Show HN: Prompt-refiner – Lightweight optimization for LLM inputs and RAG

7 points by xinghaohuang 2 months ago

Hi HN,

While building RAG agents, I noticed a lot of token budget was wasted on formatting overhead (HTML tags, JSON structure, whitespace). Existing solutions felt too heavy (often requiring torch/transformers), so I wrote this lightweight, zero-dependency library to solve it.

It includes strategies for context packing, PII redaction, and tool output compression. Benchmarks show it can save ~15% of tokens with negligible latency overhead (<0.5ms).

Happy to answer any questions!