Show HN: Warehouse OpenAI requests to your own database

www.usevelvet.com

18 points by elawler24 1 year ago

Today we’re launching Velvet, an AI gateway for warehousing OpenAI and Anthropic requests to your PostgreSQL instance.

We originally built an AI SQL editor, but realized that customers were using it to monitor their AI requests in production. We had already built an AI request warehousing tool internally to debug our SQL editor and gave some customers access.

A few days into testing this idea, our pilot customer launched [1] and we began warehousing 1,500 requests per second. We worked closely with their engineering team in the following weeks, completely re-architecting Velvet for scale and additional features (such as Batch support). Along the way, other companies began seeking out Velvet to get visibility into their own LLM requests.

We’re launching our AI gateway as a self-serve product today, but our pilot customers are already warehousing over 3 million requests per week - so the system is stable and performant.

What makes Velvet unique is that you own the data in your own database. Also, we’re the first proxy that gives visibility into OpenAI batch calls - so you can observe and monitor async calls that save you money.

Some technical notes:

- Supports OpenAI and Anthropic endpoints

- Data is formatted as JSON and logged to your own PostgreSQL instance (can add support for other databases for paying customers).

- You can include queryable metadata in the header, such as user ID, org ID, model ID, and version ID.

- Built on Cloudflare workers, which keeps latency minimal (using our caching feature will reduce latency overall)

- Built for security + starting process of SOC II soon

Why warehouse your requests?

- Understand where money is spent. Use custom headers to calculate the cost per customer, model, or service.

- Download real request/response data, so you can evaluate new models (e.g., re-running requests with a cheaper mini model)

- Monitor time to completion of batch jobs. (e.g., OpenAI says 24 hours, but our customers average 3-4 hours)

- Export a subset of example requests for fine-tuning

It’s just a 2 line code change to get started.

Try a sandbox demoing the logging proxy here: https://usevelvet.com/sandbox

More details in our docs https://docs.usevelvet.com

[1] https://news.ycombinator.com/item?id=40801494

toomuchtodo 1 year ago

This is great for folks who need to proxy and/or log for compliance and regulatory purposes.

  • elawler24 1 year ago

    Yep, we can warehouse directly to your database which makes it secure by default. It's your data and we're never going to train on top of your logs. Plus there's no data abstraction when you want to switch between models or fine-tune your own models.

samstave 1 year ago

Can we please have this as a VS-Code|Cursor Extension such that All copilot operations flow through a dash like this for a personal version of all your code prompting through vscode/cursor/positron.

Then, in positron, you can just have a viewer that can run and graph the queries in the IDE

  • elawler24 1 year ago

    A VS code / cursor extension is on our roadmap. Ideally this data is embedded in your workflow so developing with LLMs is seamless. With the current product, we can also warehouse requests directly to your PostgreSQL DB so you can query in your IDE. The SQL editor in our app is just one (optional) way to query the data.

    • samstave 1 year ago

      How about you guys have a friendly with InstanDB folks. :-)

      https://news.ycombinator.com/item?id=41322281

      There are so fn many amazing tools being shown on HN recently - Im having ShinyObject Overwhelming Tool Envy constantly.

      I wonder if there is a method where your system can also act as a memory scratchpad?

      Like can I setup rules on how to route the proxied info into my postgres, as you say = so using the InstantDB instaml - I can create rules on how to log and warehouse the promtps.

      Do you guys have bestpractice/schema template ideas for meaningful structure of the prompt /request warehouse?

      (Whats that called? 'request schema'?)

      • elawler24 1 year ago

        If I’m understanding, you’d want to choose which logs to warehouse. That’s a feature we’ll add soon, in case you don’t want to store everything. Like for cost analysis, it’s useful to have every request - but maybe you don’t need to keep them forever.

        We structure request/response logs as JSON so they’re queryable, and can work with any model. This also lets you add custom metadata to the header for any unique identifiers you want to include.

        An InstanDB integration is possible! Try the proxy out and we can add InstanDB support if it helps you build faster.