Show HN: AI Subroutines – Run automation scripts inside your browser tab

www.rtrvr.ai

44 points by arjunchint 2 days ago

We built AI Subroutines in rtrvr.ai. Record a browser task once, save it as a callable tool, replay it at: zero token cost, zero LLM inference delay, and zero mistakes.

The subroutine itself is a deterministic script composed of discovered network calls hitting the site's backend as well as page interactions like click/type/find.

The key architectural decision: the script executes inside the webpage itself, not through a proxy, not in a headless worker, not out of process. The script dispatches requests from the tab's execution context, so auth, CSRF, TLS session, and signed headers get added to all requests and propagate for free. No certificate installation, no TLS fingerprint modification, no separate auth stack to maintain.

During recording, the extension intercepts network requests (MAIN-world fetch/XHR patch + webRequest fallback). We score and trim ~300 requests down to ~5 based on method, timing relative to DOM events, and origin. Volatile GraphQL operation IDs are detected and force a DOM-only fallback before they break silently on the next run.

The generated code combines network calls with DOM actions (click, type, find) in the same function via an rtrvr.* helper namespace. Point the agent at a spreadsheet of 500 rows and with just one LLM call parameters are assigned and 500 Subroutines kicked off.

Key use cases:

- record sending IG DM, then have reusable and callable routine to send DMs at zero token cost

- create routine getting latest products in site catalog, call it to get thousands of products via direct graphql queries

- setup routine to file EHR form based on parameters to the tool, AI infers parameters from current page context and calls tool

- reuse routine daily to sync outbound messages on LinkedIn/Slack/Gmail to a CRM using a MCP server

We see the fundamental reason that browser agents haven't taken off is that for repetitive tasks going through the inference loop is unnecessary. Better to just record once, and get the LLM to generate a script leveraging all the possible ways to interact with a site and the wider web like directly calling backed API's, interacting with the DOM, and calling 3P tools/APIs/MCP servers.

saadn92 8 hours ago

I built something like this but much worse. No extension, no recording, I literally sit there with Chrome devtools open, do the action manually, copy the 3-4 network requests into a Python script, and replay them with urllib and a cookie jar.

It's absurd but it works. Gumroad's cover image upload for example, their actual API can't do it, but the browser makes 3 requests (presign to their Rails Active Storage endpoint, PUT the binary to S3, POST the signed_blob_id to attach it). Captured those once in April, been replaying them since. I uploaded covers and thumbnails to 9 products today without opening a browser.

Obviously falls apart the second they change anything.

  • arjunchint 5 hours ago

    Yes exactly imagine now anyone, even non-technical people, can just prompt and interact with this hidden/deeper layer of the web, all in their regular browser!

tim-projects 16 hours ago

If you could take this recording and turn it into a playwright script - that would be a massive time saver.

Having to redo recordings once they break sounds like too much hassle.

  • arjunchint 16 hours ago

    Hey thats a great idea, we will take a look into exploring this export option. But how would it save time by being a Playwright script?

    Right now since we have a custom sandbox to re-execute the code in, we are using our own syntax and exposed methods. So even now you can edit the generated script.

JSR_FDED 1 day ago

Maybe there’s a middle ground where a small local model can roll with the variations in a site that would break a script, while saving the per token costs?

  • arjunchint 20 hours ago

    We found Gemini Flash to be the sweet spot for both agentic actions as well as writing code. Even Flash-Lite is too hit or miss.

    We are thinking through on self healing mechanisms like falling back to a live web agent and rewriting script.

amelius 1 day ago

The problem: I don't trust extensions one bit.

  • notepad0x90 21 hours ago

    auditing the code is fairly straightforward if it isn't obfuscated. so long as it doesn't execute dynamic code that is. but the big issue is you can't control when the extension itself gets an update (to my knowledge). and it isn't uncommon to sell browsing data, or the extension itself to someone more shady than the original author down the road.

    • amelius 16 hours ago

      Yes, this exactly.

  • quarkcarbon279 21 hours ago

    The reason we open our client side code is to bring in the trust in putting rtrvr's DOM intelligence in your web apps - https://github.com/rtrvr-ai/rover/tree/main . Our monetization is super straight forward with subscription - https://www.rtrvr.ai/pricing . The experiences of some extensions shipping anything or selling user data comes in when people build them as side-gigs not when we pour more than year in building the highly accurate automation engine. We have cloud sandboxes too if you prefer executing with the same intelligence on cloud and not on your own device.

    PS: Also, our data policy if you are interested: https://www.rtrvr.ai/blog/rtrvr-ai-privacy-security-how-we-h...

daylab 19 hours ago

oh this is clever. running in main world dodges a lot of the usual scraping pain. how do you handle sites with strict csp that block inline scripts, is the extension somehow exempt?

  • arjunchint 18 hours ago

    We execute the code in a sandbox and proxy the fetch calls through main world!

rvz 1 day ago

Aren't there just many ways for the website to just break the automation?

Does this work on sites that have protection against LLMs such as captchas, LLM tarpits and PoW challenges?

I just see this as a never ending cat and mouse game.

  • acoyfellow 1 day ago

    It is. They are saying “we are willing to chase the mouse for you for money”.

  • arjunchint 1 day ago

    The bigger goal is to build and maintain a global library of popular automations. Users can also quickly re-record and recreate the scripts to update.

    Since it runs inside your own browser, there should be no captchas or challenges. On failure it can fallback to our regular web agent that can solve captchas.

    Big picture wise with the launch of Mythos it might just become impossible for websites to keep up, and they will have to go like Salesforce and just expose APIs for everything.