Show HN: I built a fast RSS reader in Zig

90 points by superstarryeyes 2 months ago

Well, I certainly tried. I had to, because it has a certain quirk inspired by "digital minimalism."

The quirk is that it only allows you to fetch new articles once per day (or X days).

Why? Let me explain...

I want my internet content to be like a boring newspaper. You get it in the morning, and you read the whole thing while sipping your morning coffee, and then you're done! No more new information for today. No pings, no alerts, peace, quiet, zen, etc.

But with that, I needed it to be able to fetch all articles from my hundreds of feeds in one sitting. This is where Zig and curl optimisations come in. I tried to do all the tricks in the book. If I missed something, let me know!

First off, I'm using curl multi for the network layer. The cool thing is it automatically does HTTP/2 multiplexing, which means if your feeds are hosted on the same CDN it reuses the same connection. I've got it configured to handle 50 connections total with up to 6 per host, which seems to be the sweet spot before servers start getting suspicious. Also, conditional GETs. If a feed hasn't changed since last time, the server just says "Not Modified" and we bail immediately.

While curl is downloading feeds, I wouldn't want CPU just being idle so the moment curl finishes downloading a single feed, it fires a callback that immediately throws the XML into a worker thread pool for parsing. The main thread keeps managing all the network stuff while worker threads are chewing through XML in parallel. Zig's memory model is perfect for this. Each feed gets its own ArenaAllocator, which is basically a playground where you can allocate strings during parsing, then when we're done, we just nuke the entire arena in one go.

For parsing itself, I'm using libexpat because it doesn't load the entire XML into memory like a DOM parser would. This matters because some podcast feeds especially are like 10MB+ of XML. So with smart truncation we download the first few X mb's (configurable), scan backwards to find the last complete item tag, cut it there, and parse just that. Keeps memory usage sane even when feed sizes get massive.

And for the UI I just pipe everything to the system's "less" command. You get vim navigation, searching, and paging for free. Plus I'm using OSC 8 hyperlinks, so you can actually click links to open them on your browser. Zero TUI framework needed. I've also included OPML import/export and feed groups as additional features.

The result: content from hundreds of RSS feeds retrieved in matter of seconds, and peace of mind for the rest of the day.

The code is open source and MIT licensed. If you have ideas on how to make it even faster or better, comment below. Feature requests and other suggestions are also welcome, here or GitHub.

renegat0x0 2 months ago

It is a fine project to limit 'doomscrolling', but I think the premise is wrong.

- I have created my own RSS readers, that contains 500+ sources. I do not doom scroll

- doom scrolling appears when social media algorithm feeds you data, even from a month ago

- I have various filters so I can browse whatever I want

So RSS should just have filters, categories, search extensive capabilities to solve doom scrolling, and on the other hand it will be able to provide you extensive amounts of data.

A1aM0 2 months ago

The 'once a day' fetching limitation is a fascinating idea. It really captures the vibe of reading a physical newspaper in the morning rather than constantly checking for updates. I think many of us could use a tool that enforces a bit of 'digital silence' like this.

mocheeze 2 months ago

In middle school (age 11-13 in the late '90s, USA) I had a hand-me-down Palm Pilot (probably upgraded to Handspring in there). I'd leave it on my serial(?) port cradle and have it download my daily news from sites like IGN and Slashdot over 56K before I woke up. I was also the kid that regularly read the "Time Life for Kids" mags they'd pass out to us in homeroom. That's the outlet I learned about Napster from and hooked my school onto. Your comment reminded me of those days. Now I'm still desperately hooked on RSS since the early days.
ETA: When I was a late teen I ended up managing a bunch of younger teams for a free mod for an indie PC game called Blockland. I had them code up IRC and RSS capabilities into the mod from scratch in the Torque Game Engine's custom TorcueScript. I couldn't believe what those kids were capable of. They all went into programming, engineering, or founding their own companies out of highschool and college. If one of them told me something was impossible I'd just tell them that I saw that a competing mod already figured it out. Magically my dudes had a solution really quick lol. Sometimes when you have limited resources and/or experience the old and proven ways are just as good.
Was great when they had all that XML experience in a weird scripting language and I asked them to implement Jabber in-game from my Dreamhost shared-hosting plan. Crazy what a bunch of teens can do for an online Lego-like game.
Thanks for letting this older dude wax nostalgic off the rails. Hope it reminds others on HN about early hacking days like OP's project.
- A1aM0 2 months ago
  
  Love it. It’s funny how we are now building modern tools just to try and get back that simple 'Palm Pilot morning read' vibe.
PMunch 2 months ago

I've been wanting a browser plugin like this for ages. Basically tell it which sites to limit, then once loaded it won't re-load for a certain amount of time, or until the next day (not necessarily 24 hours). This way there is no reason to keep checking the news, they won't change.
halfdaft 2 months ago

this on an e-reader would be lovely. perhaps limiting the adding / removal of the source list to once a week or month would add another layer of purposefulness to it. want!
endorphine 2 months ago

Kagi News does something similar, for what it's worth.
- mbirth 2 months ago
  
  I’m currently evaluating whether I’m happy with Kagi News in my RSS reader compared to separate news outlets. So far it seems to capture all the important bits.

lknuth 2 months ago

I think its cool that more people are building what I call "calm tech". More technology should try to serve a purpose quickly and then get out of the way instead of trying to artificially stay on your screen as long as possible.

Incidentally, I built my own calm RSS reader some time ago that has many similar ideas to yours: https://github.com/lukasknuth/briefly

ekjhgkejhgk 2 months ago

https://newsboat.org/
- sbinnee 2 months ago
  
  The best RSS reader program I have been using for years
superstarryeyes 2 months ago

nice! yeah, i agree. calm tech is a nice way to put it. the current big platforms are highly tuned to keep people engaged and enraged to the max, rss is kind antithesis of that. that's probably why big companies try to bury and hide it. youtube and reddit still give pretty good rss support though, which is nice.

vqtska 2 months ago

This looks kinda scary to me considering that Zig isn't a safe language and it's being used here to parse untrusted data from the internet. Would the ReleaseSafe mode that Zig provides prevent any attempts of exploiting memory safety bugs?

bnolsen 2 months ago

There's active work towards that (which could run as part of CI) and it looks very promising: https://github.com/ityonemo/clr
superstarryeyes 2 months ago

that's a valid concern.
first of all, i'm not trying to reinvent the wheel here. for xml parsing, i'm using libexpat, one of the most widely used c xml parsers.
for networking, i'm using libcurl, the industry standard.
i have some limits in place, too. the feed size is capped at 200 kb, and there are timeouts for hanging connections. there's also a sanitization step that strips control characters that could mess with the terminal emulator, mitigating escape sequences.
that said, i'm no security expert, and the source code is public. if anyone more knowledgeable spots a security hole, i'd be happy to merge a fix.

cefboud 2 months ago

The human psychology is intriguing. The more I see "... Zig ..." the stronger the urge to learn and build something with it becomes. In practice, though, I've found that building things with languages I am familiar with is more enjoyable for me, personally.

A severe case of mimetic desire. I suspect a lot of devs suffer from it.

bnolsen 2 months ago

Zig actually does bring some new and innovative ideas to programming. While Zig itself may not become the next big thing a lot of ideas in it most certainly will find their way into languages moving forward.
- Zambyte 2 months ago
  
  What would you say are the innovative ideas that have come out of Zig? I have played around with Zig a decent amount, and to me it seems more like a novel combination of an interesting set of pre-existing ideas.
ekabod 2 months ago

it works like fashion in clothing.

keyle 2 months ago

Nice work, I did mine in C, using Termbox2, in a very suckless fashion (config.h & make install)

I like the idea of the daily digest.

That gave me a good chuckle:

    Starts in milliseconds and parses hundreds of items in seconds.

Consider having a shortcut to load a feed item's comments in the browser, if that's not already there.

Zambyte 2 months ago

Very cool :)

I really like the culture around Zig right now. Ship simple, fast, correct, useful software feels like the general set of goals for people writing software in Zig.

nhanb 2 months ago

>And for the UI I just pipe everything to the system's "less" command.

I never thought of that as an option. Thanks for the tip haha.

Joker_vD 2 months ago

Ideally, one should pipe into $PAGER, or, if that's missing, into "pager" (it's usually symlinked to less), or, if that's missing, into "less"/"more"/"cat", or, if even those are missing, then just write to stdout.
superstarryeyes 2 months ago

sometimes the simplest solution is the best! i love the unix philosophy:
Write programs that do one thing and do it well.
Write programs to work together.
Write programs to handle text streams, because that is a universal interface.

ekjhgkejhgk 2 months ago

Why MIT and not GPL3?

superstarryeyes 2 months ago

why not? isn't mit just objectively a better license for open source? i just hope rss would make a comeback to make the internet a little saner again, and if someone wants to use hys source code as a base for their own rss reader, whether commercial or not, great!
- palata 2 months ago
  
  Copyleft licences "care about the user" as in "as a user, I want you to be able to patch the code you run so I enforce it in my licence". It's a different philosophy from permissive licences that say "companies can use them in their closed, proprietary product, I just want them to mention somewhere that they use my code". Note that more often than not, those using permissive licences don't even bother to follow that simple rule.
  As a user, I'm happier with copyleft. I like to take my Marshall smart speaker as an example: that thing doesn't get any updates, ever. But it connects to the Internet. The app absolutely sucks, the connectivity is passable at best (often frustrating), but the hardware itself is nice (it looks nice in my living room and the sound is good when it works).
  If all the open source software running inside that thing was GPLv3, Marshall would have to provide me with a way to patch it. So at the very least I could make security updates myself. But because Marshall used permissively-licenced dependencies, they locked it down in such a way that I can't do that.
  The permissive licence helped Marshall, but for me as a user, the code may as well be proprietary.
  
  palata 2 months ago
  
  It also has an impact on contribution. In my experience with small open source projects, if I licence my library permissively, people will almost never contribute or open source anything. They will gladly ask for bugfixes and features, though.
  If I use a copyleft licence (I like EUPL or MPLv2), it doesn't mean that they will open clean PRs, but at least they have to publish their changes in their own fork. It has happened to me that I could go read a fork, find a few things that were interesting and bring them back to my project.
  With permissive licences, the risk is that those (typically businesses) who keep their fork open source probably don't see a lot of value in their fork, otherwise they would have made it private, "just in case".
- ekjhgkejhgk 2 months ago
  
  Explain what you mean by "objectively better"? Your response makes it sound like you don't know the difference and are doing it because everybody else does it. It also makes it sounds like you don't understand the difference between open source software and free software. Both are free licenses, open source is just one part of it.
  The main difference is that GPL3 is a copyleft license, whereas MIT is not. Meaning that legally there is nothing in the license preventing a company from taking your code and using it for their purposes without having to contribute to improve the code.
  
  superstarryeyes 2 months ago
  
  i know the difference. i use gpl3 in my other project lue for example. i meant objectively better for the open source community. the spread of new ideas benefits from the mit license because the ideas in the code can travel farther.
  the reason i picked mit is because rss is in a rough spot right now. the tech isn't mainstream, and big companies are trying to squash it since it doesn't drive engagement like the infinite scroll. anything that helps rss move forward is a win, and the mit license makes that easier.
  
  ekjhgkejhgk 2 months ago
  
  Ok thank you for explaining.
  > the reason i picked mit is because rss is in a rough spot right now.
  I don't think another client is the solution, just saying. There's about three billion of them out there (though I don't dispute that yours might have something unique).

renegat0x0 2 months ago

If anyone is interested in RSS feeds, here are mine in SQLite table:

https://github.com/rumca-js/Internet-feeds