Ask HN: How close are we to local LLMs being useful? What's the impact?
Feels to me like local models are an under-covered aspect of this whole AI boom.
If everything improves over time, at some point a good chunk of tasks won’t need to be done in data centers or be subject to the whims of a few frontier AI labs.
How close are we to that? Or is my thinking flawed?
I do classification with SLMs and for my tasks when I have a few thousand samples the frontier models in zero-shot and few-shot modes are embarassingly bad in comparison.
I think we're past that point; they're absolutely useful already for a lot of tasks. I think it's about costs, convenience, and benefits of a frontier model for what you're doing.
Which ones are most useful? Any suggestions on where to go to start exploring this world?
A good place to browse is the LocalLLaMa subreddit. [0]
A good software to start is LM Studio [1]. Another popular alternative is Ollama [2].
A better software when you're used to it all is llama.cpp as it's usually a bit faster and more frequently updated [3].
A good place to get models is HuggingFace, particularly the Unsloth models [4]
Most popular models lately to run on "regular" gaming PC's, workstations, Macs etc are: Qwen 3.5 9b, Qwen 3.6 35B-A3B, Qwen 3.6 27B, Gemma 4.
But there are hundreds or thousands of other models and different quantizations, finetunes, etc, etc. Have fun :)
[0] https://www.reddit.com/r/LocalLLaMA/
[1] https://lmstudio.ai/
[2] https://ollama.com/
[3] https://github.com/ggml-org/llama.cpp
[4] https://huggingface.co/unsloth/collections
Local LLMs have been useful since 2024. If you don't know this then you are just far behind. Catch up!
until its cheaper to train and infer than 100k gpu data centers...i doubt it will ever compete.