sothatsit 2 hours ago

I tend to think that the reason people over-index on complex use-cases for LLMs is actually reliability, not a lack of interest in boring projects.

If an LLM can solve a complex problem 50% of the time, then that is still very valuable. But if you are writing a system of small LLMs doing small tasks, then even 1% error rates can compound into highly unreliable systems when stacked together.

The cost of LLMs occasionally giving you wrong answers is worth it for answers to harder tasks, in a way that it is not worth it for smaller tasks. For those smaller tasks, usually you can get much closer to 100% reliability, and more importantly much greater predictability, with hand-engineered code. This makes it much harder to find areas where small LLMs can add value for small boring tasks. Better auto-complete is the only real-world example I can think of.

  • a_bonobo an hour ago

    >If an LLM can solve a complex problem 50% of the time, then that is still very valuable

    I'd adjust that statement - If an LLM can solve a complex problem 50% of the time and I can evaluate correctness of the output, then that is still very valuable. I've seen too many people blindly pass on LLM output - for a short while it was a trend in the scientific literature to have LLMs evaluate output of other LLMs? Who knows how correct that was. Luckily that has ended.

    • sothatsit an hour ago

      True! This is what has me more excited about LLMs producing Lean proofs than written maths proofs. The Lean proofs can be proved to be correct, whereas the maths proofs require experts to verify them and look for mistakes.

      That said, I do think there are lots of problems where verification is easier than doing the task itself, especially in computer science. I think it is easier to list tasks that aren't easier to verify than to do from scratch actually. Security is one major one.

mikepalmer 3 hours ago

"LLMs are not intelligent and they never will be."

If he means they will never outperform humans at cognitive or robotics tasks, that's a strong claim!

If he just means they aren't conscious... then let's don't debate it any more here. :-)

I agree that we could be in a bubble at the moment though.

sporkxrocket 2 hours ago

I like this article, and I didn't expect to because there's been volumes written about how you should be boring and building things in an interesting way just for the hell of it, is bad (something I don't agree with).

Small models doing interesting (boring to the author) use-cases is a fine frontier!

I don't agree at all with this though:

> "LLMs are not intelligent and they never will be."

LLMs already write code better than most humans. The problem is we expect them to one-shot things that a human may spend many hours/days/weeks/months doing. We're lacking coordination for long-term LLM work. The models themselves are probably even more powerful than we realize, we just need to get them to "think" as long as a human would.

  • da_chicken an hour ago

    The issue is one that's been stated here before: LLMs are language models. They are not world models. They are not problem models. They do not actually understand world or the underlying entities represented by language, or the problems being addressed. LLMs understand the shape of a correct answer, and how the components of language fit together to form a correct answer. They do that because they have seen enough language to know what correct answers look like.

    In human terms, we would call that knowing how to bullshit. But just like a college student hitting junior year, sooner or later you'll learn that bullshitting only gets you so far.

    That's what we've really done. We've taught computers how to bullshit. We've also managed to finally invent something that lets us communicate relatively directly with a computer using human languages. The language processing capabilities of an LLM are an astonishing multi-generational leap. These types of models will absolutely be the foundation for computing interfaces in the future. But they're still language models.

    To me it feels like we've invented a new keyboard, and people are fascinated by the stories the thing produces.

  • bigstrat2003 an hour ago

    > LLMs already write code better than most humans.

    If you mean better than most humans considering the set of all humans, sure. But they write code worse than most humans who have learned how to write code. That's not very promising for them developing intelligence.

tibbar 2 hours ago

I think this is, essentially, a wishful take. The biggest barrier to models being able to do more advanced knowledge work is creating appropriately annotated training data, followed by a few specific technical improvements the labs are working on. Models have already nearly maxed out "work on a well-defined puzzle that can be feasibly solved in a few hours" -- stunning! -- and now labs will turn to expanding other dimensions.

stephenlf 3 hours ago

Great take. I personally find the thought of spec-driven development tedious and boring. But maybe that’s a good thing.

alberth 2 hours ago

OT: Since the author is a former Apple UX designer who worked on the Human Interface Guidelines, I hope he shares his thoughts on the recent macOS 26 and iOS updates - especially on Liquid Glass.

https://jenson.org/about-scott/

akagusu 2 days ago

I also agree that boring is good, but in our current society you won't get a job for being boring, and when you get a job, it's is guaranteed you are not being paid to solve problems.

  • keyle 3 hours ago

    > and when you get a job, it's is guaranteed you are not being paid to solve problems

    That's just your experience, based on your geolocation and chain of events.

  • com2kid 2 hours ago

    > but in our current society you won't get a job for being boring,

    One can argue that every other field of engineering outside of Software Engineering, specializes in making complex things into boring things.

    We are the unique snowflakes that take business use cases and build castle in the clouds that may or may not actually solve the business problem at hand.

    • voxelghost 40 minutes ago

      We're everything from the Architect to the concrete guy,to the framer, carpenter, sparky, and plumber.

      ... and if it all falls down, don't blame us - you clicked the EULA /s

  • Tagbert 3 hours ago

    One of my main job functions is to watch out for and solve problems.