u8 1 day ago

The disconnect for AI is that it is a jagged frontier and it only really shines when one of its jagged frontiers extends counter to one of your valleys.

If you've been writing Perl for 30 years, you might not want to learn JavaScript just to make a little fun idea in your head to show your wife. Vibe code that shit man. Who cares? Your wife does not care about LOC or those internal design decisions you made.

If you're trying to learn something new like an algorithm, protocol, or API write that shit by hand. You learn by doing, and when you know how the thing works and have that mental context, you will always be faster than an AI. Also, when did we stop liking to learn? Why is it a bad thing to know all the ins and outs of a programming language? To write and make all the decisions yourself? That shit is fun. I don't care if you disagree.

If you're at work and they really care about getting something out of the door, do whatever you think is best. If you just wanna ship vibed code and review PRs all day, all the power to you. If you wanna write it by hand, and use AI like a scalpel to write up boiler plate, review code, do PR audits, etc... go for it!

A hammer is a really great tool that has thousands of purpose-designed uses. I still prefer my key to get into my car. It's all tools, you are a person.

A lot of this stuff if coming top-down from people who do not have the experience you do. Wouldn't a smart employee use their expertise to advise the organization? If you work at a company where that would not be okay, maybe it's time to start looking for another firm.

  • rimmontrieu 1 day ago

    Some people actually don't really like to learn new things. If the machine spits out plausible working code, they'd be perfectly happy with that. Personally I think AI is doing a lot more harm than good and I can't wait for the bubble to burst.

    • skeledrew 1 day ago

      Let those who want to learn go learn. And let those who just want something that works well enough without having to learn get it.

    • justech 1 day ago

      I don’t think it’s going to burst like how other people expect. The technology is already out there, when it loses steam people aren’t suddenly going to stop using it. I predit it’ll be more like the dot come crash where companies that can survive the downturn come out dominant.

      • kyykky 1 day ago

        It ends like this: all codebases become unmaintainable spaghetti after agentic AI spends years on it. Then after every agent in existence will spend minimum 24 hours reading the codebase to add a simple feature, the software is abandoned.

        • necovek 1 day ago

          I believe most codebases were "unmaintainable spaghetti" even before LLMs: depends on how you define it though.

          To be, it means expensive to evolve.

        • r4nd0m4ch 20 hours ago

          no way it's going to happen

    • hirvi74 1 day ago

      To use an analogy, LLMs are like the Ring of Power in Lord of the Rings. The Ring of Power does not corrupt one nor does it magically turn one evil. Rather, the Ring just serves as a catalyst for what is already inside the bearer.

      Many that wore the Ring had pure and righteous intentions. The thought of, "If I were in power, I would..." was the arrogance and corruption which the Ring amplifies.

      So, I cannot agree that it is AI doing the harm. Rather, AI just gives us the power to do the harm, the shortcuts, the cheats, etc. we have always desired. And just like the Ring, I believe much of the harm from LLMs often comes from people that started with good intentions, and the power it grants is just too tempting for many.

  • Xeronate 1 day ago

    When I started spending 40-60 hours a week programming and wanted to spend my remaining time doing other things.

    • hirvi74 1 day ago

      I imagine my future will involve spending 40–60 hours a week using LLMs to do the work of multiple roles instead of just one, while wishing I could spend my remaining time doing other things.

  • jesterson 1 day ago

    > Why is it a bad thing to know all the ins and outs of a programming language? To write and make all the decisions yourself? That shit is fun.

    It's not just fun (i agree it is), but it is also essential for creation.

    What we have done with the 'AI' is to create a lot of ignorant morons who think they can create a lot of things without knowledge. This is not gonna end well.

    • zx8080 1 day ago

      > they can create a lot of things without knowledge. This is not gonna end well.

      Who said "managers"

      • jesterson 1 day ago

        Oh managers are not the biggest evil here. At least they know basics.

        Now we have influx of people with not a single shred of technical knowledge thinking they can create something.

  • allthetime 1 day ago

    AI is just revealing the two types of people in this line of work. Those who don’t actually like software and just do it because it’s lucrative, and the actual nerds who care.

    • tyyyy3 1 day ago

      Can we build a list of the actual nerds who care? Need it for my future recruitment needs lol.

      • andai 1 day ago

        The benchmark is "do they do it for fun", i.e. personal projects.

        But the real trick isn't "number of personal projects", but how weird they are. There's no "rational" reason to do them, they don't increase the person's marketability / hireability. They are done purely for intrinsic reasons.

        (On reflection, this also seems to be a pretty robust predictor of autism. :)

    • eli 1 day ago

      A much more charitable framing: people who enjoy the process vs people who enjoy the result.

      (Though, granted, the results are a lot better if you craft it by hand)

      • Daishiman 1 day ago

        I am not really sure. I wrote some scripts that aggregated data from several APIs with an LLM and the LLM had the foresight to create a caching layer for the API responses as it properly inferred that I would need the results over and over again as well as using asyncio to accelerate fetch speed. This would have been a v2 or v3 and it one-shotted it perfectly.

        • anygivnthursday 1 day ago

          Yeah, they are good at applying generic patterns, but often it can be overkill/YAGNI that lead to more maintenance work in places that are fine with a much simpler/straightforward solution. But this is what the engineer can decide and with LLMs they wont be forced to make the trade off because it takes longer to build, but rather whether it is really necessary or not.

          • Daishiman 16 hours ago

            For sure, but the engineer will always be ultimately responsible. There's always a QA and review stage in my process for trimming the fat.

        • eli 19 hours ago

          When it works, it feels genuinely miraculous. Working in a common problem space, like gluing together APIs, it generally does well. Doing something novel or even a little complicated, it can really lead you astray.

      • sdevonoes 1 day ago

        But business people always cared only about thr result. My PM (who speaks like a salesman) only cares about the results. My “head of” same. My ceo same. The only ones who ever cared about the process and quality were us the engineers… if we don’t have that care, well, to hell with everything

        • IanCal 1 day ago

          That's not true as a simple statement, many business people really do care about quality and process, and you may find you care much more about them than you think.

          How often have engineers decried yet another rewrite that some project is doing? Or talked about "over-engineering" something that isn't needed, or that another person in a team has setup a full kubernetes gitops thing that's glorious to them but you just want to scp a go binary and be done with it?

          I've seen truly excellent engineers hit this issue, I worked in a team years ago and people disagreed on the approach to take on a new project. So we all made a prototype and presented it, so we could pick a direction. There was a requirement that it be done in ruby since that was the language most of the developers were most fluent in. One of the engineers, remarkably smart, wrote a lisp interpreter in ruby so that technically it'd be "in ruby" but have the benefits of lisp.

          He cared about the quality and process in one area. Deeply. However focussing on that would be at the detriment to the rest of the actual product we wanted to ship. If you considered the quality of the product as a whole and the process at the level of the organisation, you'd do something very different.

          Now, none of this means all business people are good at this or long term vision or anything, just as it doesn't mean all engineers have a very narrow focus. But I've seen engineers focus on the quality or engineering of some component without looking at what it is you're actually trying to achieve as a business, and so push for a worse overall process and lower "quality" result. It's the same sort of disconnect that leads a lot of engineers to rail against meetings and PMs that slow them down without seeing from the other side that it's often better to build the right thing more slowly than the wrong thing more quickly.

        • iugtmkbdfil834 23 hours ago

          Assuming it is accurate, the logical conclusion is that the race is over. The management can get their $result and fast. Now, whether it is good or bad, is a separate story, and only time will tell whether they will be forced to learn anything. Right now, the expectation is to push for results and management seems to ascribe current set of failures to: people not embracing AI enough.

        • eli 19 hours ago

          I think that's a common experience but not universal.

          Just about everyone cares about process and quality when things start falling apart. And at least with current technology, it seems like vibe coding your way into a large project will inevitably land you in that spot.

      • anygivnthursday 1 day ago

        > enjoy the process

        This means different things to different people, lot of people enjoy the process of engineering solutions with LLM agents, build out tailored skilled, custom approaches that make up their own flavour "agentic" workflow. There are also people who find joy in Javascript that other people cannot understand why. And other people again love system languages or even tinkering with assembly etc.

        What I wanted to say is that LLM use does not automatically mean people just want to get results faster, there are still nerds enjoying the process of working with these new tools.

      • MrScruff 23 hours ago

        The results being a lot better crafted by hand I would agree with, if one removes any notion of a time constraint. Sometimes the comparison point is between the LLM authored software or nothing at all.

    • pdntspa 1 day ago

      Why exactly does "actual nerds who care" stipulate writing code?

    • Daishiman 1 day ago

      I care a lot about software and I use LLMs extensively. There are some things I deeply understand yet I don't care for doing anymore because I've done them for years and there's nothing to be gained from doing them manually.

      • allthetime 19 hours ago

        That’s just you using the tools responsibly. Not using LLMs to perform well defined virtually deterministic tasks that you fully understand is simply a waste of time. There’s a big difference between that and just letting agents go wild and do your design for you.

    • okdood64 1 day ago

      I take software engineering and production reliability very seriously. But coding is just a small part of my job. It's not really the meat and potatoes. I'll vibe code (responsibility) where I can.

    • techpression 1 day ago

      It goes for all professions really, people who do it for work and people who care. Apply to any profession, plumbers, doctors, carpenters, cleaners, etc etc. Most of us have experienced both types and I haven’t heard of anyone preferring the ”do it for work” over the ones who care. And like those other professions, in software we accept the worse of the two because finding people who care is both time consuming and often much more expensive.

      • andai 1 day ago

        >in software we accept the worse of the two

        and the whole world suffers for it.

    • enraged_camel 1 day ago

      I care about solving problems for and delivering value to my users. The software is simply a means to that end. It needs to work well, but that does not mean every line of code requires an artisanal touch and high attention to detail.

      • ehnto 1 day ago

        I think there's some ambiguity in the discussion around what people mean when they say "good code".

        Good code for a business is robust code, that's functionally correct, efficient where it needs to be and does not cost too much.

        I believe most developers who care about good code are trying to articulate this, they care about a strong system that delivers well, which comes from good architecture.

        LLMs actually deliver pretty well on the more trivial code cleanlines stuff, or can be made to pretty trivially with linters, so I don't think devs working with it should be worried about that aspect.

        What is changing fast is that last point I mentioned, "that doesn't cost too much" because if you can get 70% of the requirements for 10% of the perceived up front cost, that calculus has changed. But you are not going to be getting the same level of system architecture for that time/cost ratio. That can bite you later, as it does often enough with human coders too.

        • MrScruff 1 day ago

          I think the other aspect to this which you allude to at the end is that all of these arguments start with the assumption that all human software engineers produce high quality code that meets the requirements, but obviously that’s very much not the case in the real world. After all, 80-90% of drivers rate themselves as above average.

          If one compares a single competent software engineer directing a number of agents against a random group of engineers (not necessarily working at FAANG or a YC startup), then those quality arguments are going to be significantly less compelling.

        • theshrike79 22 hours ago

          But the trick is that if / when you can define "good code" in a deterministic manner, then the LLM can also deliver "good code".

          But if it's just based on feels, then of course it can't do it because it's not a mind reading machine.

          • allthetime 19 hours ago

            Once you’ve done the work to deterministically define your system; you’re not vibe coding anymore. You’re officially an engineer who cares about the consistency and robustness of your product, not just its superficial outcomes.

            • theshrike79 18 hours ago

              So it's Vibe Engineering then? :)

    • stephenr 1 day ago

      I've posited for a while now that the people who find spicy autocomplete to be exciting are the people who can't really do what it does.

      I played with Image Playground last year some time. It was really fun. You know why? I can't draw, and I can't paint, to save my life. It's letting me do something I can't do well/at all on my own.

      Using an LLM to do something I can do, with the caveat that it's pretty mediocre at the task, and needs to be constantly monitored to check it isn't doing stupid things? If I wanted that I'd just get an intern and watch them copy crappy examples from StackOverflow all day.

      The same logic explains the use of LLM's to write emails/other long form text.

      It makes accessible something that people otherwise cannot do well. Go look at submissions on community writing sites. The people who write because they're good at it, are adamant they don't use an LLM.

      People use LLM's to do things they're otherwise not able to do. I will die on this hill.

      • aquariusDue 1 day ago

        Initially I wanted to write more but I can boil it down to taste and context mismatch. By that I mean some people see LLM output as tasteless or kitsch (which I ascribe to generally) and another set of people (though sometimes overlapping more often than not) hold disdain or at the very least look funny at heavy LLM users like gym-goers would look at someone in the middle of the gym loudly suggesting using a dolly or forklift instead of barbell training.

        So yeah, I guess the value of doodles has shot up simply because of optics.

        Somewhere else in this comment section someone tried to broaden the definition of nerd so much so that pretty much anybody who is a consummate professional is also a nerd. The hill I will die on is that people don't actually dislike all this new AI stuff but more so the attitude of people heavily invested in it.

        And to add another data point regarding your hill my drawing/painting moment was NLP stuff. Now if I want to do (rudimentary) sentiment analysis or keyword extraction I can lean on a local LLM. Yet I don't go around yelling Snowball (I think?) is obsolete.

        • stephenr 1 day ago

          > more so the attitude of people heavily invested in it

          Exactly.

          LLM bros are just the new blockchain/crypto bros, but they aren't necessarily even writing their own spruiking comments any more.

      • cpursley 1 day ago

        While you are dying on a hill, with the help of LLMs, I'm shipping quality software and features to my customers at a pace I haven't been able to before. And no, not some nextjs slop. If you are letting your LLM look at StackOverflow, you are doing it wrong - it needs to be grounding in your stacks official docs and any other style/rules you prefer wired with other tooling like linting/formatting, duplication checking, etc. And yes, you have to constantly monitor the output and review every line of code - but it's still faster and if managed correctly, produces better code and (this is the hill I will die on) better test suites and documentation than I would have written.

        • stephenr 1 day ago

          > If you are letting your LLM look at StackOverflow, you are doing it wrong

          So you've evaluated all the sources that the model was trained on initially have you? How long did that take you?

          > I'm shipping quality software and features to my customers at a pace I haven't been able to before.

          I'm sorry are you agreeing with me or not? It sounds like you're agreeing with me.

          • cpursley 23 hours ago

            I’m just saying that you can’t just let it rip based on its training alone, it needs to be grounded and harnessed in stack specific tooling.

            • galangalalgol 23 hours ago

              I'd be more general and say it needs verification to guide it, and narrowed scope so it doesn't wander off. How those get provided can vary. While I can do what I'm asking it to do, and have so many times that I don't want to anymore, I can't do it as fast as it can. But as someone said, it is stupid really fast. The bottleneck is now me slowing down this intern who thinks fast by stopping it to redirect it when it does bad things. The more pre prompting and context and verification tools I give it the less I have to do that, so the faster it goes. Then I get to solve the parts of the problem I haven't done until its boring.

      • MrScruff 1 day ago

        Is your argument that there is no imaginable situation where someone who was competent at software development could find use for a semi-automated tool for writing software?

        That would imply that either the person in question has infinite time, or has access to all software that could ever be of utility to them, which seems unlikely.

        • stephenr 1 day ago

          There's a reason I call it spicy autocomplete.

          • MrScruff 1 day ago

            Which is what?

            • stephenr 23 hours ago

              .... that an IDE providing a suggestion about what comes next as you type is not new, and the entire basis of how an LLM works is "what word probably comes next".

              I'd have thought someone who's so enamoured with the tech would have at least a basic understanding of how it works.

              • MrScruff 23 hours ago

                Indeed. To be honest, I think everyone on HN is aware of how LLMs work at this point, it’s not actually adding a great deal to the discussion to keep going on about autocomplete or ‘stochastic parrots’.

                • ewild 20 hours ago

                  At this point if someone calls it auto complete they can be written off as a Luddite with nothing valuable to say. The irony being they themselves are being a stochastic parrot by parroting the jargon other people say about llms.

      • Kiro 23 hours ago

        "I've posited for a while now" and you post the most lukewarm and outdated take like it's an enlightenment. I've been coding for 20 years and can very well do everything the AI does, and so can all devs I know. We use it because it amplifies us, not because we couldn't otherwise. You've chosen a very ridiculous hill to die on.

    • smugglerFlynn 1 day ago

      You are probably talking about people who just crunch out some half baked solutions for the sake of getting somewhere.

      But there are other nerds who care, just not about the code quality, but about conversion, testing out business ideas quickly, getting to know their customers better.

      There are nerds who care about business strategy.

      There are nerds who care about accounting principles and clean financial reporting.

      There are nerds who care about sales targets and partnerships.

      There are many types of nerds out there. Don’t limit nerds to engineers, because “tech” world is not just an engineering world anymore. All these nerds you can team up with to build meaningful things, because they do care.

      • 16bitvoid 1 day ago

        They very clearly weren't talking about nerds in general but rather nerds who care about software.

      • bnug 23 hours ago

        This resonates with me. I'm a Mechanical Engineer who loves the process of coding. I did take an intro to business class in undergraduate though, and my professor said one thing that has stuck with me for 30+ years - 'The fundamental goal of a business is to make profit now and in the future'. Vibe coded slop might get some traction and make money now, but high quality code will reduce technical debt and allow it to be made in the future. So, in some ways, both camps are right. The PM/Manager/VP want to make money now, but if they completely disregard the nerdy engineer, they will sabotage their future.

        I see a disconnect between these two camps that will probably cause a lot of chaos in the near future. Those that figure it out will thrive.

        • theshrike79 22 hours ago

          But also time to market matters.

          While Company A is building their product in perfect hand-coded Rust with zero defects, Company B is on their third iteration of vibe'd "slop" and getting actual customer feedback - which helps them iterate further.

          It's mostly a matter whether Company B is smart enough to refactor the code to a stabler and more maintainable form or do they run headlong into a vibeslop wall.

    • XenophileJKO 1 day ago

      This is such a naive take. Most of the nerdiest and most "quality" oriented engineers are hard leaning in to agentic coding. I feel like the most impressive engineers I know have always leaned in to learning how to "sharpen the axe" and AI is really the biggest axe we have seen.

      • allthetime 21 hours ago

        Where did I say they weren’t? We’re all using LLMs now, it would be stupid not to. It’s how we use them and how much care we’re willing to give up for speed that is at issue here.

    • michaelcampbell 23 hours ago

      I think there's a continuum here, too. I've heard it said, in jest, mind, that LLM's square the dev. It turns a 1.5x dev into a 2.25x dev, but it also turns a 0.75x dev into a ~0.56x dev.

      I think the exponent of 2 is probably too high, but it's not a bad approximation of a very messy reality.

      There is also the division of people who value the thing being produced vs. valuing the actual production of that thing, whether or not its used. I don't see one side here being "right", necessarily, but when a company is behind it one is certainly more valued, and I think not incorrectly.

    • hgoel 22 hours ago

      Your category of "nerds who care" is actually "nerds who only want to be coders" and not "nerds who care about solving problems".

    • cableshaft 21 hours ago

      I've been a software nerd all my life (and there was a time where I worked 60 hours a week at a startup working hard to make mobile games), but there's just been so much extra crap associated with it (especially web development, and especially corporate web development, what currently pays my bills) over the years that it's worn me down and I'm happy to let A.I. churn through the hard or frustrating or endless amounts of boilerplate bits, and let me focus on other things.

      Part of me still wishes we were making websites with just HTML, CSS, PHP, and a little Javascript here and there (before AJAX). I'm still not convinced all this extra SPA functionality is really needed for most corporate website needs (something like Google maps or real-time chatting, sure, other things not so much), but I do it because they insist.

      I also really like game design, and I had a fairly simple game idea that I prototyped a physical version of and playtested a few times and thought, 'yeah, this is pretty fun'.

      But I don't have the energy to code it in my spare time anymore. Was curious how close to a working MVP it could get with me writing up a specification yesterday with the help of ChatGPT (after I brainstormed a few aspects of the design), and dumped that spec into a new repo on GitHub, and about 20 minutes later, it had a fully functional game that worked exactly like my physical prototype.

      It was still missing other features, like tutorials and stats and sharing abilities and the like, and I'd like to adjust the presentation some, and the computer opponent A.I. was a bit weak and could have been stronger, but it was fully functional and even looked pretty good, kind of like a Wordle presentation, which was what I was going for anyway.

      Something that would have taken me probably 40 hours of dedicated work at least to get everything working and looking as nice as it did.

      So yeah, it's kind of like 'well what's the point of me manually coding this anymore'.

      What I really like about software was solving puzzles, but now I can focus on the more interesting puzzle of what makes a good game design and 'how best to present this to players' instead of how to get five different libraries and/or APIs to play nice together and learn how it all works.

      If coding hadn't become some labyrinthian monstrosity and got out of your way when coding, I probably would want to keep coding more.

      Some languages/frameworks get close to that, Lua/Love2D is pretty smooth except when it gets to you wanting to distribute it on platforms other than PC/Mac/Linux, or integrate with external libraries, or for me work with shaders since I'm still pretty weak with shaders.

      But even then, it was hard to deny how much faster A.I. could code a feature and I've started getting more hands-off there as well.

      That being said, work has gotten less fulfilling, since I'm not doing any actual design work really, just implementing features and making them look according to Figma specifications or fixing bugs, so that's gotten less fulfilling without the busywork of solving coding puzzles (now it's 'how to say this to the A.I. to get it to fix this right, which is still a puzzle but a much weaker one). I'm starting to get tempted to make a go of starting my own business so I can have more autonomy again.

    • munksbeer 20 hours ago

      There are more types of people. I do it because it is lucrative, because it turns out I'm good at being a professional software engineer, but I also enjoy it more than other things I could be doing.

      However, ultimately, I got into software because I was intellectually curious and programming was a tool I could use to explore that curiosity. When I stop working professionally, I will stop caring about the sorts of stuff I care about today and go back to using programming for what I love. A tool to explore.

    • avgDev 20 hours ago

      I am a nerd who cared. Caring is not putting my food on the table though, delivering stuff is.

      I still enjoy diving into documentation but AI has transformed how I work. I can quickly get code examples I can debug. I learn new things as sometimes AI generates approaches I haven't used before.

  • ditchfieldcaleb 1 day ago

    I agree with you on everything you said here except:

    > when you know how the thing works and have that mental context, you will always be faster than an AI

    That's just plain false, honestly. No one can type at the speed AI can code, even factoring in the time you need to spend to properly write out the spec & design rules the AI needs to follow when implementing your app/feature/whatever. And that gap will only increase as LLMs get more intelligent.

    • JSR_FDED 1 day ago

      It should be “…you will always be faster than someone _without the knowledge_ using an AI”

    • leostarship 1 day ago

      as i understood it he's referring to the overall time it takes to build a complete finished piece of software, accounting for the refactoring and bug fixes and all that. cause handn't you understood the tools you're using you would be running into roadblocks and that adds up

    • notnullorvoid 1 day ago

      Some of us do actually have intimate knowledge in certain areas where guidance of an AI takes longer than doing it yourself. It's not about typing speed, it's that when you know something really really well the solution/code is already known to you or the very act of thinking about the problem makes the solution known to you in full. When that happens it's less text to write that solution than it is to write a sufficient description of the solution to AI (not even counting the back and forth required of reviewing the AI output and correcting it).

      • bulbar 1 day ago

        Giving a precise description of what the computer is supposed to do is exactly what programming is.

        The more specific your requirements the closer you get to natural language not being useful anymore.

        • kyykky 1 day ago

          I code mostly in APL and J. It’s much faster to type the code than explain everything to AI.

          • satvikpendem 1 day ago

            The exceptions that prove the rule. When your programming language is built up of singular Unicode characters with specific meanings, of course that's faster than typing out in English what you want.

            What do you use them for? For most AI users it's usually CRUD and I've never seen a web server or frontend in APL like languages.

            • noosphr 1 day ago

              The exception is the rule.

              The reason why programming is hard is because most languages force you to use a hammer when you need a screw driver. LLMs are very good at misusing hammers and most people find them useful for that reason.

              If you use a sane dsl instead the natural language description of a problem is always more complex and much longer than the equivalent description in a dsl. It's also usually wrong to boot.

              This is what algebra used to look like before variables: https://en.wikipedia.org/wiki/Archimedes%27s_cattle_problem#...

              I don't think you will find anyone who can do better than an LLM at one shotting the prose version of the problem. Both will of course be wrong.

              But I also don't think you will find an LLM that can solve the problem faster than a human with Prolog when you have to use the prose description of the problem.

              • stingraycharles 1 day ago

                Using esoteric programming languages doesn’t suddenly make it true for the majority of development, which is web apps, CRUD stuff, some data science, etc.

                • noosphr 13 hours ago

                  SQL and algebra are not esoteric languages.

              • satvikpendem 7 hours ago

                Who is using APL and J these days? I guarantee 90+% of Claude users are developing CRUD web apps, or something similar. Your point about algebra is a non sequitur to what people are actually developing for these days.

        • gspr 1 day ago

          This is actually my biggest gripe with vibecoding. The single best feature of any programming language is that it is precise. And that is what we throw out?! I favor of natural language, of all things?! We're insane!

          • timschmidt 1 day ago

            It turns out an awful lot of precision (plenty for many things) lives in library and web APIs, documentation, header files and dependency manifests. Language can literally just point at it without repeating it all. Avoidance of mistake through elimination of manual copying in things like actuarial and ballistics tables was what the original computers were built for.

            • stingraycharles 1 day ago

              Custom written code can also point at those APIs and libraries without repeating it all? Or am I missing your point?

              • theshrike79 22 hours ago

                API Glue is the easy and boring part in programming. Nobody really enjoys wiring API A to API B, combining the results and using API C to push it forwards.

                Any semi-competent AI Agent can do that with a plan you've written in 5 minutes.

                • skydhash 22 hours ago

                  I would love to see an AI try to make sense of GTK API.

                  I may be wrong, but it seems when people are talking about easy glue code, they’re talking about web services API, not OS API, not graphics or sound API, not file formats libraries,…

                  • theshrike79 18 hours ago

                    I used Sonnet 3.5 over a year ago to decrypt a notoriously shitty local government API to get data out of meetings, votes and discussions.

                    I know it's a piece of shit API done in the worst possible way on purpose (they don't want openness, but had to fulfill a law that mandates "openness") because I had previously tried to do it manually - twice. I ran out of whisky before I got anything done.

                    Sonnet _3.5_ almost one-shotted it with just the API "documentation" they had and access to Python and curl.

                    People have also hooked stuff into proprietary APIs on "smart" devices with zero documentation, just by having an Agent tirelessly run through thousands of permutations to figure it out.

                  • pydry 16 hours ago

                    Even with web services it usually shits the bed if you ask it to maintain something with tech debt.

          • hgoel 22 hours ago

            That's because very often the precision is just common sense that can be derived, either from general knowledge, or from your existing code.

            • esafak 19 hours ago

              If you had to give precise instructions to someone so they could get anything done you'd call them a junior.

          • robotresearcher 20 hours ago

            Historically we almost entirely moved from ASM to C, a language with lots of undefined behavior, because precision is not the most valued feature of languages.

            • bulbar 15 hours ago

              UB is about edge cases that a compiler should not be enforced to check against and an occurrence is always a bug. You don't necessarily need a precise description of the actual faulty behavior.

              • robotresearcher 11 hours ago

                Right. The language has well-formed expressions with no defined meaning in terms of machine instructions. My claim is that this is a reduction in precision compared to assembly language.

                Grandparent said:

                > The single best feature of any programming language is that it is precise.

                C overtook a more precise language family because it has features other than precision that people cared about. Perhaps a better tradeoff of expressiveness and readability with precision.

                Grandparent could be correct, and precision is the best feature of C, despite being less precise than ASM. And its better expressiveness nets out to a better overall programmer experience. I just wanted to point out that precision is something we do trade away for other things we want.

                • gspr 4 hours ago

                  Could you please explain why you feel that having UB makes C less precise than asm?

                  To me, the notion of precision isn't in any way related to whether any given statement is sound. It's about the behavior of the language for sound programs.

            • gspr 4 hours ago

              When I say "best part of any programming language" I obviously mean "best part of the in-spec defined parts of any programming language".

              Your suggestion that because languages have specified undefined behaviour, they are somehow not precise, makes little sense.

      • lmeyerov 1 day ago

        Maybe a failure to automate?

        The volume of people successfully adopting agentic engineering practices suggests this stuff isn't rocket science, but it is a learned skill and takes setup.

        A year later into heavy AI coding, my experience is what you're describing should aid in being able to run 5+ agents simultaneously on a project because you know what you're doing, you set it up right, and you know how to tell agents to leverage that properly.

        • necovek 1 day ago

          You seem to have missed OP's point: some things are only encoded in our brains when you are sufficiently experienced.

          Translating that into code can happen directly by you, or into prompt iterations that need to result in the same/similar coded representation.

          In other words, when it matters how something works and it is full of intricate details, you do not need to specify it, you just do it (eg. as an example which is probably not the best is you knowing how to avoid N+1 query performance issue — you do not need a ticket or spec to be explicit, you can just do it at no extra effort — models are probably OK at this as it is such a pervasive gotcha, but there are so many more).

          • timschmidt 1 day ago

            I think there's a level above that where the words to describe such structure are familiar and readily available and hey guess what? The model understands those too. Just about every pattern has a name. Or a shape. Or an analog or metaphor in other languages or codebases. All work as descriptors.

            • necovek 1 day ago

              This presumes that most of this stays encoded as words in our brains: the effort to translate some of these into words might be similar to translating it into code (still words, just very precise).

              It's like talking legalese vs plain English; or formal logic vs English. Some people have the formal stuff come more naturally, and then spitting code out is not a burden.

              • timschmidt 13 hours ago

                No, it really doesn't presume anything about brains or information encoding. Just points out that there is a level of mastery in which all the techniques and all the forms have names or adequate descriptions. Teachers often attempt to achieve this, to facilitate education.

                • necovek 13 hours ago

                  It's no accident there is an adage from Aristotle in the vein of: "Those who can, do. Those who understand, teach."

                  So yes, there is a level of mastery that is beyond being able to do a good job of designing and evolving complex systems which enables people to teach others the same skill set.

                  However, this is a smaller number of practitioners, and most have learned through practice and looking over how more experienced engineers apply their knowledge.

                  Where I disagree is that this means everybody is equally capable of teaching with words, or that there are no experts who are bad at teaching (humans or directing AI) — this clearly indicates it is not encoded as words for said experts.

                  • timschmidt 11 hours ago

                    It's been pretty clear in my experience that experts tend to be capable of working with the same ideas in many different forms. That's what I would call mastery. It implies "complete" knowledge, which probably means several interrelated encodings with loci in different parts of the brain. Those interrelated encodings will be highly associated, and discerning in an expert. Which implies a high degree of usefulness and specificity in communication. This matches my experience.

          • lmeyerov 1 day ago

            That's the failure to automate. The AI isn't telepathic, so agentic engineers not automating this stuff is skipping out on the engineering part.

            You setup the environment and then you do the work. Unless you are switching employers every week, you invest in writing that stuff down so the generation is right-ish and generate validation tooling so it auto-detects the mistakes and self-repairs.

            • mittensc 1 day ago

              sometimes you write the feature and write it well so it's reusable.

              imagine you have to implement a specific algorithm for a quantum computer.

              There's no value setting up AI to do the writing for you. That might be orders of magnitude harder then writing the algorithm directly.

              For highly specialized one-off features, it doesn't always pay off.

              On the other hand, if all you do are some generic items that AI can do well... then I'm not sure you're going to have a job long term, your prompts and automation will be useful for the new junior hires that will be specialized in using these and cost effective.

              • lmeyerov 16 hours ago

                That feels like true in theory, but in practice, we see the reverse for advanced projects where AI is helping us a lot. A decent chunk of our core IP falls into the bucket you're describing:

                We have been building a GPU-accelerated graph investigation platform that has grown over 10+ years with fancy stuff all over the place - think accelerated query languages, layout kernels, distribution, etc. R&D-grade high performance engineering projects and kernels end up needing a lot of iterations to make a prototype and initial release. Likewise, they're more devilish to maintain when they need a small tweak later because of the sophistication and bus factor. Both phases benefit.

                AI coding helps automate investigation, testing, measurement, patching, etc. The immediate effect is we can squeeze in many more experimental iterations with more fidelity and reach. Having an AI help automatically explore the design space and the details helps a LOT. And later, maintaining a wide surface area of code here that is delicate to touch and infrequently edited is traditionally stressful for teammates, and AI editing + AI-generated automation is helping destress that a LOT. We very much invest in upgrading our team, processes, and tooling here.

                • mittensc 5 hours ago

                  Allright, thank you! I need to re-evaluate then.

        • dodu_ 1 day ago

          Maybe you're the exception and are actually doing it right and actually getting good results, but every time I have heard this, it has been an ignorance-is-bliss scenario where the person saying it is generating massive amounts of code that they don't understand, not because they're incapable but because they don't care to, and immediately wiping their hands of it afterward.

          To give an example of where I hear this, it is indistinguishable from the things I hear from my coworkers: "You just need the right setup!" (IMO the actual difference is I need to turn off the part of my brain that cares about what the code actually does or considers edge cases at all) What I actually see, in practice, are constant bugs where nobody ever actually addresses the root cause, and instead just paves over it with a new Claude mass-edit that inevitably introduces another bug where we'll have to repeat the same process when we run into another production issue.

          We end up making no actual progress, but boy do we close tickets, push PRs, and move fast and oh man do we break things. We're just doing it all in-place. But at least we're sucking ourselves off for how fast we're moving and how cutting edge we are, I guess.

          I dunno, maybe I'm doing it wrong, maybe my team is all doing it wrong. But like I said the things they say are indistinguishable from the common HN comment that insists how this stuff is jet fuel for them, and I see the actual results, not just the volume of output, and there's no way we're occupying the same reality.

          • human305893 1 day ago

            1. If what you're replying to was a thing, wouldn't there be a open source project where I could see this in action? or Some sort of example I could watch on youtube somewhere. 2. The people that talk like this in my company, spin up new projects all the time and then just get to hand them off for other teams to clean up the mess and decode what the heck is going on.

            • lmeyerov 20 hours ago

              1. Probably most of https://github.com/simonw , but take care to seperate adopted / semi-professional from exploratory personal work

              2. That sounds like your company has a weak engineering culture and is early on its upskilling journey. We explicitly seperate projects into prototypes vs production, where vibes are fine for the former, eg, demos by designers / data scientists / sales engineers but traditional code review standards for whatever is going into production. That mirrors my qualifier in #1.

              I find that success here is a combination of engineering seniority, prompting experience, and domain experience . Anything lacking breaks the automation loop, like not knowing how and what to automate. Ex: All of our team finds value in ai coding, but junior engineers struggle on these dimensions, so are not running the 3+ agents that senior ones are.

          • lmeyerov 20 hours ago

            Yes and no

            I've seen productivity surveys of senior programmers that share the reverse, and that matches our experience. A common finding is that gardening projects are a lot cheaper now when they're just a few extra terminal tabs running in parallel - security, refactoring, more testing, etc. Non-feature backlog items that senior developers value around tech debt are less of a discussion now. They're often essential now: to make AI coding work well, there is an effective automation poverty line around verification, testing, and specification that needs to be reached.

            The understanding code thing is tough. Eg, when a non-senior fullstack developer manually edits frontend css code and didn't start from pixel-perfect designs across all form-factors, do they really understand what they did? I wrote the first formal mechanized specification of the CSS standard, and would claim 95%+ of web developers do not understand core CSS layout rules to beginwith: it was a struggle to semantically formalize even a tiny core of the box model as soon as you have floats. If the AI generates live storybooks and in-tool screenshots of all these things as part of the review process, and doing code review "looks good", what's the difference?

            I don't truly think this way - my point is to challenge basic claims of manual coding to be good to begin with and whether AI coding is being held to an artificial standard. What I see in commercial and defense software is a joke compared to what we do in the verification world. AI coding automating review iteration fixes in areas like security engineering and test coverage+amplification has been a blessing for quality improvement.

            More fundamentally, we require developers by default to be responsible for knowing what the code does and having tested it. Every case of relaxing that rule has to be explicit, eg, clear that something is a prototype, or an area is vibed with what alternate review/test flow, and we are learning as a team what that means in different situations. In practice, our senior ai coders are doing more quality engineering work than the manual coders, both per-pr and in broader gardening contributions.

            • dodu_ 12 hours ago

              > do they really understand what they did? ...

              I know you said you don't truly think that way, but to counter anyway since some people seem to legitimately hold this viewpoint:

              I take issue with the implication that not necessarily having a full understanding of what the code/library/driver/compiler/abstraction is doing is somehow justification/permission to embrace and celebrate having basically no understanding of what any of the code is doing. The in-between space there is the vast majority of the surface area where nuance can and should exist.

              >my point is to challenge basic claims of manual coding to be good to begin with and whether AI coding is being held to an artificial standard

              That's fair, and I can only speak for myself here; I don't have any inherent philosophical issue with manual vs AI, but my personal experience is that AI coding is just straight-up a frustrating nightmare to deal with, IMO orders of magnitude worse than manual. It's faster, sure, but I end my rage-filled LLM debugging session walking away knowing I learned pretty much nothing and that there's no compounding knowledge or outcome that will keep me from experiencing the same thing tomorrow, and I hate that. I am Sisyphus rolling prompts into a terminal.

              But I'm not gonna sit here and act like manual coding makes you morally virtuous or pure or whatever. IMO it's a great forcing function to better (even if not completely) understand what is going on in your system(s) and I think most everyone would agree with that. What's up for debate is probably whether that's worth the time tradeoff now that we have a magic time compressor machine available to us.

              Maybe I only find that knowledge tradeoff valuable because I'm a lowly IC and not some super turbo chad 10x principal who built a distributed database in brainfuck 10 years ago for fun and has nothing left to learn, or a technical founder of 5 concurrent startups who is optimizing for business value. It's possible that a heavy bias for learning/skill acquisition blinds me here.

              >we require developers by default to be responsible for knowing what the code does and having tested it. Every case of relaxing that rule has to be explicit

              This sounds pretty reasonable tbh.

        • stephenr 1 day ago

          > successfully adopting agentic engineering practices

          What's your definition of "successfully"?

          More LOC committed per day is probably the only one that's guaranteed when you let spicy autocomplete take the wheel.

          I don't think it's at all possible to reason about the other more meaningful metrics in software development, because we simply don't have the context of what each human is working on, and as with the WYSIWYG fad of 3 decades ago, "success" is generally self-reported, by people who don't know what they don't know, and thus they don't know what spicy autocomplete is getting woefully wrong.

          "But it {compiles,runs,etc}" isn't a meaningful metric when a large portion of the code in question is dynamic/loosely typed in a non-compiled language (JavaScript, Python, Ruby, PHP, etc).

          • pepperoni_pizza 23 hours ago

            Also, if your boss tells you "we're AI company now, you will use AI or be fired" then of course you will use AI and claim it is productive.

          • bdangubic 22 hours ago

            If you are on the right team with the right professionals you can measure. when we first started using LLMs we decided to run the same process as if they did not exist, same sprint planning meetings, same estimation. we did this for 6 months and saw roughly 55% increase in output compared to pre-LLM usage. there are biases in what were tried to achieve, it is not easy to estimate something will take XX hours when you know some portion (for example writing documentation or portions of the test coverage) you won’t have to write but we did our best. after we convinced ourselves of productivity gains we stopped doing this. saying you can’t measure something is typical SWE BS like “we can’t estimate” and the other lies we were able to convince everyone off successfully

      • larodi 1 day ago

        Care to explain which particular intimate knowledge allowed you in the last 6-9 months to be faster than AI in certain area?

        Honestly, I'm still faster than AI cooking scrambled eggs, but definitely not faster than neither AI (or compiler) in translating stuff into code.

        • tdeck 1 day ago

          Not the parent but I've had this happen when debugging for sure. Sometimes I ask Claude Code to help me debug something and it makes a wrong assumption and just churns in circles burning tokens. While it's doing that I realize the problem and fix it.

          • murkt 1 day ago

            Sometimes debuggind is faster indeed, and making small very focused changes too.

            But during feature development? Not possible. And I consider myself a very fast developer

            • tdeck 1 day ago

              Don't you find that debugging takes place as part of feature development though?

              • murkt 1 day ago

                What I meant is that only sometimes I am faster than Claude with debugging. When it's a standalone problem, a report in Sentry, and I just know immediately where I need to go to fix it. Then it's faster to do myself, than telling Claude what's the problem and where to look and wait.

                Bugs happen during feature development, as you say, but then Claude is in the context, and I don't need to tell it where to go, it sees the bug with failing tests, or smth similar.

                BTW. One thing that helps my Claude with debugging harder problems is that I tell it to apply scientific method to debugging. Generate hypotheses, gather pros/cons evidence, write to a journal file debug-<problem>.md, design minimal experiments to debunk hypotheses.

                You can add that as a skill, and sometimes it will pick it up automatically, but it works wonders just as a single sentence in the input.

          • larodi 1 day ago

            ..but then you ignore all other times CC got it right, and statistically I would put my bets CC does it right (or Codex (or PI)) than you would, and more often is right than tis not.

            besides it is a system that you query, it responds. I'm sure your dbs are not always 'right' and particularly when you as the wrong questions.

        • jasonfarnon 1 day ago

          I interpret "faster than AI" to include writing the prompt. For me (scientific computing) it is more often than not faster to write out a simulation or design in a language I know inside out like fortran or mathematica than explicate the requirements to an LLM to request the code. Obviously if someone wrote out a prompt to me and the LLM it would be way faster, but I don't think that's what the commenter had in mind.

        • Zecc 1 day ago

          If you're good at SQL, or SQL-like languages like Linq, it might be more efficient precisely writing a reasonably complex query than trying to explain it in detail to an AI.

          • larodi 1 day ago

            I am very good at SQL, I worked half my life with SQL and teached it and know all kinds of SQL flavour. But good luck getting ahead of AI on a complex query with recursive CTEs, left outers, 625-column tables that change semantics conditional to certain prop, and then some obscure Oracle package APIs.

            No way U beat an LLM on this, even on trivial ones. LLMs are better at that since at least 2024, if you haven't noticed, then you're not doing enough SQL perhaps.

            But, of course it took years for people to realize they cannot outpace Visual Studio in the 90s by being very good at x86 assembly.

      • Fr0styMatt88 1 day ago

        Yeah it’s when you go off the happy path that it gets difficult. Like there’s a weird behaviour in your vibe-coded app that you don’t quite know how to describe succinctly and you end up in some back-and-forth.

        But man AI is phenomenal for getting stuff out of your head and working quick.

      • threethirtytwo 1 day ago

        I don't believe this. Either you're lying, or you just haven't caught on with how to use Agentic AI.

        Everything I do to interact with my computer is through an agent now.

        • hatefulheart 1 day ago

          I don't believe this. Either you're lying, or you just haven't caught on how to use a computer.

          Everything I do to interact with my computer is still the same.

          See how boring you are?

          • threethirtytwo 1 day ago

            Ok sorry about that. I seriously don't believe him. The Agent is so fast there's literally no way you can be faster.

            Telling the agent your high level plan that you are extremely familiar with and then having the agent execute on 2000 lines of code is FASTER then having you execute on that 2000 lines of code. There is no reality where that can be physically beaten by even someone who's typing really quickly with zero pause. Physically impossible.

            Less boring or not? Another way to put it... although my answer is boring, I think I'm right. He is either a liar or like many other people lacks skill in using AI... because the transition to AI is happening so fast... not many people are fully utilizing AI to it's maximum potential. Many still use IDEs, many still interact with terminal. Many people still don't use it to configure infrastructure, do database administration, deploy code... etc.

            • dzjkb 1 day ago

              AI can write 2000 lines faster than you, but you can write the 2000 lines correctly first shot faster than having AI do 10 iterations on these 2000 lines with your guidance to finally get it right

              I know that a better plan could mean fewer iterations, but again that extends the time you need to spend on that plan => the total time of the AI solution

              • threethirtytwo 15 hours ago

                Right but those 10 iterations only take up prompt writing time. When the agent is executing I move onto other tasks in parallel. AI is faster when you parallelize your work flow.

            • matt_kantor 23 hours ago

              Why are you starting the clock at the time when you already have a "high level plan that you are extremely familiar with"? I think it's fairer to start from "I received a bug report/feature request" or similar.

              Also, haven't you ever had a situation where the prompt you started with ends up being longer than the final code diff? Perhaps a subtle bug that's hard to describe/trigger, but ended up having a simple root cause like an off-by-one error?

              Also also, coding agents are infamous for generating way more code than is strictly necessary. The 2000 lines of code that the agent generated may well have been only 200 lines had you written it yourself.

              • threethirtytwo 15 hours ago

                >Why are you starting the clock at the time when you already have a "high level plan that you are extremely familiar with"? I think it's fairer to start from "I received a bug report/feature request" or similar.

                Done both. We tag the LLM on slack in a reply and the ticket gets created and forwarded to an agent that automatically works on it. The only time a human is in the loop is review or or queries for changes.

                >Also, haven't you ever had a situation where the prompt you started with ends up being longer than the final code diff? Perhaps a subtle bug that's hard to describe/trigger, but ended up having a simple root cause like an off-by-one error?

                Sometimes. Getting rarer and rarer.

                >Also also, coding agents are infamous for generating way more code than is strictly necessary. The 2000 lines of code that the agent generated may well have been only 200 lines had you written it yourself.

                Depends on the agent and it's random. This was mostly true probably 5 months ago. It's much less true now.

            • notnullorvoid 21 hours ago

              Again it's not about typing speed. High level plans simply don't work very well, especially for big tasks where the optimal solution actually would take 2k lines. Unless you are building something that is extremely generic, AI coming up with the optimal solution rarely ever happens.

              > He is either a liar or like many other people lacks skill in using AI

              Not a liar, and I'm sorry to say, but AI really doesn't take much skill to use. People who say such statements give me the impression that their ceiling for skills is quite low.

              Their are areas I do and will continue to use AI and it works well enough. Giving me prototypes for projects I don't have a lot of knowledge about is one thing. But I use those prototypes to learn.

              > configure infrastructure

              I make templates I can copy and tweak to do this faster than it takes to tell an agent what to do.

              > database administration

              Don't do that... Sure get it to write you some SQL to update a table, but don't give it DB admin access for fucks sake.

              > deploy code

              Tell me, how is your agent able to deploy code more effectively than hitting merge on a PR? Or do you simply mean setting up CI/CD for you? That's usually a set and forget thing that doesn't take much time, so I'd rather do it myself.

              • threethirtytwo 15 hours ago

                >Again it's not about typing speed. High level plans simply don't work very well, especially for big tasks where the optimal solution actually would take 2k lines. Unless you are building something that is extremely generic, AI coming up with the optimal solution rarely ever happens.

                Nope. Not universally true. It depends on randomness of the rng, the type of task, the agent, and also the current state of AI. Right now for frontier models... what you're saying is generally true only in the minority of times ime.

                >Not a liar, and I'm sorry to say, but AI really doesn't take much skill to use. People who say such statements give me the impression that their ceiling for skills is quite low.

                It does take a little skill. Very little and it requires new habits that are harder to pick up. For example. I never work on one project at a time anymore. I work on 5 projects and context switch between all of them. Prompt, switch, come back, prompt, switch, prompt switch, review... etc. That takes getting used to.

                >I make templates I can copy and tweak to do this faster than it takes to tell an agent what to do.

                I have a huge change, and within that change the agent does this automatically.

                >Don't do that... Sure get it to write you some SQL to update a table, but don't give it DB admin access for fucks sake.

                You can fuck off prick, don't fucking talk like that to my face. I do it and I have no problems with it. If you don't want to, that's your own fucking prerogative.

                >Tell me, how is your agent able to deploy code more effectively than hitting merge on a PR? Or do you simply mean setting up CI/CD for you? That's usually a set and forget thing that doesn't take much time, so I'd rather do it myself.

                Because the agent merges for me. Prompt: "Complete task A". Agent: "Task completed", Me: "reviewed and good to go"

                The agent then does it's thing. Of course there's always some adjustments and more conversation then this but that's the jist of it.

      • squidbeak 1 day ago

        > Some of us do actually have intimate knowledge in certain areas where guidance of an AI takes longer than doing it yourself.

        You speak as if AI development is frozen, and you ignore the poster's point:

        > that gap will only increase as LLMs get more intelligent

      • Tuna-Fish 1 day ago

        That doesn't matter. The statement wasn't "faster than AI right now", it was "will always be faster than AI". And that's just nonsense.

        Current AI systems are extremely serial, in that very little of the inherent parallelism of the problem is utilized. Current-gen AI systems run at most a few hundreds of thousands of operations in parallel, while for frontier models, billions of operations could be run in parallel. Or in other words, what currently takes AI 8 hours will take it barely long enough for you to perceive the delay after you release the enter key.

        For a demo, play around with https://chatjimmy.ai/ , the AI chatbot of Taalas, where they etched the model into silicon in a distributed way, instead of saving it in RAM and sucking it to execution units by a straw. It's a 8B parameter model, so it's unsuitable for complex problems, but the techniques used for it will work for larger models too, and they are working to get there.

        And even Taalas is very far from the limits. Modern better quality LLM chatbots operate at ~40 tokens per second. The Taalas chatbot operates at 17000 tokens/s. If you took full advantage of parallelism, you should be able to have a latency of low hundreds of clock cycles per token, or single request throughput of tens of millions of tokens per second. (With a fully pipelined model able to serve one token per clock cycle, from low hundreds of requests.) Why doesn't everyone do it like that right now? Because to do this, you need to etch your model into silicon, which on modern leading edge manufacturing is a very involved process that costs hundreds of millions+ in development and mask costs (we are not talking about single chips here, you can barely fit that 8B model into one), and will take around a year. So long as the models keep improving so much that a year-old model is considered too old to pay back the capital costs, the investment is not justified. But when it will be done, it will not just make AI faster, it will also make it much more energy-efficient per token. Most of the energy costs are caused by moving data around and loading/storing it in memory.

        And I want to stress that none of the above is dependent on any kind of new developments or inventions. We know how to do it, it's held back only by the pace of model improvement and economics. When models reach a state of truly "good enough", it will happen. It feels perverse to me that people are treating this situation as "there was a per-AI period that worked like X, now we are in a post-AI period and we have figured out that it will work like Y". No. We are at the very bottom of a very steep curve, and everything will be very different when it's over.

        • u8 20 hours ago

          Huh, I have to say that I am impressed with Chat Jimmy. No doubt that the hardware running this model operates faster than any human. If this was possible to scale, (and I'm not saying it isn't, I just don't think it's likely right now) LLM's have a real shot of replacing real-time graphics, frontend UIs, and all sorts of interactive media if the market allows it.

          I still think regardless of how fast a model outputs tokens, it still benefits the person responsible for that output to be well informed and knowledgeable about the abstractions they're piling on top of. If you have deep knowledge, you can operate faster than other people, and make those important decisions in a more intelligent manner than any model.

          Maybe in the model we do get super intelligence and my point will finally break, but at that time I don't think I'll be worried about being wrong on the internet.

      • lukan 1 day ago

        Yes, there are still many areas where skilled humans are faster than AI (meaning faster coding yourself, than providing so much context and guidance that the AI can do it on its "own").

        But in general the statement is really not true anymore, generic projects/problems have a pretty good chance that the AI can one shot a working solution from a lazily typed vague prompt.

    • Turskarama 1 day ago

      In my experience AI can write _something_ from scratch, but often edge cases won't be handled until I go through and read the results or test it. Usually when I'm writing by hand I will naturally find the majority of edge cases as I go. By the time I've read through the results and fixed said edge cases, I usually would have been faster just doing it myself.

      • toponijo 1 day ago

        It also loves to add edge case handling where it's not needed and in poorly chosen places

      • Kiro 1 day ago

        My experience is the opposite: AI takes too many edge cases into account and guard against even the most unlikely thing. The upside is that it often handles edge cases that I either didn't think about or was too lazy to implement.

        I can with full confidence say that the code AI writes is more robust and safe than if I would have done it myself. The code definitely becomes more bloated though.

        • strken 22 hours ago

          My experience has been that it wraps all the obvious things, and even some obscure things, in error handling. In this sense it is safer.

          It also fails to write abstractions unless they're carbon copies of a well established pattern, and when abstractions already exist, it needs babysitting to ensure it will use them appropriately. It won't introspect about its current direction unless forced to by the user or by an error, and when forced it will happily "fix" non-issues just because you pointed them out, since it's a happy little yes-man.

          Because of this, code written by a good engineer is more likely to start out broken but converges towards correctness as more abstractions get built, while code written by AI duplicates abstraction layers, leaks between them, and never converges towards anything.

          • cableshaft 21 hours ago

            I've definitely had a lot of these same experiences (in fact I've been fighting it on one particular issue the past couple of days and I'm pretty much just giving up and going back to solving it manually now).

            But it still seems to get it right (or at least close enough to right that I keep using it) more often than it gets into these traps.

      • iugtmkbdfil834 23 hours ago

        This has been my experience thus far. Yes, a complete prototype can be made, but.. you don't really know until you read the code and test it. Just yesterday, small things came up in terms of Qt screen focus that wouldn't have come up otherwise save for initial testing.

        I think, and I recognize it is mostly against the 'agentic' push, I will stick with slow iteration.

    • charcircuit 1 day ago

      >No one can type at the speed AI can code

      You can definitely be faster than frontier models. The number of tokens per second is not that high and they require a lot of tokens for thinking and navigating things.

      • Aerolfos 1 day ago

        Especially if you use auto-complete AI, ironically. You type a few characters, the line fills out in less than a second, as opposed to a reasoning model that takes maybe a second per 2-3 lines it writes out.

    • gaanbal 1 day ago

      if you've never had the experience of handing something off to someone else being more laborious and slower than doing it yourself due to having to set constraints and define success, then you simply haven't held a senior enough position to comment on this with any authority

      • andai 1 day ago

        Also employees who work slower than you (and spend most of their time not actually working).

    • erfgh 1 day ago

      Where does this certainty that LLMs will get more intelligent stem from?

      • eloisant 22 hours ago

        They progressed very quickly in the past year. Not just models, but all the harness around them to code.

        When they start plateauing, then we can assume they're done progressing.

    • utopiah 1 day ago

      > No one can type at the speed AI can code

      Don't we already have a weekly post nowadays explaining, again, that typing isn't the bottleneck?

      • Kiro 1 day ago

        Which is still false and not serious. It's one of the dumbest rationalizations I've seen. AI has many flaws but pretending that it's useless because of that is not it.

      • esafak 18 hours ago

        That is not true in startups, where people are getting work done. Maybe in later stage companies where 'stakeholders' are 'synergizing' in meetings over the Q2 roadmap.

    • stephenr 1 day ago

      > LLMs get more intelligent

      The Spicy Autocomplete koolaid club is out in force today I see.

      We clearly have different ideas of what the word "intelligent" means.

      • skinfaxi 1 day ago

        Explaining your idea of intelligent would have been a better comment than name calling and shallow dismissal.

        • stephenr 1 day ago

          Your views might carry more weight if the crux of your rebuttal wasn't manufactured outrage that I used a laughably accurate nickname for a type of software.

    • TeriyakiBomb 1 day ago

      Plenty of cars can get off the line faster than an F1 car. But around a track, an F1 is by far the fastest in the world.

      Going fast isn’t the difficult bit.

    • draxil 1 day ago

      Except it's often faster to make the change yourself than explain it to an AI.

    • jmull 1 day ago

      They probably mean faster to a higher-level goal rather than SLOC. Typing speed and SLOC have never been that useful for measuring productivity.

  • jtr1 1 day ago

    I have been building an iOS app that I had kicking around in my head for years but never had time to build. I have been a frontend UX engineer for the better part of a decade and went through a handful of tutorials on Swift. The project definitely sits in this uncanny valley for me. I have test suites for every aspect of the app and have the agent using TDD to avoid cheating - this has gotten me pretty far without having to look too close at the output other than general structure. As I'm reaching a more mature stage of the project though, I'm finding that I want to tweak a lot by hand in the code to get the details right without burning tokens.

    • throwaway219450 1 day ago

      The agents always do the best work IMO if you already know exactly what you want, but are too lazy to implement it. I like having the agent mock up a working solution before reimplementing it.

      To split the difference, I now try to hand code as much as I can from the beginning, leave TODO comments for the agent to mop up and I'll ask it to complete the issue with reference to the current diff. It reduces the surface for agents to make stupid assumptions. If I can get it done fast on my own, win for me, if the agent finds issues or there's logic that needs checking, also a win. This way you stay sharp, but you have access to an oracle if you get stuck and it costs you fewer tokens.

      • jtr1 12 hours ago

        Yeah, I like the "get out of jail free" card approach. The thing I always used to hate before this era was getting stuck in a hole on something that would take days or worse to grind through. It's nice to drop a little plank bridge across those now

  • nbvkappowqpeop 1 day ago

    thanks for this take, articulates what i've been feeling towards "AI" without my angst

  • Finbel 1 day ago

    >Also, when did we stop liking to learn? Why is it a bad thing to know all the ins and outs of a programming language?

    I do not know the inns and out of the assembly layer my high level code end up as. It's not because I don't like to learn, it's because I genuinely don't need to. At a certain level of AI performance, how will this be any different?

    • californical 1 day ago

      Because you may not know the specifics of the assembly being generated, but you’ve likely learned a language built on top of assembly. And the compilers do some great tricks behind the scenes to generate efficient assembly, but those tricks are specifically coupled to semantics of the source language.

      An LLM is not coupled to anything and can generate output that simply does not relate to the input. This doesn’t happen with compilers, and if it does, then it’s a specific bug to be addressed. An LLM can never guarantee certain output based on the input.

      If I write x < 100, I know exactly how the compiler will treat that code every single time, and I know what < means and how it differs from <=

      If I tell an LLM that “I want numbers up to 100.” Will that give me < or <= and will it be consistent every single time, even the ten thousandth program that I write?

      The language is ambiguous where the code is specific

      • Finbel 1 day ago

        To me this is semantics as far as it's related to "why don't you want to learn?"

        I have a co-worker in another team that write java endpoins we consume. I can tell him what I need and I trust the output. I don't need to know java to trust him, it doesn't mean I don't want to learn.

        There are thousand examples like this across every stack and abstraction level. From ssh-handshakes to gps.

        Sure my co-worker is fundamentally different from a compiler which is fundamentally different from an LLM.

        My argument is that the chain-of-trust where you offload knowledge to an external source is identical. We do it all the time but somehow doing it with an LLM means we no longer want to learn?

    • sdevonoes 1 day ago

      One difference is: to use a top notch compiler/assembler you don’t need to pay. They are open source and have a lot of support. To use the latest and greatest models (bc no one around likes to use non sota ones) you need to pay a premium price.

      Multibillion dollars companies are now the gateway for every line of code you need to write. That’s dystopian. It sucks

      • Finbel 1 day ago

        Yes, but that's a completely different argument (that I agree with). Essentially, yes they are conceptually similar but one is bad because you have to pay rent to use it.

      • Zetaphor 1 day ago

        Local models are increasingly becoming capable of taking on serious coding tasks that I would have previously sent to a frontier lab

    • 0xpgm 1 day ago

      However, curious programmers who develop in high level languages will dabble with assembly maybe for fun, and will be much better off for it than those who treat parts of the stack like a black box never to be opened.

  • imrozim 1 day ago

    100% aggred, i learn coding by building stuff and breaking it when you let ai do everything you skip that pain and also skip the understanding.

  • pjmlp 1 day ago

    Except those are the same people that will decide who is getting hired, and who gets layoff because of increasing productivity.

    And no, this isn't playing what ifs.

    I have seen it happening with offshoring, migration to cloud, serverless, SaaS and iPaaS products, and now AI powered automations via agents.

    Less devops people, less backend devs , no translation team, no asset creation team,...

    I have been layoff a few times, having to do competence transfer to offshoring teams, the quality of the output is something c suites don't care all.

    Do you wanna bet what is behind Microslop, Apple Tahoe bugs and so forth?

  • rufasterisco 1 day ago

    Let’s see if someone can point me towards some resources over the following.

    The problem is mixing vibe-coding and agentic-eng, and switching the brain in 2 different modes (fast-feedback gratification vs deep-focus gratification).

    There’s no clear cut rule on what works. Different people, different brains, and especially amongst devs some optimized low-key neurodivergence.

    And then there’s waiting mode, those N seconds/minutes that agents take to think and write.

    What’s the right mix? Keep a main focused project and … what do you do in the meantime? Vibe code something else? Hn? Social media? Draw lines on a paper sheet? Wood carving? Exercise? Rewatch some old tv series?

    I have experimented….

    There are side activities that help you go back to the task at hand in the correct mental framework for it. Not just for productivity, but for efficiency and enhancing critical thinking on the main task. Or whatever you choose to optimize for. Can anyone point me towards some people talking about this?

  • sdevonoes 1 day ago

    Agree except for this part

    > If you're at work and they really care about getting something out of the door, do whatever you think is best.

    If you don’t mind being jobless, sure do whatever you think is best. Not all of us can simply switch companies easily. Folks need to realise that AI in a company setting works for the benefit of the company, not for the individual.

    • 0xpgm 1 day ago

      But do companies really know how to use AI? I think most of it is experimentation - throwing things to the wall and seeing what sticks.

      It's the practitioner who eventually figures out what really works. I see this the same way the agile movement emerged. It was initiated by people who were hands-on programmers and showed enough benefit at minimizing software waste before it took a life of its own and started getting peddled by people who didn't really understand the underlying principles.

      • dpoloncsak 21 hours ago

        > I think most of it is experimentation - throwing things to the wall and seeing what sticks.

        This is true in macro, but I think we're specifically referring to LLM-generated /assisted code (vibe-coding). 'Getting something out the door' is not an necessarily in reference to an AI-infused product, just new code written by AI

  • IanCal 1 day ago

    Fundamentally you need to start with "what am I trying to do?" and "given that goal, where is my time best spent?".

    I made a checklist for my kids to stamp off items after they get back from school (sort bag, get changed, etc). I had two goals, 1) I was trying to solve a problem at home and would have pip installed a library that just straight up did this already and 2) I wanted to check out what the claude website outputs was like at the time. My time was best spent poking at claude a bit but mostly playing with my kids - so vibe coding it was.

    Client test speedup issues, I'm trying to speed up tests for them and spend as little time as possible doing so. Vibe coded some analysis and visualisation tools, mostly AI but with some review guided multiple prototypes for timing and let it just fix whatever. More dedicated review for the actual solutions.

    Learning a new thing - goal is to learn that thing. AI there is good for doing a lot of the work around that. Maybe I'm focussing on, say, Z3. AI there can help with debugging, finding docs, setting up an environment and leave me to do the central part.

  • latexr 1 day ago

    > Also, when did we stop liking to learn?

    I suspect it happened when we achieved a level of such constant stimulation (there is a pocket computer always on us with infinite effortless distraction) that we’re never bored and never engage the default mode network.

    https://en.wikipedia.org/wiki/Default_mode_network

    https://www.youtube.com/watch?v=orQKfIXMiA8

    When you’re bored, your mind goes to places it wouldn’t otherwise go. Curiosity kicks in. Curiosity is a precursor to learning. Learning engages the brain and is fun. But it’s not fun all the time, some of it is challenging and frustrating (which is good, that’s the process that teaches you).

    When you have the digital equivalent to infinite candy and the brain equivalent to a sweet tooth, it’s hard to resist the siren’s call. The consequence is the brain equivalent to a stomachache—depression and loss of meaning—but unfortunately it doesn’t hit you the same way so you don’t make the immediate connection to make yourself stop. When you think about it, it’s ridiculous from several angles: the candy is infinite, it’s never going to run out, so you don’t need to gorge! But then we justify ourselves as only a true addict would, that while the candy is infinite, the flavours are limited editions and always rotating, and what if I miss that really good one everyone is on?! Then you miss it, is the answer. No one will be talking about it in fifteen minutes anyway.

    • AStrangeMorrow 1 day ago

      I still love learning, especially outside of tech. Been working in the ML field for over 8 years, and while I went into it because I liked the field, I did lose some interest in learning things, but mostly because of the sheer volume of publication and the rate of change. Learning stopped being something I enjoyed doing and went to something I had to do to keep up. And it just stopped having the same flavor.

    • gchamonlive 1 day ago

      > it happened when we achieved a level of such constant stimulation (...) that we’re never bored and never engage the default mode network.

      I don't know... I don't disagree, but I think this has been repeated so much that I believe everyone, at least everyone that is actively participating in HN discussions is aware of this.

      So if we are aware of this and we consciously choose to keep engaging in dopaminergic activities, without having some time to be bored, I think it starts to become a choice. We can blame tech for starting this trend of stealing our attention, but once we become aware of this, we can only blame ourselves for perpetuating it.

      • therealpygon 23 hours ago

        Or at least, aware that this argument continues to be made with tenuous evidence and anecdotes. And yet, people are being more productive (actually productive) with AI. Release schedules are increasing, bugs are getting fixed faster, security issues identified and patched sooner, so on and so forth.

        I’m not denying (at all) that unused skills languish. I take issue with AI being characterized as a magic eraser that mystically makes people forget what they have already learned. I’ve just done a study and concluded that dogs gets dumber when I throw a ball. What’s my evidence? They stop staring at me to chase it. The ball definitely made them forget who I was, so we shouldn’t allow dogs to have balls anymore.

        Can AI make developers lazy in new ways? Of course! Why wouldn’t it? I don’t write things in ASM because I can be “lazy” and write 50x more useful instructions with a few lines of a modern language. I doubt I’d be able to write working ASM anymore without a serious refresher. Did newer languages erase my memory of ASM and make me “lazy”, or did my efforts evolve to make use of the newest technology regardless of “lost” skills?

        • latexr 23 hours ago

          > this argument continues to be made with tenuous evidence and anecdotes.

          The linked Wikipedia page has plenty of evidence and studies and you can find plenty more with a basic web search. This is not something someone just made up; if you don’t know there are a multitude of studies on the harms of social media, you haven’t looked at all. Which is fine, it’s our prerogative to not search for information, but don’t turn around and say it doesn’t exist or is anecdotal.

          > And yet, people are being more productive (actually productive) with AI.

          You said, ironically without providing evidence, in the same paragraph that you complained about evidence not being provided for something else which has plenty of it. Furthermore, there are several studies suggesting AI may in fact decrease productivity, but I’m not going to link to those because the more important point is AI has nothing to do with the conversation. The original poster mentioned AI, but this branched thread is exclusively about the “liking to learn” part.

          • addedGone 18 hours ago

            common... productivity decrease was maybe a thing last year, but this year with all the tools and models (especially if you pile-up many accounts + many models), if productivity hasn't increased DRASTICALLY for everyone I don't know but they must be using it wrong or simply "vibe code" using Claude Code or basic tooling.

        • kiba 22 hours ago

          Can AI make developers lazy in new ways? Of course! Why wouldn’t it? I don’t write things in ASM because I can be “lazy” and write 50x more useful instructions with a few lines of a modern language. I doubt I’d be able to write working ASM anymore without a serious refresher. Did newer languages erase my memory of ASM and make me “lazy”, or did my efforts evolve to make use of the newest technology regardless of “lost” skills?

          I would argue that's a misuse of AI. If the point of an engineer is to know how things work behind a piece of software, then shipping code without an understanding how it all works is a failure.

          You wouldn't trust an engineer a bridge that an engineer vibe-engineered would you?

          So instead of focusing on AI as a productivity tool, focus on AI as a means of adding rigor and understanding to your workflow.

          • g3f32r 20 hours ago

            > You wouldn't trust an engineer a bridge that an engineer vibe-engineered would you?

            If it was as easy to stress test/battery test/materials test/etc a bridge as it is to test code - then yes. I'd trust an engineer who vibe-engineered a bridge.

            ---

            The problem with mapping digital problems into meat-space is that there is inherently a few orders of magnitude of cost automatically added to anything that happens in meat-space.

            I can spin up an arbitrary number (10, 10k, 500k) docker instances, X with fuzzed inputs, Y with explicit edge cases, Z with tolerance testing, etc etc. And if that doesn't work - I can fix and push a button and it just happens again.

            If a bridge engineer could do that with bridges - yes I'd expect them to be vibing just as hard as we are now.

            • esafak 19 hours ago

              Don't mechanical engineers do that with FEM simulations? Example: https://www.youtube.com/watch?v=tZspM_TvPKQ

              • therealpygon 12 hours ago

                Absolutely. These days engineers use AI and simulation to design new types of engines, jet nozzles, etc. Treating it not like a tool is the mistake, and the assumption many make is that “other people must be making that mistake too”.

            • kiba 19 hours ago

              That's verification. An engineer still understands the bridge and the engineering decisions that they built.

          • randallsquared 17 hours ago

            > If the point of an engineer is to know how things work behind a piece of software

            That might be the point of an engineer in some orgs, but mostly the point of an engineer is to ship a product or release that matches someone's vision of what should come next, and doesn't cause additional noticeable problems in the next quarter or three.

          • therealpygon 12 hours ago

            No, but I’d certainly trust an engineer to use software with in-built algorithms to design a bridge instead of using a typewriter, calculator, and a pencil. What you argue is your perception that this means everything is vibe coded and an engineer doesn’t understand any of it. My thought would be that this sentiment seems more telling of your views or how you use it than someone else. I’m not saying it isn’t possible for you to be right in circumstances, but rather you seem to assume your view fits all circumstances.

            That the point of the engineer is to know how all of it works? So one must know the specific details of how every hard disk driver is implemented, with the algorithms being used, and have checked all the math and inspected all that code, just to be able to “read file”? Do you also argue that million+ line codebases should be inspected through every dependency and every file, line by line, and run through a debugger, before making a single line change anywhere?

            Seems more like an extensive exercise in self-flagellation on the company’s dime than would be appropriate, but that’s just my opinion.

            We all have our domains. A person can absolutely use AI within their domain and understand its output perfectly as much as having written it themselves.

            The need for knowledge in those domains also changes over time. The need to understand a domain at a depth is directly proportional to the depth of the changes you are making to a stack. I don’t need to know how the hard drive spins to write a program. I don’t even need to understand that hard drives exist to write most programs these days, because it is not an area of concern. All of the implementation and efficiencies happen at a level below, or another below that, or within a trusted dependency. The people working on all those things understand those domains better than I ever could have the time to.

            Maybe you could better explain where vibe-coding significantly differs from the above as another “layer” of separation?

            Look, I’m not arguing vibe coding is “good” in and of itself by any means, but just basically that it’s not all vibe coding, and those of us that understand that don’t really have as much need to argue about the inevitable or that things change.

        • swsieber 22 hours ago

          > And yet, people are being more productive (actually productive) with AI. Release schedules are increasing, bugs are getting fixed faster, security issues identified and patched sooner, so on and so forth.

          I didn't see anything in parent chain that implied this. Nor did I see it "characterized as a magic eraser"; I saw it framed as something that impedes learning, and that was tied back to constant simulation.

          > Or at least, aware that this argument continues to be made with tenuous evidence and anecdotes

          The arguments I read and the argument you seem to be replying to seem to be different things.

      • latexr 23 hours ago

        > I think this has been repeated so much that I believe everyone, at least everyone that is actively participating in HN discussions is aware of this.

        I promise you that is incorrect. People who actively participate on HN are a group more diverse than is often given credit, and I strongly believe there is nothing “everyone knows” here.

        https://xkcd.com/1053/

        Just nine days ago, someone on HN was vaguely aware of the idea but did not know it’s called the default mode network. How many more aren’t even aware of the idea?

        https://news.ycombinator.com/item?id=47926043

        Not knowing the name means you’re not aware of all the details, intricacies, studies and ideas pertaining to it.

        Finally, even if everyone knew about it that would still not be reason to not talk about it. Talking and doing something repeatedly is how you create habits and change behaviour. Same way you should still call out when someone does something bad even if “everyone knows they do it”.

        > I think it starts to become a choice. (…) we can only blame ourselves for perpetuating it.

        That is called blaming the victim. There are multiple billion dollar corporations and industries actively working to get you addicted, bombarding you from every side. It’s not a simple choice of “I’m not going to engage”, rather you have to actively disengage from what’s thrown in your face all the time. It’s exhausting. You’re falling into their trap and repeating the words they want you to. It’s like a supermarket which offers 99% junk and only a tiny section of always the same selection for healthy eating (not a hypothetical, I have several like that nearby) then blaming buyers for not eating more healthily. It’s not a fair choice if you’re constantly pushing and finding ways to trick people to in one specific direction.

        And again, not everyone is aware of what is happening. Most people aren’t. And even those who are (which, again, is not even everyone on HN) aren’t immune.

        • gchamonlive 23 hours ago

          Fair enough. It's always tricky to generalize like this, so I wont defend that position.

          However, for those who know, I don't think this is blaming the victim. I think victim blaming is a form of debate simplification in this case, just like "this is life" or "shit happens".

          Sure there are billions of dollars invested in attention stealing mechanisms, just as there are billions of dollars invested in gambling sites, in alcohol, tobacco and highly processed foods, or in the scamming industry. However, while we need as a society need to discuss mechanisms to control and maybe prohibit these practices, a functional adult human beings should be expected to create safeguards to protect themselves against this. Maybe the phrasing wasn't the best, but my point stands. Once you are aware of things that aren't good for you, you can really only count on yourself to do something about it.

          • latexr 22 hours ago

            > gambling sites, in alcohol, tobacco and highly processed foods, or in the scamming industry

            Those are great examples because they show that leaving it all up to the individual is not enough. All of those are regulated by the state because we as a society recognised they were doing their damnest to screw everyone else for their own gain. Social media is going the same route, with several countries already introducing bills to prohibit them to minors.

            There is another discussion to be had if we’re going about it the right way (I certainly do not support privacy invasion in the form of age checks), but it does show we’re recognising its harm.

            • gchamonlive 21 hours ago

              Exactly, but I still think those are two slightly different conversations. If we are talking about harmful habits at a more general level, I'll defend that we need to be very restrictive in those examples. Online gambling, alcohol use and such shouldn't be allowed to advertise, it should pay almost prohibitively expensive taxes etc...

              But if we are talking about the individual, the one inserted in the society, which is temporally bound, the conversation changes. We have to admit that it isn't enough to wait for laws and culture to change in order for the individual to be able to protect him/herself. To be a functional adult is to recognize what's around us that is harmful and do our best to protect ourselves. This is why if people recognize the harm social media is doing to their attention and to their ability to be bored they only have themselves to blame if you don't take action, because only blaming the multi-billion dollar industry for the habits they exploit won't do much for the individual.

              • fugalfervor 17 hours ago

                You've clearly never been addicted to anything. You seem to have little understanding of, or empathy for, those who have become addicted.

                I quit smoking cigarettes. It took years. It was incredibly difficult on an emotional level, and took a lot of failure and disappointment to finally make it through. And I almost lost all my progress when I relapsed after my Dad died unexpectedly.

                Every pair of eyes that you see walking down the street has an entire universe behind them that we cannot see. It is not simple like you assume.

                I suggest you recognize your exceptional self-discipline and relatively unaddicted lifestyle as the stroke of good fortune that it is; you are genetically predisposed or developmentally more well-prepared than most. Recognize that others are less fortunate than you in that regard, but no less deserving of aid, comfort, and a legal avenue to seek recompense from unscrupulous actors.

                • gchamonlive 16 hours ago

                  You might disagree with my point, but you don't know me and you can't really lecture me on empathy. You just glossed over the parts of the argument where I am for discussing public policies that we can implement to care for those that suffer from addiction. But at the individual level, after recognizing the hardships of being addicted to anything, the ultimate choice and responsibility to do something is yours.

                  Your cigarette addiction might have started because of social pressure or because of advertisement, but every choice to light another one or not was entirely yours. Just like it's your merit to quit it, it would have been just as well your fault if you kept on smoking after recognizing you needed to quit.

                  • latexr 2 hours ago

                    > Your cigarette addiction might have started because of social pressure or because of advertisement, but every choice to light another one or not was entirely yours.

                    This is contradictory. Once you are addicted, the choice is no longer “entirely yours”. That’s what being an addict means, your physiology and your wants are in conflict and require constant active vigilance to contradict. Your head begins to rationalise and you’ll even forget you wanted to stop. If it were simply “entirely your choice”, addiction wouldn’t be an issue.

                    The advertising and other factors which caused you to become addicted don’t stop after you are addicted. So if you’re willing to admit that external factors may trigger the problem, you must be able to comprehend those factors also contribute to stopping you from solving it. But now you have your own biology as another obstacle.

                    I agree with you that the previous commenter made unreasonable assumptions about you, but I agree with them that at least in this particular conversation you’re not demonstrating empathy for the addict. What you’re essentially saying, repeatedly, is that they’re choosing to be addicts because they don’t simply choose to stop. This is not true, and you’ll quickly realise that if you engage with addicts, especially if they’re someone you knew from before. There is a transformation, addiction turns you into a different person you don’t always recognise.

                    • gchamonlive 1 hour ago

                      I am rereading my comments and you guys are right. What I meant to say and got sidetracked by doubling down on the "your fault" argument is that you can't help someone that doesn't want help. In this sense it's the responsibility of the person that got some kind of addiction to first recognize that they need to want to get help, only then help is effective. But yeah, you are right saying that you are like a different person on a abstinence episode.

                      Wish I could rephrase my comments, but at least I know next time I will treat this topic with more care.

          • kiba 22 hours ago

            We do all what we can as individuals, but it's not enough. The obesity crisis is going unabated except GLP-1 drugs to clean up the mess.

            • gchamonlive 21 hours ago

              we can also apply regulations but they are also not enough, otherwise people wouldn't OD on controlled substances. At one point the individual needs to start taking responsibility for their actions.

              • fugalfervor 18 hours ago

                And the unscrupulous purveyors of addictive products should likewise take responsibility for their actions. How often do you see that happen, though?

                • gchamonlive 17 hours ago

                  Not nearly enough, this is why as an individual we need to be twice as vigilant

      • rickdeckard 23 hours ago

        > So if we are aware of this and we consciously choose to keep engaging in dopaminergic activities, [..] I think it starts to become a choice.

        ...or a subtle addiction that also creates the impression of productivity/progress/social interaction...

        If so, then all applicable studies on addiction should be taken into consideration as well, but their context probably doesn't even begin to cover the size of the issue here.

      • lvales 23 hours ago

        Why don't addicts chose to stop with their addictive behaviour?

        And this isn't an excuse btw, but if you want to understand why, this is a good place to start.

        • gchamonlive 21 hours ago

          If they don't want to, go for it. I'm all up for the freedom to choose your poison, as long as it doesn't restrict someone else's freedom of choice (like jumping off a building and landing on someone, killing you both). What I'm saying is that if you recognize booze is bad for you, but you don't do anything about it because heck there is a billion dollar industry behind it, everyone drinks and you'll die anyway, IDK it seems to me like it's mostly your fault, because you'd know where yo get help if you really wanted to. That is, of course, assuming where you live has good policies for treating people with such diseases.

      • _DeadFred_ 17 hours ago

        You are pitting your randomly acquired will power and your in large part unintentional stumbling through life against all of human kind's psychology knowledge, against billions of dollars spent on advertising and advertising research. That is at this point tens maybe hundreds of millions of years of acquired human knowledge how to manipulate you versus your very randomly acquired 'will power'.

        Have you seen the quotes coming out of the richest/most powerful companies on the planet? These are very intentional impacts by companies more powerful than entire nations.

        I don't think 'but your willpower' stands a chance if you want to be connected to the modern world.

        • gchamonlive 13 hours ago

          So if you are helpless in a world dominated by the billionaires and the clerics of psychological research, what's the hope for the average person? Should we just accept our fate and waste away in endless Instagram feeds, alcohol, drugs, gambling and all forms of addiction? These need to be managed at society level, banning or taxing goods. But no amount of regulation can compensate for someone determined to destroy himself. So yeah, it's your fault. But it's also society's fault. And at the end of the day the most effective thing you have is your choice.

    • theshrike79 22 hours ago

      > When you’re bored, your mind goes to places it wouldn’t otherwise go. Curiosity kicks in. Curiosity is a precursor to learning. Learning engages the brain and is fun. But it’s not fun all the time, some of it is challenging and frustrating (which is good, that’s the process that teaches you).

      And I love how I can go from a curious brainfart "hmm, could I do a movie catalogue app that uses a web page + phone camera + OpenAI API to identify physical DVDs by front/back cover instead of trying to find a reliable barcode database" to it actually working in maybe two hours of real time. Just paused the movie I was watching, typed the idea to Claude Code on mobile and kept watching.

      After the movie went back to my computer, merged the changes and tested whether it worked. It mostly did. The UI/UX was horrible etc, but the basic idea was functional. It even got some of the movie extras correctly.

      I didn't try to turn it into a product, didn't buy a domain for it or advertise it on Reddit or Show HN. But now I know it CAN be done. Curiosity sated.

      • latexr 22 hours ago

        I don’t see what that has to do with “when did we stop liking to learn”, which is the only point I’m addressing. My point has nothing to do with AI and it doesn’t seem like you actually learned anything from that experiment.

        • swsieber 22 hours ago

          I read it more as response to your argument that it was a lack of curiosity due to over stimulation, which they responded to by citing an example of a time when they were curious while stimulated and chucked something at a vibe-coding agent to satisfy that curiosity.

          • theshrike79 17 hours ago

            Yep, now I have the time to satisfy my curiosity in weird things because it doesn't take active time from me that much.

            Before all those curiosity brain farts would've just slipped away with "oh well, I'll never have the time to find out".

            Wikipedia has scratched the same itch multiple times, but from a different angle. Doing deep dives in weird stuff is so much easier with Wikipedia than it used to be when you had to go to the library and pull books from the shelves for an hour or two.

    • madduci 22 hours ago

      We also stopped learning when someone had the idea to put unrealistic deadlines in projects and tackling tech debt has been denied and the most hated activity from management.

  • maxsilver 1 day ago

    > Also, when did we stop liking to learn?

    When the economy got so bad for so many people, that every waking moment has to be either chasing fresh cash (or spent in recovery from cash-chasing, worrying about new cash), to the point they have to largely ignore their own long term goals or basic morals or principles.

    You can blame all the new gadgets (phones/social media/tiktok/‘dopamine-things’) — but it’s a very much blaming the symptom, not the problem.

    (It’s the meme. “Guys, this isn’t funny. Humans only do this when they’re very distressed”)

  • dpoloncsak 22 hours ago

    Just here to say I love the line 'A hammer is a really great tool that has thousands of purpose-designed uses. I still prefer my key to get into my car.'

    Been saying the 'Hammer is a great tool but you need to know when to use it, just like AI.' to coworkers, and i'm ̶s̶t̶e̶a̶l̶i̶n̶g̶ borrowing your quote instead, now

  • miki123211 20 hours ago

    In my view, AI is worst at crossing the rubicon from a 200-line script to a maintainable architecture of ~10kloc.

    If you already have a decent architecture, adding a new feature is usually fine. If you have nothing and need it to write a 200-line script, that's usually fine. If you need it to figure out a maintainable architecture that will be easy to extend in the future... that;'s where the problems start.

    • esafak 19 hours ago

      You need to be involved in the architecture.

  • throwanem 19 hours ago

    > Also, when did we stop liking to learn?

    When it got dangerous to spend that kind of time without a bullet-point deliverable.

  • kreneskyp 19 hours ago

    > If you're trying to learn something new like an algorithm, protocol, or API write that shit by hand. You learn by doing, and when you know how the thing works and have that mental context, you will always be faster than an AI. Also, when did we stop liking to learn?

    I vibe engineer to learn. I am currently doing this with a project to build a Vector DB extension in postgres. Several aspects of this project are very new to me. I don't write any of the code. I have never written a single line of Rust. I do, however, spend a significant amount of time discussing architecture and design with the agents.

    I started with well known algorithms (HNSW, IVF, DiskANN, TurboQuant, RabitQ, PQFastScan) and have since moved on to a novel implementation based on fairly recent research papers.

    My primary goal is to learn. That is a success and ongoing. A stretch goal is to contribute novel ideas back to the community, which may be useful even if what I build isn't ever production ready.

  • dirtbag__dad 11 hours ago

    > Also, when did we stop liking to learn?

    Says who? One of the most enriching things about coding with agents is I have them provide new information, tools, patterns, whatever as a follow up to every feature I work on. I’m learning a ton and it’s helping me build better with agents, too.

etothet 1 day ago

Vibe Coding (and LLMs) did not create undisciplined engineering organizations or engineers. They exposed and accelerated them.

Plenty of engineers have loose (or no!) standards and practices over how they write coee. Similarly, plenty of engineering teams have weak and loose standards over how code gets pushed to production. This concept isn't new, it's just a lot easier for individuals and teams who have never really adhered to any sort of standards in their SDLC to produce a lot more code and flesh out ideas.

  • datsci_est_2015 1 day ago

    Bad engineers continue being bad, good engineers continue being good.

    I personally don’t know any colleagues who were good engineers just because they wrote code faster. The best engineers I know were ones who drew on experience and careful consideration and shared critical insights with their team that steered the direction of the system positively.

    > Claude, engineer a system for me, but do it good. Thanks!

    • embedding-shape 1 day ago

      > I personally don’t know any colleagues who were good engineers just because they wrote code faster

      Same, if anything, the opposite seems to be true, the ones that I'd call "good engineers" were slower, less panicked when production was down and could reason their way (slowly) through pretty much anything thrown at them.

      Opposite experience, I've sit next to developers who are trying their fastest to restore production and then making more mistakes to make it even worse, or developers who rush through the first implementation idea they had for a feature, missing to consider so many things and so on.

      • ryandrake 1 day ago

        > Same, if anything, the opposite seems to be true, the ones that I'd call "good engineers" were slower

        Unfortunately, a lot of workplaces are ignoring this, believing their engineers are assembly line workers, and the ones who complete 10 widgets per minute are simply better than the ones who complete 5 widgets per minute.

        • nathan_compton 1 day ago

          It isn't just that they believe this - they want a business model where this is how it works. For a big company a star coder is a liability - they have strong labor power, they can leave and they are hard to replace, etc.

          Companies want workflows that work with mediocre programmers because they are more like interchangeable parts. This is the real secret to why AI programming will work in a lot of places. If you look at the externalities of employing talented people, shitty code actually looks better than great code.

          • ryandrake 1 day ago

            To these kinds of companies, what's even better than a rack of mediocre programmers? AI agents that you can just conjure up and prompt. They take up no facility space, don't require lunch breaks or vacations, obey all commands and direction, and produce a predictable and consistent amount of output per dollar.

            This is the earworm the leaders of these companies have allowed into their minds. Like Agent Mulder, they Want To Believe in this so badly...

            • overfeed 1 day ago

              > This is the earworm the leaders of these companies have allowed into their minds. Like Agent Mulder, they Want To Believe in this so badly...

              If you assume they are not idiots and analyze the FOMO incentives via a little game-theory, it becomes clear why.

              Assuming the competition has adopted AI, leadership can ignore it, or pursue it. If they adopt it, then they are level with the completion whether AI actually succeeds or fails - they get to keep their executive job.

              If leadership ignores AI, and it actually delivers the productivity gains to the competition, they will be fired. If they ignore AI and it's a bust, they gain nothing.

              • untrust 1 day ago

                What if the outcome is the competition burns their money on LLM usage for little to no gain? If you're an exec and you jumped into LLMs as well then you also lose any advantage you would have had by saving your money or hiring a few more humans.

                • overfeed 1 day ago

                  > What if the outcome is the competition burns their money on LLM usage for little to no gain?

                  The company does better than the money-burning competition, but the executives personally gain nothing; there are no bonuses just because the competition took a misstep.

              • m4x 1 day ago

                If AI turns out to be a bust, ignoring it could become a significant win. One possible outcome of AI adoption is that existing code bases are degraded, and existing programmer capability is allowed to atrophy. In that situation, companies that adopt AI lose out relative to companies that eschew it.

            • sanderjd 1 day ago

              Yeah but does this work? Are there companies doing this successfully?

          • datsci_est_2015 1 day ago

            Glad I find myself employed under a division called Research and Development. Poaching and retaining highly compensated individuals is the entire purpose.

          • anal_reactor 1 day ago

            Bingo. This is something that many people fail to understand.

            • hyperadvanced 1 day ago

              I think you can understand that line of reasoning, but you can question its feasibility. You might not have any “star coders”, nor need them day-to-day, but I think the cost of not having one true expert, or having a completely vibe coded system that crashes in production will be extremely high.

          • duskdozer 1 day ago

            It's also true that a lot of times, it doesn't even matter how shitty the code is. For example, I'm locked in to a company whose web "app" hasn't functioned for me for the vast majority of the last two to three years. I can't leave without effectively being required to leave my job. So, they still get my business.

      • sanderjd 1 day ago

        This is true. But I find AI tools to be a huge help for all of this. Not to do any of it faster, but to remove a bunch of the tedium from the process of testing ideas and iterating on them. Instead of "I wonder if the problem is..." requiring half an hour of research, now I can do an initial check of that theory in less than a minute, and then dig further, or move onto the next one. Or say I estimate it's gonna take me an hour or more to test an idea, I might just decide I don't have time to invest in that. Well now maybe I can get a tentative answer on that by spending a minute laying out the theory and letting an agent spend ten or twenty minutes on it in the background. In this way I can explore space I just would have determined was not worth the effort previously.

        To me, none of this feels like "going faster", it feels like "opening up possibilities to try more things, with a lot less tedious work".

        • skydhash 1 day ago

          Have you ever wonder how people do it without it being a tedium for them?

          For things that have a visual elements like UI and UX, you can start with sketches (analog or digital) and eliminate the bad ideas, refine the good ones with higher quality rendering. Then choose one concept and inplement it. By that time, the code is trivial. What I found with LLM usage is that people will settle on the first one, declaring it good enough, and not exploring further (because that is tedious for them).

          The other type of problem are mostly three categories (mathematical, logical, or data/information/communication). For the first type you have to find the formula, prove it is correct, and translate it faithfully to code. But we rarely have that kind of problem today unless you’re in a research lab or dealing with floating-point issues.

          The second type is more common where you enacting rules based on some axioms originating from the systems you depend on. That leads to the creation of constraints and invariants. Again I’m not seeing LLM helping there as they lack internal consistency for this type of activity. (Learning Prolog helps in solving that kind of problem)

          The third type is about modelizing real world elements as data structures and designing how they transform overtime and how they interact with each other. To do it well, you need deep domain knowledge about the problem. If LLM can help you there that means two things: a) Your knowledge is lacking and you ought to talk to the people you’re building the system for; b) The problem is solved and you’d do well to learn from the solution. (Basically what the DDD books are all about)

          Most problems are a combination of subproblems of those three categories (recursively). But from my (admittedly small amount of) interactions with pro LLM users, they don’t want to solve a problem, they want it to be solved for them. So it’s not about avoiding tediousness, it’s sidestepping the whole thing.

          • sanderjd 21 hours ago

            I've been doing this for a couple decades. I don't wonder how people did it before AI, I did it for years and years before any of this existed...

            > What I found with LLM usage is that people will settle on the first one, declaring it good enough, and not exploring further (because that is tedious for them).

            I don't relate to this at all. It's so much easier (and less tedious) to experiment and iterate now. I see people doing a lot more of it, not less.

            AI tools are also excellent aids to all the other types of problems you elucidated. You're doing theorycraft, and I might even agree with you if I just sat down and theorycrafted out how I thought this would work for each type of problem as you're doing here. (Indeed, you can probably find HN comments I made in 2022 and 2023 that say very similar things as you're saying!)

            But in practice, I find all your theories here about why AI tools are not useful in this or that case to just be totally wrong.

    • jkaptur 1 day ago

      > I personally don’t know any colleagues who were good engineers just because they wrote code faster.

      However, the best engineers I know are usually among the quickest to open an editor or debugger and use it fluently to try something out. It's precisely that speed that enables a process like "let's try X, hmm, how about Y, no... ok, Z is nice; ok team, here are the tradeoffs...". Then they remember their experience with X, Y, and Z, and use it to shape their thinking going forward.

      Meanwhile, other engineers have gotten X to finally mostly work and are invested in shipping it because they just want to be done. In my experience, this is how a lot of coding agents seem to act.

      It's not obvious to me how to apply the expert loop to agentic coding. Of course you can ask your agent to try several different things and pick the best, or ask it to recommend architectural improvements that would make a given change easier...

      • skydhash 1 day ago

        > However, the best engineers I know are usually among the quickest to open an editor or debugger and use it fluently to try something out

        The Pragmatic Programmer book has whole chapters about this. Ultimately, you either solve the problem analogously (whiteboard, deep thinking on a sofa). Or you got fast as trying out stuff AND keeping the good bits.

      • datsci_est_2015 1 day ago

        Or: depth-first search of the solution space vs breadth-first (or balanced) search of the solution space.

        > Of course you can ask your agent to try several different things and pick the best, or ask it to recommend architectural improvements that would make a given change easier

        The ideal solution increasingly seems to be encoding everything that differentiates a good engineer from a bad engineer into your prompt.

        But at that point the LLM isn’t really the model as much as the medium. And I have some doubts that LLMs are the ideal medium for encoding expertise.

      • beacon294 1 day ago

        As you practice it will be apparent, you simply keep working on the application architecture yourself.

      • sanderjd 1 day ago

        I really don't relate to this...

        The way you apply the expert loop is to be the expert. "Can we try this...", "have you checked that...", "but what about...".

        To some degree you can try to get agents to work like this themselves, but it's also totally fine (good, actually) to be nudging the work actively.

      • Quekid5 1 day ago

        > However, the best engineers I know are usually among the quickest to open an editor or debugger and use it fluently to try something out.

        That's not my experience... mostly it's about first interrogating the actual problem with the customer and conditions under which it occurs. Maybe we even have appropriate logging in our production application? We usually do, because you know, we usually need to debug things that have already happened.

        (If it's new/unreleased code, sure fine, let's find a debugger.)

    • truncate 1 day ago

      >> Bad engineers continue being bad, good engineers continue being good.

      I don't know if good engineers can necessarily continue to be good. There is limit to how much careful consideration one can give if everything is on an accelerated timeline. Regardless good or not, there is limit on how much influence you have on setting those timelines. The whole playing field is changing.

      • andai 1 day ago

        An old comic I like:

        - I've taken a controversial new pill that accelerates my brain.

        -- So you're smart now?

        - I'm stupid faster!

        That being said, being stupid faster can work if validation is cheap (and exists in the first place).

        Turns out "eh close enough" for AGI is just stupidity in an "until done" loop. (Technically referred to as Ralphing.)

        • sanderjd 1 day ago

          Yep, validation is key. The smartest thing I've heard on this, which has reoriented how I think about this is that the objective function of a piece of software is now more important to get right than the implementation.

          • dspillett 1 day ago

            > the objective function of a piece of software is now more important to get right than the implementation

            That has always been the case. That is why weeks or even months of programming and other project busy work could replace a couple of days of time getting properly fleshed out requirements down.

            • sanderjd 1 day ago

              Agreed, it has always been the case. But I've never thought of it that way so explicitly. And I might argue that the important distinction is that the objective function is programmatically verifiable (which the word "requirements" has not always implied).

          • andai 1 day ago

            Turns out what was being rewarded all along is "the code looks all right" and "it looks like it works".

            • sanderjd 21 hours ago

              No, what is rewarded is "the code has been shown to conform to the given objective function" and especially "that objective function is a good representation of what we are trying to accomplish with this code".

        • Salgat 1 day ago

          So the chimpanzees on the keyboard thing is real.

        • dpoloncsak 21 hours ago

          I say this usually about self-driving cars, but the phrase fits here too. "It doesn't need to be perfect. It just needs to be better than the average human, and humans suck at driving."

      • sanderjd 1 day ago

        Hmmm, I think I disagree with this.

        I estimate that I'm now spending about 10 to 30 hours less time a week in the mechanical parts of writing and refactoring code, researching how to plumb components together, and doing "figure out how to do unfamiliar thing" research.

        All of those hours are time that can now be spent doing "careful consideration" (or just being with my family or at the gym or reading a book, which is all cognitively valuable as well).

        Now, I suppose I agree that if timelines accelerate ahead of that amount of regained time, then I'm net worse off, but that's not the current situation at the moment, in my experience.

        • truncate 1 day ago

          Maybe we do different things. Not that you are wrong about spending less time on things that you don't care about, but at the same time all that mechanical things helps you build a really good mental model of your product from high level design to individual classes. If I already have a good mental model of that I can direct AI to make really good changes fast, if I don't I will get things done ... but it does end up with less than ideal changes that compounds over time.

          What you said: "figure out how to do unfamiliar thing" -- is correct, and will get things done, but overall quality, maintainability or understanding how individual pieces work...that's what you don't get. One can argue who care about all that as AI can take care of that or already can. I don't think its true today at-least.

          • sanderjd 1 day ago

            I guess I just don't really agree that doing the tedious mechanical things is all that helpful for building the necessary mental model. I mean, I do think it was useful (indeed, necessary) for me to actually type out very similar lines of code over and over again when I was building up the programming skillset, but I really think the marginal value of that is just very low for me at this point. I worry a lot about how we're going to train the next generation of people without there being any incentive to do this part of the process! But for me, I already did that part.

            What I find is actually necessary for me to have a mental model of the system is not typing out the definitions of the classes and such, but rather operating and debugging the system. I really do need to try to do things, and dig into logs, and figure out what's going on when something is off. And pretty much always ends up requiring reading and understanding a bunch of the implementation. But whether I personally typed out that implementation, or one of my colleagues, or an AI, is less important.

            I mean, I already had to be able to build a mental model of a system that I didn't fully implement myself! I essentially never work on anything that I have developed in its entirety on my own.

        • kdnxownxkwkd 1 day ago

          Yeah! I mean, who needs to LEARN how to to these things properly when you can just let an autocorrect on steroids hallucinate the closest thing to “barely working”. Right?

          10 to 30 hours saved on not learning new things! Hurray!

          • sanderjd 1 day ago

            I genuinely don't understand what you're talking about with this comment. Learn how to do what things properly? I've been writing software for two decades... I'm not primarily in a learning phase, I'm in a doing phase. I'll take advantage of tools that save me time and energy in my work (for the right price). Why wouldn't I?

            What do you mean by "barely working"? I can now put more iterations into getting things working better, more quickly, with less effort. That seems good to me.

            10 to 30 hours a week is 25% to 75% of my time working. Seems like a pretty good trade?

            I do understand that the calculation is different for people who are new to this. And I worry a lot about how people will build their skills and expertise when there is no incentive to put in all the tedious legwork. But that just isn't the phase of my career that I'm in...

            • skydhash 1 day ago

              My one question for you: What’s your level of editor fluency? Because I would really like to know if there’s a correlation between claiming these kind of time savings and not using advanced features in your editor.

              My time is spent more on editing code than writing new lines. Because code is so repetitive, I mostly do copy-pasting, using the completion and the snippets engine, reorganize code. If I need a new module, I just copy what’s most similar, remove everything and add the new parts. That means I only write 20 lines of that 200 lines diff.

              Also my editor (emacs) is my hub where I launch builds and tests, where I commit code, where I track todo and jot notes. Everything accessible with a short sequence of keys. Once you have a setup like this, it’s flow state for every task. Using LLM tools is painful, like being in a cubicle reading reports when you could be mentally skiing on code.

              • sanderjd 21 hours ago

                High.

                My 2023 to early 2025 usage of AI was as "slight improvement to my existing editing and autocomplete capabilities". That was great and I loved it. But sometime over the last 12 months it has switched to "mostly using the editor pane to read rather than edit".

                Honestly I experience this as a great loss. All these hours over all these years perfecting the vim editing movements! And now I only spend like 10% of my time directly editing things anymore.

                I feel like it would be fun (and also sad and nostalgic) to see a time lapse of the relative size and time spent focused between my editor pane, terminal pane, and AI tool pane. It has changed massively, especially in the last year.

            • vultour 1 day ago

              There is simply no chance that LLMs are saving you 30 hours of work a week, especially if they're doing something where you'd have to do the research yourself. Either you're just simply wrong, or you went from understanding the code you were writing to skimming whatever the magic box spits out and either merging it outright or pawning off the effort of review on someone else.

              • sanderjd 21 hours ago

                That's why I gave a range. I didn't say it is saving me 30 hours every week, I said 10 to 30 hours a week. So 30 is the max of the range, and I'd say the distribution is pretty heavily left-skewed. It really depends on what I'm doing, but I do think there are weeks where it has save me 75% of the time I would have otherwise spent. I think there are two kinds of weeks where this is the case:

                1. A week where I would have otherwise actually spent the majority of my time writing out and doing a ton of refactoring of a lot of implementation code. This is very rare for me, but it does exist. I can remember how it could actually take me a whole week to just "code up" meaningfully sized prototypes or greenfield implementations of some unambiguous thing. Truly, now, for that kind of work, claude code can save me full days of mechanical work.

                2. A week where there is something very subtle going on that I have to figure out, probably having to do with some component or system I'm not very familiar with yet. Having an AI tool as a rubber ducky, or like a supercharged stackoverflow, can save me days of reading, debugging, working on minimal repros, etc.

                Again, I'm not saying this is the common case at all. And estimating this kind of thing is always wildly inaccurate, so sure, take it with a grain of salt. But I know that a few times now, doing estimates based on my past experience, I've said "that will take me a week" (in case #1) or "gosh, I dunno, that's a tricky one, that might take me a week to figure out" (in case #2), and instead it only took me a day.

                But honestly I think people focus too much on the high end of this range. The more valuable thing to me is the large number of weeks where it saves me that 10 to 15 hours, where I can then use that time to research new things, try more ideas, say "yes" to more things, or just not spend that time working.

      • ori_b 1 day ago

        It's deeper. We used to mock architects that stepped back and stopped coding, because they generated trash.

        There's a cycle that is needed for good system design. Start with a problem and an approach, and write some code. As you write the code, you reify the design and flesh out the edge cases, learning where you got the details wrong. As you learn the details, you go back to the drawing board and shuffle the puzzle pieces, and try again.

        Polished, effective systems don't just fall out of an engineers head. They're learned as you shape them.

        Good engineers won't continue to be good when vibe-coding, because the thing that made them good was the learning loop. They may be able to coast for a while, at best.

        • beeandapenguin 1 day ago

          Reminds me of Gall’s Law from his book Systemantics.

          A complex system that works is invariably found to have evolved from a simple system that worked. A complex system designed from scratch never works and cannot be patched up to make it work. You have to start over with a working simple system.

          https://en.wikipedia.org/wiki/John_Gall_(author)#Gall's_law

          • ori_b 1 day ago

            I find that the learning and iteration tends to lead to a simplified system, if you're willing to look hard enough at the shapes needed.

            When there's a lot of complexity, it's often repetitive translation layers, and not something fundamental to the problem being solved.

        • sothatsit 1 day ago

          You don’t need to write code by hand to learn from iterations and experiments. I run more experiments and try out more different solutions than I ever could before, and that leads to better decisions. I still read all the code that gets shipped, and don’t want to give that up, but the idea that all craft and learning is lost when you don’t is a bit silly. The craft/learning just moves.

          • ori_b 1 day ago

            How much calculus do you think you could pick up skimming a textbook without doing exercises?

            We mocked these "architects" from experience. We knew that if you weren't feeling the friction yourself, you wouldn't learn enough to do good design.

            Maybe you don't care about engineering great systems. Most companies don't. It's good for profit. This isn't new, though AI enables less care.

            • sothatsit 1 day ago

              The entire mistake you are making is comparing using AI to skimming textbooks, or taking shortcuts. Your entire premise is wrong.

              People who care about craft will care about the quality of what they produce whether they use AI or not.

              The code I ship now is better tested and better thought through now than before I used AI because I can do a lot more. That extra time goes into additional experiments, jumping down more rabbit holes, and trying out ideas I previously couldn’t due to time constraints. It’s freeing to be able to spend more time to improve quality because the ROI on time spent experimenting has gone up dramatically.

              • ori_b 21 hours ago

                You can keep telling yourself that. I have seen the results from others making the same arguments. The result is invariably trash.

                • sothatsit 21 hours ago

                  Well you have obviously already made up your mind, so have fun with your confirmation bias. We'll all be over here having a good time, getting more work done. Feel free to come over when you put down your grudge.

                • codebolt 19 hours ago

                  My agentic workflow probably differs somewhat from the majority of others here, but I can positively guarantee you that both the quality and quantity of my output is significantly higher than it has ever been, in my 20-something years of writing code. And at least 90÷ of the code I've written this year was output by an LLM. You can keep sticking your head in the sand, in the end it will only be to your own detriment.

              • zarkov99 20 hours ago

                I care deeply about craft, but:

                a) I cannot effectively review more than 2000 lines of code a day. The LLMs can produce much more than that. b) Even if I accepted my reading throughput limitations as the cost of being in the loop, reading is not enough to keep cognitive debt in check: my skills will atrophy if I do not participate in the writing ("What I cannot create I cannot understand").

                So, to me, it seems like we, humans, either have to come up with higher (and deterministic) abstractions than code to communicate with LLMs or resign ourselves to letting the LLM guess what we want from English and then banging on the output to see if it sort of works. This later state of affairs seems to be what the current trend is and I find that absolutely revolting.

                • sothatsit 19 hours ago

                  I think the distinction is that for experiments and prototypes the behaviour of the final system is what we are trying to design. We can experiment and see the tradeoffs and explore the design space before committing to a direction. And then we can sit down and produce the final code to a quality we are happy with. If you are serious about this process, there is no way you are producing 1000s of lines of code a day, unless it is trivial boilerplate.

                  In terms of higher-level abstractions, I agree this is one particularly treacherous rung on the ladder of abstractions. Previous abstractions like compilers or garbage collectors have at least had more structure/rules to rely upon. I don't know exactly how that will look but I don't think we will solely be relying on banging on the output, we will also be spot-checking the source code, using profilers or other tools to inspect the behaviour of systems, and asking the agent to explain the architectural decisions made. I'm not sure exactly how this will look, but I do believe that people who care will still find ways to do good work.

            • necovek 1 day ago

              This is an unpopular take, but when I was in undergrad maths in an old-school two-semester courses with one exam (exercises + oral) to cover it at the end, I was able to get to 60-80% score on exercises when I did just theory as prep.

              I couldn't get exercises done where there were tricks/shortcuts which are learned by doing a lot of exercises, but for many, these are still the same tricks/shortcuts used in proofs.

              This was indeed rare among students, but let's not discount that there are people who _can_ learn from well systemized material and then apply that in practice. Everyone does this to an extent or everyone would have to learn from the basics.

              The problem with SW design is that it is not well systemized, and we still have at least two strong opposing currents (agile/iterative vs waterfall/pre-designed).

            • torginus 1 day ago

              Imo the biggest issue with these no-code architects has been that you could become one without ever having coded at any noteworthy level of skill (which meant most of them were like this).

              In my experience, in a lot of organizations, a lot of people either lacked the ability or the willingness to achieve any level of technical competence.

              Many of these people played the management game, and even if they started out as devs (very mediocre ones at best), they quickly transitioned out from the trenches and started producing vague technical guidance that usually did nothing to address the problems at hand, but could be endlessly recycled to any scenario.

      • datsci_est_2015 1 day ago

        > if everything is on an accelerated timeline

        Good engineers are also capable of managing expectations. They can effectively communicate with stakeholders what compromises must be made in order to meet accelerated timelines, just as they always have.

        We’ve already had conversations with overeager product people what the ramifications are for introducing their vibe coded monstrosities:

          - Have you considered X?
          - Have you considered Y?
        

        Their contributions are quickly shot down by other stakeholders as being too risky compared to the more measured contributions of proper engineers (still accelerated by AI, but not fully vibe-coded).

        If that’s not the situation where you work, then unfortunately it’s time to start playing politics or find a new place to work that knows how to properly assess risk.

      • runarberg 1 day ago

        When there is all that crap out there, good engineer may simply just carry out, call it good and leave the industry. Personally seeing the proliferation of wibe coded apps has made me hesitant of publishing and promoting my AI free apps.

      • paulddraper 1 day ago

        There is no limit.

        Or at least, the limit is increasing by the day.

    • LtWorf 1 day ago

      Good engineers need to be allowed to be good. If they are told to pump features or lose their job, they might act like bad engineers as well.

      • sanderjd 1 day ago

        Aren't they more likely to leave?

        • LtWorf 1 day ago

          Depends. If they have a good salary, nice coworkers, WFH. If they manage to tolerate having to produce crap they might stick around if other factors are above average.

          For someone with 3-4 kids who lives far from the city, WFH and time flexibility can be important motivators.

    • nly 1 day ago

      The best paid engineers I know seem to be the super fast hackers who write unfathomable amounts of code in short order.

      Unfortunately thoughtful design and engineering doesn't get recognised

      • bdangubic 1 day ago

        in my experience this is because there are very very very very few thoughtful designers and engineers, especially compared to people that are cranking out code.

        • galangalalgol 1 day ago

          Also thoughtful code varies from that library that does that thing you need with an api so intuitive you don't even need autocomplete or docs (though it has docs) to the library that is extensible to every possible use case you will never need but missing the obious ones you do or at least makes them horribly unergonomic in the name of that extensibility and purity with regard to some random paradigm that is self evidently the best one.

    • notnullorvoid 1 day ago

      > Bad engineers continue being bad, good engineers continue being good.

      Unfortunately I have seen some really good software engineering peers regress into bad engineers through a increasing reliance on AI.

      Conversely some very bad engineers (undeserving of the title) have been producing better outputs than I ever expected possible of them.

  • bitexploder 1 day ago

    Vibe coded apps with barely no tests, invariants, etc. No wonder it turns into spaghetti. You can always refactor code, force agents to write small modular pieces and files. Good engineering is good engineering whether an agent or human wrote the code. Take time to force agents to refactor, explore choices. Humans must at least understand and drive architecture at this point still. Agents can help and do recon amazingly and provide suggestions.

    • mleo 1 day ago

      I can’t understand this. The first thing I do with new agent driven project is set up quality checks. Linters, test frameworks, static analysis, etc… Whatever I would expect a developer to do, I would expect an agent to do. All implementation has to go through build success and mixed agent reviews before moving on. I might not do this with initial research/throwaway prototype, but once I know what direction to go and expect code to go to production it is vital to set guard rails.

      • gck1 1 day ago

        > The first thing I do with new agent driven project is set up quality checks. Linters, test frameworks, static analysis, etc

        I do this too, but then I sit and observe how agent gets very creative by going around all of these layers just to get to the finish line faster.

        Say, for example, if I needlessly pass a mutable reference and the linter screams at me, I know it's either linter is wrong in this case, or I should listen to it and change the signature. If I make the lazy choice, I will be dissatisfied with myself, I might even get scolded, or even fired if I keep making lazy choices.

        LLM doesn't get these feelings.

        LLM will almost always go for silencing it because it prevents it from reaching the 'reward'. If you put guardrails so that LLM isn't allowed to silence anything, then you get things like 'ok, I'll just do foo.accessed = 1 to satisfy the linter'.

        Same story with tests. Who decides when it's the test that should be changed/deleted or the implementation?

        • Daishiman 1 day ago

          > Same story with tests. Who decides when it's the test that should be changed/deleted or the implementation?

          Claude is remarkably good at figuring this is out. I asked it to look at a failing test in a large and messy Python codebase. It found the root cause and then asked whether the failure was either a regression or an insufficiently specified test, performed its own investigation, and found that the test harness was missing mocks that were exposed by the bug fix.

          It has become amazingly good at investigating.

          • gck1 18 hours ago

            If you point it at a specific thing and ask a specific question, yes, it will figure it out.

            But I never have "fix this test" as a task. What happens when you task it with a feature implementation and test breaks in the middle of the session? It will not behave the same way.

        • bitexploder 18 hours ago

          You have to not "stress" the agents out over testing. If a gate is no failing tests they cheat. If the gate is triage failing tests, quantify risk of failing test, prioritize in next work cycles... agents behave amazingly better at cheating tests.

      • Quekid5 1 day ago

        Generated tests... I mean... listen to yourself.

        I can generate a lot of tests amounting to assert(true). Yeah, LLM generated tests aren't quite that simplistic, but are you checking that all the tests actually make sense and test anything useful? If no, those tests are useless. If yes, I don't actually believe you.

        It's the typical 10 line diff getting scrutinized to death, 1000 line diff: Instant LGTM.

        Pay attention to YOUR OWN incentives.

  • jakevoytko 1 day ago

    Yeah, a lot of people came of age with a "we'll fix it when it's a problem" mindset. Previously their codebases would start to resist feature development, you'd fix the immediate bottlenecks, and then you could kick the can down the road a bit until you hit the next point of resistance. You kinda refactor as you do features. The frontier models have pushed the "it's a problem" moment further back. They can kinda work with whatever pile of code you give them... to a point. So it manifests as the LLM introducing extra regressions, or dropping more requirements than it used to, but it's not really manifesting as the job being harder for you. It's just not as smooth as it was from an empty repository. Then you hit the point where it just breaks too much and you need to fix it. And the whole codebase is just fractal layers of decisions that you didn't make. That's hard to untangle. And you're not editing the code yourself, so you don't have that visceral "adding this specific thing in this specific way has a lot of tension" reaction that allows you to have those refactoring breakthroughs.

    • meridian-v 1 day ago

      This is the sharpest observation in the thread. The "tension" you describe is proprioception for code — you feel where the abstractions leak, where the seams don't align, through the act of writing and refactoring. It's not a visual signal. You can't get it from reading a diff.

      The risk isn't that agents write bad code. It's that developers lose the sense that tells them where code is bad. Code review is perception. Writing code is proprioception. They're different senses and one doesn't substitute for the other.

      The question for the agent era isn't "is the code good enough to ship" — it's "do I still have enough coupling to the codebase to know when it isn't?"

  • jsemrau 1 day ago

    The same applies to banks and lending standards. In the end it is a function of governance and professional conduct.

  • lumost 1 day ago

    Honestly, the problem is one of BS detection.

    Lead engineer says something is not workable? Pm overrides saying that Claude code could do it. Problems found months later at launch and now the engineers are on the hook.

    New junior onboardee declares that their new vision is the best and gets management onto it cuz it’s trendy -> broken app.

    It’s made collaboration nearly unbearable as you are beholden to the person with the lowest standards.

    • tom1337 1 day ago

      I hate how correct you are. Working at a company with only two engineers and few sales and marketing people the amount of "hey i made that feature with claude when can we ship it for the customer? I showed them and they really like it" only to look at the code and find out that it doesn't adhere any of our standards and is not of a good quality either. But if you tell that then it's "yea but everyone is ai shipping now and we cannot be the ones not doing it as we will lose customers..." yea but now we are losing maintainability, understanding of our codebase and make ourself dependant on LLM providers who are getting more expensive every week.

    • zxspectrumk48 1 day ago

      > It’s made collaboration nearly unbearable as you are beholden to the person with the lowest standards.

      Exactly right.

  • teeray 1 day ago

    Can’t wait for the next stage of escalation when teams start to feel code review is keeping them from vibe coding utopia. It’ll probably be “AI review only, keep your human opinions to yourself” just so they can continue to check the “all changes are reviewed” box on security checklists.

  • tbrownaw 1 day ago

    > Vibe Coding (and LLMs) did not create undisciplined engineering organizations or engineers.

    Loss of discipline can be a result of panic or greed.

    Perhaps believing that your own costs or your competitors' costs are suddenly becoming 10x lower could inspire one of those conditions?

    (Also for greenfield projects specifically, it can plausibly be an experiment just to verify what happens. Some orgs are big enough that of course they can put a couple people on a couple-month project that'll quite likely fall flat.)

  • layoric 1 day ago

    This is very true, I've found these tools that I am highly encouraged to use very hit and miss, which they are by nature. After using Matt Pocock's skills, I've come around to the idea that LLM's main utility is to act as the ultimate rubber ducky. The `grill-me` feature is honestly the most useful, not for guiding the follow up writing of code, but to make me write down and explore the idea I have more quickly. It's guesses of questions to ask are generally pretty good. I don't believe there is any 'understanding', so I feel the rubber ducky analogy works quite well. This isn't anything you couldn't do before with some discipline, but at least I find it helpful to be more consistent.

    • pydry 1 day ago

      The first time i used LLMs it was to try and refactor behind a solid body of tests i trusted.

      I figure if it cant code when it has all of the necessary context available and when obscure failures are easily detected then why would i trust it when building features and fixing bugs?

      It never did get good enough at refactoring.

      • layoric 1 day ago

        I agree, the mechanical refactoring of modern IDE tooling, especially with typed languages is so much faster and safer, it's not even close. These tools can be useful for sure, but I think in general they are being wayy over prescribed to different tasks.

  • jillesvangurp 1 day ago

    It's also helping the engineers that do have standards. A lot of what I put in my guard rails (crafted to get better outcomes for my prompts) is not exactly rocket science. Those guard rails just impose some sane engineering processes and stuff I care about.

    As models get better, they seem to be biased to doing most of these things without needing to be told. Also, coding tools come with built in skills and system prompts that achieve similar things.

    Two years ago I was copy pasting together a working python fast API server for a client from ChatGPT. This was pre-agentic tooling. It could sort of do small systems and work on a handful of files. I'm not a regular python user (most of my experience is kotlin based) but I understand how to structure a simple server product. Simple CRUD stuff. All we're talking here was some APIs, a DB, and a few other things. I made it use async IO and generate integration tests for all the endpoints. Took me about a day to get it to a working state. Python is simple enough that I can read it and understand what it's doing. But I never used any of the frameworks it picked.

    That's 2 years ago. I could probably condense that in a simple prompt and achieve the same result in 15 minutes or so. And there would be no need for me to read any of that code. I would be able to do it in Rust, Go, Zig, or whatever as well. What used to be a few days of work gets condensed into a few minutes of prompt time. And that's excluding all the BS scrum meetings we'd have to have about this that and the other thing. The bloody meetings take longer than generating the code.

    A few weeks ago I did a similar effort around banging together a Go server for processing location data. I've been working against a pretty detailed specification with a pretty large API surface and I wanted an OSS version of that. I have almost no experience with Go. I'd be fairly useless doing a detailed code review on a Go code base. So, how can I know the thing works? Very simple, I spent most of my time prompting for tests for edge cases, benchmarking, and iterating on internal architecture to improve the benchmark. The initial version worked alright but had very underwhelming performance. Once I got it doing things that looked right to me, I started working on that.

    To fix performance, I iterated on trying to figure out what was on the critical path and why and asking it for improvements and pointed questions about workers, queues, etc. In short, I was leaning on my experience of having worked on high throughput JVM based systems. I got performance up to processing thousands of locations per second; up from tens/hundreds. This system is intended for processing high frequency UWB data. There probably is some more wiggle room there to get it up further. I'm not done yet. The benchmark I created works with real data and I added generated scripts to replay that data and play it back at an accelerated rate with lots of interpolated position data. As a stress test it works amazingly well.

    This is what agentic engineering looks like. I'm not writing or reviewing code. But I still put in about a week plus of time here and I'm leaning on experience. It's not that different from how I would poke at some external component that I bought or sourced to figure out if it works as specified. At some point you stop hitting new problems and confidence levels rise to a point where you can sign off on the thing without ever having seen the code. Having managed teams, it's not that different from tasking others to do stuff. You might glance at their work but ultimately they do the work, not you.

  • adastra22 1 day ago

    LLMs are accelerants. They elevate great engineers to ever more dizzying heights of productivity. They also multiply massively the sloppy output of shit engineers.

zarzavat 1 day ago

Perhaps I've missed a few weeks worth of progress, but I don't think that AIs have become more trustworthy, the errors are just more subtle.

If the code doesn't compile, that's easy to spot. If the code compiles but doesn't work, that's still somewhat easy to spot.

If the code compiles and works, but it does the wrong thing in some edge case, or has a security vulnerability, or introduces tech debt or dubious architectural decisions, that's harder to spot but doesn't reduce the review burden whatsoever.

If anything, "truthy" code is more mentally taxing to review than just obviously bad code.

  • christoff12 1 day ago

    This has generally been the case, though. As mentioned in the post, "You want solutions that are proven to work before you take a risk on them" remains true and will be place where the edges are found.

    • zarzavat 1 day ago

      It's about responsibility.

      If I get pwned because my AI agent wrote code that had a security vulnerability, none of my users are going to accept the excuse that I used AI and it's a brave new world. I will get the blame, not Anthropic or OpenAI or Google but me.

      The same goes for if my AI generated code leads to data loss, or downtime, or if uses too many resources, or it doesn't scale, or it gives out error messages like candy.

      The buck stops with me and therefore I have to read the code, line-by-line, carefully.

      It's not even a formality. I constantly find issues with AI generated code. These things are lazy and often just stub out code instead of making a sober determination of whether the functionality can be stubbed out or not.

      You could say "just AI harder and get the AI to do the review", and I do this a lot, but reviewing is not a neutral activity. A review itself can be harmful if it flags spurious issues where the fix creates new problems. So I still have to go through the AI generated review issue-by-issue and weed out any harmful criticism.

      • user34283 1 day ago

        On the other hand, I don’t need to review carefully every line of code in my thumbnail generator and associated UI.

        My nonexistent backend isn’t going to be pwned if there is a bug in the thumbnail generation.

        After the QA testing on my device, a quick scroll through of the code is enough.

        Maybe prompt „are errors during thumbnail generation caught to prevent app crashes?“ if we‘re feeling extra cautious today.

        And just like that it saved a day of work.

        • jaggederest 1 day ago

          > My nonexistent backend isn’t going to be pwned if there is a bug in the thumbnail generation.

          Hmm. Historically image editing was one of the easier to exploit security holes in many systems. How do you feel about having unknown entities having shell inside your datacenter or vpc?

          • user34283 1 day ago

            I feel pretty good about the odds of attackers exploiting security holes in image editing functions my app does not have, in order to enter my also nonexistent datacenter or vpc.

        • djhn 1 day ago

          But a thumbnail generator is a 1 hour task at best if you’re on a solo greenfield project and it’ll still be a 6 week project at an enterprise, even with AI.

          • user34283 1 day ago

            I would be impressed if you implement it in an hour with the following features:

            - webview fallback with canvas capture for codecs not supported in the default player

            - detecting blank frames and diff between thumbnails to maximize variety

            - UI integration to visualize progress and pending thumbnails, batched updates to the gallery

            - versioning scheme and backfill for missing/outdated thumbnail formats

            Honestly, a day seems rather optimistic to me. Maybe if I was an expert for this platform and would have implemented a similar feature before, then I could hope to do it in a day.

            If I had to handwrite it and estimate it for Scrum at work, I‘d budget a week.

            • djhn 1 day ago

              Ok, fair. I incorrectly assumed you meant resizing static images to create a lower resolution preview image.

              Video thumbnails are a different beast altogether. And you might want to double check your assumptions about security considerations. If any of your ffmpeg, opencv, pyscenedetect code is running on your server, it might well be exploitable.

              • user34283 1 day ago

                It’s in-app on iOS.

                Ironically, already another user in this comment section was concerned about the security of my nonexistent backend.

                But it’s good to know, I was not previously aware that video processing on the backend is a common source of vulnerabilities.

        • parliament32 22 hours ago

          I assume you're talking about a local application? You don't care if a malicious image you downloaded pwns your PC then? Like CVE-2016-3714

      • jaggederest 1 day ago

        I think there's a couple levels here:

        First of all, building a system that constrains the output of the AI sufficiently, whether that's typing, testing, external validation, or manual human review in extremis. That gets you the best result out of whatever harness or orchestration you're using.

        Secondly, there's the level at which you're intervening, something along the hierarchy of "validate only usage from the customer perspective" to "review, edit, and validate every jot and tiddle of the codebase and environment". I think for relatively low importance things reviewing at the feature level (all code, but not interim diffs) is fine, but if you're doing network protocol you better at least validate everything carefully with fuzzing and prop testing or something like that.

        And then you've got how you structure your feedback to the LLM itself - is it an in-the-loop chat process, an edit-and-retry spec loop, go-nogo on a feature branch, or what? How does the process improve itself, basically?

        I agree with you entirely that the responsibility rests on the human, but there are a variety of ways to use these things that can increase or decrease the quality of code to time spent reviewing, and obviously different tasks have different levels of review scrutiny, as well.

  • xantronix 1 day ago

    I know there are good uses of LLMs out there. I do. But.

    The current fever pitch mandates from above seem to want it applied liberally, and pushing back against that is so discouraging and often career-limiting as to wear the fabric of one's psyche threadbare. With all the obvious problems being pointed out to people, there are just as many workarounds; and these workarounds, as is often revealed shortly thereafter, have their own problems, which beget new solutions, ad infinitum.

    At some point it genuinely seems like all this work is for the sake of the machine itself. I suppose that is true: The real goal has become obscured at so many firms today, that all that remains is the LLM. Are the people betting the farm and helping implement the visions of those who have done so guaranteed a soft exit to cushion them from the consequences, or is rationality really being discarded altogether?

    Sure, sound engineering principles can help work around these problems, but what efficiency is truly gained, in terms of cognitive load, developer time, money, or finite resources? Or were those ever an earnest concern?

    • Daishiman 1 day ago

      There's two sides to the AI mandates.

      The degenerate side is clueless upper management and fad-driven engineering. We have talked extensively about this.

      There is a more rational side to it that I've seen in my org: some engineers absolutely refuse to use AI and as a consequence they are now, clearly and objectively, much less productive than other engineers. The thing is, you still need to learn how to use the tool, so a nontrivial percentage of obstinate engineers need to be driven to use this in the same way that some developers have refused to use Docker or k8s or whatever.

      • callc 1 day ago

        Ah yes, we must force these obstinate engineers to the right path! Only after getting everyone to see the light will they understand and thank us for boundless productivity!! /s

        Perhaps these “obstinate” engineers have good reason in their decision. And it should be their decision!

        To be so confident in what is “the right way (TM)” and try to force it onto others is... revealing.

        • empthought 1 day ago

          Engineers that didn't move past src.v35.final.zip version control don't really have jobs today, either.

          • jaggederest 1 day ago

            You would be absolutely shocked how many software projects are still run, to this day, without source control at all. Or automated (or manual) testing. And how many hand crafted artisanal servers are running on AWS, never to be recovered if their EC2 instance is killed for some reason.

            • zbentley 1 day ago

              Sure, but that’s a small and shrinking market. Not a source of economic security or growth for its employees, nor for most of its companies (though some have defended niches).

              • jaggederest 1 day ago

                I've seen growing companies running multiple million ARR through systems like that. It's way more common than you'd think if you're a professional software developer.

          • seanw444 1 day ago

            I seriously don't see how version control and LLMs are comparable. A deterministic way to track code changes over time, versus an essentially non-deterministic statistical code generator that might get you what you want, and might do it in a reasonable time frame, and that might not land you in a minefield of short-term-good/long-term-bad design points.

            • foldr 1 day ago

              > an essentially non-deterministic statistical code generator that might get you what you want, and might do it in a reasonable time frame, and that might not land you in a minefield of short-term-good/long-term-bad design points.

              Sounds like a human? The ‘statistical’ part is arguable, I suppose.

          • xantronix 1 day ago

            There is an absolute embarrassment of modern tooling in other categories I have no problem whatsoever embracing. I'm not a holdout for being stuck in my ways. Maybe I value things other than expediency at massive cost. Maybe I speak just as well to computers as I do to humans.

            I'm sure I will have no problem whatsoever remaining in the employ of a firm that trusts me to make products and tooling that still push the envelope of what's possible without having to resort to the sheer brute force of trillion parameter-scale models.

            • Daishiman 1 day ago

              There is no massive cost. For 80% of the brute work that needs to be done day in and day out LLMs provide code as good as a senior engineer provided you have sufficient competency in steering the model, but done at breakneck pace.

        • Daishiman 1 day ago

          I ran the statistics myself and my company is spending 40% less time doing feature development since AI agents began to be used en masse and pushing 50% more tickets without any noticeable increase in regressions.

          After 18 months the hard evidence is in place. And much like replacing bare-metal servers for many use cases where evidence shows that the burden of k8s or the substitution of shell scripts for Terraform, it's time to move on.

          I don't really see a place for no AI usage in line-of-business software apps anymore.

          • svieira 1 day ago

            What did you use to fill the time you aren't doing feature development in with? Or are you all now working 20 hour work weeks?

            • Daishiman 1 day ago

              Faster feature development, more strategic thinking in how to keep the dev pipeline full, doing braindead mechanical improvements that pay off tech debt that would have otherwise not have management sign-off to justify, writing GUI-based tools for support teams that previously had to scour reams of shell scripts, spending more time on refining specifications and estimations, writing throwaway concepts of different design ideas in order to have better architetuce discussions based on real code instead of pseudocode, clearing out the backlog of bugs that used to be terribly annoying to reproduce and that now I can just throw brute compute for resolving.

              • sdevonoes 1 day ago

                Sounds awful. Just filling the time with worthless stuff. You are basically a liability. Wouldn’t like to have you in my team. Less is more (nowadays more than ever)

                • Daishiman 20 hours ago

                  Sounds like you don't read and you don't understand what adds value in a engineering team.

        • jaggederest 1 day ago

          Around the turn of the century there were the same exact arguments being made about automated testing (not just TDD, but any automated tests at all!)

    • user34283 1 day ago

      In my opinion you are just wrong.

      It’s an absolute game changer, and it can now multiply your productivity fivefold if it’s a solo greenfield project.

      Maybe half a year ago it was as you said. You had to wait for the agent to finish, you had to review carefully, and often the result was not that great. You did not save a lot of time.

      Now I can spin up 3+ parallel conversations in Codex, each in a git worktree. My work is mainly QA testing the features, refining the behavior, and sometimes making architectural decisions.

      The results are now undeniable. In the past I could not have developed a product of that scope in my free time.

      That is what is possible today. I suspect many engineers have not yet tried things that became feasible over the last months. Like parallel agents, resolving merge conflicts, separating out functionality from a large branch into proper PRs.

      • atomicnumber3 1 day ago

        "many engineers have not yet tried things that became feasible over the last months"

        I have heard this statement every single day for 2 years and yet we still have no companies compressing 10 years into 1 year thus exploding past all the incumbents who don't "get it".

        • passivepinetree 1 day ago

          Well, the GP mentioned

          > if it’s a solo greenfield project

          which is a pretty large caveat. Anecdotally, I've found my side projects (which are solo greenfield projects, and don't need to be supported to the same standards as enterprise software) have gained the boost the GP was talking about.

          At work, it's different, since design, review, and maintenance is much more onerous.

        • simonw 1 day ago

          If you want an example of a project that condensed 5 years into 6 months and exploded past the competition I suggest looking at OpenClaw.

          The first line of code was written on November 25th. It achieved adoption in the "personal agents" space that far exceeded the other companies that had tried the same thing.

          (Whether or not you trust the quality of the software you can't deny the impact it had in such a short time. It defined a new category of software.)

          • mlsu 1 day ago

            Ideally, the given example would be something not ajacent to the presently white-hot category of "AI agents".

            Like, look at e.g. YC minus the AI and AI ajacent companies. Are those startups meaningfully more impressive or feature-rich as compared to a couple years ago?

            • simonw 1 day ago

              Not yet, no. I think that's because coding agents got good in November, most people didn't notice until January and it still takes 3-4 months to go from idea to releasing something.

              I expect we will start seeing the impact of the new coding agent enhanced development processes over the next few months.

          • retinaros 1 day ago

            Its trash vibecoded markdown files around pi. This exemplifies well what op’is saying. We are at the ICO stage of llms. Hopefully there wont be an nft one

            • mschuster91 1 day ago

              As much as I love to hate on AI: even the bad apples still produce something that one can reasonably work with.

              Cryptocurrencies? Barely any other use than money laundering, buying drugs and betting on the outcome of battles in war. And NFTs? No use at all other than money laundering and setting money ablaze.

              • gck1 1 day ago

                Privacy and security from government overreach is not enough?

                • mschuster91 1 day ago

                  What privacy? Enough drug dealers have already been busted with solid evidence from trailing the paths on public blockchains.

              • retinaros 14 hours ago

                crypto is a few 100 of billions less big as an industry than GenAI is. I guarantee you that AI is a far better money laundering scheme. mb the two better money laundering schemes would be construction business and the global warming business. doesnt mean that some of the stuff produced is good.

          • mjr00 1 day ago

            OpenClaw is definitely not a "5 years" project pre-AI though. That was more like a month of greenfield work compressed into a weekend -- which is still really impressive, don't get me wrong! -- but I think the point is we're not seeing mature, legacy codebases get outcompeted by new, agile, AI-driven codebases; we're seeing greenfield projects get spun up faster. Which, again, is still impressive and valuable.

            If agents could really compress 10 years of development into 1 year, you'd see people making e.g. HFT platforms and becoming obscenely rich, not making a fun open-source project and getting hired by OpenAI as an employee.

            • thunky 1 day ago

              You're framing it like the only barrier to writing wildly successful money printing software is software development skills.

              If that were true, all of these anti-AI greybeards who have been in the game for 30 years would all own their own jets.

            • simonw 1 day ago

              41,964 commits is a lot more than "a month of greenfield work".

              https://tools.simonwillison.net/github-repo-stats?repo=OpenC...

              • timr 1 day ago

                Seriously? Commit count is right up there with lines of code as a classically dumb measurement of productivity.

                • simonw 1 day ago

                  Sure, but it's still a good counter to "a month of work".

                  • timr 1 day ago

                    No it isn't. There's basically no upper bound on the number of commits an LLM can generate. If the LLM takes 10,000 commits to do what a human would do in 10, then the comparison is meaningless.

                    I don't know anything about the code quality of OpenClaw, but telling me the number of commits tells me precisely nothing of use.

                    • simonw 23 hours ago

                      OK, now do that for 369,293 stars, 76,193 forks, 138 releases and 2,133 contributors.

                      I expect there is no number I could bring up here that won't be instantly shot down as telling "precisely nothing". My mistake for bringing up any numbers at all.

                      OpenClaw is a good example of a completely new project written using coding agents that made a significant impression on the world and would not have been built without them.

                      I'm surprised this is a hill I have to die on, but there we are.

                      (I'm not even a user of OpenClaw! I don't think it's secure or safe enough to use in my own life.)

                      • timr 20 hours ago

                        > OK, now do that for 369,293 stars, 76,193 forks, 138 releases and 2,133 contributors.

                        You're counting forks and stars as code metrics now? Oy.

                        Look, those aren't nothing -- they're a decent enough proxy for popularity -- but they aren't a rebuttal to the original comment. (The other day some LLM dudebro got a bajillion stars on GH for his vibe-coded hot mess of a repo that sets three environment variables. I should go check the number of commits on that...)

                        > OpenClaw is a good example of a completely new project written using coding agents that made a significant impression on the world and would not have been built without them. I'm surprised this is a hill I have to die on, but there we are.

                        The fundamental problem here is that you were asked to provide an example of some software where LLMs have made a revolutionary difference, and OpenClaw is what you chose. That just says a lot, right there.

                        I don't even really care about that debate, since OpenClaw probably meets the literal requirements of the original question (if not the spirit), and sure, it's had a big splash. But the point of the OP is well-taken: everyone is so "productive", but if the only thing we're seeing from it is Moltbook and 10,001 half-broken pokemon games, then eventually the bloom is going to fall off the rose.

                        The fact that you felt you had to rebut the "I could do that in a weekend" guy with commit counts is both poetic and oddly fitting for where we are with these things.

                        • simonw 18 hours ago

                          I stand by what I said. OpenClaw proved that "personal digital agents" are a category with a huge amount of demand, to the point that people will jump through major hoops and completely ignore the colossal security risks involved in adopting that software.

                          It's spawned dozens of imitations, some of which are looking quite credible.

                          Anthropic themselves have been cloning OpenClaw features.

                          I get that it's not cool to say "OpenClaw is significant and influential" but I truly believe that it is.

                      • mjr00 20 hours ago

                        > OpenClaw is a good example of a completely new project written using coding agents that made a significant impression on the world and would not have been built without them.

                        Nobody is denying that OpenClaw is popular, and nobody (in this thread, at least) is denying that AI rapidly speeds up the ability to make an initial release or prototype for greenfield projects. But the comment that spawned this discussion was:

                        > we still have no companies compressing 10 years into 1 year thus exploding past all the incumbents who don't "get it".

                        The issue is that you're extrapolating OpenClaw, which upon release was a month of pre-AI development work compressed into a few days, to cover the "10 years into 1 year" scenario. However, this isn't appropriate because software development is non-linear. As anyone who has worked on a greenfield project pre-AI should know, those first weeks and months have much faster development cycles. There's no tech debt to worry about; there's no urgent bug tickets or feature requests from customers; there's no thinking about whether it's okay to ship a breaking change.

                  • sdevonoes 1 day ago

                    It isn’t man. Anyone can easily split a single good commit into 10 just to inflate the numbers. C’mon, this is 101 git

              • mjr00 1 day ago

                > 41,964 commits is a lot more than "a month of greenfield work".

                I meant a month for the initial release, not current state.

                Regardless, much like lines of code, number of commits is not a good metric, not even as a proxy, for how much "work" was actually done. Quickly browsing there are plenty[0] of[1] really[2] small[3] commits[4]. Agentic coding naturally optimizes for small commits because that's what the process is meant to do, but it doesn't mean that more work is being done, or that the work is effective. If anything, looking at the changelog[5] OpenClaw feels like a directionless dumpster fire right now. I would expect a lot more from a project if it had multiple people working on it for 5 years, pre-AI.

                [0] https://github.com/openclaw/openclaw/commit/e43ae8e8cd1ffc07...

                [1] https://github.com/openclaw/openclaw/commit/377c69773f0a1b8e...

                [2] https://github.com/openclaw/openclaw/commit/ffafa9008da249a0...

                [3] https://github.com/openclaw/openclaw/commit/506b0bbaad312454...

                [4] https://github.com/openclaw/openclaw/commit/512f777099eb19df...

                [5] https://github.com/openclaw/openclaw/blob/main/CHANGELOG.md

                • simonw 1 day ago

                  That's why my original comment said:

                  > (Whether or not you trust the quality of the software you can't deny the impact it had in such a short time. It defined a new category of software.)

                  I brought up OpenClaw here because the challenge was:

                  > we still have no companies compressing 10 years into 1 year thus exploding past all the incumbents who don't "get it".

              • sdevonoes 1 day ago

                Didn’t we learn anything from the past? Using loc or number of commits or github stars to measure success or productivity is so backwards. It seems everyone on the AI wagon is either young (and so they don’t know our history) or simply forgot about all the good practices in software engineering

                • atomicnumber3 21 hours ago

                  That latter bit is my experience. As soon as AI enters the equation, we have to immediately ignore everything we ever learned and just type text into the prompt box, or you're not doing it right.

              • krater23 23 hours ago

                My bashscript can do that in some hours. The git repo contains no working software after that, but when that is what you want to meassure...

          • demorro 1 day ago

            > It defined a new category of software

            Which is exactly why you can't use it as an example, there is no control. This is basic stuff.

          • cbarte01 1 day ago

            The condensation argument is totally true.... Strikes me though the other metric Id look at is how long code survives before being re-written. Feels like for that one a bit early to tell...

      • valcron1000 1 day ago

        > if it’s a solo greenfield project

        That's a big if. I don't have numbers but most professional engineers are not working on such projects

      • xantronix 1 day ago

        The thing is, I don't care any longer. I sincerely believe velocity without direction is not a good strategy for delivering quality in the long term. And that's the thing about it: How sustainable is this velocity, in terms of socioeconomic concerns, product strategy, and mental health?

        • user34283 23 hours ago

          Velocity without direction?

          I‘m personally directing and QA testing every feature.

          I don’t know how socioeconomic concerns, product strategy, and mental health are a concern for me here.

          I‘m having a great time with my project and it’s been the most fun I‘ve had in many years of building.

      • nananana9 1 day ago

        > and it can now multiply your productivity fivefold if it’s a solo greenfield project.

        Why do I not see 5x as many interesting greenfield projects than before?

      • heavyset_go 1 day ago

        All of the "solo green field projects" I let LLMs mostly write, despite supplying the scaffolding, structure and specific implementation details as code, prompts or context, I can't tell you much about 6+ months later, except for the parts I did write.

        It's like I never wrote them, because I didn't. I've got the gist of them, but it's the same way I get the gist of something like Numpy: I know how it works theoretically, but certainly not specifically enough to jump in and write some working Fortran that fixes bugs or adds features.

        I now have a bunch of stalled projects I'm not very familiar with. I no longer do solo green field projects that way.

    • steventhedev 1 day ago

      The dirty secret if you work inside BigCorp and look around at the projects they're showcasing:

      1. They're low stakes to get wrong.

      2. The most common is MCPs or similar ai-tooling.

      3. Making them look good takes time and effort still. It's a multiplier, not a replacement.

      4. Quality and maintainability require investment. I had to restart an agentic project several times because it painted itself into a corner.

  • asdfman123 1 day ago

    You can direct LLMs to do test-driven development, though. Write several tests, then make sure the code matches it. And also make sure the agent organizes the code correctly.

    • CharlieDigital 1 day ago

      The LLM obliges and writes a lot of useless tests. I have asked devs to delete several tests in the last day alone.

      • seanw444 1 day ago

        "I don't trust this giant statistical model to generate correct code, so to fix it, I'm going to have this giant statistical model generate more code to confirm that the other code it generated is correct."

        I swear I'm living through mass hysteria.

        • hyperadvanced 1 day ago

          A lot of times the act of specifying test criteria prevents developers from accidentally vibe coding themselves into a bad implementation. You can then read the tests and verify that it does what you want it to. You can read the code!

          I’m not saying that it’s all hunky dory, but you use AI for straight up test driven development to catch edge cases and correct sloppy implementations before they even get coded by your giant chaos machine.

      • asdfman123 1 day ago

        Well, yeah, you don't just make it bang out a bunch of useless code without monitoring it.

        You instruct it to write the code you want to be written. You still have to know how to develop, it just makes you faster.

  • hintymad 1 day ago

    > I don't think that AIs have become more trustworthy, the errors are just more subtle.

    Honest question: what about the counter-argument that humans make subtle mistakes all the time, so why do we treat AI any differently?

    A difference to me is that when we manually write code, we reason about the code carefully with a purpose. Yes we do make mistakes, but the mistakes are grounded in a certain range. In contrast, AI generated code creates errors that do not follow common sense. That said, I don't feel this differentiation is strong enough, and I don't have data to back it up.

    • sumeno 1 day ago

      Humans can't make mistakes at the sheer scale that AI can.

      Yes, as an engineer I make mistakes, but I could never make as many mistakes per day as an LLM can

      • throwuxiytayq 1 day ago

        Obviously, the measure isn’t mistakes per day, it’s mistakes per LOC. And that’s not the whole story either - AI self-corrects in addition to being corrected by the operator. If the operator’s committed bugs/LOC rate is as low as the unaugmented programmer’s bugs/LOC, you always choose the AI operator. If it’s higher, it might still be viable to choose them if you care about velocity more than correctness. I’m a slow, methodical programmer myself, but it’s not clear to me that I have a moat.

    • BoorishBears 1 day ago

      This is like having a coworker who's as skilled as you if not more skilled, but also an alien.

      Their mental model doesn't map cleanly enough to yours, and so where for a human you'd have some way to follow their thought patterns and identify mistakes, here the alien makes mistakes that don't add up.

      Like the alien has encyclopedic knowledge of op codes in some esoteric soviet MCU but sometimes forgets how to look for a function definition, says "It looks like the read tool failed, that's ok, I can just make a mock implementation and comment out the test for now."

      • AndrewKemendo 1 day ago

        Some of my favorite peer engineers work exactly like that

        People used to like them and they used to be legends (even if not everyone liked them)

        Notch, Woz, Linus and Geohot come to mind

        The Metasploit creator Dean McNamee worked for me and he was just like that and a total monster at engineering hard tech products

        • BoorishBears 1 day ago

          No they don't because they have brains.

          I have no strong idea why people can't accept that intelligence formed separately of a human brain can truly be alien: not in the hyperbolic sense of "that person is so unique it's like they're a different species", but "that thing does not have a brain, so it can have intelligence that is not human-like".

          A human without a brain would die. An LLM doesn't have a brain and can do wonderous things.

          It just does them in ways that require first accepting that there is no homo sapien thinks like an LLM.

          We trained it on human language so often times it borrows our thought traces so to speak, but effective agentic systems form when you first erase your preconceived notions of how intelligence works and actually study this non-human intelligence and find new ways to apply it.

          It's like the early days of agents when everyone thought if you just made an agent for each job role in a company and stuck them in a virtual office handing off work to each other it'd solve everything, but then Claude Code took off and showed that a simple brain dead loop could outperform that.

          Now subagents almost always are task specific, not role specific.

          I feel like we could leap ahead a decade if people could divorce "we use language, and it uses language so it is like us", but I think there's just something really challenging about that because it's never been true.

          Nothing had this level of mastery over human language before that wasn't a human. And funnily enough, the first times we even came close (like Eliza) the same exact thing happened: so this seems like a persistent gap in how humans deal with non-humans using language.

          • tyyyy3 1 day ago

            "I feel like we could leap ahead a decade if people could divorce "we use language, and it uses language so it is like us","

            Or maybe just maybe... the thing should be much better designed around the human.

            That's how personal computers made their way into homes. People like yourself are comical and can't understand how widespread adoption takes place to obtain value from what the thing intrinsically possesses.

            Firms literally exist to take care of the hassle so that the person can get the value from the thing closer to the present - like hello...?

            • BoorishBears 1 day ago

              You quote me then start speaking about things completely unrelated to anything I said.

              We can't choose if the LLM is like us unless you want to go back 10-20 years in time and choose a new direction for AI/ML.

              We stumbled upon an architecture with mostly superficial similarities to how we think and learn, and instead focused on being able to throw more compute and more data at our models.

              You're talking about ergonomics that exist at a completely different layer: even if you want to make LLM based products for humans, around humans, you have to accept it's not a human and it won't make mistakes like a human (even if the mistakes look human) -

              If anything you're going to make something that burns most people if you just blindly pretend it's human-like: a great example being products that give users a false impression of LLM memory to hide the nitty gritty details.

              In the early days ChatGPT would silently truncate the context window at some point and bullshit its way through recalling earlier parts of the conversation.

              With compaction it does better, but still degrades noticeably.

              If they'd exposed the concept of a context window to the user through top level primitives (like being able to manage what's important for example), maybe it'd have been a bit less clean of a product interface... but way more laypeople today would have a much better understanding of an LLM's very un-human equivalent to memory.

              Instead we still give users lossy incomplete pictures of this all with the backends silently deciding when to compact and what information to discard. Most people using the tools don't know this because they're not being given an active role in the process.

          • AndrewKemendo 1 day ago

            I think these are reasonable questions but it assumes that everything is actually a black box instead of being treated as such.

            Despite what the headlines say, these systems aren’t inscrutable.

            We know how these things work and can build around and within and change parameters and activation functions etc…and actually use experience and science and guidance.

            However those are not technical problems those are organizational social and quite frankly resource allocation problems.

            • BoorishBears 1 day ago

              I said the opposite of what your comment is replying to.

              > but effective agentic systems form when you first erase your preconceived notions of how intelligence works and actually study this non-human intelligence and find new ways to apply it.

              There's no reason you can't make good use of them and learn how to do it more reliably and predictably, it's just chasing those gains through a human intelligence-like model because they use human language leads to more false starts and local maxima than trying to understand stand them as their owb systems.

              I don't think it should even be a particularly contentious point: we humans think differently based on the languages we learn and grew up with, what would you expect when you remove the entire common denominator of a human brain?

      • wilsonnb3 1 day ago

        Dealing with the alien coworkers has always been the job, that is what software is to most people.

        Software developers get paid big money because they can speak alien, the only thing that is changing is the dialect.

        • BoorishBears 1 day ago

          Nope, I tried my best to be really detailed and already knew these replies would come flooding.

          I'm an engineers engineer: I get the job isn't LOC but being able to communicate and translate meatspace into composable and robust sustems.

          So when I mean an alien when I say an alien.

          Not human.

          Not in the cute "oh that guy just hears what everyone else hears and somehow interprets it entirely differently like he's from a different planet" alien way, but in the, "it is a different definition of intelligence derived from lacking wetware" alien way.

          Intelligence is such multidimensional concept that all of humanity as varied as we are, can fit in a part of the space that has no overlap with an LLM.

          -

          Now none of that is saying it can't be incredibly useful, but 99% of the misuse and misunderstanding of LLMs stems from humans refusing to internalize that a form of intelligence can exist that uses their language but doesn't occupy the same "space" of thinking that we all operate in, no matter how weird or unqiue we think we are.

    • chromacity 1 day ago

      One answer, as another person pointed out, is that LLM mistakes are just different. They are less explicable, less predictable, and therefore harder to spot. I can easily anticipate how an inexperienced engineer is going to mess up their first pull request for my project. I have no idea what an LLM might do. Worse, I know it might ace the first fifty pull requests and then make an absolutely mind-boggling mistake in the 51st one.

      But another answer is that human autonomy is coupled to responsibility. For most line employees, if they mess up badly enough, it's first and foremost their problem. They're getting a bad performance review, getting fired, end up in court or even in prison. Because you bear responsibility for your actions, your boss doesn't have to watch what you're up to 24x7. Their career is typically not on the line unless they're deeply complicit in your misbehavior.

      LLMs have no meaningful responsibility, so whoever is operating them is ultimately on the hook for what they do. It's a different dynamic. It's probably why most software engineers are not gonna get replaced by robots - your director or VP doesn't want to be liable for an agent that goes haywire - but it's also why the "oh, I have an army of 50 YOLO agents do the work while I'm browsing Reddit" is probably not a wise strategy for line employees.

      • wilsonnb3 1 day ago

        > I can easily anticipate how an inexperienced engineer is going to mess up their first pull request for my project.

        Isn’t this just because you have seen a lot of PRs from inexperienced engineers? People learn LLM behavior over time, too.

        • chromacity 1 day ago

          I'm pretty sure that I've seen more LLM mistakes than coworker mistakes at this point and I'm nowhere closer to enlightenment.

    • philipwhiuk 1 day ago

      > Honest question: what about the counter-argument that humans make subtle mistakes all the time, so why do we treat AI any differently?

      We're investing in the human getting better rather than paying $100 to Anthropic and hoping that's enough that they don't make the product worse.

  • sanderjd 1 day ago

    Yeah I relate to this. I think working in smaller chunks helps a lot. (Just like how it is for work done by humans!)

  • esalman 16 hours ago

    This.

    My manager reported couple of days ago that copilot manipulated some tests in order to make edge cases pass.

    We have standalone prototypes for our product, so it was easy to catch, but actually going in to debug and fix was much harder than expected.

    It absolutely did nothing to increase confidence on copilot though. I personally manually accept each line of code copilot writes, unless it's a skill/mcp server we have no plan to deploy.

jwpapi 1 day ago

> I know full well that if you ask Claude Code to build a JSON API endpoint that runs a SQL query and outputs the results as JSON, it’s just going to do it right. It’s not going to mess that up. You have it add automated tests, you have it add documentation, you know it’s going to be good.

I feel like this is just not true. An JSON API endpoint also needs several decisions made.

- How should the endpoint be named

- What options do I offer

- How are the properties named

- How do I verify the response

- How do I handle errors

- What parts are common in the codebase and should be re-used.

- How will it potentially be changed in the future.

- How is the query running, is the query optimized.

If I know the answer to all these questions, wiring it together takes me LESS time than passing it to Claude Code.

If I don’t know the answer the fastest way to find the answer is to start writing the code.

Additionally, whilst writing it I usually realize additional edge cases, optimizations, better logging, observability and what else.

The author clearly stated the context for this quote is production code.

I don’t see any benefits in passing it to Claude Code. It’s not that I need 1000s of JSON API endpoints.

  • yieldcrv 1 day ago

    I’ve seen the best REST APIs since Claude Code has taken the wheel

    Every verb implemented, and implemented correctly according to the obscure IETF and most compatible way when the IETF never made it clear

    Intuitively named routes, error, authentication all easily done and swappable for another if necessary

    I feel like our timeline split if you’re not seeing this

    • jwpapi 1 day ago

      I don’t want every verb implemented, I also dont want an IETF standard. I want as little as possible, so I have to worry about as little as possible in the future.

      Use-cases differ, you described a complete REST API, which can be as much of a problem as a too little.

      • hnuser123456 1 day ago

        I see you haven't encountered an API where a GET command can modify the database.

        • adithyassekhar 1 day ago

          Now why would you make such a monstrosity? Audit logs? I was having good day till now.

        • asa400 1 day ago

          Similarly, I once worked somewhere that had an HTTP API that returned status code 200 {“error”: “ok”} to indicate an error occurred.

          • Schiendelman 20 hours ago

            A lot of GraphQL APIs are like this! They return a 200 just to mean the damn GraphQL is well formed, and the call can totally fail underneath.

        • IshKebab 21 hours ago

          I have not either. What's your point? AI isn't perfect.

        • esafak 17 hours ago

          I saw a code base where SQL UDFs were used to mutate the data with SELECTs...

      • yieldcrv 1 day ago

        Then just tell it to do that

        It'll even suggest it

        You want a single RPC websocket go for it

        • jwpapi 1 day ago

          Till it has explored the codebase, asked me follow up questions, suggested the code change, incorporating my fixes after losing time on context switch + the extra time I need when somebody requests a change in 3 months to learn the mental model. I’m way faster to just write it myself (mental model included)

          • phpnode 1 day ago

            If it's genuinely the case that you can write code faster than you can prompt it into existence then you're not being ambitious enough with your coding agent. Ask it to do more. Tackle bigger problems.

            • batshit_beaver 1 day ago

              1. It's unclear why creating more code faster is a good thing. Software engineering wisdom for decades has been that code is a cost, not a product. There are great reasons for that, which haven't changed with the appearance of LLMs.

              2. There absolutely are cases where modifying code "manually" is unquestionably faster than prompting an LLM. There are trivial examples for this - eg only an insane person would ask an LLM to rename a variable rather than using an LSP for that. It would provably and consistently take more keystrokes. There are less trivial examples as well, like, you know, having an understanding of your codebase and using good abstractions/libraries within it that let you make large changes to the program's behavior with little boilerplate code.

              One can argue that producing a lot of complex changes through an LLM is faster, which I would agree with, but then see point #1. Sustainable software development has up to this point relied on iterative discovery of the right small components that together form a complete, functional, stable system (see "Programming as Theory Building").

              There's zero indication so far that LLMs are capable of speeding up the process of creating complete, functional, stable systems. What every org within my career and friend circle is seeing (and research into productivity impacts of LLMs on software development is showing) is the same story - fast prototypes that either turn into abandonware, personal tools, or maintenance nightmares.

              • phpnode 23 hours ago

                1. More code faster is not the goal. More features / value faster is the goal. Obviously to get there you need to write more code, but it's not writing code for code's sake.

                2. Yes, true, but the point is to move up the abstraction hierarchy, so instead of asking the LLM to rename a variable you describe the concrete business goal you're trying to achieve.

                It is true that coding agents cannot build fully complete stable systems completely unguided yet. That's why we still have jobs. But it's wrong to suggest that they don't deliver value or that they're destined to produce trash every time. It is a matter of oversight and guidance and setting your codebase up for success. That does require work, but it is not impossible, just a different skillset from the ones we've been used to.

            • yieldcrv 1 day ago

              bro is probably using a local LLM at 2 tokens/sec

              • jwpapi 1 day ago

                Ad hominem

                • yieldcrv 22 hours ago

                  Extremely relevant as that’s the only way it would make sense that your experience with agentic coding is still so slow and so poor

    • theteapot 1 day ago

      the obscure IETF? Which standard is that exactly? Who cares guess - Claude do that stuff.

  • weird-eye-issue 1 day ago

    > If I know the answer to all these questions, wiring it together takes me LESS time than passing it to Claude Code

    How so?

    • jwpapi 1 day ago

      Like writing code to me is not slower than writing text?

      When I write code every character I type in my computer has less ambiguity than when I write it in human language? I also have the help of LSPs, Linters and Auto-completes.

      • spoiler 1 day ago

        It's not much to go on by, but I kinda feel ya. I think one exception I'd perhaps make is doing a large mechanic refactor. I find them incredibly daunting. So, I'll just ask AI for that. I mean it probably takes me a similar time to do, but it feels less daunting.

        I've been trying to get into agentic coding and there are non-refactoring instances where I might reac for it (like any time I need to work on something using tailwind; I'm dyslexic and I'd get actual headaches, not exaggerating, trying to decipher Tailwind gibberish while juggling their docs before AIs came around)

        • jwpapi 1 day ago

          I use Jetbrains features for that usually, it has great tools for that.

          Lets say on that JSON API I want to extract part of the logic in a repositiory file i CTRL + W the function then I have almost all of my shortcuts with left alt + two character shortcuts. So once marked i do LAlt + E + M for Extract Method then it puts me in a step in between to rename the function and then LAlt + M+V for MoVe and then it puts me in an interface to name the function.

          Once you used to it its like a gamer doing APMS and its deterministic and fast. I also have R+N (rename), G+V (generate vitest) Q+C(query console), Q+H(Query history) and many more. Really useful. Probably also doable with other editors.

        • IanCal 1 day ago

          I highly recommend looking into codemods for larger mechanical refactorings. I did things like converting large test suites from one testing library to another by having codex write a codemod to convert it as a first pass.

      • jameson 1 day ago

        I have a similar sentiment. Subject that makes the claim that AI writing code is fast is going to matter a lot because some programmers heavily use "LSPs, Linters and Auto-completes", key bindings, snippets, CLI commands, etc to speed up writing code

      • dreambuffer 1 day ago

        This assumes:

        - that you spend no amount of time looking things up, reorganising, or otherwise getting stuck

        - that you have a solution to the problem ready to go at all times

        - that your solution is better than the LLM's solution

        I highly, highly doubt that all 3 of these are true. I doubt even 1 of them is true, I think you just don't know how to use LLMs in a focused way.

        • jwpapi 1 day ago

          I use AI to look things up and I try to learn. That part is speed up, but once I know how X works I’m faster doing it myself. My assumption is that most people seeing things differently, compare their performance of not knowing how X works with Claude, but not with someone who’s really good at X. Which makes a lot of sense given LLMS are prediction generators. My take is that the best use of AI is to get you to the point where you are really good with X and then naturally your AI usage will go down.

          • jatora 22 hours ago

            > but once I know how X works I’m faster doing it myself.

            Survey says: Legacy coder.

          • r_lee 22 hours ago

            what my experience says is that, when you get "really good" with X, then you can easily write a prompt that says exactly how it needs to be done and you'll be able to do it much faster than writing it all yourself because you know the important parts and the rest is just glue.

      • weird-eye-issue 1 day ago

        I use voice to text and for me coding is way faster now. You don't need to sit down and type up a perfect spec lol. I give it terrible prompts with poor grammar and typos from incorrect transcriptions and it does an amazing job. Definitely not perfect I iterate with it a ton but it's still faster than typing it out by hand

      • fragmede 1 day ago

        You're still typing? I don't know how fast you can type, but I can speak way faster than I can type. Somewhere in the neighborhood of 300 wpm. Speech-to-text is pretty good now, and prompting an AI means I'm not trying to speak curly brace semicolon new line.

        • jwpapi 1 day ago

          Average speaking speed for english speakers is 100-120 wpm for complex topic. I type 130wpm peak and I have the most common coding characters on my home row using neo layout.

          • weird-eye-issue 1 day ago

            I hope you never get RSI. It absolutely blows and I can barely type for the last few years without getting pain. And this is with physical therapy...

            • jwpapi 1 day ago

              I had Ulnar Nerve issue, but I’ve changed my setup and it really helped. I don’t have any problems anymore.

              It took quite some time to figure out what works and what triggers it. However I don’t know it’s the same for RSI.

              I’m grateful for the ability to use speaking as a second option, but utilizing both I can’t cope that speaking is even remotely close to typing :/

  • eric_cc 1 day ago

    You can also just talk it out loud to Claude while you’re on a walk getting some sunshine. Done.

    • jwpapi 1 day ago

      Yeah I can and I’ve done it and for fun project it’s fun and cool. But its like using templates to build your website. You’ll be annoyed and at one point your project goes in the endless graveyary of abandoned projects

      • jmilloy 1 day ago

        I think most people are finding the opposite. Claude Code is not only reducing how many projects get abandoned, it's also resurrecting projects from the graveyard.

        • sumeno 1 day ago

          The number of Show HNs recently that have a days worth of commits and are never touched again disagrees. It's creating a lot of projects that are immediately abandoned

          • tyyyy3 1 day ago

            I think its a direct reflection of the fact that most people really prefer to go-go-go and not spend the time up-front thinking about what their project even is, why it matters and is it worth dedicated resources toward it. The abandonment usually reflects the answer - no it was not worth it.

            LLMs amplify this behaviour.

          • jmilloy 1 day ago

            There is a difference between a project that is eventually abandoned out of annoyance because you couldn't accomplish what you wanted and a project that gets a day or two of attention and then gets aborted because you figured out it wasn't worth it or got interested in something else. I think the parent comment is talking about the former and I'm responding to that, while you're talking about the latter.

    • nozzlegear 1 day ago

      Now you're working when you should be taking a break and enjoying your surroundings. Not good!

      • Schiendelman 20 hours ago

        Maybe instead you get twice as much walking in the sunshine.

    • dodu_ 1 day ago

      I'd rather just be an actual schizophrenic at that point. It seems like less of a mental illness.

      Just be outside and present.

  • eddieroger 1 day ago

    > If I know the answer to all these questions, wiring it together takes me LESS time than passing it to Claude Code.

    That's just not true, and if it is in your case, then you're not great at writing prompts yet.

    > Take the todo_items table in Postgres and build a Micronaut API based around it. The base URL should be /v1/todo_items. You can connect to Postgres with pguser:pgpass@1.2.3.4

    That's about all it takes these days. Less lines of code than your average controller.

    • apsurd 1 day ago

      I've drank the AI koolaid so I'm not a hater, but to say "you're just not prompting right" is such a cop-out. Prompting right takes a metric fuck ton of effort. I'm actually kinda agreeing with you, if you make it to where you're dev environment is sufficiently harnessed, then you can give it one-liner magic prompts. But getting there, learning to get there, paying that cost, hot mother of god it's a lot of effort.

      Communicating, in words, is extremely hard. I don't think this should be as controversial as it's seems in the prompt era.

      VS: someone has mastered one of the myriad openAPI generators, and it's shipped.

      • xmprt 1 day ago

        I'll go in the other direction and say that if you're spending a lot of your time learning to prompt better then you're wasting it because LLMs are only going to get better at understanding your intent regardless of "prompt engineering". The JSON API example to wire up a database can be one-shot pretty easily by the latest models without much context and without setting up any harness. The more time you spend perfecting your harness, the more time you would have wasted when the next model comes out to make it obsolete.

        • derangedHorse 1 day ago

          I was thinking of this interpretation as I read that:

          "I'll go in the other direction and say that if you're spending a lot of your time learning to [program] better then you're wasting it because [computer]s are only going to get better at [computing] regardless of "[software] engineering". The JSON API example to wire up a database can be [run] pretty easily by the latest [computer]s without much [design] and without setting up any [optimizations]. The more time you spend perfecting your [program], the more time you would have wasted when the next [computer] comes out to make it obsolete."

        • apsurd 1 day ago

          but then how can the parent comment land? "you're just not prompting right"

          • xmprt 1 day ago

            I don't think it does. If I had to guess, the top comment was using an older version of AI or a local model which wouldn't be able to solve the JSON API task. A lot of AI skepticism comes from people who used it once a while back and decided not to keep up with the latest developments. If I only had experience with gpt-3.5 then I'd also assume what the original commenter said.

            • majormajor 1 day ago

              An experiment I'd love to do, but which isn't actually possible anymore, is run GPT 3.5 or the original 4 API release through a modern "agentic" harness for a task like this.

              I think 3.5 would probably need more frequent intervention than a lot of harnesses give. But I bet 4 could do a simple JSON API one-shot with the right harness. Just back then I had to manually be the harness.

        • majormajor 1 day ago

          The hardest thing about software engineering has always been that your intent often has to be decided on the fly once you get into complicated edge cases, weird-or-legacy-business requirements, or things that the spec literally has no answers for.

          Letting the tool figure out your assumed intent on those things is a double-edged sword. Better than you never even thinking of them. But potentially either subtle broken contracts that test coverage missed (since nobody has full combinatoric coverage, or the patience to run it) or just further steps into a messy codebase that will cost ever-more tokens to change safely.

      • phpnode 1 day ago

        it does take a little while to get good at this new skill, yes. Just like, say, learning a new programming language and the ecosystem around it takes some effort. After you get over the hump it's really very straightforward and mostly a matter of knowing the kinds of mistakes the LLM is likely to make ahead of time, and then kindly asking it to do something smarter. If you've successfully mentored junior engineers you already have this skill.

        • apsurd 1 day ago

          that's well put. But i'd stress mentoring junior engineers is really a high effort, high leverage, high demand skill. A good teacher is gold. and not common.

      • yakbarber 1 day ago

        this seems disingenuous. even if your premise is true (which i don't think it is), it only really holds for the first few endpoints. most systems have many, and the models are very good at copying established patterns to the point that you wouldn't normally have to re-explain every detail for every endpoint. so you might be right for the first (you're not), but you're definitely wrong for the next 50.

        • majormajor 1 day ago

          To be fair, I don't know many humans who would write endpoints 2-50 from scratch either in that situation.

          Time-wise, it's easy-mode vs easy-mode at that point.

          The human is more likely to make copypasta errors, though!

      • eddieroger 1 day ago

        I disagree it's a cop-out, but I agree it's hard to get good at writing prompts and takes a lot of effort. But so is programming. We're trading one skill set for another and getting a bigger return on it.

        I started as a skeptic and have similarly drank the kool-aid. The reality is AI can read code faster than I can, including following code paths. It can build and keep more context than I can, and do it faster as well. And it can write code faster than I can type. So the effort to learn how to tell it what to do is worthwhile.

        • apsurd 1 day ago

          yep fully agree. i'm taking issue with the flippant "not prompting right" as if they're holding it upside down vs it's actually a meaningful skill to have to invest in so it's fully believable that someone trained in normal code gen is much more proficient up front.

    • majormajor 1 day ago

      Every day I do something where the llm writes it ten times faster than I would with twice the test coverage.

      And every day I do something else where the LLM output is off enough that I end up spending the same amount of time on it as if I'd done it by hand. It wrote a nice race condition bug in a race I was trying to fix today, but it was pretty easy for me to spot at least.

      And once a week or so I ask for something really ambitious that would save days or even weeks, but 90% of the time it's half-baked or goes in weird directions early and would leave the codebase a mess in a way that would make future changes trickier. These generally suggest that I don't understand the problem well enough yet.

      But the interesting things are:

      1) many of the things it saves 90% of the time on are saving 5+ hours

      2) many of the things I have to rework only cost me 2+ hours

      3) even the things that I throw away make it way faster to discover that 'oh, we don't understand this problem well enough yet to make the right decisions here yet' conclusion that it would be just starting out on that project without assistance

      so I'm generally coming out well ahead.

      • qingcharles 1 day ago

        This. There is definitely a ratio. A year ago, it was 50/50. It felt better because the hard things it did fast while I sipped coffee outweighed in my mind the negatives.

        Now that ratio is swinging way over towards the LLMs favor.

    • sarchertech 1 day ago

      >you’re not great at writing prompts yet

      How do you reconcile that with your example prompt, which demonstrates no skill requirement whatsoever. It’s the first thing any developer would think of.

      • vlunkr 1 day ago

        It’s simple but contains all the necessary info. You can say “build an endpoint to get user data” and it will absolutely do something, but it might be stupid, and when you compound 1000 stupid prompts like that you get spaghetti.

        • sarchertech 1 day ago

          A programmer wouldn’t write a prompt like that. Notice the questions the OP talked about out answering first.

        • eloisius 1 day ago

          It doesn’t contain any information at all about the structure of the JSON output. Is this a greenfield endpoint and anything will work is does it need to conform to an existing API? What about response codes for different failure modes? What about logging?

          Your comment exemplifies what a lot of people complain about vibe coding: it works great for greenfielding CRUD apps, but it’s a bitch to use in a real code base.

          • theshrike79 21 hours ago

            If it's not a greenfield project, the Agent will go and look at the other endpoints and mimic the style as much as possible. Just like a human would.

            "But we have documentati..." then give the Agent access to the same docs as humans and it'll use them.

            • sarchertech 14 hours ago

              On a real codebase there’s going to be v5 that is the newest version, v4 that we planned to migrate off of but we had to keep around for iOS clients, and v3 that no one except for a dozen huge enterprise customers use. But we need to support all 3 styles. This stuff is sort of documented, but not completely, and there’s a push by some people in the org to use v5 style for every new feature, but there’s one director pushing back on that. So you need to go talk to a few people and get enough condensed to CYA before deciding what to do.

              Some version of that happens in every big company or every long running app. Claude isn’t AGI and that prompt isn’t nearly specific enough for anything outside of greenfield.

              • theshrike79 1 hour ago

                AI can't fix shit processes =)

                I actually believe it just surfaces them, humans will tolerate ambiguity like that and deal with it. AI Agents either won't work properly or will just fail to do anything useful.

    • sdevonoes 1 day ago

      I have worked with people like you. Worst colleagues ever.

    • wiseowise 1 day ago

      > you’re not great at writing prompts

      > provides not great prompt

    • philipwhiuk 1 day ago

      > you’re not great at writing prompts yet

      You know what we call adequately specifying the system such that the computer can run it as a viable system.

      Coding. We call it coding.

    • awalGarg 17 hours ago

      > > If I know the answer to all these questions, wiring it together takes me LESS time than passing it to Claude Code.

      > That's just not true, and if it is in your case, then you're not great at writing prompts yet.

      That's just not true, and if it is in your case, then you're not great at writing code yet.

  • slashdave 1 day ago

    You forgotten the important part: permissions

  • cyral 1 day ago

    This may have been a problem a year or two ago but any premium model will be exploring the codebase to check similar routes to answer all these questions, if you don't specify them.

    • rufasterisco 1 day ago

      Exactly. As long as the codebase is consistently following some given patterns, LLMs nowadays stick to it.

      Understanding that limiting number of “design patterns” in a codebase made it better (easier to code and understand) was a good proxy for seniority before LLMs.

      Now it’s even better: if all of a sudden “unusual code” is in a PR, either the person opening the PR or the one reviewing it has lost touch with the codebase. Very important signal, since you don’t want that to happen with code you care about.

  • senordevnyc 19 hours ago

    This is just bizarre to me. Do people not use Plan mode?

    I start by telling the agent what I'm trying to accomplish, and then I throw in some questions like this, concerns I have, edge cases I've thought about, whatever. It goes out and does all the research, both in my code base and beyond, asks me questions where it needs clarification, and then writes me a plan. I review the plan, we go back and forth a bit with adjustments to the plan, and then the plan is ready for implementation. At that point, the implementation is mostly a formality, because all of the difficult parts are already done.

    On top of that, most of what you've described as decisions that need to be made are either trivially made by a frontier model without even needing to be told, or stuff I can bake into my skills so I don't need to specify it on every task.

    Given the above, I can't fathom an approach where I'd be faster without AI than with it, because the acceleration is the planning / decision-making, not the implementation. Whether the implementation takes the agent two minutes or six hours really doesn't matter, because I'm not involved at that point.

    • jwpapi 14 hours ago

      You getting swindled into over-engineering.

devin 1 day ago

> If you can go from producing 200 lines of code a day to 2,000 lines of code a day, what else breaks? The entire software development lifecycle was, it turns out, designed around the idea that it takes a day to produce a few hundred lines of code. And now it doesn’t.

It is so embarrassing that LOC is being used as a metric for engineering output.

  • etothet 1 day ago

    Agreed. And, LOC has historically been one of the things we've collectively fought against management for how to evalute a "productive" developer!

    • ButyTh0 1 day ago

      Why?

      We should have gone the other way; generated a lot of code and demanded pay raises; look at the LOC I cranked out! Company is now in my debt!

      If they weren't going to care enough as managers to learn and line go up is all that matters to them, make all lines go up = winning

      You all think there's more to this than performative barter for coin to spend on food/shelter.

      • embedding-shape 1 day ago

        Because not everyone is just out after earning the most money, some people also want to enjoy the workplace where they work. Personally, what the quality of the codebase and infrastructure is in matters a lot for how much you enjoy working in it, and I'd much rather work in a codebase I enjoy and earn half, than a codebase made by just jerking out as many LOC as possible and earn double.

        Although this requires you to take pride in your profession and what you do.

        • ButyTh0 1 day ago

          All of human agency must prop up the vanity of you. Of all people.

          Got it.

          ...ok fine; lack of political action to put us all on the hook for your healthcare is your choice to take a gamble on a paycheck. It's a choice to say your own existence is not owed the assurance of healthcare.

          So I will honor your choice and not care you exist.

  • estimator7292 1 day ago

    At least "mentions of LOC" is now a great metric for "how clueless is this person"

  • ilikebits 1 day ago

    LOC is useful here not because it's a metric for output but because it's a metric for _understandability_. Reviewing 200 lines is a very different workload than reviewing 2000.

    • jazzypants 1 day ago

      That's assuming the 200 lines are logical and consistent. Many of my most frustrating LLM bugs are caused by things that look right and are even supported by lengthy comments explaining their (incorrect) reasoning.

      • mcmcmc 1 day ago

        Ok? No one is saying that all LOC are equal. Ceteris paribus, 2000 lines is 10x more time consuming to review than 200

        • embedding-shape 1 day ago

          > 2000 lines is 10x more time consuming to review than 200

          Very far from the truth in practice, every line of code isn't as difficult/easy to review as the other.

          • jimbokun 1 day ago

            But why would the lines in the 2000 case be easier to review per line?

            • squeaky-clean 1 day ago

              Which of these programs is easier to review

                {x{x,sum -2#x}/0 1}
              

              or

                def f(n):
                    if n <= 1:
                        return n
                    else:
                        return f(n-1) + f(n-2)
              

              They're both the same program

              • jimbokun 23 hours ago

                Good question.

                But it is orthogonal to the question of evaluating 2000 lines of AI code vs 200 lines of human written code. Either the human or the AI could produce idiomatic code for either language, given sufficient training data in the AI’s corpus for the language.

                My guess is that the first one is much quicker to review, for a human equally fluent in both languages.

          • mcmcmc 1 day ago

            Holy shit, read the words I wrote. Ceteris Paribus. Assume the 200 lines and 2000 lines have a similar distribution of complexity.

        • jazzypants 1 day ago

          The point is that LOC is never a good metric for any aspect of determining the quality of code or the coder because it ignores the nuance of reality. It's impossible to generalize because the code can be either deceptively dense or unnecessarily bloated. The only thing that actually matters is whether the business objective is achieved without any unintended side effects.

          • mcmcmc 1 day ago

            > The only thing that actually matters is whether the business objective is achieved without any unintended side effects.

            Objectives change; timeliness matters. The speed at which you deliver value is incredibly important, which is why it matters to measure your process. Deceptively dense is what I’d call software engineers who can’t accept that the process is actually generalizable to a degree and that lines of code are one of the few tangible things that can be used as a metric. Can you deliver value without lines of code?

            • jazzypants 1 day ago

              > Objectives change; timeliness matters. The speed at which you deliver value is incredibly important, which is why it matters to measure your process.

              This assumes that shorter code is faster to write. To quote Blaise Pascal, "I would have written a shorter letter, but I did not have the time."

              > Can you deliver value without lines of code?

              No, but you can also depreciate value when you stuff a codebase full of bloated, bug-ridden code that no man or machine can hope to understand.

              • mcmcmc 1 day ago

                You seem determined to misinterpret. I’m not talking about LOC as a measure of productivity. The ratio of LOC needing review to the capacity of reviewers (using how many LOC can be read/reviewed over the sampling period) is what’s being discussed. Agentic AI/vibe coding has caused that ratio to increase and shows a bottleneck in the SDLC. It’s a proxy metric, get over yourself.

                “All models are wrong, some are useful”. What’s not useful is constantly bitching about how there’s no way to measure your work outside of the binary “is it done” every time process efficiency is brought up.

                • jazzypants 1 day ago

                  Yes, reading this back, I definitely veered off-topic. I apologize. I still don't think that you can say how much time it will take to review code based on how many lines of code are involved, but my argument was not well crafted. I just hope that others can learn something from our discussion. Thank you for being patient with me, and I hope you have a good day! :)

    • moregrist 1 day ago

      It’s still a bad metric.

      I have worked with code where 1000s of lines are very straightforward and linear.

      I’ve worked on code where 100 lines is crucial and very domain specific. It can be exceptionally clean and well-commented and it still takes days to unpack.

      The skills and effort required to review and understand those situations are quite different.

      One is like distance driving a boring highway in the Midwest: don’t get drowsy, avoid veering into the indistinguishable corn fields, and you’ll get there. The other is like navigating a narrow mountain road in a thunderstorm: you’re 100% engaged and you might still tumble or get hit by lightning.

      • lelandfe 1 day ago

        There’s still a limit on how far one can drive in a day, no matter the road.

      • jimbokun 1 day ago

        The number of bugs tends to be linear to lines of code written meaning fewer lines of code for the same functionality will have fewer bugs.

        So I’m pretty skeptical that reviewing 2000 lines of code won’t take any more time than reviewing 200 lines of code.

        Furthermore how do you know the AI generated lines are the open highway lines of code and not the mountain road ones? There might be hallucinations that pattern match as perfectly reasonable with a hard to spot flaw.

        • moregrist 1 day ago

          > The number of bugs tends to be linear to lines of code written meaning fewer lines of code for the same functionality will have fewer bugs.

          It depends on the code. If you’re comparing code of the same complexity then, sure, 2000 lines will take longer than 200.

          I was comparing straight linear code to far more complex code. The bug/line rate will be different and the time to review per line will be different.

          > Furthermore how do you know the AI generated lines are the open highway lines of code and not the mountain road ones?

          Again, it depends on the code. Which was my point.

          Linear code lacks branches, loops, indirection, and recursion. That kind of code is easy to reason about and easy to review. The assumptions are inherently local. You still have to be alert and aware to avoid driving into the cornfields.

          It’s a different beast than something like a doubly-nested state machine with callbacks, though. There you have to be alert and aware, and it’s inherently much harder to review per line of code.

    • mrbnprck 1 day ago

      Its still posssible to run any LLM in a loop and optimize for LoC while preserving the wanted outcome.

  • mcmcmc 1 day ago

    Is it? The whole point of the article is that the rate of output for writing code has surpassed the rate at which it can be reviewed by humans. LOC as an input for software review makes a lot of sense, since you literally need to read each line.

  • adtac 1 day ago

    LOC is the worst metric for engineering output, except for all the others - Churchill

    • deadbabe 1 day ago

      The amount of times an engineer says what the fuck while reading code still seems like a reliable metric for code quality assessment.

      • AnimalMuppet 1 day ago

        Somewhat reliable, yes. Not objective, though, and hard to reproduce.

        • deadbabe 1 day ago

          In a world where everything is vibes now that doesn’t matter much.

      • dyauspitr 1 day ago

        We won’t be doing that for much longer, enjoy it while you can.

        • deadbabe 1 day ago

          I’m sure an agent can audibly play “what the fuck” as it crunches tokens reading through a codebase

  • vrganj 1 day ago

    I read somewhere that measuring software engineering output by LoC is like measuring aerospace engineering by pounds added to the plane and I thought that was an apt comparison.

  • root_axis 1 day ago

    He's not using LOC as a metric, he's making an observation about the impact of a change in the typical volume of LOC.

  • faizshah 1 day ago

    I experimented with vibe coding (not looking at the code myself) and it produced around 10k LOC even after refactors etc.

    I rewrote the same program using my own brain and just using ChatGPT as google and autocomplete (my normal workflow), I produced the same thing in 1500 LOC.

    The effort difference was not that significant either tbh although my hand coded approach probably benefited from designing the vibe coded one so I had already though of what I wanted to build.

    • embedding-shape 1 day ago

      Sounds like a great oppurtunity to understand your own development process, and codify it in such detail that the agent can replicate how you work and end up with less code but doing the same.

      My experience was the same as you when I started using agents for development about a year ago. Every time I noticed it did something less-than-optimal or just "not up to my standards", I'd hash out exactly what those things meant for me, added it to my reusable AGENTS.md and the code the agent outputs today is fairly close to what I "naturally" write.

      • 8note 1 day ago

        or go with this, and use the agent to prototype ideas, and write it yourself once you know what you want

  • hungryhobbit 1 day ago

    Humans are also incredibly varied and different.

    Do you reject all stats that treat the number of people involved (eg. 2 million pepole protested X) as "embarrassing" ... because they lump incredibly varied people together and pretend they're equal?

  • dyauspitr 1 day ago

    Honestly it’s more like 200 to a 100,000 of pretty decent quality code at this point.

  • keeda 1 day ago

    LoC is perfectly fine as a metric for engineering output. It is terrible as a standalone measure of engineering productivity, and the problems occur when one tries to use it as such.

    It's still useful, however, because that is the only metric that is instantly intuitively understandable and comparable across a wide variety of contexts, i.e. across companies and teams and languages and applications.

    As we know, within the same team working on the same product, a 1000 LoC diff could take less time than a 1 line bug fix that took days to debug. Hence we really cannot compare PRs or product features or story points across contexts. If the industry could come up with a standard measure of developer productivity, you'd bet everyone would use it, but it's unfeasible basically for this very reason.

    So, when such comparisons are made (and in this case it was clearly a colloquial usage), it helps to assume the context remains the same. Like, a team A working on product P at company C using tech stack T with specific software quality processes Q produced N1 lines of code yesterday, but today with AI they're producing N2 lines of code. Over time the delta between N1 and N2 approximates the actual impact.

    (As an aside, this is also what most of the rigorous studies in AI-assisted developer productivity have done: measure PRs across the same cohorts over time with and without AI, like an A/B test.)

  • moomoo11 1 day ago

    I follow Garry Tan on X and he’s a big proponent of LOCmaxxing using AI.

    AI helps eng ship more and faster, I think that’s the takeaway.

  • autoconfig 1 day ago

    The charitable interpretation here is obviously that the LoCs are equivalent in quality, in which case it is a very useful metric in the context that was presented. The inability to infer that should be embarrassing.

  • jwpapi 1 day ago

    I deleted 75000 lines of code of my codebase in the last 2 months and that was tremendously more useful to by business than the 75000 AI has written the 2 months before...

  • np1810 1 day ago

    I just read somewhere on HN that "code is a liability, not an asset, the idea behind the code/final product is the actual asset." And, I can't agree more...

    > It is so embarrassing that LOC is being used as a metric for engineering output.

    In one of my previous org, LOC added in the previous year was a metric used to find out a good engineer v/s a PIP (bad) engineer. Also, LOC removed was treated as a negative metric for the same. I hope they've changed this methodology for LLM code-spitting era...

dataviz1000 1 day ago

Have you noticed that the coding agents get really close to the solution on the first one shot and then require tons of work to get that last 10% or 5%?

If we shift the paradigm of how we approach a coding problem, the coding agents can close that gap. Ten years ago every 10 or 15 minutes I would stop coding and start refactoring, testing, and analyzing making sure everything is perfect before proceeding because a bug will corrupt any downstream code. The coding agents don't and can't do this. They keep that bug or malformed architecture as they continue.

The instinct is to get the coding agents to stop at these points. However, that is impossible for several reasons. Instead, because it is very cheap, we should find the first place the agent made a mistake and update the prompt. Instead of fixing it, delete all the code (because it is very cheap), and run from the top. Continue this iteration process until the prompt yields the perfect code.

Ah, but you say, that is a lot of work done by a human! That is the whole point. The humans are still needed. The process using the tool like this yields 10x speed at writing code.

  • nichochar 1 day ago

    This was often true when writing code manually to be fair.

    You could get to "something that works" rather fast but it took a long time to 1) evaluate other options (maybe before, maybe after), 2) refine it, 3) test it and build confidence around it.

    I think your point stands but no one really knows where. The next year or so is going to be everyone trying to figure that out (this is also why we hear a lot of "we need to reinvent github")

    • SV_BubbleTime 1 day ago

      When I hire fresh out of college… I can see them coming in and not having the slightest comprehension of the difference of the things that they did in school to get a grade and never touch it again versus a product that is supposed to exist and work for 10+ years.

  • tyyyy3 1 day ago

    The problem of life in general is the last 5-10% is always the hardest. And it makes no economic sense in many cases to invest in trying to make that last part mechanised.

    I believe the llm providers went with the wrong approach from the off - the focus should’ve been on complementing labour not displacement. And I believe they have learned an expensive lesson along the way.

  • skybrian 1 day ago

    I tend to get something working and refactor my way out, which does work and you can use a coding agent to do it, but it takes time. Maybe starting over would have been better, but I didn’t know what I wanted the architecture to look like at the beginning.

  • deadbabe 1 day ago

    That will not work as cleanly as you described once a lot of code has been committed to the code base. You cannot just blow away an entire working code base and start over just because an LLM is struggling to make a feature work with existing architecture.

    • gck1 1 day ago

      This happened on every single greeenfield project that I've started with AI, no matter how rigorous process I've had defined.

      And it's not just easier because it's cheap, it's easier because you're not emotionally attached to that code. Just let it produce slop, log what worked, what didn't, nuke the project and start over.

      It just gets incredibly boring.

      • deadbabe 1 day ago

        People will get attached to code that works just right and they don’t want to mess with it too much.

  • NickNaraghi 1 day ago

    Yes! Anthropic team calls this “regenerate, don’t fix.”

    The person who builds an agentic IDE or GitHub alternative that natively does the process you describe will be a multibillionare.

  • randyrand 1 day ago

    I can go long session with it making great code.

    But the first time I say “No, it should be …” it’s nearly game over. If you say it 3+ times in a row, you’re basically doomed.

    Sure, you can get it to fix the bug, but it comes at the cost of future prompts often barely working.

    • fransje26 1 day ago

      I second that experience.

      The moment I hit the "no, it should be.." point, I know it's the end of it.

      Sometimes I can salvage something by asking for a summary of the work and reasoning done, and doing a fresh restart. But often times, it's manual corrections and full restart from there.

  • fransje26 1 day ago

    > Ah, but you say, that is a lot of work done by a human! That is the whole point. The humans are still needed. The process using the tool like this yields 10x speed at writing code.

    Shame that what is left for the humans is the shitty, tedious part of the work.. It reminds me of the quote:

       I want AI to do my laundry and dishes so that I can do art and writing, not for AI to do my art and writing so that I can do laundry and dishes..
peterbell_nyc 1 day ago

For me the distinction is the quality and rigor of your pipeline.

Vibe coding: one shot or few shot, smoke test the output, use it until it breaks (or doesn't). Ideal for lightweight PoC and low stakes individual, family or small team apps.

Agentic engineering: - You care about a larger subset of concerns such as functional correctness, performance, infrastructure, resilience/availability, scalability and maintainability. - You have a multi-step pipeline for managing the flow of work - Stages might be project intake, project selection, project specification, epic decomposition, d=story decomposition, coding, documentation and deployment. - Each stage will have some combination of deterministic quality gates (tests must pass, performance must hit a benchmark) and adversarial reviews (business value of proposed project, comprehensiveness of spec, elegance of code, rigor and simplicity of ubiquitous language, etc)

And it's a slider. Sometimes I throw a ticket into my system because I don't want to have to do an interview and burn tokens on three rounds of adversarial reviews, estimating potential value and then detailed specification and adversarial reviews just to ship a feature.

  • Aurornis 1 day ago

    If your slider only goes between vibe coding or agentic engineering you're missing an entire range of engineering where the human is more involved.

    I've been using Opus, GPT-5.5, and some lesser models on a daily basis, but not having them handle entire tasks for me. Even when I go to significant effort to define and refine specs, they still do a lot of dumb things that I wouldn't allow through human PR review.

    It would be really easy to just let it all slide into the codebase if I trusted their output or had built some big agentic pipeline that gave me a false sense of security.

    Maybe 10 years from now the situation will be improved, but at the current point in time I think vibe coding and these agentic engineering pipelines are just variations of a same theme of abdicating entirely to the LLM.

    This morning I was working on a single file where I thought I could have Opus on Max handle some changes. It was making mistakes or missing things on almost every turn that I had to correct. The code it was proposing would have mostly worked, but was too complicated and regressed some obvious simplifications that I had already coded by hand. Multiply this across thousands of agentic commits and codebases get really bad.

    • transcriptase 1 day ago

      Next time give it the context required for the task, eg an explanation of why you have those hand coded simplifications, and be amazed at how proper use of a tool works better than just assuming your drill knows what size bit to pick.

  • bryan0 1 day ago

    I agree, vibe coding does not have quality gate checks at each stage, while agentic engineering does. Dev teams get into trouble when they try build to build without a proper process of design, tests, and reviews. This was true before agentic coding, but it's especially true now. The teams that understand how to leverage agents in this process are the ones that will be most successful.

underdeserver 1 day ago

When I was in grad school I graded homework for first year math classes, and the thing about math homework is that the perfect homework takes almost no time to grade.

It's the bad, semi-coherent submissions that eat up your time, because you do want to award some points and tell students where they went wrong. It's the Anna Karenina principle applied to math.

Code review is the same thing. If you're sure Claude wrote your endpoint right, why not review it anyway? It's going to take you two minutes, and you're not going to wonder whether this time it missed a nuance.

  • scottyah 1 day ago

    Typically in engineering you don't know what you're doing. If you're sure of what it should look like going in, you're more of a technician. I think most people coding have no idea what they're doing to a large extent- not many people can do the same rote work for years straight.

wek 1 day ago

What an excellent article by a smart, humble, still-learning person!

Favorite quote:" There are a whole bunch of reasons I’m not scared that my career as a software engineer is over now that computers can write their own code, partly because these things are amplifiers of existing experience. If you know what you’re doing, you can run so much faster with them. [...]

I’m constantly reminded as I work with these tools how hard the thing that we do is. Producing software is a ferociously difficult thing to do. And you could give me all of the AI tools in the world and what we’re trying to achieve here is still really difficult. [...]"

  • keeganpoppen 1 day ago

    it’s sad that i had to triple-read this to determine you weren’t being sarcastic. sad for whom? i don’t know. but the amplifier take is exactly the right one.

    • wek 1 day ago

      I kind of felt the same way reading the article! It felt so unusual to encounter someone who is both smart and humble and willing to admit they were learning. And I was happy to encounter it and sad that I was so surprised by it.

    • rirze 19 hours ago

      I didn't think it was sarcastic till I read your comment, upon which point, I got confused and read it twice to make sure it wasn't sarcastic.

      Nevertheless, it is refreshing to see nuanced positive energy. I agree that AI is going to be the great multiplier.

  • alishayk 1 day ago

    What do you do if you don't have that existing experience? How do you build it up?

    • mikestaas 1 day ago

      Break things, and then fix them. Repeat many times.

    • sotix 1 day ago

      Build it up in your free time. It's extraordinarily valuable to build up those skills, and I'm not convinced that companies will allow time to slow down and build them.

ofrzeta 1 day ago

"I want professionally managed software companies to use AI coding assistance to make more/better/cheaper software products that they sell to me for money.” (Simon Willison herein quotes Matthew Yglesias) - this is such a naive and sloppy take. What do you want? "better software"? not going to happen. "cheaper software"? not going to happen either. "more software"? for sure, but is it really what you want?

If I hire a plumber it's certainly not cheaper than doing it myself but when I am paying money I want to make sure it is better quality than what I am vibe plumbing myself.

  • ianm218 23 hours ago

    I definitely want more, higher quality software, maybe even 10X more. Even simple things like a personal assistant that can help manage my social life better don't really exist yet, nevermind that I want a medical team doing research on my behalf/ optimizing my insurance. Or a software team in the background building bespoke software for all my hobbies etc.

  • senordevnyc 19 hours ago

    I'm already getting and creating better software for cheaper. I have lots of software products that I use that are better now than a few years ago because of AI. And much of the software I use is free. What are you talking about exactly?

    And on the creation side, I run a SaaS that's taking over a niche market because it replaces a human-powered process with an AI-powered one. Customers switch to me because they get better results more consistently, much faster, and much cheaper.

kelnos 1 day ago

Yup, the normalization of deviance here is a real thing. I still review all the code the LLM generates (well, really, I have it generate very little code: I use it more for planning, design, rubber-ducking, and helping track down the causes of bugs), but as time goes on without obvious errors, it gets more and more tempting to assume the code is going to be fine, and not look at it too closely.

But resisting that impulse is just another part of being a professional. If your standards involve a certain level of test coverage, but your tests haven't flagged any issues in a long time, you might be tempted to write fewer tests as you continue to write more code. Being a professional means not giving in to that temptation. Keep to your quality standards.

Sure, standards are ultimately somewhat arbitrary, and experience can and should cause you to re-evaluate your standards sometimes to see if they need tweaking. But that should be done dispassionately, not in the middle of rushing to complete a task.

And hell, maybe someday the agents will get so good that our standards suggest that vibe coding is ok, and should be the norm. But you're still the one who's going to be responsible when something breaks.

wg0 1 day ago

Here's for the AI supremacists:

Let's assume AI is 10x perfect than humnas in accuracy and produces 10x less bugs and increases the speed by 1000x compared to a very capable software engineer.

Now imagine this: A car travels at a road that has 10x more bumps but it is traveling 1000x slower pace so even though there are 10x bumps, your ride will feel less bumpy because you're encountering them at far lower pace.

Now imagine a road that has 10x less bumps on the road but you're traveling at 1000x the speed. Your ride would be lot more bumpy.

That's the agentic coding for you. Your ride would be a lot more painful. There's lots of denial around that but as time progresses it'll be very hard to deny.

Lastly - vibe coding is honest but agentic coding is snake oil [0] and these arguments about having harnesses that have dozens of memory, agent and skill files with rules sprinkled in them pages and pages of them is absolutely wrong as well. Such paradigm assumes that LLMs are perfect reliable super accurate rule followers and only problem as industry that we have is not being able to specify enough rules clearly enough.

Such a belief could only be held by someone who hasn't worked with LLMs long enough or is a totally non technical person not knowledgeable enough to know how LLMs work but holding on to such wrong belief system by highly technical community is highly regrettable.

[0]. https://news.ycombinator.com/item?id=48018018

  • jwpapi 1 day ago

    You are speaking out of my soul. Thank you. Great example. I have grinded AI extensively 14 hours a day on my own project for months. I’ve been using AI since GPT-2.

    I maxxed out Claude Max $200 subscription and before I justified spending $100/day.

    And it was worth it, but not because it wrote me so good code, but because I learnt the lessons of software engineering fast. I had the exact ride you are describing. My software was incredible broken.

    Now I see all the cracks, lies and "barking the wrong tree" issues clearly.

    NOW i treat it as an untrustworyth search engine for domains I’m behind at. I also use predict next edit and auto-complete, but I don’t let AI do any edit on my codebase anymore.

  • princevegeta89 1 day ago

    I will 100% agree with this. It just feels very scary to see entire teams completely handing off all coding needs and testing needs and also design needs for that matter, to AI. This not only makes people lose their touch but also allows them to push insane amounts of code every day. PRs get impossible to review for humans because they are too huge and they add too much burden so they unsurprisingly use AI to review those things again. And with the amount of code churn, nobody knows what exactly is being implemented. And I have seen first hand that as the size of the code base grows, tracing problems and actually debugging things when things go wrong gets incredibly rough and complex.

    And AI that has been helping all this time will suddenly stop helping out with this one use case. I have experienced AI running in circles, in this case trying to find a root cause. It failed, and the user is left holding the bag. That is when you feel like you have just been dropped into a vast ocean without a lifeboat. Then you'll have to just start looking through those massive chunks of vibe-coded crap to understand what is going on.

    AI is good in terms of improving speed, but I am afraid we are massively taking it the wrong way as engineers. Everyone is just letting it go on autopilot and make it do things completely from start to end. The ideal solution lies where every piece of code it writes is reviewed by authors, and they make sure they are not checking in crazy stuff day in and day out.

  • ex-aws-dude 19 hours ago

    I don't understand what you mean by the last point

    If I generate code with an agent and review it and iterate back and forth until the quality is as high as I would write myself, the end result is no different

    I'm still in control of holding it to the same quality level?

    With agentic coding there is still a human reviewing the code, that's the main difference from vibe-coding

    The rules are just to try to guide it and save iteration time but there is no illusion that they are actual hard rules since everything is statistical.

    > Such paradigm assumes that LLMs are perfect reliable super accurate rule followers

    That's the whole reason we're not vibe-coding, we are well aware of that.

keeda 1 day ago

I think all coding will become vibe coding, but it will be no less an engineering discipline.

Note: I still review pretty much every line of code that I own, regardless of who generates it, and I see the problems with agents very clearly... but I can also see the trends.

My take: Instead of crafting code, engineering will shift to crafting bespoke, comprehensive validation mechanisms for the results of the agents' work such that it is technically (maybe even mathematically) provable as far as possible, and any non-provable validations can be reviewed quickly by a human. I would also bet the review mechanisms would be primarily visually, because that is the highest bandwidth input available to us.

By comprehensive validations I don't mean just tests, but multiple overlapping, interlocking levels of tests and metrics. Like, I don't just have an E2E test for the UI, I have an overlapping test for expected changes in the backend DB. And in some cases I generate so many test cases that I don't check for individual rows, I look at the distribution of data before and after the test. I have very few unit tests, but I do have performance tests! I color-code some validation results so that if something breaks I instantly know what it may be.

All of this is overkill to do manually but is a breeze with agents, and over time really enables moving fast without breaking things. I also notice I have to add very few new validations for new code changes these days, so once the upfront cost is paid, the dividends roll in for a long time.

Now, I had to think deeply about the most effective set of technical constraints that give me the most confidence while accounting for the foibles of the LLMs. And all of this is specific to my projects, not much can be generalized other than high-level principles like "multiple interlocking tests." Each project will need its own custom validation (note: not just "test") suites which are very specific to its architecture and technical details.

So this is still engineering, but it will be vibe coding in the sense that we almost never look at the code, we just look at the results.

  • rvz 1 day ago

    This is complete insanity for anyone that actually works on production-grade, hundred billion dollar systems that are critical to the function of the global economy.

    Other than for your own pet projects, almost all of what you said has no place for "vibe engineering" / or "vibe coding" on serious software engineering products that are needed in life and death situations.

    • keeda 1 day ago

      That may be true for highly critical systems, but those are a tiny, tiny, tiny minority of all software projects. I mean, how many engineers work on aviation or automotive or X-ray machine or other life-and-death code compared to pretty much anything else?

      And not all "production-grade, hundred billion dollar systems" are that critical. Like, Claude Code as we all know is clearly vibe-coded and is already a 10-billion (and rapidly increasing!) dollar system. Google Search and various Meta apps meet those criteria and people are already using LLMs on that code, and will soon be "vibe coding" as I described it.

      AWS meets that criteria and has already had an LLM-caused outage! But that's not stopping them from doing even more AI coding. In fact I bet they will invest in more validation suites instead, because those are a good idea anyways. After all, all the cloud providers have been having outages long before the age of LLMs.

      The thing most people are missing is that code is cheap, and so automated validations are cheap, and you get more bang for the buck by throwing more code in the form of extensive tests and validations at it than human attention.

      Edited to add: I think I can rephrase the last line better thus: you get more bang for the buck by throwing human attention at extensive automated tests and validations of the code rather than at the code itself.

      • rvz 1 day ago

        This is you:

        >> I think all coding will become vibe coding...

        Nope. First of all, Let's get the true definition of "vibe coding" completely clear from the first mention of it from Karpathy. From [0]:

        >> "There's a new kind of coding I call "vibe coding", where you fully give in to the vibes, embrace exponentials, and forget that the code even exists." [0]

        >> "I "Accept All" always, I don't read the diffs anymore. When I get error messages I just copy paste them in with no comment, usually that fixes it. The code grows beyond my usual comprehension, I'd have to really read through it for a while. Sometimes the LLMs can't fix a bug so I just work around it or ask for random changes until it goes away." [0]

        So with the true definition, you are arguing that all coding will become "vibe-coding" and that includes in mission critical software. Not even Karpathy would go as far as that and he's not even sure that he even knows that it works..."mostly".

        Responsibility is what cannot be vibe-coded. The major cloud providers and the tech companies that own them have contracts with their customers which is worth billions to their revenue. That is why they cannot afford to "vibe-code" infra that causes them to lose $100M+ a hour when a key part of their infra goes down or stops working.

        So:

        > Like, Claude Code as we all know is clearly vibe-coded and is already a 10-billion (and rapidly increasing!) dollar system.

        That is not vibe-coded anymore and it is maintained by software engineers who look at the code at all times, daily before merging any changes; AI generated or not.

        > Google Search and various Meta apps meet those criteria and people are already using LLMs on that code, and will soon be "vibe coding" as I described it.

        Nope. As Karpathy described it, that would never happen and human software engineers will be reviewing the agents code all the times. But that would not be vibe-coding would it?

        > AWS meets that criteria and has already had an LLM-caused outage!

        Are they vibe coding now after that outage? I bet that they are not.

        > After all, all the cloud providers have been having outages long before the age of LLMs.

        That isn't the point. Someone was held to account for the outages and had to explain why it happened.

        They will lose trust + billions of dollars if they admitted that they vibe-coded their entire infra and had 0 engineers who don't understand why it went wrong.

        > The thing most people are missing is that code is cheap, and so automated validations are cheap, and you get more bang for the buck by throwing more code in the form of extensive tests and validations at it than human attention.

        The risk is amplified with the companies reputation on the line and it's very expensive to lose. I'm talking in the hundreds of billions annually and a 10% loss of global revenues due to constant outages can cause the stock to fall.

        So you do understand the contradiction you said earlier about AWS indeed strengthens my point on the limitations on vibe coding especially on mission critical software?

        [0] https://x.com/karpathy/status/1886192184808149383

        • keeda 1 day ago

          Even ignoring the semantic drift that has happened since he coined the term (on which there have already been a few HN threads), the key part of Karpathy's definition is "...and forget that the code even exists." Which is why I was careful to phrase it thus:

          > So this is still engineering, but it will be vibe coding in the sense that we almost never look at the code, we just look at the results.

          It is pretty clear that "giving in to the vibes" is simply "looking at the results." But I'm predicting that it is going to be an engineering discipline in itself. Note that I started with (emphasis added):

          > I think all coding will become vibe coding but it will be no less an engineering discipline.

          And then I went on to explain the engineering aspect as extensive technical validation. There is a role called Validation Engineers in many industries including semiconductors, and I posit that it's going to be everybody's primary role soon.

          > Responsibility is what cannot be vibe-coded. ... That isn't the point. Someone was held to account for the outages and had to explain why it happened.

          I never implied a loss of accountability anywhere, but I completely agree, and have posted about it before: https://news.ycombinator.com/item?id=46319851

          That is still orthogonal to vibe-coding. People have been sloppy without vibe-coding and were still held accountable. The flaw is assuming all vibe-coding is slop, because my point is that validation will matter much more than the code, which means soon we may never look at the code. In fact, extensive automated validation is probably a better signal for accountability than "We looked at the code very, very carefully."

          • rvz 6 hours ago

            > Even ignoring the semantic drift that has happened since he coined the term (on which there have already been a few HN threads), the key part of Karpathy's definition is "...and forget that the code even exists." Which is why I was careful to phrase it thus:

            The point of bringing up the exact definition is to draw a clear line on what defines "vibe-coding" in the first place. There is no "semantic drift" and, karpathy's entire tweet is used as the definition and I don't think you can separate any part of it at all.

            In your first post, you mentioned "vibe coding" and by definition it includes not looking at the code and accepting all changes the AI agent suggests and copy pasting errors back to the agent until it is fixed without any understanding; exactly how karpathy first defined it.

            > It is pretty clear that "giving in to the vibes" is simply "looking at the results."

            Seems like you don't even understand what vibe coding is. "giving into the vibes" is not just looking at the results, it is not looking at the code and accepting all the agent's output without any understanding of the code.

            > But I'm predicting that it is going to be an engineering discipline in itself. Note that I started with (emphasis added):

            "Vibe coding" is not engineering anymore than "coding" is not software engineering.

            > And then I went on to explain the engineering aspect as extensive technical validation. There is a role called Validation Engineers in many industries including semiconductors, and I posit that it's going to be everybody's primary role soon.

            It already is, in the form of quality assurance. This was there for years alongside formal verification engineers and validation engineers for years and "vibe coding" is incompatible with all of this and breaks the software development lifecycle.

            These roles were already a given at many companies, so there is nothing new that you actually said.

    • tempaccount5050 1 day ago

      Almost no one works on stuff like that, so congrats on finding a corner case I guess.

      • rvz 1 day ago

        Complete nonsense.

        There are people who write software for hedge funds, quant firms, aviation and defense systems, data center providers, major telecom services used by hospitals and emergency services and semiconductor firms and the big oil and energy companies and that is NOT "almost no-one" and these companies see and make hundreds of billions of dollars a year on average.

        This is even before me mentioning big tech.

        Perhaps the work most here on this site are doing is not serious enough that can be totally vibe-coded and are toy projects and bring in close to $0 for the company to not care.

        What I am talking about is the software that is responsible for being the core revenue driver of the business and it being also mission critical.

        • adithyassekhar 1 day ago

          I would prefer hedge funds and traders to vibe code their software. Heck I am willing to do it if I mist.

          • rvz 6 hours ago

            You would never say that if you interviewed at a hedge fund or quant firm and they'll laugh at you before they look at your resume.

        • keeda 1 day ago

          I could list dozens more sectors of the software industry that would far outnumber those you listed. And even within those you listed, those working on the mission critical parts are a very tiny fraction. Statistically, that is almost no-one.

          E.g. there are 100s of millions of lines of code in a car, but the vast majority of that concerns non-critical parts like the dashboard; the primary Engine Control Unit has like ~10K LoC, and the number of people that work on it are proportionally smaller.

          And if you think that is very well-designed code, here's something to help you sleep better: https://www.reddit.com/r/coding/comments/384mjp/nasa_softwar...

  • rirze 19 hours ago

    This take is too premature. We forget that AI is seamless for contexts that are in the training datasets (popular programming languages, open source libraries, well-documented algorithms, etc..).

    It is very obviously hallucinogenic when it comes to new programming languages, new domains, and uncommon/poorly documented contexts. And AI is very poor at (3D) spatial visualization (making AI assisted CAD development incredibly hard).

    AI is not capable of genuine logical thinking from fundamentals yet; these are highly trained, curated models.

turtlebits 1 day ago

The scary part is that codebases are getting layers of AI complexity, that it's going to cost $$$ to have the latest model decipher and make changes as no human can understand the code anymore.

Pretty soon there is no code reuse and we're burning money reinventing the wheel over and over.

  • bossyTeacher 1 day ago

    > The scary part is that codebases are getting layers of AI complexity, that it's going to cost $$$ to have the latest model decipher

    Isn't this a bit like old Java or IDE-heavy languages like old Java/C#? If you tried to make Android apps back in the early days, you HAD to use an IDE, writing the ridicolous amount of boilerplate you had to write to display a "Hello Word" alert after clicking a button was soul destroying.

    • turtlebits 1 day ago

      At least a human can get involved. Complex codebases written by humans can be understood.

      If the barrier is too high, code is refactored.

    • layer8 1 day ago

      The difference is that the complexity to achieve “Hello World” was the same for everyone, and more or less well-understood and documented. With AI, you get some different random spaghetti slop each time.

  • ewild 1 day ago

    I genuinely think it's part of a psyop. If we bloat all codebases and eventually start printing the models on chips to reduce inference costs by 50-100x they'll take in massive profits from 5M line codebases instead of 350k

  • somewhereoutth 1 day ago

    Prior to the advent of LLMs, I had this concept of the 'complexity horizon' - essentially a [hand built] software system will naturally tend to get more and more complex until no-one can understand it - until it meets the complexity horizon. And there it stays, being essentially unmaintainable.

    With LLMs, you can race right for that horizon, go right through, and continue far beyond! But then of course you find yourself in a place without reason (the real hell), with all the horror and madness that that entails.

  • eddiewithzato 1 day ago

    The models today will happily slop over a single 1k loc react index component on a brand new project.

    They really are bad for creating a healthy codebase

dev360 1 day ago

> It’s not just the downstream stuff, it’s the upstream stuff as well. I saw a great talk by Jenny Wen, who’s the design leader at Anthropic, where she said we have all of these design processes that are based around the idea that you need to get the design right—because if you hand it off to the engineers and they spend three months building the wrong thing, that’s catastrophic.

This is spot on. I think the tooling is evolving so much particularly on the design side that its not worth the "translation cost" to stay (or even be) on the Figma side anymore.

  • christophilus 1 day ago

    If you hand something off to engineering and they spend three months building the wrong thing, you’ve got a dysfunctional organization.

gabriela_c 1 day ago

Claude often does things in more detail, and even better, than I would, in the first pass. But I don't understand how anybody stands comments generated by an LLM?

It's seriously the thing that worries (and bothers) me the most. I almost never let unedited LLM comments pass. At a minimum.

Most of the time, I use my own vibe-coded tool to run multiple GitHub-PR-review-style reviews, and send them off to the agent to make the code look and work fine.

It also struggles with doing things the idiomatic way for huge codebases, or sometimes it's just plain wrong about why something works, even if it gets it right.

And I say this despite the fact that I don't really write much code by hand anymore, only the important ones (if even!) or the interesting ones.

Also, don't even get me started on AI-generated READMEs... I use Claude to refine my Markdown or automatically handle dark/light-mode, but I try to write everything myself, because I can't stand what it generates.

  • jazzypants 1 day ago

    I find that the best thing about generating documentation with LLM's is that it gets me angry enough to rewrite it correctly.

    "Ugh, no! Why would you say it like that? That's not even how it works! Now, I need to write a full paragraph instead of a short snippet to make sure that no future agents get confused in the same way."

  • mkozlows 1 day ago

    The comments aren't an LLM thing, they're a Claude thing. Codex doesn't write those gross hyper-verbose comments.

    • user34283 1 day ago

      In my experience Codex barely writes any comments, despite my attempts to encourage it in the AGENTS.md.

GistNoesis 1 day ago

The real paradigm shift is not here yet, but not very far away. I'm talking about the single unified codebase. Agents building a unique codebase for all your software needs.

Because most of the complexity in software comes from interfacing with external components, when you don't need to adapt to this you can write simpler and better code.

Rather than relying on an external library, you just write your own and have full control and can do quality control.

Linux kernel is 30 000 000 LOC. At 100 tokens /s, let's say 1 LOC per second produced for a single 4090 GPU, in one year of continuous running 3600 * 24 * 365 = 31 536 000 everyone can have its own OS.

It's the "Apps" story all over again : there are millions of apps, but the average user only have 100 max and use 10 daily at most.

Standardize data and services and you don't need that much software.

What will most likely happen is one company with a few millions GPUs will rewrite a complete software ecosystem, and people will just use this and stop doing any software because anything can be produced on the fly. Then all compute can be spent on consistent quality.

  • deadbabe 1 day ago

    Every happy OS will be the same. Every broken OS will be broken in its own way. What a nightmare.

  • ytoawwhra92 1 day ago

    > Standardize data and services and you don't need that much software.

    We've known this since close to the advent of computing and yet every generation of has taken us further away from this goal. Largely driven by jealous resource-guarding, particularly when it comes to data. Why don't I have a generic media player app that can stream Netflix, Disney, Hulu, etc? Those brands want control over my experience. They will continue to want that control indefinitely. That basic human desire for control won't evaporate with a "single unified codebase".

vmaurin 1 day ago

> The entire software development lifecycle was, it turns out, designed around the idea that it takes a day to produce a few hundred lines of code. And now it doesn’t.

No, it was never designed around that. All methodologies of software dev don't focus too much on writing the code, but on everything else: requirement definition, quality, maintenance, speed of integrating feature, scaling the work, ...

Personally with 20 years of experience, I never seen a single company were writing the code was a bottleneck

  • senordevnyc 19 hours ago

    requirement definition, quality, maintenance, speed of integrating feature, scaling the work

    Literally every single one of these is much, much faster with AI than without. It's not even close.

noduerme 1 day ago

>> The entire software development lifecycle was, it turns out, designed around the idea that it takes a day to produce a few hundred lines of code.

Yeah. I'm not sure how other people work, but I almost never need to write formal tests because I essentially test locally as I write, one method at a time, and at that moment I have a complete mental map of everything that can potentially go wrong with a piece of code. I write and test constantly in tandem. I can write a test afterwards to prove what I already know, but I already know it. This is time consuming, anal, and obsessive-compulsive, and luckily that kind of work perfectly suits my personality. The end result is perfect before I commit it.

It is a lot of fun asking LLMs to write code around my code. Make 10 charts with chartjs in an html page that show something and put it behind a reverse proxy so the client can see it. Wow. Spot on, would've taken me an hour. I can even rely on Claude to somewhat honestly reason about things in personal projects.

But knowing every implementation decision makes a huge difference when anything real is at stake. "Guilt" wouldn't begin to describe the sense I'd have id my software did something because of a piece of code I hadn't personally reviewed and fully understood, at which point I probably should have just written it myself.

bhagyeshsp 1 day ago

> The thing that really helps me is thinking back to when I’ve worked at larger organizations where I’ve been an engineering manager. Other teams are building software that my team depends on.

> If another team hands over something and says, “hey, this is the image resize service, here’s how to use it to resize your images”... I’m not going to go and read every line of code that they wrote.

The distance of accountability of the output from its producer is an important metric. Who will be held accountable for which output: that's important to maintain and not feel the "guilt".

So, organizations would need to focus on better and more granular building incentives and punishment mechanisms for large-scale software projects.

drmajormccheese 1 day ago

There are techniques for improving our confidence in our software: unit testing, integration testing, fuzz testing, property-based testing, static analysis, model checking, theorem proving, formal methods, etc. The LLM is not only a tool for generating lines of code. It can also generate lines of testing. The goal is that the tests are easier to audit by the humans than the code.

  • exographicskip 1 day ago

    I've found that one of the areas I enjoyed least is now what I spend a lot of time on now: testing!

    Property-based testing in particular has uncovered a number of invariants in every code base I've introduced it to.

    tbf depending on the agent/model a lot of the tests end up being thrown out so it's possible I _should_ handwrite more tests, but having better prompts and detailed plans seems to mitigate that somewhat

  • sfjailbird 1 day ago

    How do we make sure the LLM generated code works? We'll have LLM generated tests! Wait a minute...

  • coldtea 1 day ago

    >There are techniques for improving our confidence in our software: unit testing, integration testing, fuzz testing, property-based testing, static analysis, model checking, theorem proving, formal methods, etc. The LLM is not only a tool for generating lines of code. It can also generate lines of testing.

    Which is the same issue of lack of understanding and care and accountability from the human operator, with extra steps and a false sense of security.

mrothroc 19 hours ago

The "blurring" framing makes Simon's tension sound intrinsic when it is actually structural. Vibe coding and agentic engineering aren't on a continuum. They're distinguished by the process.

Engineering is always about a defined process. We follow it to produce predictable artifacts that meet the specifications. Even though code is somewhat "squishy" in that it is an art just as much as a science, it still has to meet the spec.

This has always been true, even before agents started writing code for us. We've all dealt with spaghetti code because of undisciplined practices. That's exactly why we came up with the standard SDLC process: plan, design, code, test, deploy. Repeat.

The part people seem to forget about when looking at this is the space between the steps: the gates. We review the artifacts produced at each stage. If the reviewer does not approve, the engineer has to fix it until it passes. True for human coders, doubly true for agentic coders.

Agentic engineering still follows the process. Artifacts are now cheap to produce, which means we have to adjust it so we don't overwhelm the humans in the loop. For me, this means augmenting my review step with agentic reviewers to catch the dumb stuff. It only escalates to me when either a) it passes clean or b) there is something that genuinely needs my experience.

This is agentic engineering, not vibe coding.

_doctor_love 1 day ago

Repeat after me: most software spends the majority of its lifetime in the maintenance phase.

Repeat after me: it follows that most of the money the software makes occurs during the maintenance phase.

Repeat after me: our industry still does not understand this after almost 100 years of being in existence.

Alan Kay was 100% right when he said that the computer revolution hasn't occurred yet. For all of our current advancements all tools are more or less in the Stone Age.

My great hope is that AI will actually accelerate us to a point where the existing paradigm fully breaks beyond healing and we can finally do something new, different, and better.

So for now - squeee! - put a jetpack on your SDLC with AI and go to town!!! Move fast and break things (like, for real).

  • jwpapi 1 day ago

    I hate code and I want as little of it as possible in my codebase.

    • _doctor_love 1 day ago

      The best code is no code. The second-best code is the code I delete.

      My favorite JIRAs are the ones I prevent from being worked on in the first place because they were unnecessary.

      The ideal prompt is the one I don't fire because it would be a waste.

      In an application with an LLM component, the ideal amount of inference is zero.

      Ultimately this seems to lead to "the ideal amount of computers in the world is none" but for the sake of my continued employment let's let that one go by. :)

  • jFriedensreich 1 day ago

    Most software has a few years lifetime and nearly no users. What you say is only true after reaching a certain milestone like product market fit. I think the idea is to reach that turning point as fast as possible and then rebuild the system from ground up with maintainability and quality focus.

ttariq 19 hours ago

I am not sure about agentic engineering getting close to vibe coding, but I certainly buy into building trust in your agents, similar to how you would trust another team / colelague within your organization (the image resizing example), and the best way to make sure that a team is working well is to make sure the right context i available to them at the right time and whenever they change the code base, they update that "context." In the case of human programming, this context is in the form of architecture docs, tickets, product spec, ADRs, messages, code review comments etc and lives in a host of different places. It is also difficult to get humans to fetch and update the context with discipline. However, with agents, it is much easier to get them to consume the right context and keep it updated as they make changes to the code base. I think that is the key to making agents more reliable and being able to have the trust in their decision making and output. All of this, is of course, on top of standard unit testing etc.

inventor7777 1 day ago

I agree somewhat, but I do still think there is a decently sized separation between true vibe coding (the typical "make me an app...fix this bug") and actual AI assisted development. I personally think that if you are a dev and you simply trust the AI's output, that is still vibe coding.

I am not a developer and have very basic code knowledge. I recently built a small and lightweight Docker container using Codex 5.5/5.4 that ingests logs with rsyslog and has a nice web UI and an organized log storage structure. I did not write any code manually.

Even without writing code, I still had to use common sense in order to get it in a place I was happy with. If i truly knew nothing, the AI would have made some very poor decisions. Examples: it would have kept everything in main.go, it would have hardcoded the timezone, the settings were all hardcoded in the Go code, the crash handling was non existent, and a missing config would have prevented start. And that is on a ~3000 line app. I cannot imagine unleashing an AI on a large, complex. codebase without some decent knowledge and reviewing.

_jss 1 day ago

This is a timely observation and feels right to me. I needed to get a relatively simple batch download -> transform -> api endpoint stood up. I wrote a fairly detailed prompt but left a lot of implementation details out, including data sources.

Opus 4.7 built it about 90% the same way I would, but had way more convenience methods and step-validations included.

It's great, and really frees me up to think about harder problems.

  • exographicskip 1 day ago

    This is my experience too. I'm primarily a python dev, but have been routinely using other backend languages (rust, go, etc) that I'm familiar with but not at the same level.

    Just having ~13yrs experience heavily weighted in one language with some formal studying of others makes directing llms a lot simpler.

    Learning syntax, primitives, package managers, testing, etc isn't that much of a lift compared to how I used to program.

    Was helping a non-dev colleague who's using claude cowork/code to automate reporting the other day. They understand the business intelligence side well, but were struggling with basic diction to vibe code a pyautogui wrapper to pull up RDP and fill out a MS Access abstraction on a vendor DB.

    Think we'll be fine for another 5-10 years as a profession

linuxhansl 1 day ago

I guess it all depends on what you use it for.

I work on database optimizers and other database related stuff, and I can assure Claude Code - with all the highest settings - does make mistakes. It will generate a test that does not actually test what it "thinks" it tests. It will confidently break stuff.

Do not get me wrong. It is still awesome! It takes much of grunt work off me. It can game out designs decisions even when that needs to refactor a lot of code. If you point out a mistake more often than not it can fix it itself.

It's just for a critical project I would never ship it without understanding every line of code - with the exception perhaps of some of the test code. Maybe in a year or two that will be different.

aenis 1 day ago

Its just economy 101.

People have been running crappy code commercially for over half a century now. Not many companies successfully differentiate by running good code - it usually does not matter to the end consumer, other things are much more important. So now companies will pay less for code, and maybe it is a bit worse (though I personally can't believe AI can do worse than corporate software developers on average). Hobbyists will remain hobbyists, and precious few will be lucky enough to have someone pay them to handcraft stuff. Exactly what happened to woodworkers and other craftsmen.

ok123456 1 day ago

One-shot "vibe coding" is generally a mistake.

But using an agentic LLM to complete boilerplate is attractive simply because we've created a mountain of accidental and intentional complexity in building software. It's more of a regression to the mean of going back to the cognitive load we had when we simply built desktop applications.

  • dyauspitr 1 day ago

    Tell it to make a plan. Ask it to do 3-5 steps at a time. “One shotting” works very well.

    • addedGone 1 day ago

      why in May 2026, it seems that people haven't discovered loops? people are ignorant, run 20 times the same task in a loop to verify and it's pristine.

redhale 1 day ago

I want to agree, I do. But this point is plainly wrong in my observations:

> The enterprise version of that is I don’t want a CRM unless at least two other giant enterprises have successfully used that CRM for six months. [...] You want solutions that are proven to work before you take a risk on them.

Perhaps not for every category of software and every company. But in practice, any SaaS app that is just CRUD with some business logic + workflows is, imo, absolutely vulnerable to losing customers because people within their customers' orgs vibe coded a replacement.

They are perhaps even more at risk because would-be new customers don't ever even bother searching to find them as an option because they just vibe code a competitor in-house.

The vulnerability lies primarily in the fact that most of these SaaS apps were talking about are _wrong_ to some meaningful degree. They don't fully fit how your company works, and they never did. There is something about them that you are forced to work around in some way. This is true because it is impossible to build a universally perfect product, to perfectly fit it to every business requirement of every user in every company.

But now it is relatively cheap to build the perfect version for your company in-house. Or maybe even just for YOU.

I think medium/long-term this will mean a redistribution of technical talent from SaaS companies to industry companies. Instead of paying millions for SaaS subscriptions, industry companies will spend fewer millions building precisely what they need in-house with the help of AI. Not every SaaS and not every company, but I already see this happening at my company right now.

nsoonhui 1 day ago

This is my workflow which I find very productive with Agentic AI.

Disclaimer: I'm doing a CAD-like engineering desktop app, and I'm using VS 2026 Copilot, so YMMV.

When I get a Jira ticket, I will first diagnose the problem, and then ask AI to write a test case for it that will reproduce the problem, with guidance on what/how to do the test case (you will be surprised to know how many geometry, seemingly visual problems can be unit tested), and if necessary I provide clues (like which files to read, etc.) for AI to look at, and ask AI to just go and fix the test.

Often AI can do that; AI can make the test pass and make sure that adjacent tests also pass. If in doubt, I will check the output reasoning. I then verify that the fix is done properly via visual inspection (remember, this is a desktop app), and I ask for clarification if needed.

Then at night I'll let my automated test suites run... and oops! Regression found! Who broke it? AI or human? Who cares. I just tell AI that between these times one of the commits must have broken the code — can you please fix it for me? And AI can do that.

This works for small or medium feature implementation, trival bugfixes, or even annoying geometrical problems that require me to dig out the needle in the haystack. So the productivity gain is very real. But I haven't tried it on feature that requires weeks or months for implementation, maybe I should try it next time.

It's hard to describe the feeling. It's just that the AI is working like a very capable (junior?) programmer; both might not have full domain knowledge, but with strong test suites and senior guidance, both can go very far. And of course AI is cheaper and a lot more effective.

kommunicate 1 day ago

It's already the case that you get much better results out of LLMs by forcing agents using them to go through additional layers of planning, design & review.

The future is going to dynamically budget and route different parts of the SLDC through different models and subagents running on the cloud. Over time, more and more of that process will be owned by robots and a level of economic thinking will be incorporated into what is thought of today as "software engineering." At some point vibe coding _is_ coding and we're maybe closer to that point than popularly believed.

solomonb 1 day ago

From the podcast episode they talk about the idea of using an LLM for training by disallowing the model to write code. I've been experimenting with exactly that in conjunction with a proof checker (Agda) to help me learn some cubical type theory and category theory.

I find the LLM as interactive tutor reviewing my work in a proof checker to be a really killer combo.

smallnix 18 hours ago

It doesn't matter if you specify system behavior in code, as a LLM conversation, agent instructions, or UML. In all cases you need to be able to translate business needs into very specific computer behavior. This isn't something a layperson can do. But it democratized software development to all who can, but can't write code.

sevenzero 1 day ago

>If you can go from producing 200 lines of code a day to 2,000 lines of code a day, what else breaks? The entire software development lifecycle was, it turns out, designed around the idea that it takes a day to produce a few hundred lines of code. And now it doesn’t.

How is producing more lines of code any good? How does quality assurance work with immeasurable code bloat? I want good software not slopware with 2000 different features. A good product does few things, but does these really well. There is no need to constantly add lines of code to a working product.

parasti 1 day ago

As a web developer, I feel like this take is wildly optimistic. My remaining qualifications that still provide some sort value are providing historical/business/architectural context to the agent and testing the agent's output. And that's only because 1) it's not all written down in Markdown and 2) the agent is massively nerfed by costs and Anthropic. The thing in the middle where I get a coffee and write code in a variety of languages, then pop open a debugger has been fully obsoleted.

cultofmetatron 1 day ago

2 days ago, we updated a stripe library which broke everything. With AI, I was able to one shot wrapping all of the calls into a shared service, patched the broken api contract across the entire app and got our signup and payment flows working again. solid day and a half of work. this would have taken a days of back and forth debugging previously. AI is not a panacea for everything but its doign valuable work right now.

  • bamboozled 1 day ago

    What does this have to do with the article?

    I'd say if you're a semi-competent developer, as probably many people reading the article and commenting already are, this comment adds nothing new to the discussion and would already be a very vanilla usage example of "AI".

    I think the point is that while you can "do things" like extracting the stripe integrations out into their own service in ten minutes, you're not stepping into other problems, such as how do you handle failures, how do you scale the stripe service, how do you structure all your other micro services so they can communicate in a coherent way, basically you're speed running yourself into harder decisions when using AI.

    • cultofmetatron 1 day ago

      > basically you're speed running yourself into harder decisions when using AI.

      on the contrary, I freed myself from the burden of having to find all the places in the code base where we used stripe and patched them in one go along with the tests to prevent regressions. That represents DAYS of work that I condensed into a few hours.

      who cares if it can't know good structure and how to handle failures? I know how to do that. I have a skills file I created that tells stripe our policy for handling error failures, defaults for structures as well as guidelines for how we should deal with communications between different systems. Before i spent hours building this stuff out. now I just spend 20-30 min reviewing a pr to make sure it follows my directives and move onto other problems.

      Thats said, i agree with you on principle. I hand coded an app from a solo dev to now managing a team and gettin ready for an imminent series A. AI doesn't save you from scaling issues, you still need to have a clear idea of what you want from the ai and build processes that give it the context to do its job.

      I call that job security :)

bobkb 1 day ago

While those who are hands on is realising the limits and issues with vibe/context engineering/agentic engeering/buzz-word-of-the-week the businesses and pushing hard on the buzz words. It’s high time we start looking at ways to live with the new reality and figure out ways to ensure software reliability.

jFriedensreich 1 day ago

We still have not the right sandbox and PR abstractions to make the merge of the two complete. Imagine merging a PR and knowing exactly this code cannot ever possibly reach the internet and it can only receive and send specific shapes of api requests from these specific services, it has well defined resource limits and you have specific optimal UI to review these constraints. I can imagine to not review a bigger number of PRs in that reality.

MikeNotThePope 1 day ago

The more I use AI, the more I find it’s great for anything trivial and uninspired. Need help with some predictable glue code? AI. Need help with something insightful and new to the world? Not AI. Need help with an important task that’s been done a 1000 times? AI with scrutiny. Need to invent something new to the world and core to your business? Probably not AI.

  • bluefirebrand 1 day ago

    I'm struggling to imagine the sort of person who struggles with predictable glue code that I would trust with anything more important than that, with or without AI...

    • tempaccount5050 1 day ago

      It's not a struggle for me to walk 15 miles to work every day, I could easily do it. It's just makes no sense when I have a car.

      • bluefirebrand 19 hours ago

        Okay. I'm saying if it were a struggle for you to walk 15 miles to work, I wouldn't hire you to do a job that requires a lot of walking just because cars exist.

singpolyma3 1 day ago

I think I'm just too opinionated to go there. If I see something that works fine, but isn't the way I'd do it, it doesn't matter if a human or an LLM wrote it I'm still in there making it match my vision.

  • suzzer99 1 day ago

    100%. I don't think any senior programmer ever looks at another developer's code and says, "Oh yeah, that's just the way I'd do it."

    • ai_slop_hater 1 day ago

      So you are going to waste everyone's time getting another developer to write code the way you want? This resonates with me because at my company I get this all the time. At that point, you might as well close my PR and do it yourself, whatever way you want. I really like the advice from the book 0 2 1, to assign different areas of responsibility to people, so that there is no conflict.

      • suzzer99 1 day ago

        > So you are going to waste everyone's time getting another developer to write code the way you want?

        No one is suggesting that.

    • cortesoft 1 day ago

      But I assume you don't go and change all your co-workers code just because they didn't do it how you would have done it?

      • jcgrillo 1 day ago

        Even the most toxic places I've worked that kind of behavior would totally get you canned.

    • hirvi74 1 day ago

      I concur, and I think that is one of the most difficult aspects of reviewing another's code. It's difficult for me to sometimes differentiate between what is acceptable vs. what I would have done. I have to be very conscious to not impose my ideals.

  • jstummbillig 1 day ago

    That's not how most organizations work, AI or not.

    • jf22 1 day ago

      What do you mean?

      • jstummbillig 1 day ago

        Organizations usually are not looking for employees who change things that work fine, just because it disagrees with the "vision" of one employee.

  • rglover 1 day ago

    This is the way. If you're a prick about quality and outcomes, whether you typed it with your digits or the robot spit it out is irrelevant.

    What standard of result are you pursuing and are you willing to discipline yourself enough to achieve it?

    AI can't make you un-lazy, no matter how many tokens you pay for.

galkk 1 day ago

Given rapidly decelerating quality of, at least, claude code output, the agentic coding use may decrease. It is insane how bad the results of background agents are now: constant hallucinations, nonsensical outputs.

  • BowBun 1 day ago

    The heavy users of Claude at my job disagree (me included), our work gets shipped and the quality has increased by all metrics. Are you talking about enterprise or consumer Claude subscriptions? I think they're serving drastic different quality depending on how much $ you fork up.

    • galkk 1 day ago

      I don't see much sense to have hn as support thread, but here are quotes from my single claude investigation session, and that happens in every claude code session that I have, especially with 4.7

      * The first agent's claim that was 3.x-only was wrong * is nice-to-have but doesn't target our exact case as cleanly as the agent claimed. * The agent's "direct fix for yyy" is overstated. * not 57% as the earlier agent claimed

      etc etc etc

      And I forgot how many times my session with claude starts: did you read my personal CLAUDE.md and use background agents for long running operations?

      I use enterprise subscription, max effort, was with both 4.6 and 4.7.

      And please refrain from comments like "you're using it wrong", as the drop in output quality is very clear and noticeable.

arian_ 1 day ago

The gap between "vibe coding" and "agentic engineering" is the same gap between asking someone to do a task and being able to prove they did it correctly. One is vibes. The other is accountability. We keep building more powerful agents without building the audit infrastructure to verify what they actually did.

  • jatora 1 day ago

    I think this sounds much more poignant than it is. Its actuallt pretty shallow. The same agents can audit the infrastructure lol

kw3b 1 day ago

Strong agree. Most orgs will stay tangled in the mess they hand-coded over the years, a few greenfield teams will pull ahead, but until some LLM-fuelled startup displaces a strong incumbent I'm skeptical that we're on the cusp of anything other than a K-shaped transition. I see already low quality software and orgs getting flushed to make room for some new ideas now that the barrier to entry is slightly lower (but far from free). I just wish the transition was done with more humanity.

imrozim 1 day ago

Used to check every line for my project. Now i just check the tricky parts still don't know if that's ok or just lazy?

ianhxu 1 day ago

In my own experience, good engineering practices are still not easy to achieve. As a software engineer with three years of experience, I've been doing solo dev for the past few months. Currently, there is still a lot of the harness to set up manually.

Amber-chen 1 day ago

The distinction between 'vibe coding' and 'agentic engineering' is important. In my experience, the key difference is whether you're reviewing and understanding the code the agent produces. When I use coding agents for non-trivial tasks, I always review the diff before committing — that's the engineering part. The danger is when people skip that step and just trust the output.

  • sodapopcan 1 day ago

    That's exactly what TFA is about.

_pdp_ 1 day ago

About two years ago I was using the term "agentic engineer" to describe someone who builds AI agents - not a vibe coder.

Agentic Engineer does not make much sense to be applied to a developer.

It is weird and confusing to call a web designer that uses AI assisted coding tools "agentic engineer".

  • aryehof 1 day ago

    Vanity titles never make much sense, and now even more people can call themselves “engineers”. I was always at a loss why many weren’t calling themselves “web engineers”. Hey Mom, I used Claude Code today at work so I’m an Agentic Engineer!

NikolaosC 1 day ago

The "has someone actually used it" signal is the new code review. Tests, docs, commit count all reproducibl in 30 minutes. Daily usage for 2 weeks isn't. That's the only proof of work that survived the agent era.

skeledrew 1 day ago

It makes sense that they merge over time; it's a mark of the progress being made. The ultimate end is to make them indistinguishable, where the purely vibe coded app will have the quality of the app that has been well engineered over significant time thanks to good user feedback.

kdnxownxkwkd 1 day ago

An AI cannot be held accountable to mistakes, so an AI should not be doing your job for you. End of discussion.

__alexs 1 day ago

The current state of the technology is that you must read at least some of the code, but everyone keeps shipping tools that are focussed on churning out more and more stuff without giving you any affordances to really understand the output.

Claude Code in particular seems really uninterested in this aspect of the problem and I've stopped using entirely because of this.

tim-projects 23 hours ago

> Here are some of my highlights, including my disturbing realization that vibe coding and agentic engineering have started to converge in my own work.

Nothing about this should be disturbing unless you want to dig your heels in, cross your arms and refuse to adapt.

AI is a massive opportunity. But if people focus on the issue of the 'change' they simply waste time they could (and should) be spending on integrating it correctly.

I believe that this form of resistance is far more stagnating and dangerous than any of the issues that come with the general onslaught of ai integration.

readgrounded 1 day ago

I agree to some extent. I think that small aps, dashboards, service wrappers etc. you can vibe code.

But building software still requires domain knowledge, understanding data structures, architecture, which services to use. We probably have 2-5 years before thats fully automated.

tyyyy3 1 day ago

Correct me if I’m wrong Simon, but weren’t you highly optimistic about llm’s and agentic-use of them?

I believe this is a common fault of not being able to zoom out and look at what trade offs are being made. There’s always trade-offs, the question is whether you can define them and then do the analysis to determine whether the result leaves you in a net benefit state.

  • simonw 1 day ago

    I still am. I think setting up LLMs to call tools in a loop is a fascinating way to build interesting software that could not have existed before.

    Coding agents are also upending how software development works, in a way that we are still very much figuring out.

    I don't think anyone has a confident answer for how best to apply them yet, especially on larger production-ready projects.

    • p_stuart82 1 day ago

      I think you kind of answered this in the post though. "I want somebody to have used the thing" is dogfooding. and it's probably the only quality signal left that can't be generated in 30 minutes.

hiroakiaizawa 1 day ago

One thing I've started appreciating with LLM-assisted workflows is how important fixed evaluation protocols are.

Without pre-defined definitions and locked procedures, it's extremely easy to mistake iterative adaptation for genuine signal.

devoria 20 hours ago

The thing I've been thinking about: agentic engineering still gives you per-step verification.

ppqqrr 1 day ago

the discourse around "code quality" has always attracted the least nuanced minds, ones who see the world and the phenomenon of life as nothing but territory to be divided up by the latest buzzwords. the worst ones insist that we narrow the discussion even further, to focus on the conflicts between these buzzwords. whenever i have to sit through such discussions, i try to meditate on the irony of mother nature weaving the most functionally brutal, ruthlessly redundant poetry that is the genetic code, only for the resulting creatures to deny themselves the power of the principles inherent in their own construction.

tannerr_dev 17 hours ago

Was unaware they were seperate or different in the first place

causal 1 day ago

As agents get better at code we trust them to produce more of it. There are still bugs to find, but the haystack gets bigger.

So the number of bugs to find remains constant but the amount of code to review scales with the capability of the agent.

Havoc 1 day ago

Never really bought that there was a clean distinction.

To me it’s a spectrum with varying levels of structure provided, review etc.

Basically oneshot vibes on one side, fully hand coded on other.

lubujackson 1 day ago

I think this is what people mean when they say LLMs are a higher level abstraction. We still need to consider edge cases and have tests. We still to sweat the architecture and understand how the pieces fit together and have a mental map of the codebase. But within each bottom node of that architecture we don't sweat the details. Anything obvious gets caught right away. Most subtle/interaction-based issues occur at the architecture level. Anything that bypasses those filters is a weird bug that is no worse or different from a normal bug fixes - an edge case that was hit in a real world scenario that gets flagged by a user or a logged as an error.

There are certain codebases and pieces of code we definitely want every line to be reasoned and understood. But like his API endpoint example, no reason to fuss with the boilerplate.

This has definitely been my shift over the past few months, and the advantage is I can spend much more time and energy on getting the code architecture just right, which automatically prevents most of the subtle bugs that has people wringing their hands. The new bar is architecting code to be defined as well as an API endpoint->service structure so you can rely on LLMs to paint by numbers for new features/logic.

  • exographicskip 1 day ago

    Good description of my thoughts on vibe coding / agentic engineering.

    Spend a lot more time on architecting and testing than hand rolling most repos now.

    Hats off to people who enjoy the minutia of programming everything by hand, but turns out I enjoy the other aspects of software development more.

mattlangston 1 day ago

Software engineering is software engineering.

An ace software engineer is not an ace because of tooling.

It's not the plane, it's the pilot, or something like that.

mohsen1 1 day ago

I am experimenting with writing en entire TypeScript compiler[1] with AI assistant. I've spent 4 months on it already. It might not be successful at the end of the day but my thinking is that if LLMs are going to write a lot of the code I better learn how this can and can not work. I've learned a lot from this project already. I think we're still in charge of design and big ideas even if all of the code is written by AI

[1] https://github.com/mohsen1/tsz

  • Insanity 1 day ago

    I'm also experimenting with it more and more. Now I'm trying to create a 2D side-scrolling shooter with it, running in the browser. When it was relatively small, it did a good job. As the codebase and docs/ files that I'm using get larger it starts hallucinating, especially when the context gets at about 50% usage (Codex w/ gpt5.5). As in, it'll literally forget to update parts of the code.

    e.g, I change velocity of player to '200' and of bullets to '300', and it only updated the bullet velocity. Then told me the player was already 'at the correct value' even though it was set to 150. Things like that.. :)

    • mohsen1 1 day ago

      For me, unless there is a concrete way of proving work is correct you can't rely on AI coding. tsz has super strict tests around correctness, performance and architectural boundaries

      • Insanity 1 day ago

        If I understood you correctly, I think I'm less extreme than that. Most code written by humans is also not provably correct. But I'm assuming you mean provably correct like Lean: https://lean-lang.org/, and not just "passes tests".

        If you mean 'passes tests', that can be tackled by AI. Although AI writing its own tests and then implementing its own code is definitely not a foolproof strategy.

        • mohsen1 1 day ago

          More or less. The tsz solver is pure enough (it doesn't know about the AST) that it might be possible to formally validate it. But in my case I am lucky with tsc baseline. Anything that produces different output than tsc is a bug

  • copypaper 1 day ago

    >25k commits in 4 months or about 1 commit every 7 minutes

    How do you manage/orchestrate this? I'm genuinely curious.

    • mohsen1 1 day ago

      Multiple computers and each multiple Claude Code or Codex sessions. It had lots of ups and downs. Now I have a good enough test harness that makes it easier to iterate faster

      • ai_slop_hater 1 day ago

        Do you not run out of things to code?

        • mohsen1 1 day ago

          Code is not the goal. What code does is.

resters 22 hours ago

agentic engineering is when you go from vibes to trust. It's much like how one feels about a brand new unproven, newly hired human team member vs a trusted team member one has worked with for years.

a456463 15 hours ago

Hot take: most people are shit at writing code or logic. We are just going to see more of this vibe coding. This is exposing the bad coders more than anything else. Everything to do with preventing and stabilizing vibe code is what we had to do on a longer scale, now we have to do it a lot more and faster

bigger_fish 1 day ago

Totally agree. The sales pitch is that anyone can use this stuff, but good output is only obtained via thorough understanding.

Sparkyte 1 day ago

The problem with vibe coding closer is that the agentic makes a very plasticy samey feel unless you work with something that makes it unique or can pass a template through it.

overgard 1 day ago

I can't really say I agree with this, although I also hate the phrase "agentic engineering".

I'm working on a licensing system for a product I'm building. I've used Claude a little bit to help out with it, but it's also made a lot of very dumb decisions that would have large (security!) consequences if I didn't catch them. And a lot of them are braindead things, like I asked it to create a configurable limit on a certain resource for the trial version of the application. When I said configurable, I mostly meant: put the number in a constant so I can update it later. What Claude thought I asked was "make it so the user can modify the limits of the trial version in the settings panel" (which defeats the entire purpose of a free trial!). Another thing it messed up recently is I was setting up email-magic-link authentication. It defaulted to creating an account for anyone that typed in an email, which could allow a bad actor to both spam people with login requests (probably getting me kicked off Resend) or creating a lot of bogus accounts.

These things do not think. You cannnot outsource your thinking to them.

criddell 1 day ago

Agentic engineering? That reads to me a little like amateur oncologist. How are you defining engineering?

Can agentic engineers adhere to a similar code of ethics that a professional engineer is sworn to uphold?

https://www.nspe.org/career-growth/nspe-code-ethics-engineer...

  • vehemenz 1 day ago

    The problem of calling what most of us do "engineering" predates LLMs by a good 15-20 years.

  • senko 1 day ago

    > Can agentic engineers adhere to a similar code of ethics that a professional engineer is sworn to uphold?

    Can software engineers?

  • rglover 1 day ago

    Yes. I do "agentic engineering," primarily using Cline as it allows me to gas-and-brake the AI and review what it's doing on a granular level. So, think pair programming but my #2 is an LLM. I routinely reject turns when a given model goes off into space. I also routinely make hot edits to its changes before advancing, several times per day.

    You can use these tools wisely without letting it run unverified carelessly.

rotis 1 day ago

> my disturbing realization that vibe coding and agentic engineering have started to converge in my own work.

>I firmly staked out my belief that “vibe coding” is a very different beast from responsible use of AI to write code, which I’ve since started to call agentic engineering

Disturbing? Really? I admit I don't do agentic and am going only by vibes, but for me agentic engineering is basically vibe coding in a automated loop with some ornamentals. They both stem from the same LLM root and positioning them as significantly different is weird and unconvincing to me. There may be a merit to this article (I gave up after few sentences), but I reject this specific premise.

  • coldtea 1 day ago

    >They both stem from the same LLM root and positioning them as significantly different is weird and unconvincing to me.

    It's the difference between caring and not caring.

    • rotis 1 day ago

      Caring about what? I could slap an application and say I vibe coded it or I could equally claim I agentically engineered it. No one could tell the difference(if there is any) without seeing the code. The only thing you could say I used an LLM. And that is what is happening. Most of the code that is "engineered" we don't get to see. So who know what is really going on there and what is the actual result?

      • coldtea 1 day ago

        >Caring about what? I could slap an application and say I vibe coded it or I could equally claim I agentically engineered it. No one could tell the difference(if there is any) without seeing the code.

        Caring about the result, whether one "can tell the difference" or not.

        • rotis 23 hours ago

          Got it. Vibe coding is all about the end result, damned be the way we got there. So I assume agentic engineering must be the opposite here? Don't care what we will cook. If I get a calculator while asking for integrator that is the true agentic engineering.

mentos 1 day ago

Why is it one or the other and not one THEN the other?

jcgrillo 1 day ago

> It used to be if you found a GitHub repository with a hundred commits and a good readme and automated tests and stuff, you could be pretty sure that the person writing that had put a lot of care and attention into that project.

I think this highlights a problem that has always existed under the surface, but it's being brought into the light by proliferation of vibeslop and openclaw and their ilk. Even in the beforetimes you could craft a 100.0% pure, correct looking github repo that had never stood the test of production. Even if you had a test suite that covers every branch and every instruction, without putting the code in production you aren't going to uncover all the things your test suite didn't--performance issues, security issues, unexpected user behavior, etc.

As an observer looking at this repo, I have no way to tell. It's got hundreds of tests, hundreds of commits, dozens of stars... how am I to know nobody has ever actually used it for anything?

I don't know how to solve this problem, but it seems like there's a pretty obvious tooling gap here. A very similar problem is something like "contributor reputation", i.e. the plague of drive-by AI generated PRs from people (or openclaws) you've never seen before. Stars and number of commits aren't good enough, we need more.

kensai 1 day ago

No offense, but if feels to me the author writes this piece to convince himself. I am afraid he is right. But the bottom line is the same: vibe coding, agenting engineering, everything AI-related comes for our jobs.

Slash32 1 day ago

Still thinking about LLM's

wiseowise 1 day ago

> I’m starting to treat the agents in the same way. And it still feels uncomfortable, because human beings are accountable for what they do. A team can build a reputation. I can say “I trust that team over there. They built good software in the past. They’re not going to build something rubbish because that affects their professional reputations.”

The most important part and why slop isn't the same as a code written by someone else. The model doesn't care, it just produces whatever it is asked to produce. It doesn't have pride, it doesn't have ego, it doesn't artisanal qualities, it doesn't have ownership.

Groxx 1 day ago

I mean... yeah? Isn't it obvious that they're essentially the same thing, but one thinks they're in a higher class than the other?

Fast feedback loops and delegating tasks to sub-agents have been pretty common for vibers since well before they were canonicalized by agenteers. Same thing, different day, hardly even any difference in quality: they evolve together, though vibe tends to lead and agents follow and refine... which vibers then use too.

If you think of vibe coders as agentic alpha testers it makes a lot more sense.

themafia 18 hours ago

> responsible use of AI to write code

You have no clue what went into the training data or how much of the output is covered by someone else's copyright. To pretend this is "responsible" is ridiculous.

Then you go on to use lines of code per day as a meaningful metric without any evidence that it has any consequence whatsoever.

Finally you don't mention profitability once.

What are we even doing here? Pretending? Why?

kushalpatil07 1 day ago

Every time I do deep work, and think of solutions to a complex problem. I always have the opportunity to ask claude to implement a sub-par AI slop solution.

Do this enough times, and I will have forgotten how to think.

  • dev360 1 day ago

    Or, you just explain the solution and save some typing and get the same thing. I find it refreshing to be able to just talk to Claude and have it generate the same thing I would have built.. It gives me more time to articulate and solve complex problems, and less time with the mundane writing, test loops etc.

  • eddiewithzato 1 day ago

    That’s why I like the term “mind virus” for AI. Humans always go for shortest path

QuantumNomad_ 1 day ago

People in the future are going to wonder what the hell we were thinking, when 30 years down the line everything is a hot mess of billions of lines of code generated by LLMs that no human has read almost any of it and is no longer possible for anyone to maintain neither with nor without LLMs. And the LLM generated garbage will have drowned out all of the good quality code that ever existed and no one will be able to find even human generated code anymore on the internet.

Makes me want to just give up programming forever and never use a computer again.

  • ativzzz 1 day ago

    By then, the fix will be easy. Fire up the latest LLM, point it at your codebase and tell it "rewrite this from scratch. do it well. fix the architecture mistakes"

    • kurthr 1 day ago

      "Write me a really cool game, that will make me lots of money, fast!"

      • KumaBear 1 day ago

        Make me a 1hr episode of my favorite book. Make it as lore accurate as possible. Plot out the script for the next 100 episodes.

      • estimator7292 1 day ago

        I see your point, however: EA sports has been doing this for literally the entire lifetime of gaming as an industry

        • DonHopkins 1 day ago

          Electronic Sharts slogans and franchises:

          "Shit's in the Game!"

          "Chunder Everything"

          "Maddening NFL 26"

          "FIFiAsco 26"

          "UFC 26 (Un Finished Code)"

          "The Shits 4"

          "Battlefailed"

          "Need for Greed"

    • fnoef 1 day ago

      "Make sure to double check everything, and MAKE NO MISTAKES!!!"

      • unfunco 1 day ago

        Don't hallucinate!

    • bulbar 1 day ago

      Will work just as good as today or 20 years ago.

      • cortesoft 1 day ago

        Are you suggesting AI coding was as good 20 years ago as it is today?

        • hrldcpr 1 day ago

          I think they're being sarcastic, saying that rewrites from scratch have rarely worked well (whether done by AI or humans).

          • bulbar 1 day ago

            Exactly. Sorry for not being explicit about it. I thought it was clear enough, because 'this code is crap, let's just rewrite the whole thing, doesn't look to hard' is kind of famous for being a bad idea most of the time since forever.

        • vrganj 1 day ago

          It sure wrote less crappy code.

    • hasbot 1 day ago

      We can do this today too (but definitely hopefully future LLMs make better architectural decisions). With Claude, I've been working on an application for the last 2 months. I didn't have a great vision of what I wanted when I started but I didn't want that to slow me down. The architecture is terrible - Claude separated some functionality into different classes but did a bad job at it and created a big ball of mud. Now that I finally have my vision locked down and implemented (albeit poorly), it'd be a great time to throw it away and start over. It'd be interesting to see the result and see how long it takes.

      • ativzzz 1 day ago

        Just have claude (or gpt maybe) do an architecture review and request a multi-phase refactoring plan. This is probably better to do incrementally as you notice the balls of mud forming but it might not be too late. Either way, if it does something you don't like, `git checkout` and start over

    • faizshah 1 day ago

      It won't be an LLM that does it, the entire feature of an LLM is it produces generalizable reasonably "correct" text in response to a context.

      The system that makes it have an opinion about good vs bad architecture or engineering sensibilities will be something on top of the transformer and probably something more deterministic than a prompt.

    • orphea 1 day ago

      Do you think new LLMs are going to write better and better code? When all they are going to have is the slop generated by previous, worse models?

      • chickensong 1 day ago

        Yes. The models may have started from indiscriminate scraping, but people are undoubtedly working on refining the training data. Combined with the overall model capabilities, I suspect code quality will continue to go up.

        What you're suggesting is a negative flywheel where quality spirals down, but I'm hoping it becomes a positive loop and the quality floor goes up. We had plenty of slop before LLMs, and not all LLM output is slop. Time will tell, but I think LLMs will continue to improve their coding abilities and push overall quality higher.

        • orphea 23 hours ago

          Let's agree to disagree.

          What I see is that while LLMs can do real tasks, they often produce overengineered unmaintainable slop (plenty of examples where code can be reduced 10x to do the same). I hope this is not a base to continue training LLMs on.

    • jcalx 1 day ago

      There is definitely going to be some Wirth's law-like [0] effect about the asymmetry of software complexity outpacing LLMs' abilities to untangle said software. Claude 9.2 Optimus Prime might be able to wrangle 1M LoC, but somehow YC 2035 will have some Series A startup with 1B+ LoC in prod — we'll always have software companies teetering on the very edge of unmaintainability.

      [0] https://en.wikipedia.org/wiki/Wirth%27s_law

      • AlotOfReading 1 day ago

        It's the Peter principle for computers. Codebases expand to the limits of the organization's ability to manage them. If you make one person use ed to write code for a bare metal environment, you'll get a comparatively small, laser-focused codebase. If you task a hundred modern developers to solve the same problem, you'll get a Linux box device running a million lines of JavaScript.

        Same thing happens in other fields. A rich country and a poor country might build equivalent roads, but they won't pay the same price for them.

  • jimmyjazz14 1 day ago

    If that is the case market forces would likely favor hand written code and all the slop will be forgotten (unless the slop works fine and is stable).

    • devin 1 day ago

      This is wishful thinking. The force of the market is "number go up". Quality increasingly has less and less of a role in the equation. You will eat your slop, and you will like it. It will be the only choice you have.

      • sesky 1 day ago

        But the quality of code was already very bad due to market forces. Most code at large companies is notoriously poor despite the talent density, because the incentives are not there to tackle tech debt or improve code quality.

        With such a low baseline, there is an optimistic perspective that LLMs could improve the situation. LLMs can produce excellent code when prompted or reviewed well. Unlike human employees, the model does not worry about getting a 'partially meets expectations' rating or avoid the drudgery of cleaning up other people's code.

        • devin 1 day ago

          The model is optimized in a different way to "partially meet expectations". Sycophancy coupled with only really "knowing" what it has been trained on assure a different kind of mediocrity.

        • switchbak 1 day ago

          The same incentives that discourage good code in pre-AI times are still dominating now. You will be pushed to ship sub-par products in the future, just like you were in the past.

          AI certainly has the potential to make the underlying code/design a lot cleaner. We will also be working with dramatically more code, at a much higher rate of change. That alone will be a big challenge to keep sustainable.

          The ones making the decision to under-invest on design are either are unaware of the real costs, or are aware and are deliberately choosing that path - that's not new, and I don't expect it to change.

          • demorro 1 day ago

            The only thing that has changed is that there used to be a loose correlation between capability to effect change and inherent desire for quality. This correlation barely exists anymore, so the counter-cultural acts that happened to manifest quality inside our perverse systems will occur much more rarely now.

      • tyyyy3 1 day ago

        I agree generally but there are periods where creative people show up and a whole slew of existing firms go bust/shrink due to one’s ability to envision a path toward creative destruction.

    • xantronix 1 day ago

      The market is hardly as rational as people would like to hope it is, though it does at least have its own twisted sort of internal consistency.

    • lbrito 1 day ago

      I don't think that's how money works. Enough people have poured enough money into this thing that the actual, measurable results/efficacy/ROI are of secondary importance (to put it mildly). At this point AI adoption is (at least sold as) a fait accompli.

    • demorro 1 day ago

      Absurd. Market forces don't optimize for quality, reliability or human welfare. This is religious thinking.

  • jf22 1 day ago

    First, most software is already a hot mess.

    Second, LLM code can be less of a hot mess than human written code if you put in the time to train/prompt/verify/review.

    Generating perfect well patterned SOLID and unit tested code with no warnings or anti-patterns has never been easier.

    • glouwbug 1 day ago

      Right, but it takes one to know one. Many don’t have the ability to decipher what’s good stable output or not

    • switchbak 1 day ago

      Like with a lot of things in this space, it depends where you invest your effort. If you care about quality design and good code, you can definitely get there - but that doesn't happen by default.

      With the right investment, we could certainly have tooling that creates and maintains very good designs out of the box. My bet is that we'll continue chasing quick and hacky code, mostly because that's the majority of the code that it was trained on, and because the majority of people seem to be interested in a quick result vs a long-term maintainable one.

    • yakattak 1 day ago

      The only people who are going to put in the time, are people who care enough to. The problem is you have people who didn’t care before who were equipped with a garden hose. Now that they have a fully pressurized fire hose they can make more of a mess faster.

      • risyachka 1 day ago

        This is so on point that I want to cry.

      • senordevnyc 1 day ago

        Then they should be easy to defeat. Why are you complaining?

        • yakattak 1 day ago

          Defeat in what aspect?

          • senordevnyc 1 day ago

            Compete with, for jobs, customers, investment, etc.

            • yakattak 1 day ago

              Maybe. But it depends on the metric. It seems like orgs are focused on PR count and token usage. Issues caused by poor code are often lagging indicators so it’s asymmetrical in that aspect.

              Write lots of code now and statistically look great, while the impact won’t be felt for a much larger range of time.

              With the job search and whatnot then yeah, caring becomes a lot more important. That’s true.

        • themgt 1 day ago

          As an author of fine literature, these million monkeys on typewriters simply upset my sense of dignity. And to imagine the impoverished prose so many readers shalt forthwith be perusing!

      • Daishiman 1 day ago

        Hard disagree. LLMs are fantastic for fixing bad architecture that's been around for a decade because nobody was willing to touch it. I can have it write tons and tons of sanity checks and then have it rewrite functionality piece by piece with far more verification than what I'd get from most engineers.

        It's not immediate, it still takes weeks if you want to actually do QA and roll out to prod, but it's definitely better than the pre-LLM alternatives.

        • yakattak 1 day ago

          Yeah but you care which is my exact point.

          • Daishiman 1 day ago

            How is this different from every single technological iteration?

            • salawat 1 day ago

              Because there is a certain point where barrier to entry prevents meaningful competition once winner-take-all power laws start kicking in, and stability hitherto has been predisposed on having a plurality of non interrelated competitors to ensure no one man's quirks drives too much of societies theoretical output.

              AI will make this dynamic worse, and it's got the extra danger of the default banal way of applying the technology in fact encourages it's application to that end.

              • Daishiman 1 day ago

                I don't really see it that way because most software companies overestimate the importance of fantastic software vs merely adequate software, and most times good sales development, support, and negotiation skills are what helps actually sell.

                I also don't think that the commodification of programming is a substitute for things like understanding your customers, having good taste for design, and designing software in a way that is maximally iterable.

    • jplusequalt 1 day ago

      >First, most software is already a hot mess.

      That the industry was already routinely dealing with fires of it's own creation is not a valid reason to start cooking with gasoline.

      • jf22 1 day ago

        But we aren't cooking with gas. We are cooking with a more controlled burner than ever that can download a clean code claude skill and be committing better code than you or I could write.

        What would normally be considered overengineered gold plating is "free" now.

  • cj 1 day ago

    > Makes me want to just give up programming forever and never use a computer again.

    LLMs aren’t the first thing to come along and change how people develop applications.

    You had the rise of frameworks like Django, Rails, etc. Also the rise of SPAs. And also the rise of JS as a frontend+backend language.

    In a 3-5 yeats we’ll have adapted to the new norm like we have in the past

    • toraway 1 day ago

      Or, it could be like asbestos and the immediate benefits are just too appealing to listen to arguments of skeptical naysayers about some vaguely defined problems that are decades away, if they even happen.

      I use AI tools daily (because they feel like they're helping me) but it's not exactly hard to imagine scenarios where an explosion of slop piling up plus harm to learning by outsourcing all thinking results in systemic damage that actually slows the pace of technological progress given enough time.

      History of new technologies tend to average into a positive trend over a long enough time scale but that doesn't mean there aren't individual ups and downs. Including WTF moments looking back at what now seems like baffling decision-making with benefit of hindsight.

      • cj 1 day ago

        > Or, it could be like asbestos

        If it is, the fall out will be way worse than if AI ends up living up to (reasonable) expectations.

        If it doesn’t, we are going to see over a trillion dollars of capital leave the tech sector, which I think will have worse impacts on the livelihood of tech workers than if AI ends up panning out.

        This is something the naysayers need to grapple with. We’ve crossed a line where this tech needs to work simply because of the amount of money depending on that fact.

        • toraway 1 day ago

          The asbestos hypothetical is a bit different than the "bubble popping" economic crisis scenario though. In this world, AI would just continue being adopted and shoved into every nook and cranny into which it can be made to fit, with valuations only getting bigger and bigger.

          The damage would come much later, well beyond the point where it could be simply pulled out and replaced without spending massive amounts of money and would also basically necessitate training an entire new generation of engineers.

          Then the AI giants would start appearing vulnerable like cigarette companies in the 90s while an AI Superfund and interstate class action are being planned but Sam Altman would already be a centitrillionaire at that point so it would be someone else's problem.

        • lelanthran 1 day ago

          > If it doesn’t, we are going to see over a trillion dollars of capital leave the tech sector, which I think will have worse impacts on the livelihood of tech workers than if AI ends up panning out.

          I don't think it will be worse; if AI pans out the world would be able to continue without a single programmer left. If a trillion dollars leave the tech sector, all those programmers employed outside of the tech sector will still have jobs.

      • Izkata 1 day ago

        Some of us are already experiencing that. For example I handed off an initial version of something some months ago, and the AI-generated stuff they came up with was a huge buggy mess of spaghetti code neither of us understood. Months later we've detangled it, cutting it down to a third the size, making it far simpler to understand, and fixing several bugs in the process (one was even by accident, we'd made note of it, then later when we went to fix it, it was already fixed).

    • lbrito 1 day ago

      The difference between writing assembly code and Ruby code is much smaller than the difference between programming and vibe coding.

      Also, companies are pressuring employees towards adoption in novel ways. There was no such industry-wide pressure by employers in the 90s, 2000s or 2010s for engineers to use a specific tech.

      • Daishiman 1 day ago

        > Also, companies are pressuring employees towards adoption in novel ways. There was no such industry-wide pressure by employers in the 90s, 2000s or 2010s for engineers to use a specific tech.

        Companies have been enforcing technology mandates since time immemorial. In the early 2000s there were definitely a lot of mandates to move away from commercial UNIX to Linux. Lots of companies began enforcing the switch to PHP, Ruby and Python for new projects.

        • lbrito 1 day ago

          Yes, but the entire industry was not pushing any one single tool at the same time. If you disliked Django, you could go to Rails. If you disliked Rails, you had Phoenix. Etc.

          Good luck disliking LLM babysitting these days

  • genghisjahn 1 day ago

    I'm generally pro "llm assisted coding" or whatever you want to call it. But I do somethings think about the Butlerian Jihad from Dune.

    https://en.wikipedia.org/wiki/Dune:_The_Butlerian_Jihad

    • hermitShell 1 day ago

      If you like sci-fi takes on software systems, check out Vernor Vinge "A Fire upon the deep" and sequels. I recall ship systems software is something like all the code humanity has ever written, plus centuries of LLM churn. One of the protagonists is a space faring software developer particularly good with legacy code.

      We are used to thinking about software like in the article, a program that runs deterministically in an OS. Where we are headed might be more like where the LLM or AI system is the OS, and accomplishes things we want through a combination of pre-written legacy software, and perhaps able to accomplish new things on the fly.

      • genghisjahn 1 day ago

        Interesting, I kinda do this. Sometimes when an LLM solves a problem for me, I have it write code so that I can reuse that exact same approach deterministically(and I line by line check it). Now I have about a dozen CLI commands that the LLM can use and I'm reasonably (although not 100%) sure I'll get an expected outcome. Really helpful with debugging via steam pipe and connecting to read replicas.

      • Izkata 1 day ago

        Sounds like a recipe for Star Trek holodeck malfunctions.

      • DonHopkins 1 day ago

        Pham Nuwen is a master of vibe patching legacy sedimentary software.

      • genghisjahn 1 day ago

        Ordered Fire Upon the Deep. Looks interesting.

  • johnbarron 1 day ago

    There is nothing in the post to support the statement. An interesting personal confession, but it does not establish that vibe coding and agentic engineering are converging as a general phenomenon.

    As a piece of meat, I look forward to charge rates of $10,000 an hour, to fix code out the vibe code generation.

  • empath75 1 day ago

    > People in the future are going to wonder what the hell we were thinking, when 30 years down the line everything is a hot mess of billions of lines of code generated by LLMs that no human has read

    --

    It's just as likely that people will be surprised that we used to have billions of lines of human generated code, that no LLM ever approved.

  • throw_this_one 1 day ago

    Why does it matter, as long as it accomplishes the task?

  • zuzululu 1 day ago

    By then AI would be good enough to clean them all up....like I dont get these dooming scenarios they always assume that we are going to be stuck with LLMs and there wont be anything new coming.

    • orphea 1 day ago
        By then AI would be good enough to clean them all up...
      

      [citation needed]

      To make my comment more on-topic: why do you think this is going to be the case? What newer LLMs will be trained on?

      • zuzululu 1 day ago

        well you are assuming that there's not going to be any new progress and that we are going to be stuck with whatever LLM version we have currently

        • orphea 23 hours ago

          No. I assume that

          * we're already close to the ceiling of LLM capabilities: LLM providers probably already consumed everything they could. Plus there have been no large improvements for a while, only small and incremental. I believe Mythos-too-dangerous-to-release is a marketing bullshit, until proved otherwise.

          * people generate overengineered slop at light speeds. If that is the training base for future models, I doubt they're going to improve significantly, rather quality is going to stagnate at best.

  • murukesh_s 1 day ago

    Hello from assembly programmers to present day javascript folks. Joke aside, I sometimes think how VS Code is written in such layers and layers of code - ~200mb of minified code - Java based IDEs were worser with almost 1GB of code (libs/dependencies). And VS Code did beat native editors (Sublime) of its time to dominate now - may be because of the business model (open & free vs freemium). But it does the job quite well IMO. And it enabled swarms of startups to go to market including billion $ wrappers - including Cursor, Antigravity and almost all UI coding agents. I remember backend developers (Java/C++ type) looking down upon Javascript developers as if we are from an inferior planet or something.

    How many of us remember that VSCode is actually a browser wrapped inside a native frame?

    • skydhash 1 day ago

      VS Code has two things that worked well for it. Web Tech and Money. Web tech makes it easy to write plugins (you already know the stack vs learning python for sublime). And I wonder how much traction it would get if not Microsoft paying devs to wrangle Electron in a usable shape.

    • k__ 1 day ago

      To be fair, MS send a world class engineer to make JavaScript usable for codebases at that scale.

    • 000000000001 1 day ago

      >How many of us remember that VSCode is actually a browser wrapped inside a native frame?

      The new standard, Web Apps. Why update 3 seperate binaries for Win/Lin/Mac when you can do 1 for a web framework and call it a day?

  • michelb 1 day ago

    If 30 years down the line I still have to look at code, maintain code, or even worry in the slightest about code, something went deeply wrong.

    • skydhash 1 day ago

      Code will never go away. Code was there before computer hardware and it will always be there. Code is (almost?) all of computation theory so unless we throw computers away, we shall always use code.

      • phainopepla2 1 day ago

        They're not suggesting that code will go away, but rather that it will be abstracted beneath an LLM interface, so that writing code in the future will be like writing assembly today: some people do it for fun or niche reasons, but otherwise it's not necessary, and most developers can't do it.

        Whether that happens or not is a different question, but I believe that's what they're suggesting.

        • skydhash 1 day ago

          Code is formal and there are basic axioms that grounds its semantic. You can build great constructs on top of those semantics, but you can’t strip away their formality without the whole thing being meaningless. And if you can formalize a statement well enough to remove all ambiguity, then it will turn into code.

          Programming is taking ambiguous specs and turning them into formal programs. It’s clerical work, taking each terms of the specs and each statements, ensuring that they have a single definition and then write that definition with a programming language. The hard work here is finding that definition and ensuring that it’s singular across the specs.

          Software Engineering is ensuring that programming is sustainable. Specs rarely stay static and are often full of unknowns. So you research those unknowns and try to keep the cost of changing the code (to match the new version of the specs) low. The former is where I spend the majority of my time. The latter is why I write code that not necessary right now or in a way that doesn’t matter to the computer so that I can be flexible in the future.

          While both activities are closely related, they’re not the same. Using LLM to formalize statements is gambling. And if your statement is already formal, what you want is a DSL or a library. Using LLM for research can help, but mostly as a stepping stone for the real research (to eliminate hallucinations).

  • pllbnk 1 day ago

    I think it’s a mistake to think that we will be blindly going in this direction for many years and then suddenly collectively wake up and realize what have we done. It’s a great filter and a great opportunity.

    If LLMs stop improving at the pace of the last few years (I believe they already are slowing down) then they will still manage to crank out billions lines of code which they themselves won’t be able to grep and reason through, leading to drop in quality and lost revenue for the companies that choose to go all-in with LLMs.

    But let’s be realistic - modern LLMs are still a great and useful tool when used properly so they will stay. Our goal will be to keep them on track and reduce the negative impact of hallucinations.

    As a result software industry will move away from large complex interconnected systems that have millions of features but only a few of them actively used, to small high quality targeted tools. Because their work will be easier to verify and to control the side effects.

    • leptons 1 day ago

      I wish I got to hallucinate at work, and just get a pat on the head for constantly doing the wrong thing.

      • 2ndorderthought 1 day ago

        I mean you can do that, but the job probably doesn't pay too much. Might enrich your spirituality though.

      • pllbnk 1 day ago

        Maybe I am unlucky but I had worked with too many developers who couldn't make a good decision if their life depended on it. LLMs at least know how to convince you of their decisions with strong arguments.

        • nothinkjustai 1 day ago

          Mmm, I feel it’s more common for them to just blindly agree with whatever you say.

          Assistant: “I propose A”

          User: “Actually B is better”

          Assistant: “you’re absolutely right”

          User: “actually let’s go with C”

          Assistant: “Good choice, reasons

          User: “wait A is better”

          Assistant: “Great decision!”

      • oompydoompy74 1 day ago

        The title for that is Director, VP, or CTO at any given large enterprise company.

        • leptons 1 day ago

          People downvoted you, but I actually know a few of these people.

    • lelanthran 1 day ago

      > If LLMs stop improving at the pace of the last few years (I believe they already are slowing down)

      Depending on how you measure "improvement" they already have or they never will :-/

      Measuring capability of the model as a ratio of context length, you reach the limits at around 300k-400k tokens of context; after that you have diminishing returns. We passed this point.

      Measuring capability purely by output, smarter harnesses in the future may unlock even more improvements in outputs; basically a twist on the "Sufficiently Smart Compiler" (https://wiki.c2.com/?SufficientlySmartCompiler=)

      That's the two extremes but there's more on the spectrum in between.

      • rgbrenner 1 day ago

        300k-400k isn’t the current limit if you create modules and/or organize the code reasonably.. for the same reason we do this for humans: it allows us to interact with a component without loading the internals into out context.

        you can also execute larger tasks than this using subagents to divide the work so each segment doesn’t exceed the usable context window. i regular execute tasks that require hundreds of subagents, for example.

        in practice the context window is effectively unlimited or at least exceptionally high — 100m+ tokens. it just requires you to structure the work so it can be done effectively — not so dissimilar to what you would do for a person

        • jmalicki 1 day ago

          That makes it not a context window.

          How to organize code like you said, and how agents interact with it, to keep the actual context window small is the fundamental challenge.

          • lelanthran 1 day ago

            I keep getting surprised that people who are all-in on this (" i regular execute tasks that require hundreds of subagents ") don't have any idea of what is happening even a single layer below their interface to the LLM ("in practice the context window is effectively unlimited or at least exceptionally high — 100m+ tokens.")

            I looked at that response by GP (rgbrenner) and refrained from replying because if someone is both running hundreds of agents at a time AND oblivious to what "context window" means, there is no possible sane discourse that would result from any engagement.

          • rgbrenner 1 day ago

            ok "series of context windows spread across many agents".. sure much clearer.

            Doesn't change my point: the amount of code the agent can operate on is very large, if not unlimited, as long as you put even a little bit of thought into structuring things so it can be divided along a boundary.

            If you let the codebase degrade into spaghetti, then the LLM is going to have the same problem any engineer would have with that. The rules for good code didn't disappear.

            • jmalicki 1 day ago

              Context windows don't necessarily cleanly divide. Getting each agent to be able to task within a context window is a hard problem.

              It's like like if your context window with one agent is n, your context window with 10 agents is n/10. It is some skill, but that is also where a lot of the advances are coming in.

              • rgbrenner 1 day ago

                300k tokens--the useable context window of a single agent--is about 40k lines of code and you can't figure out a natural breakpoint within that code to divide up the task?

    • parliament32 22 hours ago

      > I believe they already are slowing down

      They certainly have. The only impactful improvement in the last year is "just run it in a loop until it gets it right" lmao

      Which, of course, only works as long as the costs are subsidized by the companies vying for market share.

  • wan23 1 day ago

    Have you ever encountered the very common real life situation where there's some software that works, and you have a binary for it but you either don't have the source code or it doesn't compile for whatever reason? This is the pre-LLM world. Now, do you think LLMs make this situation better or worse? You may not know what's wrong with your software or how to fix it, but unlike in the past you can throw compute at trying to figure it out, or replicating a subset of it, or even replicating all of it depending on what it is. I think LLMs are making this situation better not worse.

    • lelanthran 1 day ago

      I think the problem with that sort of thought is that the burgeoning sizes of output for even trivial software makes it almost a certainty that:

      a) The stuff output by the existing LLMs is too unwieldy even for them to handle , even if the product itself is a glorified chatbot.

      b) If all software is throwaway, then the value of all software drops to, effectively, the price of an AI subscription. We'll all be drowning in a market of lemons (https://en.wikipedia.org/wiki/The_Market_for_Lemons), whilst also being producers in said market.

    • kingleopold 1 day ago

      another aspect is amount of code LLMs can handle went from few lines to small codebase in few years, so future is just possible for a lot bigger codebases?

  • Keyframe 1 day ago

    Why are we pretending everyone's code is an etalon of quality? Most software out there is probably hot mess already. No think behind it, let alone ultrathink.

    • Maxatar 1 day ago

      Exactly, before the rise of LLMs it was not at all uncommon to hear people claiming that their job is to just Google API calls or copy and paste code from Stackoverflow. The context back then was that companies are being picky by hiring people who can demonstrate some modicum of understanding of data structures and algorithms because all any developer does is tweak some CSS or make some calls to a database to glue together a CRUD app... why should anyone be expected to know how to reverse a linked list, or how a basic sorting algorithm works... just download an npm package to do that stuff and glue it all together with a series of nested for loops.

      With the rise of LLMs that do all of that... those people shutup and shutup real fast.

  • beAbU 1 day ago

    Have you ever worked on a legacy codebase with actual good code? I struggle to see the difference between your predicted future and today's reality when it comes to working with legacy disasters.

    • jeromegv 1 day ago

      Well, on legacy code base, you still needed humans to write those lines of code. There's a maximal amount of lines a human can write in a year.

      Now with LLM we are talking of millions and millions of line of code that could be generated in a single day. The scale of the problem might not be the same at all.

  • stronglikedan 1 day ago

    > is no longer possible for anyone to maintain neither with nor without LLMs.

    That's what the Tech-Priests are for.

    • ofjcihen 1 day ago

      <INTERROGATIVE-HAVE YOU TRIED APPLYING INCENSE AND RECITING THE SACRED TECH LITANIES?>

  • ilaksh 1 day ago

    30 years down the line a human will wake up in his climate controlled bed in an idyllic large scale people-zoo, think about what information he wants, and immediately his 900TB ferroelectric compute-in-memory exobrain will read his thoughts via his brain-computer-interface, and render a custom 3d visualization of that information floating in front of him. There will be no separate code stage, just neural rendering of data to pixels.

    • giraffe_lady 1 day ago

      Who empties the bedpan?

      • sunrunner 1 day ago

        It's a tube and it's directly connected, for efficiency. Feel free to fill in the rest of the story.

    • efnx 1 day ago

      Better not think a forbidden thought. Oh shoot! You just did! :)

      • frizlab 1 day ago

        Well thanks, I lost the game now :|

    • butlike 1 day ago

      But are the pixels hot?

    • sunrunner 1 day ago

      > custom 3d visualization of that information floating in front of him.

      Eh, what a waste. Can't we just stimulate the optic nerve? Or better yet, whatever region of the brain is responsible for me being able to 'see' anything? And perhaps we can finally get smell-o-vision too.

      • ilaksh 18 hours ago

        I was hoping that the visuals going through the BCI also was implied since I didn't indicate any other new technology for holograms.

  • MagicMoonlight 1 day ago

    Have you seen Windows? We already have thirty years of slop.

  • pkulak 1 day ago

    I can't get used to vibe-coded projects on Github. One that I was using for a little while is about a year old, with 40,000 commits and 15,000 PRs. And it has "lite" in its name; it's supposed to be the simple alternative. There were so many bugs. I fixed one, submitted a PR, but it was off the first page in hours. It will never be merged. I moved to a different project with a bit less... velocity, and it has been way smoother.

  • butlike 1 day ago

    People, as a rule, don't really "go backwards." We didn't really walk back on the industrial revolution, and we're probably not going to walk back from this day-and-age's activities. It's only unsettling until the changes are accepted. The old timers can vie for a time before "all this" when they were children and all their needs were given to them by their now-deceased parents, and the cycle can continue on, yet again.

treespace8 1 day ago

I feel like an outlier in all of this. But isn't this just more AI slop? How is this different from text generation or image generation?

Like many people I have used AI to generate crap I really don't care about. I need an image. Generate something like, whatever. Great hey a good looking image! No that's done I can do something I find more interesting to do.

But it's slop. The image does not fit the context. Its just off. And you can tell that no one really cared.

This isn't good.

  • simonw 1 day ago

    The difference is that coding agents can run the code that they produce, fix any bugs, build tests and generally demonstrate that it works.

    You can't do that for images and text.

WhereIsTheTruth 21 hours ago

Keyboards and mouse have always been a bottleneck, the average person only types around 50 words per minute

If you want to build a project, you can never shorten the actual time it takes to write it out, you are stuck at that 50 words per minute limit

LLMs, agents, call it how you want, they allow us to remove that bottleneck

saltyoldman 1 day ago

For work I do agentic engineering. As the code that I submit for a code review is hand reviewed by me. I know every line and file that I submit.

My side project is 80% vibe code. Every now and then I look and see all the bad stuff, then I scold Codex a bit and it refactors it for me. So I do see the author's point.

lenerdenator 1 day ago

> I know full well that if you ask Claude Code to build a JSON API endpoint that runs a SQL query and outputs the results as JSON, it’s just going to do it right. It’s not going to mess that up. You have it add automated tests, you have it add documentation, you know it’s going to be good.

> But I’m not reviewing that code. And now I’ve got that feeling of guilt: if I haven’t reviewed the code, is it really responsible for me to use this in production?

Answer: it wholly depends upon what management has dictated be the goal for GenAI use at the time.

There seems to be a trend of people outside of engineering organizations thinking that the "iron triangle" of software (and really, all) engineering no longer holds. Fast, cheap, good: now we can pick all three, and there's no limit to the first one in particular. They don't see why you can't crank out 10x productivity. They've been financially incentivized to think that way, and really, they can't lose if they look at it from an "engineer headcount" standpoint. The outcomes are:

1) The GenAI-augmented engineer cranks out 10x productivity without any quality consequences down the line, and keeps them from having to pay other people

or

2) The GenAI-augmented engineer cranks out 10x productivity with quality consequences down the line, at which point the engineer has given another exhibit in the case as to why they should no longer be employed at that organization. Let the lawyers and market inertia deal with the big issues that exist beyond the 90-day fiscal reporting period.

Either way, they have a route to the destination of not paying engineers, and that's the end goal.

If you don't like that way of running a software engineering organization, well, you're not alone, but if nothing else, you could use GenAI to make working for yourself less risky.

rolymath 1 day ago

Simon,

Just piggy backing on this post since I'm early:

Would love to see your take on how the AI and Django worlds will collide.

_rwo 1 day ago

> But I’m not reviewing that code (...)

That's the spirit, I always say - _others_ will deal with AI slop during code review. Eventually they will get tired and start 'reviewing' this AI stuff with AI - so it's a win win. Right?

scuff3d 23 hours ago

That's because "agentic engineering" by and large is a term made up to make people feel better about the fact that their just vibe coding.

cess11 1 day ago

"But I’m not reviewing that code. And now I’ve got that feeling of guilt: if I haven’t reviewed the code, is it really responsible for me to use this in production?"

"I know full well that if you ask Claude Code to build a JSON API endpoint that runs a SQL query and outputs the results as JSON, it’s just going to do it right. It’s not going to mess that up. You have it add automated tests, you have it add documentation, you know it’s going to be good."

This really is Wordpress and early PHP all over again, but it's the seasoned folks rather than the amateurs that buy into it.

I believe these tools will be refined and locked down and eventually turn into RAD stuff used by certified enterprise consultants, much like SAP and Salesforce and IBM solutions and so on. From this I come to the conclusion that it is not a good idea to become dependent on them at this stage, which is corroborated by the pecuniary expense as well as excruciatingly fast change in available products.

dyauspitr 1 day ago

I still don’t get what agentic engineering is. Isn’t it all just asking the same LLM what you want it to do?

gverrilla 1 day ago

> I know full well that if you ask Claude Code to build a JSON API endpoint that runs a SQL query and outputs the results as JSON, it’s just going to do it right. It’s not going to mess that up.

> Claude Code does not have a professional reputation!

how come?

  • giulianob 1 day ago

    That's a wild statement to me. Even with spending significant time making plans with Opus 4.7 and GPT 5.5 on xhigh, I still find lots of poor decisions made when it actually goes to implement it. I find the quality of PRs hasn't dramatically changed either way because the better engineers will spot the issues whereas others will find what the AI is doing acceptable.

slopinthebag 1 day ago

I agree, I'm actually generating just over of 20,000 lines of code each day at my company. Part of that was the mandate and leaderboards around token usage, but also they started using pull requests as an explicit metric. What I do is usually pull around 5 or so tickets at once, spin up 5 different agents on their own branch, have them work until completion, and then spin up two more agents to handle the merge request.

I'm not checking the code since the code doesn't really matter anymore anyways - I just have the agent write passing tests for the changes or additions I make, and so even if something breaks I can just point to the tests.

Some days, the tickets are completed much faster than I expect and I don't hit my daily token expenditure goal, so I have my own custom harness that actually hooks up an agent to TikTok, basically it splits up the reel into 1 second increments and then feeds those frames to the LLM for it's own consumption. I can easily burn 10m tokens a day on this, and Claude seems to enjoy it.

Personally I want to thank you Simon for putting me onto this "vibe engineering" concept, I really didn't expect an archaeology major like myself to become a real engineer but thanks to AI now I can be! Truly gatekeeping in tech is now dead.

  • jFriedensreich 1 day ago

    I nearly fell for it until the tiktok part, thanks for amusing shitpost

DonHopkins 1 day ago

Instead of "vibe coding" by asking the AI to design and write code, I'm having it refine my own designs, and write code under strict supervision and guidance, that I carefully review and iterate on.

I took a rock carving course in school that really enlightened me about software engineering, and it still applies today, especially to AI. You can't just decide what you want to carve, hold the chisel in just the right spot, and whack it with a hammer just perfectly so all the rock you want falls away leaving a perfect statue behind.

"I saw the angel in the marble and carved until I set him free." -Michelangelo

It's a long drawn out iterative process of making millions of tiny little chips, and letting the statue inside find its way out, in its natural form, instead of trying to impose a pre-determined form onto it.

Vibe coding is hoping your first whack of the hammer is going to make a good statue, then not even looking at the statue before shipping it!

But AI assisted conscientious coding (or agentic engineering as Simon calls it) is the opposite of that, where you chip away quickly and relentlessly, but you still have to carefully control where you chisel and what you carve away, and have an idea in your mind what you want before you start.

gverrilla 1 day ago

"Code quality" was always a mirage imo. Logic is what matters. I've used the internet from the early days, and probably 99% of software I used always had serious bugs. Ultima online was mentioned in HN recently: it was a real bug-and-exploit-fest. Banks, AAA games, companies like Uber with 1000's of engineers - they all had serious problems (and that's still true). It would be worst if some engineers didn't have that drive to code in high quality, but we gotta admit that was not ever enough. Even now with Claude Code, I see a lot of "specifications" that are far from specified enough - and people blame the LLM.

andy_ppp 1 day ago

Honestly, I think the need for devs is total copium, the progress made in two years is astounding and in two years time they will be better at programming than 99% of programmers. It’s incredible what they can do now. No it’s not perfect but imagine where we’ll be in 5 or 10 years.

  • bamboozled 1 day ago

    All of those out of work radiologists would agree \s

fzzzy 1 day ago

man i love this post

gxs 1 day ago

I grew up on construction sites with my dad. If i've done well in my career, it was from watching him operate - managing huge construction crews, how he figured out who to put on what tasks, handling suprises, setbacks, all that stuff

My dad (now retired) was always super practical about stuff. He'd tell me pretty nonchalantly things like "yeah we're dealing with xyz constraint, we may have to cut a corner over here, but that's ok", when I asked him about it he gave me a little spiel that you can be thoughtful about how you do things, including when you can cut a corner and more importantly, what corners are ok to cut.

I really took that to heart - especially the "be thoughtful about the corners you cut"

If an LLM has consistently one shotted certain tasks and they are rote/mechanical - not reviewing that code is probably ok.

Are you getting lazy and not reviewing stuff that should be reviewed even if a human wrote it? That's probably not ok

I can live with some basic code that broke because it used outdated syntax somewhere (provided the code isn't part of a mission critical application), but I can't live with it fucking JWT signing etc

0gs 1 day ago

huh. i honestly never thought they were all that different. didn't the same guy coin them both to refer to the same thing?

  • simonw 1 day ago

    Not at all. Andrej Karpathy coined vibe coding as: https://twitter.com/karpathy/status/1886192184808149383

    > where you fully give in to the vibes, embrace exponentials, and forget that the code even exists [...] It's not too bad for throwaway weekend projects, but still quite amusing. I'm building a project or webapp, but it's not really coding - I just see stuff, say stuff, run stuff, and copy paste stuff, and it mostly works.

    So clearly we need a term for what happens when experienced, professional software engineers use LLM tooling as part of a responsible development process, taking full advantage of their existing expertise and with a goal to produce good, reliable software.

    "Agentic engineering" is a good candidate for that.

    • dev360 1 day ago

      > as part of a responsible development process, taking full advantage of their existing expertise and with a goal to produce good, reliable software

      Its shifted so much for me. I used to think that I had a solemn duty to read every line and understand it, or to write all the test cases. Then I started noticing that tools like CodeRabbit, or Cursor would find things in my code that I would rarely find myself.

      I think right now, its shifted my perception of my role to one where I am responsible for "tilting" the agentic coding loop; ultimately the goal is a matter of ensuring the agent learns from its mistakes, self-organize and embrace a spirit of Kaizen.

      Btw thank you for your work on Django, last 20 years with it were life changing (I did .NET before).

    • 0gs 1 day ago
      • simonw 22 hours ago

        Yeah, Andrej thinks "agentic engineering" is a good candidate too. Note that he's not claiming to have coined it there:

        > Many people have tried to come up with a better name for this to differentiate it from vibe coding, personally my current favorite "agentic engineering"

        Andrej posted that on 4th of February. I first saw the term "agentic engineering" used by OpenClaw creator Peter Steinberger in October 2025 (a month before he wrote the first line of code for OpenClaw) https://steipete.me/posts/just-talk-to-it

        • 0gs 21 hours ago

          aha! got it

hirvi74 1 day ago

I'd be lying if I said I was not worried about the future. I am not necessarily worried in the sense that there will be some grave, impeding doom that awaits the future of humanity.

Rather, I just feel like I have to constantly remind myself of the impermanence of all things. Like snow, from water come to water gone.

Perhaps I put too much of my identity in being a programmer. Sure, LLMs cannot replace most us in their current state, but what about 5 years, 10 years, ..., 50 years from now? I just cannot help be feel a sense of nihilism and existential dread.

Some might argue that we will always be needed, but I am not certain I want to be needed in such a way. Of course, no one is taking hand-coding away from me. I can hand-code all I want on my own time, but occupationally that may be difficult in the future. I have rambled enough, but all and all, I do not think I want to participate in this society anymore, but I do not know how to escape it either.

  • cortesoft 1 day ago

    If you work in any new technology field, the chances that your job will exist in the same way 50 years from now is very small.

    The job, as you have done it at least, was also not here 50 years before you started doing it.

    Did you have any of the same feelings knowing that you were doing a job that has not existed in the world very long? That seems like a strange requirement for a meaningful job, that it should remain the same for 50+ years.

    In truth, our world and what we do for our careers is entirely shaped by the time that we live in. Even people that ostensibly do the same thing people have done for centuries (farmer, teacher, etc) are very different today than 100 years ago.

Fokamul 1 day ago

Reminder, cybersecurity will be huge in following years.

Companies are shipping things and nobody understands what they're shipping.

jonahs197 1 day ago

What the F is "agentic" really?

xienze 1 day ago

> And that feels about right to me. I can plumb my house if I watch enough YouTube videos on plumbing. I would rather hire a plumber.

I don't buy this argument at all. I think if we could pay $20/month to a service that would send over a junior plumber/carpenter/electrician with an encyclopedic knowledge of the craft, did the right thing the majority of the time, and we could observe and direct them, we'd all sign up for that in a heartbeat. Worst case, you have to hire an experienced, expensive person to fix the mess. Yes, I can hear everyone now, "worst case is they burn your house down." Sure, but as we're reminded _constantly_ when we read stories about AI agent catastrophes -- a human could wipe your prod database too. wHy ArE yOu HoLdInG iT tO a DiFfErEnT sTaNdArD???

The business side of the house is getting to live that scenario out right now as far as software goes. Sure you've got years of expertise that an LLM doesn't have _yet_. What makes you think it can't replace that part of your job as well?

  • cortesoft 1 day ago

    I literally do pay $20 a month to have a plumber service on call.

    • xienze 1 day ago

      And that includes materials, labor, and will be there the instant you need them?

      • cortesoft 1 day ago

        Not instant, but same day yes.

        • xienze 1 day ago

          ... And labor and materials?

          • cortesoft 1 day ago

            Labor is included, i have to pay for any additional materials required

  • wavemode 1 day ago

    You're comparing paying $20 for an AI plumber to paying hundreds/thousands for a traditional plumber.

    But that's not what the author is talking about in that passage you quoted. What he's saying is that, if you can pay $20 for an AI plumber, then it stands to reason that eventually you will be able to pay $30 to a company that manages AI plumbers for you, so that you don't even have to go to the trouble of supervising the plumber. Most people will choose the $30.

    • xienze 1 day ago

      It's in a section called "Why I’m still not afraid for my career."

      The implication here is software engineer jobs are still safe despite basically free labor/material being available to do said jobs because he thinks other people would prefer to pay experienced professionals to do it right at a significantly higher cost. My point is, I think most people will take the low-stakes gamble of having the cheap AI agent do it with self-supervision[0]. He's naive in thinking people are really going to care about artisanal software built by experienced professionals in the future.

      0: Even if you subscribe to the "your job will be to supervise the agents" train of thought, you're kinda glossing over the fact that it's probably gonna involve a pretty significant pay cut and the looming problem of "how do new experienced professionals get created if they don't have to/don't need to get their hands dirty"?

      • esafak 15 hours ago

        Do humans take a pay cut when they manage other humans? Does directing agents take little technical skill that merits a pay cut?

  • techblueberry 1 day ago

    > I think if we could pay $20/month to a service that would send over a junior plumber/carpenter/electrician with an encyclopedic knowledge of the craft, did the right thing the majority of the time, and we could observe and direct them, we'd all sign up for that in a heartbeat.

    I don’t think this comparison quite works (or maybe I think it works and is wrong) and I think it has something to do with creativity or the initial ideation.

    I would do this, but I’m a jack of all trades. I built my own diner booth in my kitchen recently. But my wife, who loves the diner booth, just doesn’t really want to get over the hump of figuring out what she might want. I think most people want to offload the mental load of figuring out where to start.

    Most people aren’t just bored by coding, they’re bored or overwhelmed by the idea of thinking about software in the first place. Same with plumbing or construction, most people aren’t hiring someone to direct, they’re hiring a director.

    Even I have this about some things, sometimes I choose to outsource the full stack of something to give me more space to do creativity elsewhere.

zuzululu 1 day ago

Vibe coding is just coding now. Writing assembly used to be a thing too until higher and higher languages were created. LLM is like that except it compiles English to code. This scares lot of professionals understandably.

drfloyd51 1 day ago

It is pure arrogance to expect that machines will never be able to code as good as a skilled human.

And AI generated code should be different than human code. AI has infinite memory for details. AI doesn’t need organizational patterns like classes. Potentially AI can write code that is more performant than any human.

Will it look like garbage? Sure. Will the code be more suited to the task? Yes.

  • vehemenz 1 day ago

    I would only add one caveat to this:

    Code that is organized well and operates coherently in the first place, by an LLM or not, will be easier to iterate on, by an LLM or not.

  • tyyyy3 1 day ago

    Your post weeks of pure arrogance. You sound like the bozo’s at Anthropic who made an AI agent for finance and think this is somehow going to provide a huge productivity boost because all they do is a bunch of tick boxing and spreadsheet work.

    No, just no.

  • jazzypants 1 day ago

    I find it hard to believe that code with unnecessary cruft and repetition is "more suited to the task". I've literally deleted hundreds of unnecessary or unused functions at this point. The only way I can agree is if "more suited" means, "it's wearing multiple suits for no reason".

  • tuom1s 1 day ago

    What will happen when AI companies increase the price of tokens?

    The code produced will only be understandable by AI. You could use locally hosted LLMs, but it won't be as performant as AI run by big guys. And there is nothing stopping greedy companies implementing some ridiculous pattern that only their model can reasonably work with.

    So what you'll do in situation when you can't understand "your" codebase and you have to make changes or fix a bug?

    • jnwatson 1 day ago

      What happens when the price of tokens goes to 0?

      The open weight models are nipping on the heels of frontier models. The frontier labs have to make forward progress and keep tokens cheap in order to maintain marketshare.

      Eventually, we'll have a Mythos-level model running on integrated hardware on every PC.

      • JackSlateur 1 day ago

        Are you ready to bet your future on this ?

    • pylua 1 day ago

      Eventually I would bet on ai using its own non human readable languages (brains?) to program in to reduce overhead.

      It will be a black box, and the code will be generated just in time by ai for each api request

      • recursive 18 hours ago

        If the language is unreadable for humans, we really can't trust that it does what it claims to do, except by testing. This requires more trust in the system than is warranted IMO. You can never be sure that there's no "sleeper" logic waiting to get activated. See "Reflections on Trusting Trust" by Ken Thompson. If we build systems that start relying on opaque mechanisms, it seems to be only a matter of time before things start behaving in ways contrary to what their authors intended, with no clear way to stop it other than hitting the power button, if that's even possible at that point.

    • platevoltage 1 day ago

      I think this is going to happen sooner than most people think.

    • drfloyd51 13 hours ago

      That is a pricing problem. And it is an absolute risk. That doesn’t change AI’s potential to be a better coder than 98-100% of everyone.