points by YZF 3 days ago

What tools have you tried? Are we talking Codex GPT 5.5 and Opus 4.7?

Would you say the project is well architected? Clear boundaries? Or ball of mud?

How large is large?

Are there AGENT.md files giving good information that helps LLMs get context when looking at a certain area of the code?

Is it all in one repo? multiple repos?

Are there good tests?

I feel like these are some of the many variables that can make a difference.

I work on a pretty large project/code base, written mostly in Go, and I have pretty positive experience with LLMs. I take on fairly small chunks, I review and understand the changes. I also use LLMs to explore options and prototype quickly. They're also very good at fixing bugs, failing tests etc.

mittensc 3 days ago

> What tools have you tried? Are we talking Codex GPT 5.5 and Opus 4.7?

Yes, with generous budgets.

> They're also very good at fixing bugs,

Seeing opposite here too, they are like eager juniors 'oh the issue is here and here's a 5 page report why', and it's wrong... then you add more info and it goes to a different spot... repeat until you get tired and solve it yourseld, it is useful as a rubber ducky i guess.

> I work on a pretty large project/code base, written mostly in Go, and I have pretty positive experience with LLMs. I take on fairly small chunks, I review and understand the changes.

Great that it's working for you, I'm just pointing out there's a massive disconnect.

I would assume your work can be done by a junior engineer without any prior knowledge (except LLM md files) with same quality but less speed?

If yes, then great, perhaps that's where the disconnect is, complexity.

Also, if yes, which would be cheaper?, junior engineer or LLM?

  • brabel 3 days ago

    > Seeing opposite here too, they are like eager juniors 'oh the issue is here and here's a 5 page report why', and it's wrong... then you add more info and it goes to a different spot... repeat until you get tired and solve it yourseld, it is useful as a rubber ducky i guess.

    It's really amazing how different people have completely different experiences. I work on a massive code base and I thought AI would not be able to fix anything in at least a few years since the application is very complex and does not use well known frameworks. I was very wrong. In my experience, it fixes bugs better than I could, at least given a short time budget (which is always the case, if we spend too much time on each bug we just fix bugs slower than they get reported and we'd enter a death spiral).

    I have worked on this code base for more than 10 years, touched every part of it, and I wrote large chunks of most systems, despite around 20 people working on it right now. Still, when I need to figure out something, now, I often ask AI as it is absolutely wonderful in understanding and explaining code, no matter how big the code base is. My team consists of 20 very senior developers, and I am their technical lead, so I think I know what I am talking about.

    A junior would require at least 6 months of guidance to become productive in our code base, unfortunately, just because it's so big and it integrates with all sorts of external services, databases etc. I do understand that saying this is not really a flex, I would've actually preferred that my code base was so good even a junior developer could be immediately productive in it, but that's sadly just not the case. But perhaps, with the help of a AI tutor, that's actually possible now?!

    If you think AI is at the level of a junior developer right now, I'm afraid you're kidding yourself.

    In case you're wondering: we use Claude Code.

    • mittensc 3 days ago

      > given a short time budget (which is always the case, if we spend too much time on each bug we just fix bugs slower than they get reported and we'd enter a death spiral).

      This is something I don't understand.

      - If you have a bug, you need to fix it well as well as proper root cause.

      - That way the bug never surfaces again and safeguards are added for that class of bugs.

      - if done well over time it builds discipline and bugs only surface from new features or integrations.

      I've never had an experience of a 'death spiral' that you mention.

      > Still, when I need to figure out something, now, I often ask AI as it is absolutely wonderful in understanding and explaining code, no matter how big the code base is.

      Sure, but you still dig into the code afterwards I assume, you don't blindly trust what the AI summarization tells you.

      > If you think AI is at the level of a junior developer right now, I'm afraid you're kidding yourself.

      It depends, small projects with well defined scope, yeah, it knocks them out of the park, what I'm working on, it's a bit disappointing, not for lack of trying.

      Still, one other thing I'm noticing now... if my account were not anonymous I would likely need to think of possible repercussions for my 'lack of faith' and would probably post comments very similar to yours or not at all.

      So I'll stop here.

      • brabel 3 days ago

        > If you have a bug, you need to fix it well as well as proper root cause.

        Can you spend 3 months fixing a bug and doing nothing else? You always have a time budget, whether you know it or not, even for your hobby projects. Do you not have users reporting bugs regularly? Any large product will have bugs, I see the biggest companies with the best engineers maintaining open source repositories with thousands of bugs, and the list just keeps growing. Internal products are even worse. All you need for your bug list to keep growing is one bug taking longer to fix than the rate at which bugs are reported.

        > if done well over time it builds discipline and bugs only surface from new features or integrations.

        Yes, and we have a whole lot of features coming out every release. We have a very large product. That's why we keep adding "bugs"! Not because we're fixing bugs that had already been badly fixed previously, if that's what you're thinking.

        You've never seen a bug spiral? I must assume you're new to this industry. Bug spirals have killed many companies. It's very common to have code that's so bad no one can touch it without introducing lots of bugs. Fix one bug, 2 new bugs are introduced.

        Luckily, where I work we have a lot of tests so it's rare that we have regressions, so the main cause of bugs is the new features, especially big ones as it's humanly impossible to properly review thoroughly enough that there's no bugs. That's where I think AI will help a lot - but we're still trying to figure out exactly how. Simply letting the AI review everything is not enough. And as I said before, humans just can't spot bugs to save their lives, me included.

        > if my account were not anonymous I would likely need to think of possible repercussions for my 'lack of faith'

        That's weird to hear, HN is about 50% AI enthusiasts, 50% AI skeptics, at least that's my impression.

        I was a skeptical until recently, but in the last few months of using Claude Code (and Copilot, but Copilot consistently performs worse), the LLM has become better than most humans IMO. I still write a bit of code by hand, though, simply because I can't help it and sometimes I know I can do things very fast anyway so why burn LLM tokens on the thing. But sometimes I try to "correct" AI code just to learn later the AI was right (normally tests pick that up - we instruct the AI to write comprehensive tests, and it does it well... I normally review mostly the test code and less so the implementation). I am almost at a level where I believe not using LLMs to write code professionally is akin to not using static type systems: you're refusing to let the computer help you for no reason. It's not about faith, it's about using the tools that make our jobs easier and our output better. I know not everyone is there yet, but I definitely feel like I am.

        • mittensc 3 days ago

          > Can you spend 3 months fixing a bug and doing nothing else?

          In what world would that be needed or accepted.

          It generally takes 1-2 days to fix harder issues lile race conditions/memory corruptions. Regular bugs are much faster. All fixed correctly without AI.

          AI just goes on a random path every time and in general fails to find the root cause unless you tell it explicitly what it is...

          > I was a skeptical until recently, but in the last few months of using Claude Code (and Copilot, but Copilot consistently performs worse), the LLM has become better than most humans IMO

          great that it's working on your end

  • arw0n 3 days ago

    Could you maybe in brought strokes explain what you are working on? I think it is very plausible that the disconnect is between people writing front ends/rest apis vs people solving things like graphics.

    • YZF 3 days ago

      In my case this is not simply "rest APIs". It's is a fairly complex code base. Not trivial work. But the code base is fairly clean and so localized understanding can be sufficient for many tasks.

  • YZF 3 days ago

    I would say much better than a junior without any prior knowledge. But definite not a senior with knowledge. I.e. needs guidance.

    x200 the speed of a junior.

    It's interesting how far our experiences differ. I have heard from people working on C/C++ code bases that it's more challenging and I haven't tried the LLMs in these domains.

    I do see people getting results even internally. Sometimes it's about getting to learn the tool. It's really interesting how we have this mix of "this is garbage" and "this is really useful". From my end I don't think I'm making stuff up or looking through some rosy glasses and I've been coding for 30+ years.

    EDIT: I should add that when I use AI I already have a "shape" in my head of what I'm trying to get done. It's not like I tell AI something vague (like a user level issue) and expect it to fully understand a huge code base (though sometimes that also works). If I have a race I might have a Go race detector goroutine dump. If I'm refactoring I know where the work needs to happen. If I have a test failure I know what test failed and I usually have some idea of where to start.

    I'll also add the resulting AI assisted code is good. I review it as it is being written and if there are issues (either functional or stylistic) make adjustments. All our code gets reviewed and all has quite extensive tests. Again this is at above junior level.

iLoveOncall 3 days ago

That's a lot of "ifs" for something supposed to revolutionize the industry.