Consider a spoon-fed spectrum for AIs working in large codebases, where is state-of-the-art AI?
"Here's a bug report, fix it."
"Here's a bug report, an explanation of what's triggering it, fix it."
"Here's a bug report, an explanation of what's triggering it, and ideas for what needs to change in code, fix it."
"Here's a bug report, an explanation of what's triggering it, and an exact plan for changing it, fix it."
If I have to spoon-feed as much as the last case, then I might as well just do it. The second last case is about the level of a fresh-hire who is still ramping up and would still be considered a drain under Brook's Law.
I suppose the other axis is: How much do I dread performing the resultant code review?
Put them together and you have a "spoon-fed / dread" graph of AI programmer performance.
Another thing is that working on a large codebase is not even mostly about writing code it's about verifying the change. There are a lot of tickets in our backlog that I could roll through "fixing" by just joyriding my IDE through the codebase but verifying each of those changes will in some cases take days (I work on a platform supporting most of the company's business).
I guess the AI folks will insist that the next step is "agentic" AIs that will push the changes to a test environment that it keeps up to date, adds and modifies tests ensuring that they're testing intent creates a MR argues with the other agents in the review, checks the nightly integration report and supports it into production.
Consider a spoon-fed spectrum for AIs working in large codebases, where is state-of-the-art AI?
"Here's a bug report, fix it."
"Here's a bug report, an explanation of what's triggering it, fix it."
"Here's a bug report, an explanation of what's triggering it, and ideas for what needs to change in code, fix it."
"Here's a bug report, an explanation of what's triggering it, and an exact plan for changing it, fix it."
If I have to spoon-feed as much as the last case, then I might as well just do it. The second last case is about the level of a fresh-hire who is still ramping up and would still be considered a drain under Brook's Law.
I suppose the other axis is: How much do I dread performing the resultant code review?
Put them together and you have a "spoon-fed / dread" graph of AI programmer performance.
Another thing is that working on a large codebase is not even mostly about writing code it's about verifying the change. There are a lot of tickets in our backlog that I could roll through "fixing" by just joyriding my IDE through the codebase but verifying each of those changes will in some cases take days (I work on a platform supporting most of the company's business).
I guess the AI folks will insist that the next step is "agentic" AIs that will push the changes to a test environment that it keeps up to date, adds and modifies tests ensuring that they're testing intent creates a MR argues with the other agents in the review, checks the nightly integration report and supports it into production.