Former Microsoft CEO Steve Ballmer on useless performance metrics[1][2]:
>> In IBM there's a religion in software that says you have to count K-LOCs, and a K-LOC is a thousand line of code. How big a project is it? Oh, it's sort of a 10K-LOC project. This is a 20K-LOCer. And this is 5OK-LOCs. And IBM wanted to sort of make it the religion about how we got paid. How much money we made off OS 2, how much they did. How many K-LOCs did you do? And we kept trying to convince them - hey, if we have - a developer's got a good idea and he can get something done in 4K-LOCs instead of 20K-LOCs, should we make less money? Because he's made something smaller and faster, less KLOC. K-LOCs, K-LOCs, that's the methodology. Ugh anyway, that always makes my back just crinkle up at the thought of the whole thing.
Which ties in nicely to Bill Gates' quote:
Maybe the question of "How do I measure developer productivity?" is to broad to be useful?
Per the article, what managers often actually want to know is "How do I detect when a developer is so unproductive that they should not be retained?".
What are some behaviors that developers have when they've "checked out"? Not showing up to work or logging in is an obvious example. Lines of code might be a poor metric, but maybe the bottom 1% of coders in that metric is a sufficient detection for poor performance?
...at least as a detection mechanism to highlight for further investigation?
Your best programmer will likely remove thousands of lines of code from your application and replace them with just a few lines.
For a manager there is no universal metric other than the hard work of understanding what your employees are actually doing.
The manager rarely needs to detect when the developer is that unproductive - it's glaringly obvious. In practice, even if you as the manager don't notice, the team will tell you that another developer isn't pulling their weight. However, you can't just fire someone in most organizations because their teammates have a gut feeling -- you need to have evidence as you get a case together. Further, some causes of poor performance are temporary; some are an indication of a lack of skill that can be remedied; some are that someone is a good worker in the wrong place in the organization; Sometimes, a developer is just lazy. A good manager will be able to discern this and determine if the solution is to change the environment in a way that brings success to the dev.
But if the case is that the developer needs to be "managed out," you need to build the case as a manager. This is where those metrics help.
Reminds me of this story about Bill Atkinson at Apple, submitting a negative 2,000 lines of code report to his manager. https://www.folklore.org/StoryView.py?story=Negative_2000_Li...
The first Turkish software export was measured in Meters :)
The story goes, a Turkish company sells a software to a British client for a few hundreds thousands GBP and has to go though the bureaucracy for the accounting purposes.
Turns out, the clerk in the customs doesn't understand how software works and says he cannot certify the export of a few hundred grands worth of goods delivered by press of a button(the delivery was done through 28kbps modem connection). They bring him a disk but this is not good enough, there is no way this piece is worth that much he says and rejects the application.
They end up putting the software on a tape and declare that they've sold 2000 meters of fine Turkish software to the Brits. The clerk likes that, so the first software export from Turkey ends up being measured in Meters!
[0] The sources are in Turkish but the guy who was involved in this is Ali Akurgal.
That's a great story. If I did that for my projects I'm sure I'd come off feeling as if I've built a lot more than I currently do.
As a Turkish, I can confirm the story.
Also the same story highlights the roots of the term "Tape Out".
Since the designs of ICs were stored in the tapes at the time, you write the final design to a tape and send it out to fab, hence you Tape out the design.
I recall from some HN story that "tape out" is an even older term, referring to the practice of putting black line tape on the (much magnified) artwork for the photomask. Wikipedia seems to confirm this: https://en.wikipedia.org/wiki/Tape-out#History.
Thanks for the info, TIL something new.
Cheers!
I can't find a reference now, but there's an old story about an early import of some software from the UK into Ireland (in the 50s or so). The software came on punched cards, and when it arrived turned out not to load. After much investigation it turned out that some cards were missing, so the vendor sent it again. This time different cards were missing. Eventually the problem was tracked down; at the time, it was normal for customs officials to retain a small sample of any bulk material imported for post-hoc inspection. The punched cards were viewed as a bulk material...
Twenty years ago I worked for a company that had in the past paid their contractors by LOC count.
Turns out if you base the pay on that ... why create a function and call it repeatedly when you can just copy and paste the same code block over and over again
As soon as a measure becomes a target, it ceases to be a good measure.
My lead would have a conniption if I ever did that.
I have seen contractors do things like put 1000 lines of enum declarations for every integer from 1 to 1000. That's an extreme example.
https://en.m.wikipedia.org/wiki/Goodhart%27s_law
https://en.m.wikipedia.org/wiki/Campbell%27s_law
If you know a thing or two about evolution, it should be obvious how these shortcuts will play out.
And still, you hear about similar metrics being used all the time... All over the place.
I suspect, in most cases, it's actually bad metrics stacked and really the manager trying to putting a metric on their own "contribution". (Parasitic isn't even the word, but rather viral or prionic...)
In the end, truly and reproducibly recognizing individual contribution is probably beyond human comprehension for any non-trivial tasks. Maybe your product actually benefits a lot from that one guy, who is just very good at promoting a good spirit, being fun to work with, rather than being the most proliferative coder. And how do you measure how hard a problem is? Does is matter who perceives it as hard in your team?
I think, it's one of those things, where trying to be clever will most likely make things much worse because the manual labor and personal involvement of the past actually was already best fitted for the task, by utilizing the human brain where it excels. Humans doing human things with humans, without being willfully ignorant about the complexity at hand.
Now think about money as a metric for human prosperity. Which creatures really roam the markets and where is mankind's place there? Is any dynamic stability of intrinsic value to us? (Not trying to be edgy, or deep, I just think those are the questions indeed usually not answered by free market enthusiasts.)
It's always fun trying to explain to non-tech management the two truths of managing software development:
1. There is no objective measure of productivity that can be applied to software development.
2. Accurate estimation of development times is impossible (not difficult, actually theoretically impossible). All estimates of development times are wrong, some by more than an order of magnitude.
The second one is especially hard to grok for non-techies. I have to explain that if you insist on accurate estimates, you will get grossly padded estimates [0]. And the work will expand to fit the time available (sometimes resulting in the task exceeding even the massively padded estimates because the work expanded with the estimate). I've had many entertaining conversations attempting to explain this.
[0] Because every developer knows that an estimate will magically become a deadline.
> 2. Accurate estimation of development times is impossible (not difficult, actually theoretically impossible). All estimates of development times are wrong, some by more than an order of magnitude.
Some are useful. The goal isn’t to be right in that there’s no award for perfect estimates, that would be stupid. But having an estimate, especially with relevance to multiple features is helpful.
At one time a team I worked on used “story point cards” and each person would estimate blindly and the discuss. It was interesting hearing people’s reasons behind their magnitudes.
Over time the story estimates got pretty good. But the number was completely useless objectively and made no sense when comparing teams or different projects.
Yeah, there are methods of getting to a useful "it's about this big" estimate.
The real pain is turning estimates into deadlines. "About this big" doesn't equate to "it'll be finished on Thursday".
It should not be express as size then. It should be expressed in dice!
“This story is 4D6, that one about a D20”
haha, love it. "Difficulty estimate: 15. Roll a d20 every day to see if it's completed, with a +1 modifier per day"
> The second one is especially hard to grok for non-techies. I have to explain that if you insist on accurate estimates, you will get grossly padded estimates [0]. And the work will expand to fit the time available (sometimes resulting in the task exceeding even the massively padded estimates because the work expanded with the estimate). I've had many entertaining conversations attempting to explain this.
Sounds like Hofstadter's law.
"Hofstadter's Law: It always takes longer than you expect, even when you take into account Hofstadter's Law."
[1]: https://en.wikipedia.org/wiki/Hofstadter%27s_law
> 2. Accurate estimation of development times is impossible (not difficult, actually theoretically impossible). All estimates of development times are wrong, some by more than an order of magnitude.
Personally now-a-days I care more about velocity. If the team can deliver the smallest value added to the system within a week, we are good. The more constant velocity the team has, the more Management trusts you to produce value and get their things done without the need to have massive estimation parties that almost always end up being wrong.
The sad thing is, once the velocity gets bogged up or waving (Org change, contantly changing directions,..) the estimation parties are here to stay and it's quite hard to get back to where the team was.
From my experience, and just a rule of thumb:
If you give a bug to a software developer to fix, the good software developer will probably rewrite some code and in the end the LOC count will be the same or less. The bad/lazy software developer will glue some fix on top of the bug and add additional LOCs.
Depends. Sometimes software is so convoluted that you're sparing yourself and your colleagues a lot of trouble by adding that one liner rather than doing a big refactoring, especially in the absence of tests.
KLOC can be a useful productivity metric to track at an individual developer level, given the developer works in the same domain for an extended period of time, and has the professional and skill level maturity not to game it. That relies on management that doesn't try to aggregate it or compare it in an apples to oranges fashion across developers, domains or (worse) languages.
The problem is that this is a theoretical scenario which rarely comes up in reality. The developers operating at the skill and maturity level where tracking their individual productivity and quality can be done well, generally are not the developers who really need these types of measures, and they usually aren't left grinding on the same domain for the years required to have enough data to do anything useful with it.
It is kind of a catch 22. The situations where tracking and gaining insight from KLOCs would help (mature/skilled devs working on the same thing over and over for years...doesn't happen much), are not the situations where it is most needed (projects with hundreds of relatively inexperienced/immature developers and managers)
The result is it is applied to disastrous effect in situations where immature managers glom onto it as a silver bullet for managing the unmanageable.
I always get a bit cranky about the LoC thing.
I’d rather have an employee that spent three days, optimizing a single 100 LoC method, than one that churned out 2,000 LoC that does the same thing, in two days, that will need to be maintained, forevermore.
Fewer LoC isn't always better for maintenance. I'd certainly rather have my team maintain 1KLoC of JavaScript than 50LoC of Brainfuck.
Good point. I write Swift. It's very easy to write almost entirely inscrutable Swift.
I sometimes have to go back, and put code back in, to make it maintainable (often, by Yours Truly).
The less code the better! Usually when I see massive PRs with hundreds of lines of code, the engineer who wrote it has through lack of experience or lack of humility re-written a huge chunk of stuff that's already in the standard library.
Less code reaches a limit where readability and future maintenance costs more. The art is having just enough code to do the job without having too much or too little. Unfortunately, programmers are not often measured on their ability to write code that can be handed off to others and maintained for the next 20 years.
Did people get creative with stretching simple things into ridiculous line counts? I totally would.
Ballmer was actually insightful and philosophical, when he isn’t throwing chairs around?
Having a temper doesn't prevent one from being intelligent. Gates and Jobs had famously bad tempers too.
Was Jobs really that smart? Not sure that's the niche he filled successfully. More like Kinski, talented in some domain... but smart?! Jobs died pretty much by Dunning-Kruger effect, no?
People can be very intelligent on one area, and not as much in others.
There are many scientists who have very high standards for proof and evidence in their professional life, but accept other theories as truth without a shred of evidence in their personal lives.
Sure, but I think it's also easy to conflate success and talent with intelligence. I think many successful scientist are not particularly smart, but rather dilligent and educated; most innovation is not a leap, but a steady incremental progress which may accumulate in a "innovative" product at some point.
I don't know much about Jobs, that's why I asked. From what I got, he was rather a charismatic leader with a good intuition, but I wouldn't call those attributes intelligence per se. Is there any evidence for a particular intelligence of his?
What exactly do you consider an evidence of intelligence?
Idk. But surely, you can't make your point without it, do you? I think success and fame is not an appropriate proxy.
So what is? If the visionary genius of Steve Jobs doesn't qualify for intelligence, what does?
Wasn't he the leading advocate of stack ranking at Microsoft, though? That's not particularly insightful or philosophical, it's just lazy management.
It's almost like humans are complex creatures...
He backed the HR lead personally; I think it was their thing.
I think this might have been one of those infinite monkeys moments.
To be fair, this was nearly universally know for decades by the time he became CEO of MS.
He is just not old enough for working in a time this was new.