There is a creeping realization among heavy LLM using firms that these technologies make their productivity problem worse, not better.
The practical consequence of this is that despite their transformative power, they are very much in a market bubble.
Like the web, this means the real fun starts after the bubble pops and people figure out how to actually use these things for good.
The core issue is that making one piece of your production line vastly faster while doing nothing whatsoever about downstream bottlenecks simply overwhelms it.
My practical advice for fixing firms remains "Hurl Deming books at the faces of managers like they are shoes at W".
If your firm does not have psychotic hatred of delays and inconsistency in the production process they are going to get run over by ones that do.
The software production process has these issues now that LLMS have made the writing part fast:
- Time from PR to completed review has no consistency whatsoever
- Time from "issue identified" to work being scheduled is radically inconsistent
- Time from merged code to shipped code has no consistency either
In an environment where the factory's output is "lol idk", management-by-hair-fire becomes the order of the day.
The whole point of things like six sigma is that you need to identify the outliers so they stop happening.
Then "elmo feel calm" enough to not start tossing out the bodies until morale improves.
Such an environment of tranquility is also enormously harder to get away with shenanigans in.
Digging ourselves out of the hole requires we prioritize.
Getting consistent reviews
- Exterminate bikeshedding ruthlessly with auto-enforced linting standards and demand de gustibus things be referred to linter annealing rather than hold review.
- Limit the scope of reviews by putting a hard cap on the size of PRs. You might be shocked by how small your peers' "context window" is in practice.
- Use CI and make it very, very fast. This is helped by the above allowing you to only run relevant subsets of your testsuite.
- Make sure the channel by which reviewers are notified of pending reviews is a 0 noise channel, and reward/punish timely review appropriately.
Adding means into your tooling that notify developers (or agents) when they start to exceed the box you need to stay in are vital.
Otherwise you will waste precious time or tokens splitting out work unnecessarily.
Shortening the Dev OODA loop is what you have to laser-focus on.
Capping the size of PRs also encourages terse, minimally invasive changes instead of pasta slop.
The hardest practical part of this is making your CI only run relevant things, because that requires:
- A testsuite which has enough coverage of structural, integration and acceptance concerns to give management confidence.
- A robust means of mapping what test is relevant to any given system component changing.
The former step is harder than anyone ever estimates, but is do-able.
The latter is even harder for things acceptance tests, and things testing nonfunctional aspects of the system.
However, this is what hill-climbing algorithms are actually built to find out.
I am unaware of anyone yet building a "test recommender" algorithm.
Perhaps I should do so.
Actually getting it out of the door
Supposing the actual production line is running smoothly, the next place that stacks up is the loading dock.
Thankfully, automating nearly the entire process here is well understood and has been done by many firms; it's merely a lot of work.
The primary bottlenecks here are interdepartmental friction.
- Is the changelog automatically updated? No? Automate plz.
- Has marketing been kept abreast of said changelog and has copy ready that doesn't sound like AI slop?
- If there are new features, have the new advertising channels been identified and we are ready to exploit them?
- When it comes time to ship out the product, how many steps are in the process? If it's more than one you haven't automated enough yet.
Getting a grip on the backlog
Now that you are able to cram this stuff out of the airlock on a fully saturated belt, you have permission to think about improving the product.
- Matching "prospective change" to "person(s) most likely to do this right" is a hard problem, but recommender algos are probably good enough for this
- LLMs may make breaking "big, scary" changes into "digestible" things no longer a 5 hour planning meeting
- All the steps done previously to drive out inconsistency in time to get stuff done can allow actuarial analysis to estimate better than the ICs themselves
- Quantizing ordinal concerns such as prioritization remains nontrivial, but this is why you've hired a BA, right? They need automation too.
- If any of this process requires a meeting, it needs work. Meetings are friction, period. Their goal should be how to prevent ever having another one on the matter.
Actually getting people with the program
Now we come to why none of this wonderful sounding stuff will actually happen at most firms.
- If the workers feel disposable, so is the process.
- If there is not enough slack in the process to react in a timely manner to interrupt based flows like review, everything breaks.
- These reforms never happen in the first place because what is rewarded is dousing hairfires rather than process improvement.
Both of these are actively undermined by the unstoppable impulse of managers to "milk the plant".
If the board does not have the proper level of paranoia this can't be helped.
They have to assume by default their hired managers revert to goldbricking ticks the second they look away, and every report is falsified in difficult to understand ways.
This is why all well-functioning organizations end up employing "shoppers" and other deep cover agents to uncover the truth.
But what about if I'm a solo dev?
There is no escaping the need for consistency.
I need to hurl Sewell's "Customers for Life" at my own head with sufficient force to make this real.
The best practical advice I have heard about this is to never, ever multitask.
Start the day deciding what you are going to do and do it all.