Companies are pushing teams to adopt AI, so we are writing a lot more code. With agentic workflows and vibe coding, more of the SDLC is becoming automated end to end. But the quality feedback loop has not caught up.
I have been looking at a few repos recently, mostly small teams moving quickly with AI. Nothing obviously broken. No big red flags. Linters pass, tests pass, pull requests get reviewed. On any given day the codebase looks fine.
But over a couple of weeks you start to see the patterns:
- The same logic rewritten slightly differently across multiple files.
- New code paths without tests, because the team will "come back to it".
- Flows that are getting harder to follow end to end.
- Files that have quietly doubled in size.
- Dependencies added and forgotten.
None of this is a bug. None of it is a failing CI check. Static analysis will flag some of it, but that is not really the issue.
The harder problem is keeping context
As more code gets generated automatically, it becomes harder to see what is actually changing over time. You can review a pull request in isolation and conclude that everything is fine. A human reviewer is reading 300 lines. They are not reading the other 50,000 lines of context that the PR is moving in aggregate with the last month of merges.
The signals that matter for quality are rarely in any single diff. They are in the trend. Are files getting bigger? Is the test-to-source ratio holding up? Are we adding dependencies faster than we remove them? Is duplication rising? These are all measurable. Most teams are not measuring them.
Slow drift
Slow drift is when quality is getting worse, steadily, without any single event being dramatic enough to act on. Every week the codebase is slightly harder to change than last week. No alarm fires. The PR queue keeps moving. Velocity looks healthy.
Until one day a feature takes three times longer than it should. Then another feature. Then a security finding gets escalated because nobody understood the blast radius of a change made two months ago. At that point the cost of the drift is visible, and it is expensive.
Why this is becoming a real problem quickly
Two things have changed in the last two years.
First, the volume of code landing in production has stepped up. Copilot, Cursor, Claude and agentic tools are writing a meaningful share of commits. A team of five now generates the output of a team of twenty.
Second, the implicit assumption behind code review is that humans are reading enough of the code to catch drift. At five-to-one code-written-to-code-reviewed, that assumption holds up. At twenty-to-one, it does not.
So the quality net has holes in it that were not there before. Tools that flag individual issues are still useful, but they miss the one thing that matters most: the direction the codebase is moving.
What would actually help
An automated layer that reads the whole codebase, scores it across several quality dimensions, and tracks the change over time. Not a replacement for code review. Not a replacement for static analysis. A feedback loop that answers one question honestly: is this codebase getting better or worse this month?
If the answer is "worse", you can do something about it before the cost becomes a crisis. If the answer is "better", you keep doing what is working.
This is what I have been working on with David White. We built Implera for this. It runs on every push, scores seven domains, and shows the trend line clearly enough for a team to act on it.
I am curious how other people are approaching this. A shorter version of this post first went up on LinkedIn, and the conversation there has been useful. If you run a repo you want to see the drift on, you can connect it and run an analysis in about two minutes.