Implera is currently offline. The blog stays up.
Back to insights

Insights

How AI Coding Assistants Affect Your Codebase Health

Copilot, Cursor and Claude Code now write a meaningful share of commits on most small and medium codebases. On the teams I have looked at recently, "written with significant AI assistance" is somewhere between 40 and 80 percent of new code.

That is a big change. What happens to the codebase depends less on the AI and more on how the team receives the output.

What AI is genuinely good at

  • Boilerplate. CRUD handlers, API client wrappers, type definitions, config files. Code that is tedious but predictable. AI handles this faster and at least as well as most humans.
  • Matching existing patterns. Given a codebase's conventions, AI will generally follow them. Sometimes better than a new hire.
  • Finishing a thought. A developer writes a function signature and two test names. AI fills the middle. Productivity win.
  • Explaining unfamiliar code. Ask "what does this do?" on a legacy module. The answer is usually right.

What AI is consistently bad at

  • Architecture-level decisions. AI suggests patterns that fit local context but conflict with system-wide design. It cannot see the whole codebase at once.
  • Saying "you do not need this". AI will write the thing you asked for. It will rarely push back on an approach that is over-engineered or introduces unnecessary complexity.
  • Quality judgement on its own output. AI will generate a test that executes a function but asserts nothing meaningful. It writes "working" code that would not pass an experienced reviewer.
  • Edge cases it was not prompted about. AI optimises for plausible output. Edge cases that are not in the prompt get missed.

The measurable effects on a codebase

This is the part that matters. What do you actually see in metrics when a team adopts AI heavily?

Volume of code rises sharply. New files, new functions, new dependencies. Commit frequency often doubles.

Test-to-source ratio falls. AI writes source faster than humans write tests. Unless you enforce test coverage in CI, the ratio drops within a month.

Duplication creeps up. AI generates a utility in src/utils/parseDate.ts in one PR. Two weeks later it generates a slightly different version in src/lib/helpers/date.ts. Neither PR reviewer catches it because each looks reasonable in isolation.

Dependencies accumulate. AI suggests a package for a problem that could be solved in ten lines of existing code. The package lands. A month later three more land, each solving a similar problem with a different library.

Complexity rises in hotspots. Long functions with nested conditionals accumulate because AI extends them naturally. Refactoring rarely happens because the PR that added the nesting looked fine at the time.

None of these is a bug. Every one shows up as a trend.

What the traditional quality net misses

Code review was designed around human-speed code production. At human speed, a reviewer reads a meaningful fraction of changes. When AI triples or quadruples the volume, the reviewer is reading a smaller fraction of a larger pie. Reviewers spot individual defects but miss aggregate drift.

Static analysis catches specific patterns (unparameterised SQL, missing alt text, dangerous API calls). It does not catch "duplication is rising across the codebase" or "dependencies are growing faster than they are being maintained".

Tests catch regressions you anticipated. They do not catch regressions you did not.

The gap is the trend layer. What is moving over time, and in which direction?

What works

Four practices from teams that have kept quality stable while scaling AI use.

  1. Enforce test-to-source ratio in CI. The number itself is less important than the constraint. A team committed to "PRs must include tests for new code paths" gets pushed back by the CI gate, and the ratio holds.
  2. Run per-PR quality gates. Specifically for security, architecture and duplication. Catch regressions as they land, not six months later.
  3. Track codebase metrics over time. File size distribution, dependency count, duplication ratio, complexity. A weekly snapshot is enough. What you care about is the direction.
  4. Review AI-generated PRs with the same rigour as any other. The reviewer still has to read the code. The fact that an AI wrote it is not a reason to read less carefully. Teams that get this right treat AI as a faster typist, not a second reviewer.

The leverage

AI tools are net positive for most teams. But the productivity gain only compounds if the quality does not degrade underneath it. Teams that accept the speed benefit while letting the trend drift find themselves, six months in, with a codebase that is busier and slightly worse, and no clear reason why.

Measure what is moving. Act on the trend. The AI writes the code. Something needs to watch the quality.

FAQ

Common questions

© 2026 Implera