Maintainability is a feeling. You open a file and know within ten seconds whether you want to work on it today. You bring in a new engineer, and after a week they either feel productive or lost.
Feelings do not scale. You cannot tell a team "this codebase feels bad" and expect action. But the feeling tracks real signals, and the signals are measurable.
What "maintainable" actually means
A maintainable codebase is one where the cost of making a change stays stable over time. Add a feature this month, it costs X. Add a comparable feature in six months, it still costs roughly X. If the cost keeps rising, maintainability is falling.
This is the definition that matters. Everything else is a proxy for it.
The signals that track maintainability
None of these is the whole story. Together they are a usable proxy.
File size distribution. A healthy codebase has most files under 300 lines, a long tail up to 600, and very few over 1,000. When you see files at 2,000+ lines, they are usually doing many jobs and will resist change.
Cyclomatic complexity. The number of independent paths through a function. Functions above 15 are hard to test, hard to reason about, and hard to change. Most real cases are under 10.
Nesting depth. Code three levels deep is normal. Five levels is hard to follow. Eight levels is a sign something has gone wrong.
Naming consistency. Not measurable directly, but indirectly: are there naming patterns, or does every module follow its own convention? When naming is inconsistent across a codebase, readers spend cycles translating rather than understanding.
Change coupling. Files that are frequently edited together are coupled. Git log reveals this. High coupling without shared purpose means changes ripple unnecessarily.
Dead code. Exports that nobody imports. Functions nobody calls. Every dead-code entry is cognitive overhead for a future reader who wonders if it matters.
Documentation density. Not strict count. The ratio of public interface that has at least a one-line description. Below 30% and new engineers will struggle to navigate.
The signals that do NOT track maintainability
Lines of code. A 10,000-line codebase can be more maintainable than a 2,000-line one if the structure is clearer. Size is a weak proxy at best.
Test count. You can have 1,000 tests and still be unable to change anything safely. Coverage numbers alone lie; assertion density matters more.
Language choice. Any language can produce maintainable or unmaintainable code. The common belief that X is "easier to maintain" is usually dressed-up familiarity.
Framework choice. Same.
Commit frequency. Many commits mean activity, not maintainability.
How to measure for real
You do not need a dedicated maintainability dashboard. You need a few tools running on every PR:
- Cyclomatic complexity scan. Most linters support it as a rule. Fail CI when a function exceeds your threshold.
- File size guard. Soft warning at 300 lines, hard fail at 500. Exceptions go through a "yes, we know" mechanism, not a silent bypass.
- Circular dependency check.
madge,dependency-cruiseror equivalent. Fail on new cycles. - Dead export detection. Tools like
knip,ts-pruneor per-language equivalents. Fail when new dead code is introduced. - Change coupling report generated weekly from git history. Not a CI gate, but a signal the team reviews.
The sum of these is a maintainability score. Not a magic number, but a trend you can track.
What the trend tells you
Absolute values are hard to interpret in isolation. "Our complexity score is 7" means nothing without context.
Trends are easier. "Complexity score has risen from 5.2 to 7.8 over the last quarter while new features are landing" is clear: maintenance is deteriorating faster than the work justifies. Time to pause and refactor.
"Score has stayed at 6.5 for six months" is healthy. The team is absorbing change without losing structure.
The job is to watch the trend and intervene early. By the time maintainability feels bad, the cost to fix is high. The signals move earlier than the feeling.
The loop
Measure continuously. Review weekly. Intervene when a trend turns the wrong way. Not dramatic work. Just sustained attention.
Teams that do this have codebases that stay workable for years. Teams that do not have codebases that accumulate complexity until someone proposes a rewrite. The cost of the rewrite always exceeds the cost of the attention that would have prevented it.