Overview
Refactoring is the disciplined improvement of a system's internal design without intentionally changing its externally observable behavior. Examples include extracting a cohesive component, simplifying conditionals, improving names, removing duplication, or replacing a dependency through an abstraction. Refactoring differs from adding a feature, fixing a behavioral defect, optimizing for a measured bottleneck, or rewriting a system.
A refactoring strategy defines how those changes are made safely. It combines evidence about current behavior, small reversible steps, automated tests, compatibility techniques, observability, and controlled rollout. In a mature delivery process, refactoring is not a separate cleanup phase that happens only when development stops. It is part of making normal changes without allowing the design to become progressively harder to modify.
Code review is another risk-control and knowledge-sharing mechanism. A useful review checks correctness, design, security, tests, operability, maintainability, and compatibility. It should improve the change and spread context without becoming a subjective style debate or a substitute for automated analysis.
Technical debt describes the future cost created when a design or implementation makes subsequent change harder, riskier, or slower. Some debt is a deliberate trade-off made to learn or deliver sooner; some is accidental. Not every imperfection is debt, and eliminating all debt is neither realistic nor economically sensible. Teams should identify debt by its impact, prioritize it alongside product work, and pay it down where the expected reduction in risk or delivery cost justifies the investment.
Interviewers use this topic to assess engineering judgment. Strong candidates can explain not only how to improve code, but also how to preserve behavior, coordinate change across teams, review proportionately, communicate risk, and connect maintenance investment to measurable business outcomes.
Core Concepts
Refactoring, Rewriting, and Behavioral Change
Refactoring preserves behavior while changing structure. Keeping that distinction explicit reduces review ambiguity:
A commit can contain more than one kind of change, but combining them makes failures and reviews harder to reason about. Prefer a sequence in which mechanical restructuring, behavior changes, and optimization are independently visible.
Refactor in Small, Verified Steps
The basic loop is:
- Establish a green baseline.
- Make one small structural change.
- Run the most relevant fast checks.
- Commit when the code is coherent.
- Repeat.
Small steps reduce the number of possible causes when a test fails. They are easier to review, revert, and deploy. Tool-assisted transformations such as rename, extract method, and change signature are useful, but generated edits still require compilation and tests.
Characterization Tests for Legacy Code
Legacy code often lacks trustworthy tests or clear specifications. Characterization tests record what the system currently does at an observable boundary. Their first purpose is to detect accidental change, even if the existing behavior is surprising.
A practical sequence is:
- Identify the behavior and boundary that must remain stable.
- Add tests around representative success, failure, and edge cases.
- Capture production examples where privacy and security rules permit.
- Introduce a seam around unstable or external dependencies.
- Refactor behind the protected boundary.
- Replace tests that encode obsolete behavior when requirements intentionally change.
Characterization tests do not prove that current behavior is correct. They provide a safety net while the team determines which behavior must be preserved.
Branch by Abstraction
Branch by abstraction supports a long-running replacement without maintaining a long-lived source-control branch. Callers depend on a stable abstraction while old and new implementations coexist:
public interface IPricingPolicy
{
Money Calculate(Order order);
}
public sealed class LegacyPricingPolicy : IPricingPolicy
{
public Money Calculate(Order order) => LegacyPriceEngine.Calculate(order);
}
public sealed class NewPricingPolicy : IPricingPolicy
{
public Money Calculate(Order order)
{
// New implementation developed and verified incrementally.
return PricingRules.Calculate(order);
}
}
The new implementation can be selected for tests, specific tenants, or a percentage of traffic. After parity and rollout, the legacy path and temporary abstraction can be removed. The abstraction should represent a meaningful capability rather than merely mirror the old implementation.
Parallel Change and Expand-Contract
Parallel change, also called expand-contract, evolves a public API, event, or database schema without breaking existing consumers:
- Expand the interface so old and new forms work.
- Migrate producers and consumers incrementally.
- Observe adoption and compatibility.
- Contract by removing the old form after all dependents migrate.
For a database change, this may mean adding a nullable column, deploying code that can read both formats, backfilling data, switching writes, and only later removing the old column. Dual writes require special care because partial failure can create divergence. Prefer one authoritative write path plus a reliable migration or change-capture process where possible.
Refactoring at System Boundaries
Internal refactoring can still cause external regressions through:
- Changed API response fields, status codes, or error semantics.
- Event schema changes.
- Database migrations and query-plan changes.
- Authentication or authorization behavior.
- Logging, metrics, alerts, or operational runbooks.
- Timing, throughput, or resource consumption.
Contract tests, migration rehearsals, performance checks, feature flags, canary releases, and rollback plans complement unit tests at these boundaries.
What a Code Review Should Evaluate
A review should be proportionate to the risk and examine:
- Correctness: Does the change satisfy the intended behavior, including failure paths?
- Design: Are responsibilities and dependencies appropriate for the problem?
- Complexity: Is the solution understandable and no more general than necessary?
- Tests: Do tests cover meaningful behavior and fail for the expected reason?
- Security and privacy: Are trust boundaries, authorization, secrets, and sensitive data handled safely?
- Compatibility: Will existing clients, stored data, messages, and deployments continue to work?
- Operability: Are logs, metrics, alerts, configuration, rollout, and recovery adequate?
- Maintainability: Can another engineer understand and safely change the result?
Style rules that a formatter, compiler, analyzer, or linter can enforce should normally be automated. Human review attention is more valuable for behavior, architecture, risk, and clarity.
Author Responsibilities
The author should make a change reviewable:
- Self-review the final diff.
- Explain the problem, approach, scope, risks, and verification.
- Link the relevant requirement, design decision, or incident.
- Keep generated and mechanical changes separate where practical.
- Call out migration, security, data, and operational implications.
- Provide focused tests and reproduction instructions.
- Avoid unrelated cleanup that hides the essential change.
A pull request is a communication artifact, not just a request for approval.
Reviewer Responsibilities
Review comments should be specific, respectful, and actionable. Distinguish:
- Blocking: Correctness, security, data loss, broken compatibility, or an unacceptable design risk.
- Required improvement: Important maintainability or test issue that should be addressed before merge.
- Suggestion: A useful alternative that is not necessary for approval.
- Question: A request for context rather than an implied defect.
Review the code, not the author. Explain the engineering consequence of a requested change. When several valid solutions exist, optimize for a sound, maintainable result rather than the reviewer's personal preference.
Review Size and Turnaround
Smaller changes usually receive faster and more accurate reviews. A large initiative can be split into:
- Preparatory tests.
- Mechanical refactoring.
- New implementation behind a stable interface.
- Consumer migration.
- Cleanup and deletion.
Each change should leave the system buildable and coherent. Stacked changes can preserve momentum, but reviewers need clear dependencies. Teams should set expectations for review responsiveness because long queues increase context switching, merge conflicts, and delivery time.
Automation and Human Judgment
Continuous integration should handle repeatable checks such as:
- Compilation and tests.
- Formatting and static analysis.
- Dependency and secret scanning.
- Architecture rules.
- API or schema compatibility.
- License and policy checks.
Automation should produce actionable results and avoid noisy gates. Human reviewers remain responsible for intent, domain correctness, usability, trade-offs, and risks that tools cannot infer.
Technical Debt Is an Economic Concept
The debt metaphor has two useful parts:
- Principal: The cost of improving or replacing the current solution.
- Interest: The recurring delay, defects, incidents, cognitive load, or operational cost caused by leaving it in place.
A shortcut that never affects future work may be an imperfection but has little debt interest. Conversely, a confusing module changed every week can accumulate high interest even if it has few obvious code smells.
Common Types of Technical Debt
Technical debt can appear in:
- Architecture and dependency boundaries.
- Code duplication and excessive complexity.
- Missing tests or unreliable test suites.
- Fragile build and deployment pipelines.
- Outdated or unsupported dependencies.
- Data models and migrations.
- Security controls and secrets management.
- Observability, runbooks, and incident recovery.
- Documentation and ownership gaps.
Security vulnerabilities or unsupported platforms may require immediate risk treatment rather than ordinary backlog prioritization.
Record Debt as an Actionable Item
An effective debt record includes:
- The affected capability or component.
- Evidence of the problem.
- Current impact and likely future interest.
- Risk if left unresolved.
- Proposed next step and rough scope.
- Owner or owning team.
- Trigger, target window, or review date.
A large undifferentiated "cleanup" backlog becomes a graveyard. Delete obsolete items, combine duplicates, and keep only work tied to a recognizable outcome or risk.
Prioritizing Technical Debt
Useful prioritization factors include:
- Business and security risk.
- Frequency of change in the affected area.
- Defect and incident history.
- Delivery delay caused by the design.
- Number of teams or customers affected.
- Cost of waiting versus cost of remediation.
- Upcoming product work that will amplify the debt.
- Reversibility and migration complexity.
Hotspot analysis combines complexity with change frequency. It often identifies better investments than applying broad cleanup rules across stable code.
Paying Down Debt Incrementally
Debt can be addressed through:
- Refactoring the area while delivering a related feature.
- Reserving explicit capacity for cross-cutting work.
- Incident follow-up and reliability work.
- Dependency upgrade programs.
- Focused modernization milestones.
- Automated guardrails that prevent recurrence.
The "leave it better than you found it" principle is useful within scope. It should not turn every feature into an uncontrolled redesign.
Measuring Improvement
Measure outcomes rather than lines changed or debt tickets closed. Depending on the problem, useful signals include:
- Lead time and review time.
- Change failure and rollback rate.
- Escaped defects and incidents.
- Time needed to modify a hotspot.
- Build duration and test flakiness.
- Vulnerability age.
- Onboarding and support burden.
- Operational toil.
Measurements need context. A temporary increase in reported debt can mean visibility improved rather than quality declined.
Common Mistakes
- Starting a rewrite without a migration and parity strategy.
- Refactoring behavior and structure simultaneously without clear tests.
- Treating code coverage as proof of safety.
- Blocking reviews on personal style preferences.
- Approving code solely because tests pass.
- Creating massive pull requests that cannot be reviewed effectively.
- Recording debt without impact, ownership, or a next action.
- Promising to "clean it up later" without a trigger or budget.
- Refactoring stable code only to match a preferred pattern.
- Removing old paths before adoption and rollback evidence exists.
Practical Best Practices
- Protect behavior before changing structure.
- Prefer small, reversible, independently deployable steps.
- Make compatibility and data migration explicit.
- Automate objective conventions and reserve review for judgment.
- State review severity and explain consequences.
- Prioritize debt by risk and recurring interest.
- Tie maintenance work to delivery, reliability, security, or cost outcomes.
- Delete temporary migration code after the transition.
- Review whether the intervention actually improved changeability.