Performance reviews are one of the most consequential things engineering managers do. This guide explains why reviews feel so difficult, what actually helps, and how to build a process that serves both you and your team.
Why Performance Reviews Feel Harder Than They Should
Performance reviews shape compensation, promotions, team dynamics, and individual careers. Yet most managers approach review season with dread rather than confidence. Understanding why helps you address the real problems.
Fragmented Context
An engineer's work is scattered across dozens of systems. Pull requests live in GitHub. Project updates appear in Jira or Linear. Conversations happen in Slack threads, 1:1 notes, and standup summaries. Design decisions get made in documents you may never see. No single place holds a complete picture of what someone actually did. When review season arrives, you're assembling a puzzle from pieces stored in different rooms.
Memory Reconstruction
Human memory is unreliable across six-month or twelve-month review periods. You remember the incident that woke you up at 2 AM. You remember the last sprint. You remember the person who speaks up in meetings. You don't remember the quiet refactor in March that prevented three incidents. You don't remember the mentoring conversations that helped a struggling teammate turn things around. The work you didn't have to think about tends to disappear.
Visibility Bias
Some work is inherently more visible than other work. Launching a feature gets announced in Slack. Improving test coverage does not. Engineers who present in all-hands get remembered differently than engineers who unblock others in DMs. This isn't about who works harder. It's about which work leaves traces you can see.
Emotional Load on Managers
Writing reviews is emotionally taxing. You're making judgments about people you work with every day. You know your words affect their livelihood and sense of self-worth. The stakes are high, and the information you're working from feels incomplete. Most managers genuinely want to be fair. The anxiety comes from knowing how easy it is to get it wrong.
What Managers Typically Try (and Why It Doesn't Work)Common Pitfalls
Spreadsheets
Many managers maintain spreadsheets throughout the year, intending to log accomplishments as they happen. This works for about three weeks. Then a busy quarter hits, and the spreadsheet goes untouched until December. Spreadsheets require manual entry, which means they capture what you remembered to write down, not what actually happened.
Last-Minute PR Scraping
In the week before reviews are due, managers often pull up GitHub and scroll through commit histories. This gives you a list of code changes but strips away context. You see that someone merged 47 PRs. You don't see that 40 of them were fixing issues caused by unclear requirements, or that they spent significant time reviewing others' work. Raw activity metrics without context can mislead more than they inform.
Relying on Intuition
Some managers trust their gut sense of who's performing well. Intuition isn't useless—experienced managers often have good instincts. But intuition is vulnerable to recency, likability, and communication style. The engineer who explains their work well in 1:1s creates a different impression than the one who does equivalent work but struggles to articulate it. Both deserve accurate reviews.
Over-Weighting Recent Events
The last month before reviews carries disproportionate weight in most assessments. This is called recency bias, and it's nearly universal. An engineer who had a strong first half but struggled recently gets reviewed as struggling. An engineer who coasted for months but shipped something big in November gets reviewed as high-performing. Neither review reflects the full period.
What Good Review Inputs Actually Look LikeBest Practices
Good review inputs share a few characteristics. They're longitudinal—spanning the entire review period rather than clustering around the moments you happen to remember. They capture behavior over time, not isolated incidents. And they surface the kinds of contributions that often go unnoticed in day-to-day operations.
Patterns Over Snapshots
A single missed deadline tells you less than a pattern of missed deadlines with similar root causes. A single thoughtful code review doesn't make someone a mentor. Consistently unblocking teammates across six months does. The artifacts that matter for reviews are the ones that reveal trajectories: how someone responds to feedback over time, how they handle repeated challenges, whether their judgment is improving or static.
Collaboration Impact
Engineering is collaborative, but most review inputs emphasize individual output. Useful inputs include signal about how someone affects the people around them. Whose PRs do they review? Who comes to them with questions? Do they raise concerns early or wait until problems escalate? These patterns often live in Slack threads, standup notes, and peer feedback—not in commit histories.
Growth Trajectory vs. Raw Output
Someone stretching into a new area may produce less raw output than someone operating in their comfort zone. That's expected. Review inputs should help you distinguish between the engineer who shipped less because they were building new capabilities and the engineer who shipped less because they were coasting. This requires context about what someone was working on and why, not just how much they shipped.
Separating Visibility from Value
Some valuable work is inherently visible: launching features, presenting in all-hands, resolving high-profile incidents. Other valuable work is not: improving test coverage, mentoring in DMs, writing documentation that prevents future confusion. Good review inputs capture both. They don't reward engineers who are skilled at self-promotion over engineers who are skilled at quiet, essential work.
Concrete Artifacts
The artifacts that support fair reviews include:
- •Pull requests and code reviews: Not just counts, but quality, scope, and whether someone's reviews help others improve
- •1:1 notes: Themes from conversations about challenges, accomplishments, and development areas
- •Standup and check-in summaries: What did someone own? What blockers did they surface or resolve?
- •Peer feedback: Structured input from collaborators who see work the manager doesn't
- •Goal progress: Not just whether goals were met, but how someone responded when circumstances changed
None of these artifacts alone tells the full story. Together, they give you enough signal to write a review grounded in evidence rather than reconstruction.
How to Prepare for Review Season Before It Starts
What to Capture Continuously
The most useful information to capture includes:
- •Key accomplishments tied to specific work: Not just "shipped feature X" but the context around difficulty, collaboration, and impact
- •Feedback received and given: Patterns in how someone responds to feedback and supports others
- •Standup and check-in themes: What challenges came up repeatedly? What did someone take ownership of?
- •Growth conversations: What did you discuss in 1:1s about development areas? What did they commit to working on?
What Not to Track
Avoid tracking raw activity metrics without context. Lines of code, number of commits, and hours logged tell you very little about performance.
Also avoid tracking subjective impressions recorded in the moment without supporting evidence. "Seemed disengaged in meeting" is less useful than "Didn't participate in three consecutive planning sessions despite owning affected work."
How to Keep Notes Lightweight
The goal is not comprehensive documentation. It's having enough signal that you're not reconstructing from scratch.
Five minutes after each 1:1 to note key themes is more sustainable than detailed transcripts. A quick tag on significant PRs or projects is more sustainable than spreadsheet logging. The best system is one you'll actually use.
How Bias Shows Up Even with Good IntentImportant
Quiet vs Visible Contributors
Some engineers do their best work in ways that don't generate noise. They improve reliability. They mentor junior teammates in DMs. They write documentation that prevents confusion. They do the unglamorous work that keeps systems running. Other engineers work on high-profile projects, present frequently, and communicate proactively about their impact. Both styles can represent high performance. Reviews often reward the second style disproportionately.
Availability Bias
You assess based on information that's available to you. If you can easily recall examples of someone's work, you rate them more confidently. If their contributions are harder to surface, you may underrate them simply because the evidence isn't at hand. This affects remote employees, people on different time zones, and anyone whose work doesn't cross your desk regularly.
Confirmation Bias
Once you form an impression of someone, you tend to notice evidence that confirms it and discount evidence that contradicts it. If you see someone as a strong performer, you interpret ambiguous situations favorably. If you see them as struggling, the same situations look like more evidence of struggle. This makes early impressions sticky in ways that aren't always fair.
How Some Teams Reduce Reconstruction and Bias
Several approaches help teams write more accurate reviews:
Structured 1:1 Notes
Some managers use consistent templates for 1:1s that prompt them to capture accomplishments, challenges, and growth areas. This creates a lightweight record without requiring separate logging.
Peer Feedback Systems
Gathering input from collaborators provides perspectives the manager may not have. Engineers often have visibility into each other's contributions that managers lack.
Brag Documents
Some teams ask engineers to maintain running documents of their own accomplishments. This shifts some of the memory burden to the person with the most context about their own work.
Continuous Context Capture
Some teams use tools like Vereda to continuously capture context from engineering work, standups, and check-ins so review season isn't a reconstruction exercise. This approach reduces the gap between what happened and what the manager can access.
The common thread is reducing reliance on memory and making the full review period equally visible.
What “Good” Looks Like
Reviews as Coaching Moments
The best reviews don't feel like verdicts. They feel like structured conversations about where someone is, where they're headed, and what support they need. The manager comes prepared with specific observations. The engineer feels seen and understood. This requires the manager to have confidence in their assessment—confidence that comes from having real data, not reconstructed impressions.
Confidence in Assessments
When you sit down to write a review, you should be able to answer: What did this person actually accomplish? How did they grow? Where did they struggle? What do they need next? If you can answer these questions with specifics from the entire review period, you're in good shape. If you're guessing, something in your process needs to change.
Less Stress for Managers and Engineers
Review season doesn't have to be a scramble. With continuous context and lightweight systems, writing reviews becomes an exercise in synthesis rather than archaeology. Engineers benefit too. When they know their full body of work will be considered—not just the memorable parts—they can focus on doing good work rather than performing visibility.
Fair reviews are possible. They just require better inputs than most managers currently have.
Frequently Asked Questions
How far in advance should I start preparing for performance reviews?
The honest answer is that preparation should be continuous, not concentrated. Managers who wait until review season to gather information are forced into reconstruction mode, piecing together six or twelve months from memory and scattered artifacts.
The practical minimum is maintaining lightweight notes from your 1:1s throughout the review period. After each 1:1, spend two to three minutes capturing the key themes: what the engineer accomplished since your last conversation, what challenges they're facing, and any feedback you gave or received. This takes less time than you'd spend trying to recall these details months later.
Beyond 1:1 notes, pay attention to natural checkpoints. Project completions, quarterly planning, and team retrospectives are good moments to note contributions and patterns. You don't need a formal system—a running document or a few tags in your notes app is enough.
The goal isn't comprehensive documentation. It's having enough signal that when you sit down to write, you're working from evidence rather than impressions. Even imperfect notes from throughout the year are more valuable than perfect notes from the last month.
How do I evaluate engineers whose work I don't directly see?
This is one of the hardest challenges for engineering managers, especially as teams grow or become more distributed. You're responsible for assessing someone's performance, but much of their work happens in systems you don't monitor or conversations you're not part of.
Start by identifying where the evidence lives. For technical work, this usually means pull requests, code reviews, design documents, and project trackers. For collaboration and mentorship, it means peer interactions, Slack threads, and feedback from people they work closely with.
Peer feedback is essential here. Structure it with specific questions: What has this person contributed to the team? How have they helped you or others? Where have you seen them grow or struggle? Generic questions like "How is this person doing?" produce generic answers.
Check-ins and standups also provide signal. If your team has regular syncs, the themes that emerge over time reveal patterns. Someone who consistently raises blockers and resolves them is demonstrating ownership. Someone who frequently surfaces risks is demonstrating technical judgment.
Finally, ask the engineer directly. Regular 1:1s should include discussion of what they're working on, what they're proud of, and what's been hard. Their self-assessment is one input among many, but it gives you visibility into work you wouldn't otherwise see.
How do I write a fair review for someone I have a difficult relationship with?
Every manager eventually faces this situation. You need to evaluate someone you find frustrating, someone you've had conflict with, or someone whose communication style clashes with yours. The review needs to be fair despite your personal reaction to them.
The first step is acknowledging the difficulty to yourself. You don't need to pretend you have no feelings about this person. You need to ensure those feelings don't distort your assessment of their work.
Anchor your review in specific, observable evidence. Instead of "They're difficult to work with," identify concrete patterns: "Missed three committed deadlines without proactive communication" or "Responded defensively to code review feedback on multiple occasions." This discipline forces you to separate behavior from personality.
Seek outside perspectives. Talk to their collaborators, review their peer feedback, and look at artifacts of their work. If your negative impression isn't reflected in how others experience them, that's important information. If others share similar observations, you have corroboration.
Consider whether any of your frustration stems from style differences rather than performance issues. Some engineers communicate more directly than you prefer. Some ask more questions than you think necessary. These differences aren't performance problems unless they're actually impacting outcomes.
Finally, have someone you trust review the draft. Ask them to flag any language that sounds more emotional than evidence-based. Fresh eyes often catch what you've become blind to.
What's the difference between feedback and evaluation in a performance review?
This distinction matters because conflating the two undermines both.
Evaluation is backward-looking assessment. It answers the question: How did this person perform against expectations during this review period? Evaluation informs compensation, promotion decisions, and formal ratings. It's a judgment about what happened.
Feedback is forward-looking guidance. It answers the question: What should this person do differently to grow or improve? Feedback is developmental. It's about helping someone get better.
A good performance review includes both, but they serve different purposes and should be clearly separated.
Evaluation should be grounded in evidence from the review period. It should reference specific accomplishments, specific challenges, and patterns of behavior. The person receiving the review should be able to connect your assessment to work they remember doing.
Feedback should be actionable and specific. "Communicate more proactively" is less useful than "When timelines are at risk, flag it in standup rather than waiting until you're blocked." Good feedback gives the engineer something concrete to try.
The mistake many managers make is burying evaluation in feedback or vice versa. Saying "You need to improve your technical design skills" when you mean "Your technical design skills did not meet expectations for your level this year" obscures the actual assessment. Saying "You didn't meet expectations" without explaining what they should do differently leaves them without a path forward. Separate the two in your review structure.
How do I handle recency bias when I can remember the last month clearly but not the first six months?
Recency bias is nearly universal, and the first step is accepting that you're subject to it. Even experienced managers over-weight recent events unless they take deliberate steps to counteract it.
The structural solution is having records from early in the review period that are equally accessible as records from recent months. If you maintained notes from 1:1s throughout the year, review them before writing. If you didn't, pull what artifacts you can from project trackers, PR history, and team channels. Force yourself to look at Q1 and Q2 with the same attention you'd naturally give to Q3 and Q4.
Ask the engineer for their own retrospective of the full period. They often remember accomplishments from early months that have faded from your memory. This isn't about taking their self-assessment at face value—it's about jogging your own memory and identifying work worth examining.
You can also structure your review writing to counteract the bias. Start by listing accomplishments and challenges from the first half of the review period before you write anything about the second half. This forces you to give early months dedicated attention.
Finally, be honest with yourself about the limits of your memory. If you genuinely cannot recall what someone did in February and have no records to reference, that's a gap in your process, not a gap in their performance. Note it and improve your approach for the next cycle. Don't fill the gap with assumptions that favor or penalize them.