Structured Scorecards — Turning "I Liked Them" Into Evidence
After a round of interviews, the debrief often sounds like this: "I really liked her." "He gave me a good feeling." "Something was a bit off, I can't say what." These are honest reactions — but as the basis for a hiring decision, they're dangerously thin. A structured scorecard is how you turn that fog of impressions into something you can actually rely on, compare, and defend.
The problem with "I liked them"
"I liked them" feels like information. It isn't, really — at least not the kind you can act on safely.
It's unspecific. Liked them how? Were they good at the actual skills the job needs, or just pleasant to talk to? The statement collapses a dozen different qualities into one vague verdict, and you can't tell which ones drove it.
It's unreliable. Liking is heavily influenced by things that have nothing to do with job performance — confidence, similarity to you, who interviewed before them, your mood. Two interviewers "liking" two different candidates tells you very little about who's actually better.
And it's indefensible. If someone asks why one candidate was chosen over another, "I liked them more" doesn't hold up — not to your team, not to a candidate, not to anyone reviewing the decision later. There's nothing underneath it to point to.
The goal of a scorecard is to replace this with something concrete: not whether you liked a candidate, but how they performed against the specific things the job requires.
What a structured scorecard does
A structured scorecard breaks a role down into the specific competencies that predict success, and asks every interviewer to rate each candidate on each one, against the same defined standard.
Instead of one global "good/bad" feeling, you get a profile: strong on objection handling, average on rapport-building, weak on composure under pressure. That's far more useful than "I liked them," because it tells you what you're getting and where the risks are. It also forces a kind of honesty — you can't hide a vague unease behind a number; you have to locate it in an actual competency.
Critically, it makes candidates comparable. When everyone is rated on the same dimensions against the same criteria, you can line them up meaningfully. "Candidate A scored higher on handling difficult customers; Candidate B was stronger on building quick rapport" is a real, actionable comparison. "I liked A but I'm not sure why" is not.
Building one that works
A useful scorecard has a few features.
Job-relevant competencies. Start from what actually predicts success in this role — for a support hire, things like de-escalation, ownership, clarity, composure. Don't rate generic traits that sound nice but don't connect to performance.
Defined rating standards. For each competency, describe what a low, medium, and high score actually looks like in observable behavior. This is what stops the scorecard from becoming gut feel with numbers attached. "5 = stayed calm and took ownership when the customer escalated" is anchored; "5 = great" is not.
Tied to observed behavior. A scorecard is only as good as what it's scoring. Rating "composure under pressure" from a relaxed chat is guesswork. Rating it from watching the candidate actually handle a pressured moment is evidence. The scorecard works best when it's pointed at real, observed performance rather than inference.
From scorecard to evidence
Here's where it comes together. A structured scorecard gives you the framework for evidence — defined competencies, consistent standards. But a framework still needs something real to score. The strongest version pairs the scorecard with a genuine demonstration of the skill: the candidate handles a realistic scenario, and you score what they actually did, ideally against a transcript you can point to.
Now "I liked them" becomes "they scored a 5 on de-escalation — here's the moment in the conversation where they turned the customer around, and here's how that compares to the other finalist." That's no longer an impression. It's evidence: specific, comparable, and defensible.
The shift sounds bureaucratic but it isn't. It's the difference between hoping you chose well and knowing why you did. Structure doesn't remove your judgment — it gives your judgment something solid to stand on.