Recrutador is a Hiring Intelligence Platform. It covers the full lifecycle, from role planning to live interview execution and the final memo. It replaces gut feeling with structured evidence, for any role and any company size.

Is Recrutador an ATS?

No. ATSs like Gupy, Greenhouse and Lever manage the candidate pipeline. Recrutador integrates with them and adds the missing layer: structured evaluation of the interview itself.

How does the semi-structured interview work?

Every interview for the same role starts from the same probe library, generated from the Role Blueprint. During the conversation, the HUD adapts probe depth in real time based on what the candidate actually says. Each interview is unique, but the measurement standard never changes. The result is comparable, defensible assessments.

Does Recrutador record interview audio?

No. Audio is streamed to the transcription service and discarded immediately. Only the text transcript persists. This is an architectural commitment of the product, not a setting. Load-bearing for LGPD/GDPR compliance.

Does it work for any role?

Yes. The engine is role-agnostic. The Role Blueprint dictates what matters for that specific role. Proven in production from CTO to long-haul truck driver, on the same system, with no role-specific code.

How is it different from recording tools like Metaview or Otter?

Metaview, Otter and Fathom document what happened in the conversation. Recrutador shapes what happens. By analogy: they are dashcams, Recrutador is a GPS. The HUD acts during the interview, suggesting the right next question in real time.

How is it different from assessments like Sólides or Predictive Index?

Assessment tools label the candidate before the interview (DISC, behavioral profile). Labels are not evidence that this specific person will perform in this specific role. Recrutador extracts demonstrated evidence from the interview. The two are complementary.

What is the cost of a bad hire?

Classic estimates from the U.S. Department of Labor put the cost of a bad hire at a minimum of 30% of the role's annual salary. SHRM and follow-up studies push that figure to multiples of salary once you account for severance, lost team productivity, ramp time of the next hire and customer impact. For most small and mid-size businesses that translates to at least $10,000 / R$50,000 / £8,000 lost per bad hire. Avoiding that cost is exactly why structured interviews and evidence-based decision making exist.

What is the difference between structured and unstructured interviews?

In an unstructured interview each candidate answers different questions, is evaluated against the interviewer's implicit criteria, and the outcome depends on impressions and memory. In a structured interview every candidate for the same role answers the same set of questions, is evaluated against the same criteria and rubrics, and decisions are based on recorded evidence. Organizational psychology research shows structured interviews have substantially higher predictive validity than unstructured ones. Recrutador implements the semi-structured model: the same starting point for every candidate, with depth that adapts in real time to what the person actually says.

Interview Scorecard Template: Free Download + How to Actually Use It

The single most-cited reason interview scorecards fail is not that the template is wrong. It is that the team uses the template as a souvenir of the interview rather than as the instrument that drives the decision. The scorecard gets filled in five minutes after the conversation ends, from memory, with the candidate’s name already mentally scored as “yes” or “no”. The scores are reverse-engineered to match the gut-feel verdict. The document gets filed. Nobody is the wiser.

This guide gives you a free interview scorecard template you can copy into Google Sheets, Notion, or a sheet of paper, and explains the discipline that makes it actually work. Both halves matter. The template without the discipline is theater.

Why a scorecard at all

Decades of organizational psychology research show structured interviews predict on-the-job performance substantially better than unstructured ones. The McDaniel et al. (1994) meta-analysis ¹ reported criterion-related validity nearly three times higher for structured interviews (.63 vs .20). Schmidt and Hunter’s landmark 1998 synthesis of 85 years of selection research ² reported a smaller but still substantial advantage (.51 vs .38), and Wingate et al. (2025) re-validated the same direction with modern data ³. The cost of getting this wrong is not academic. SHRM puts the total cost of replacing an employee at 0.5x to 2x the position’s annual salary ⁴, with the widely-cited “30% of annual salary” figure (commonly attributed to the U.S. Department of Labor) sitting at the floor of that range ⁵.

Translated: for an $80,000-a-year role, a single bad hire costs the company somewhere between $40,000 and $160,000 (the SHRM 0.5x-2x range), or even more for senior and revenue-touching positions. The scorecard is the single cheapest defense against that cost.

The template

Copy this directly into a spreadsheet, a Notion page, or a doc. One scorecard per candidate per role.

Header block

Field	Value
Candidate	(name)
Role	(role title)
Interviewer	(name)
Date	(date)
Interview round	(1, 2, panel, final)

Criteria block

This is the heart of the scorecard. Define the criteria before you see the first candidate, not while you are filling in the form. Add or remove rows to fit the role; 4 to 6 criteria is the sweet spot.

#	Criterion	Weight (1–5)	Evidence (quote what the candidate said)	Evidence quality (Surface / Specific / Tested)	Score (1–5)
1	(e.g. “Has run a B2B negotiation through to close on a deal of $50K+ in the last 24 months”)
2	(e.g. “Can explain a technical decision to a non-technical stakeholder without losing precision”)
3
4
5
6

Decision block

Field	Value
Weighted score	(sum of Score × Weight)
Strengths	(one or two sentences, citing evidence)
Concerns	(one or two sentences, citing evidence)
Open questions for the next round	(gaps where evidence was thin)
Recommendation	Strong yes / Yes / Mixed / No / Strong no
Rationale (one sentence)	(frame as evidence, not verdict)

That is the entire template. Sixty seconds to set up. Useless without what comes next.

How to actually use it: the discipline

1. Define criteria first, never during the interview

Open the scorecard before the first candidate is on the call. Decide, with the hiring manager, what the 4 to 6 criteria are, what weight each carries, and what evidence-strong looks like for each one. Write a one-line rubric for each: evidence-strong = the candidate cites a specific deal with named context (industry, size, objection), the action they took, and a measurable outcome. Evidence-weak = generic claim, no example, no numbers.

If you skip this step, the scorecard becomes a post-hoc rationalization. The whole point is to lock the standard before any candidate-specific bias enters the picture.

2. Write evidence during the interview, not from memory after

The most common failure mode is filling in the scorecard from memory five minutes after the call. By that point, what survives is impression, not evidence. The candidate who spoke fluently feels stronger; the candidate who was nervous feels weaker. This is the halo effect documented in decades of cognitive psychology ⁶: a single positive attribute (verbal fluency, confident posture) contaminates evaluation of every other dimension. The validity numbers earlier in this article are not theoretical: Schmidt & Hunter found unstructured interviews explain less than 15% of the variance in on-the-job performance, meaning a memory-based score is closer to a coin-flip than to a measurement.

Instead: quote the candidate’s actual words in the Evidence column, while the interview is happening. Even short fragments are fine. “Said: led migration of 12-person team from monolith to microservices over 9 months, KPI was deploy frequency, went from weekly to daily.” That is evidence. “Strong technical leadership” is impression.

3. Use the evidence-quality classification

Every quote you capture goes in one of three buckets:

Surface — the candidate made a generic claim with no specifics. (“I’m a strong leader.”)
Specific — the candidate gave a concrete example with names, numbers, or detail. (“I led the migration from X to Y over Z months and we hit metric M.”)
Tested — you challenged the claim and the candidate held up under pressure. (“When I asked what would have been different if the budget had been half, they walked through the trade-off they would have made and named the metric they’d have sacrificed.”)

A candidate with three Tested-quality quotes and two Surface-quality is in a different league from one with five Surface-quality quotes, even if their nominal scores look similar.

4. Score the criterion, not the candidate

When you assign a 1 to 5, you are scoring that specific criterion based on the evidence you captured, not your overall feeling about the person. This is harder than it sounds. The simplest forcing function: write the score in the Score column before you read your own Evidence column, then read the evidence and ask yourself “would a stranger reading just this evidence give the same score?”. If not, adjust the score, not the evidence.

5. Compare scorecards side by side, never one at a time

The decision is not “is this candidate good?”. The decision is “which of these candidates is best for this role?”. Open all the scorecards in one view. Compare row by row. The candidate who scored a 4 on the most-weighted criterion beats the candidate who scored a 5 on the least-weighted one, even if the second one was more charming in the room.

6. Use the Open Questions column to drive the next round

The scorecard is not just a verdict-making instrument. It is also the briefing document for the next interviewer. The Open Questions column tells the next round exactly which gaps to probe. If criterion #3 was thin in the first interview, the second interviewer goes deep there. This is how a multi-round process compounds evidence instead of duplicating it.

7. Frame the recommendation as evidence, not as verdict

The Rationale field is one sentence. It should sound like “Strong on technical depth (criteria #1, #4) with specific evidence; concerns on stakeholder communication (#2) where evidence was Surface across two probes.” It should not sound like “Smart and personable, would be a good fit.”

The first form lets a different reader (your co-founder, a future manager, a compliance audit) see exactly why the decision was made and decide whether they agree. The second form is gut feel dressed up as a decision.

The seven mistakes that make scorecards useless

After watching teams use scorecards for years, the same failure modes repeat. If your scorecard is not improving your hires, it is almost certainly one of these seven:

Filling it in from memory. See above. The single biggest source of failure.
Inventing criteria during the interview. Criteria need to be locked before the candidate is on the call. Otherwise you are scoring against a moving target.
Using the same generic criteria for every role. A scorecard that does not change between a sales rep and a CTO is not actually evaluating anything specific.
Letting one charismatic interviewer dominate the panel decision. The whole point of multiple scorecards is that they get compared. If one person’s vote always wins, you do not have a scorecard, you have a rubber stamp.
Treating the score as the answer. The score summarizes the evidence. The evidence is what you decide on. If two candidates have the same weighted score but very different evidence quality, the evidence wins.
Burying gaps. If criterion #4 was thin in the interview and you score it a generous 3 because you “got a good feeling”, you are making the decision impossible to defend later.
Not keeping them. Scorecards filed away after the decision are wasted. Reviewed six months in (when the new hire is either crushing it or struggling), they are the highest-leverage learning artifact a hiring team has. Patterns emerge: which criteria predicted success, which criteria you misweighted, which kinds of evidence were misleading.

When the scorecard needs to live in the interview, not after it

The template above works. The hard part is not building it. The hard part is keeping it open and actively scored during the interview itself, when the conversation is moving fast and the candidate is in front of you.

The most common failure modes are predictable:

The scorecard is open but the interviewer scores from memory at the end of the day, so vibes win over evidence
Different interviewers use different mental versions of “strong” / “mixed” / “weak” and the comparison across candidates becomes noise
Evidence quotes get paraphrased instead of captured verbatim, which is the same as not capturing them; the defensibility disappears
The scorecard exists for hire #1 and gets reinvented from scratch for hire #2

Each of those failures cancels the gain, and the cost of each failure is exactly what this discipline exists to prevent (SHRM puts the total cost of a single bad hire at 0.5x to 2x annual salary).

Recrutador is a Hiring Intelligence Platform that runs the scorecard discipline end to end as software. A chat-first Strategist defines the Role Blueprint (criteria + weights + rubric + probe library) and persists it across interviews. Resumes are ranked by the Blueprint. During the live interview, a desktop HUD listens, transcribes in real time, and surfaces the next probe one action at a time. At the end, the Post-Interview Memo is generated automatically with quoted evidence. Same engine for any role, any seniority.

If you want to see the methodology end to end, read What is Recrutador. If you want to try it, get started or talk to the team.

References

McDaniel, M. A., Whetzel, D. L., Schmidt, F. L., & Maurer, S. D. (1994). The Validity of Employment Interviews: A Comprehensive Review and Meta-Analysis. Journal of Applied Psychology, 79(4), 599-616. PDF ↩
Schmidt, F. L., & Hunter, J. E. (1998). The Validity and Utility of Selection Methods in Personnel Psychology: Practical and Theoretical Implications of 85 Years of Research Findings. Psychological Bulletin, 124(2), 262-274. DOI ↩
Wingate, T. G., et al. (2025). Evaluating interview criterion-related validity. International Journal of Selection and Assessment. Wiley Online Library ↩
Society for Human Resource Management and aggregated industry estimates put the total cost of a bad hire between $17,000 and $150,000+, with management roles often costing 1 to 2 times annual salary. Updated synthesis: The Real Cost of a Bad Hire (2026). ↩
The 30%-of-annual-salary figure is widely attributed to the U.S. Department of Labor and replicated across consultancy and market reviews. Accessible overview: The Hidden Costs of Bad Hiring and The Cost of a Bad Hire. ↩
Nisbett, R. E., & Wilson, T. D. (1977). The Halo Effect: Evidence for Unconscious Alteration of Judgments. Journal of Personality and Social Psychology, 35(4), 250-256. Applied synthesis: Halo Effect in Job Interviews and 7 Cognitive Biases That Distort Hiring. ↩