Every applicant tracking system uses some form of AI to rank resumes. The architectures fall into two categories: keyword-based matching (used by nearly every legacy platform) and outcome-based ranking (used by CurriculoATS). The difference is not marketing. It’s a structural difference in how the model reads a resume.
How keyword-based matching works
The Boolean approach
Classic keyword matching extracts tokens from the job description (Python, AWS, Kubernetes, Django, PostgreSQL) and scans the resume for the same tokens. Each found token counts toward the match score.
job_tokens = extract_keywords(job_description) # ["Python", "AWS", "Kubernetes", ...]
resume_tokens = extract_keywords(resume_text)
overlap = intersection(job_tokens, resume_tokens)
score = len(overlap) / len(job_tokens)The TF-IDF improvement
More sophisticated keyword matching uses Term Frequency-Inverse Document Frequency scoring. Instead of counting exact matches, it weights rare terms higher than common ones. This reduces the impact of generic words like “team” and gives more weight to specialized terms like “Kubernetes”.
The semantic similarity upgrade
The most advanced keyword-based systems use word or sentence embeddings to compute semantic similarity. Instead of exact token matches, they compute vector distances between the job description and the resume in a learned embedding space. This catches synonyms. “Managed” and “led” end up close in embedding space. But the underlying methodology is still comparing text-level features rather than outcomes.
The three structural vulnerabilities of keyword matching
1. Adversarial resume poisoning
Any deterministic text-matching algorithm can be gamed by candidates who know the rules. If the model rewards presence of specific keywords, candidates inject those keywords in hidden white text at the bottom of the resume, in repeated bullet points, or in fake skills sections.
Keyword-based systems are structurally vulnerable to this. Even embedding-based systems are vulnerable, because a candidate who copies job description language into their summary line scores high on semantic similarity without any actual experience.
2. False positives from density
A candidate who repeats “Python Python Python” or lists 50 technologies in their skills section can score higher than a candidate who wrote detailed descriptions of two real projects. The model rewards token count over depth.
3. False negatives on real talent
Great candidates often describe the same outcome in different words. An engineer who wrote “migrated our real-time data pipeline to Apache Flink” gets credit for “Apache Flink” but loses credit for “streaming” if the model wanted that specific word. Word choice matters more than actual accomplishment.
The three vulnerabilities aren’t edge cases. They’re structural features of keyword-based ranking. Any competitive candidate pool has enough adversarial behavior to make keyword scores unreliable for hiring decisions.
How outcome-based ranking works
Reading for outcomes, not tokens
CurriculoATS’s AI is trained to read resumes the way a senior engineer or operator would. It extracts measurable outcomes in four categories:
The model doesn’t count token occurrences. It reads the work that was done and evaluates it against the job requirements.
Writing reasoning, not just scores
For every candidate, the outcome-based model produces both a 0-100 fit score AND a full written reasoning paragraph. The reasoning explicitly references which outcomes matched which requirements and where the candidate fell short. This makes every decision auditable. Recruiters can read the reasoning and override the score when the model makes a mistake.
How outcome-based ranking handles the three vulnerabilities
Adversarial poisoning stops working
Hidden white-text keyword dumps add zero signal to an outcome-based model. The model isn’t counting keywords. It’s evaluating descriptions of work. An unused skill list doesn’t match any of the four outcome signals, so it doesn’t move the score.
False positives from density disappear
Repeating “Python Python Python” adds zero value. The model is looking for descriptions of real work done in Python, not occurrences of the word. Depth beats density.
False negatives on real talent drop dramatically
A candidate who writes detailed descriptions of specific outcomes in their own words gets credit regardless of whether they match the exact keyword the job description used. “Led migration to Apache Flink” and “managed k8s rollout for 30-service monorepo” are recognized as senior systems work even if the job description used different words.
Why this matters for compliance
Both NYC Local Law 144 (effective 2023) and the EU AI Act (2024) require algorithmic transparency and auditability for automated employment decision tools. The regulations expect that, for any candidate decision, you can explain why the decision was made.
A keyword-based system can point at a score, but the score’s origin is a set of floating-point weights and token occurrences. Recruiters can’t explain it. Auditors can only sample it statistically. Candidates can’t be told why they were rejected.
An outcome-based system with written reasoning is structurally auditable. Every score comes with a plain-English paragraph explaining which outcomes matched which requirements.
We don’t claim specific regulatory certifications (those require independent audits per deployment). What we can say is that the architecture is designed for transparency. The kind of system that can pass an audit rather than fight one.
The technical bottom line
Keyword-based matching is computationally cheap and easy to explain. It’s also structurally vulnerable to adversarial inputs, density bias, and false negatives on word-choice variation. Semantic embedding-based upgrades mitigate some of these issues but share the core limitation: they’re comparing text-level features, not outcomes.
Outcome-based ranking is computationally more expensive (requires a capable LLM reading resume content and producing structured reasoning), but fixes all three vulnerabilities. The tradeoff is cost of inference for quality of signal. For any use case where hiring quality matters more than saving milliseconds per candidate, outcome-based wins.
How to test it yourself
Import a CSV of past applicants into CurriculoATS and compare the AI reasoning against the scores your previous keyword-based ATS produced. The free plan supports this workflow, no credit card, no time limit.
Related: Impact Scoring engine, AI Resume Screening, How keyword stuffing exploits legacy ATS.