When AI Hiring Goes Wrong
Four real cases where AI recruiting tools discriminated against candidates — and what a proper ethics audit would have caught.

A Qualified Candidate Who Never Got Seen
Imagine you're a 56-year-old tutor. Decades of experience. You apply to an online education platform. You meet every qualification listed in the job posting.
A human never reads your application. The software rejected you automatically — because you're over 55.
That's not a hypothetical. It's what happened at iTutorGroup, and it's the mildest case in this post.
AI hiring tools are sold on a simple premise: remove human bias from recruitment. Four cases — spanning 2018 to 2024 — show what happens when that premise meets reality. Each one escalates: from internal discovery, to regulatory enforcement, to vendor liability.
Amazon's Recruiting Tool (2018)
The canonical case. Reuters reported that Amazon built an internal AI recruiting tool trained on resumes submitted over a decade. The engineering workforce was predominantly male. The algorithm learned that "male" was a predictor of success — not because men are better engineers, but because men were the ones who had historically been hired.
It penalised resumes containing the word "women's" — as in "women's chess club captain." It downgraded graduates of two all-women's colleges.
Amazon's engineers tried to fix the bias. They couldn't guarantee neutrality. The project was scrapped entirely — never deployed at scale.
The failure: Training data reflected historical hiring patterns, not candidate quality. No bias testing against protected characteristics. No human oversight loop. The model optimised for "looks like people we already hired" rather than "qualified for the role."
What would have caught it: A pre-deployment demographic parity test — required under the EU AI Act for high-risk employment AI — would have flagged the gender disparity immediately.
HireVue's Video Assessments (2019)
HireVue built a platform used by over 700 companies — Goldman Sachs, Unilever, and others — that scored job candidates through video interviews. It analysed facial expressions, word choice, and speech patterns.
The Electronic Privacy Information Center (EPIC) filed a complaint arguing the results were "biased, unprovable, and not replicable." Independent research on speech recognition systems found accuracy gaps that disadvantaged non-white applicants and deaf candidates.
HireVue dropped facial recognition in 2021. The speech pattern analysis — which still carries bias — remained.
The failure: Facial expression analysis has well-documented accuracy gaps across skin tones. Speech scoring penalised accents, speech impediments, and non-native speakers. The scoring model was opaque. No third-party fairness audit before deployment at scale.
What would have caught it: Transparency requirements under the OECD AI Principles. Candidates should understand how they're being evaluated. Under the EU AI Act, employment screening requires documented risk assessment, human oversight, and post-deployment monitoring. A compliance mapping exercise would have identified these gaps before the system touched 700 companies.
iTutorGroup — The First EEOC AI Settlement (2023)
In August 2023, the EEOC reached its first-ever settlement involving AI hiring discrimination. iTutorGroup's screening software rejected women over 55 and men over 60 — filtering out more than 200 qualified applicants based on age alone.
Not as one factor among many. As an automatic rejection rule. Candidates who met every qualification were never seen by a human.
The settlement was $365,000 — modest by corporate standards. But the legal precedent matters: the EEOC treated the AI tool identically to any other discriminatory employment practice. "The algorithm did it" is not a defence.
What would have caught it: This one is almost embarrassingly simple. A basic input audit — someone reading the filtering rules the system was applying — would have flagged age-based cutoffs immediately. No advanced bias testing required.
Workday — When the Vendor Is Liable (2024)
In February 2023, a class action was filed against Workday alleging its automated resume screening discriminated on the basis of race, age, and disability. In July 2024, a California federal court allowed the case to proceed under an "agent" theory.
This is the case that changes the game. Previously, liability sat with the employer — the company that chose to use the tool. This case may establish that AI tool vendors carry employer-equivalent liability. If you build the discriminatory system, you can't hide behind "we just sell the software."
The EEOC filed an amicus brief supporting the plaintiff, signalling that federal enforcement is moving toward holding vendors accountable.
What would have caught it: Third-party algorithmic audits. If Workday had subjected its screening tools to independent bias testing — and published the results — the discriminatory patterns could have been identified before deployment across thousands of companies.
The Pattern
Each case escalates:
- Amazon (2018) — Company catches its own bias, kills the project internally
- HireVue (2019) — External advocacy group files complaint, company drops one feature but keeps others
- iTutorGroup (2023) — Federal regulator settles first-ever AI discrimination case
- Workday (2024) — Court rules AI vendor can be directly liable, not just the employer
The direction is clear. Enforcement is tightening. Liability is expanding. And "we didn't know the algorithm was biased" is becoming legally indefensible.
The Audit Checklist
All four cases share the same root failure: deployment without governance. Here's what a basic AI hiring audit should cover:
Pre-Deployment
| Check | Question |
|---|---|
| Training data | Does the dataset reflect the candidate population, not just historical hires? |
| Bias testing | Have outcomes been tested across gender, ethnicity, age, and disability? |
| Explainability | Can the system explain why a candidate was scored a certain way? |
| Human oversight | Is there a defined threshold where a human reviews the AI's decision? |
| Regulatory mapping | Which jurisdictions apply, and what do they require? |
Post-Deployment
| Check | Question |
|---|---|
| Outcome monitoring | Are hiring outcomes tracked by demographic group over time? |
| Drift detection | Has the model's behaviour changed since deployment? |
| Complaint mechanism | Can candidates challenge AI-driven decisions? |
| Audit trail | Are all AI-assisted decisions logged and reviewable? |
| Periodic review | Is the system re-evaluated on a defined schedule? |
The Counterargument
AI hiring proponents have a point: human hiring is also biased. Interviewers favour people who look like them, went to the same schools, share the same cultural references. AI was supposed to fix that.
The problem isn't the goal. It's the assumption that automation equals objectivity. An algorithm trained on biased data doesn't remove bias — it scales it. And unlike a human interviewer, it does it consistently, silently, and across thousands of candidates simultaneously.
The answer isn't to stop using AI in hiring. It's to govern it. Test it. Audit it. And hold someone accountable when it fails.
Building the model is engineering. Deciding where and how to deploy it — that's policy. Right now, most companies are doing the first part without the second. These four cases show where that leads.
This is the second post in a series on AI ethics policy. Start with Getting Started in AI Ethics Policy for the frameworks behind these cases.
Next in this series: an AI ethics policy audit template you can apply to any AI system making decisions about people.
References
- Why Amazon's Automated Hiring Tool Discriminated Against Women — ACLU
- EPIC Complaint Against HireVue — Electronic Privacy Information Center
- iTutorGroup EEOC Settlement — U.S. Equal Employment Opportunity Commission
- Mobley v. Workday: AI Vendors and Employment Discrimination Liability — Fisher Phillips
- EU AI Act Implementation Timeline — EU Artificial Intelligence Act
- OECD AI Principles — OECD