PSB-Recruiting Benchmark
Tavi scores 97.03 on recruiting search.
Tavi was evaluated on PSB-Recruiting, the 30-query recruiting category of PeopleSearchBench. The benchmark measures whether a people-search system finds relevant candidates, returns enough qualified profiles, and provides useful information for review.
- Overall
- 97.03
- Relevance
- 97.26
- Coverage
- 99.90
- Utility
- 93.94

Methodology
How the benchmark works
Each recruiting brief is decomposed into explicit criteria. Returned candidates are checked against those criteria using web evidence and an LLM judge, then scored for ranking quality, candidate coverage, and information utility. The underlying benchmark paper is available on arXiv.
- 01The test uses 30 recruiting searches, each written like a real hiring brief.
- 02For each brief, Tavi can return up to 15 people, ranked from strongest fit downward.
- 03Each returned person is converted into the same structured format the benchmark expects.
- 04The evaluator breaks every hiring brief into explicit criteria, then checks each candidate against them.
- 05Web search is used to gather outside evidence, so judgments are grounded in public information.
- 06An LLM judge reviews the evidence and scores relevance, coverage, and information usefulness.
- 07The evaluator is run 7 times, then the results are averaged to reduce judge variance.