Tavi

PSB-Recruiting Benchmark

Tavi scores 97.03 on recruiting search.

Tavi was evaluated on PSB-Recruiting, the 30-query recruiting category of PeopleSearchBench. The benchmark measures whether a people-search system finds relevant candidates, returns enough qualified profiles, and provides useful information for review.

Overall
97.03
Relevance
97.26
Coverage
99.90
Utility
93.94
Benchmark chart showing Tavi scored 97.03 on PSB-Recruiting, ahead of Prism at 89.64 and other published baselines.

Methodology

How the benchmark works

Each recruiting brief is decomposed into explicit criteria. Returned candidates are checked against those criteria using web evidence and an LLM judge, then scored for ranking quality, candidate coverage, and information utility. The underlying benchmark paper is available on arXiv.

  1. 01The test uses 30 recruiting searches, each written like a real hiring brief.
  2. 02For each brief, Tavi can return up to 15 people, ranked from strongest fit downward.
  3. 03Each returned person is converted into the same structured format the benchmark expects.
  4. 04The evaluator breaks every hiring brief into explicit criteria, then checks each candidate against them.
  5. 05Web search is used to gather outside evidence, so judgments are grounded in public information.
  6. 06An LLM judge reviews the evidence and scores relevance, coverage, and information usefulness.
  7. 07The evaluator is run 7 times, then the results are averaged to reduce judge variance.
Full report ->