Independent Evaluation, Unbiased Benchmarks

Testing AI on Real-World Tasks

We benchmark the world's leading AI models on rigorous, domain-specific tasks in finance, law, software, healthcare, and more. We run all of our own evaluations and create many of our benchmarks in-house.

Vals AI Updates

Fresh updates from our testing queue

benchmark
06/23/2026

Legal Research Bench Released

Legal Research Bench Released

View Details

System

Accuracy

43.75%

± 3.45

40.38%

± 3.41

38.46%

± 3.38

31.25%

± 3.22

30.77%

± 3.21

29.81%

± 3.18

27.89%

± 3.12

25.48%

± 3.03

23.08%

± 2.93

20.67%

± 2.81
Showing top 10 models from the benchmark. Visit the benchmark page to view more

Industry Leaderboard

Independent benchmarks for industry-specific AI performance.

Industry
Benchmark

Model Performance Over Time

Tracking how foundation models improve with each release