Independent Evaluation, Unbiased Benchmarks

Testing AI on Real-World Tasks

We benchmark the world's leading AI models on rigorous, domain-specific tasks in finance, law, software, healthcare, and more. We run all of our own evaluations and create many of our benchmarks in-house.

Vals AI Updates

Fresh updates from our testing queue

benchmark
06/16/2026

Code Migration Released

Code Migration Released

View Details

System

Accuracy

55.06%

± 4.61

47.25%

± 4.18

45.16%

± 4.16

43.88%

± 4.22

39.89%

± 4.14

34.98%

± 4.11

27.77%

± 4.14

26.75%

± 4.10

26.20%

± 4.04

25.77%

± 4.09
Showing top 10 models from the benchmark. Visit the benchmark page to view more

Industry Leaderboard

Independent benchmarks for industry-specific AI performance.

Industry
Benchmark