Independent Evaluation, Unbiased Benchmarks

Testing AI on Real-World Tasks

We benchmark the world's leading AI models on rigorous, domain-specific tasks in finance, law, software, healthcare, and more. We run all of our own evaluations and create many of our benchmarks in house

Vals AI Updates

Fresh updates from our testing queue

benchmark
05/04/2026

Vals Index and Multimodal Index Methodology Update

Vals Index and Multimodal Index Methodology Update

View Details

System

Accuracy

72.21%

± 1.95

71.00%

± 2.18

66.13%

± 2.17

66.00%

± 2.16

65.55%

± 2.14

64.52%

± 2.23

62.22%

± 2.30

59.33%

± 2.46

58.11%

± 2.19

57.92%

± 2.25
Showing top 10 models from the benchmark. Visit the benchmark page to view more

Industry Leaderboard

Independent benchmarks for industry-specific AI performance.

Industry
Benchmark

Model Performance Over Time

Tracking how foundation models improve with each release