Independent Evaluation, Unbiased Benchmarks

Testing AI on Real-World Tasks

We benchmark the world's leading AI models on rigorous, domain-specific tasks in finance, law, software, healthcare, and more. We run all of our own evaluations and create many of our benchmarks in-house.

Vals AI Updates

Fresh updates from our testing queue

model
06/02/2026

Alibaba's Qwen 3.7 Plus evaluated on the Vals Index

Alibaba's Qwen 3.7 Plus evaluated on the Vals Index

View Details

Benchmarks

Accuracy

Rankings

52.33%

± 1.63
13/ 23

46.39%

± 4.61
17/ 55

52.81%

± 0.65
13/ 22
Contact us
Or send us an email at contact@vals.ai
Proprietary Benchmarks (contact us to get access)
Academic Benchmarks

Read about our methodology.

Industry Leaderboard

Independent benchmarks for industry-specific AI performance.

Industry
Benchmark