We evaluated Gemini 3.1 Flash Lite Preview across our full benchmark suite. The model is Google’s fast and cost-efficient offering.
- The model does well on two of our multimodal benchmarks, SAGE (ranked 5th, with 49.5% accuracy) and Mortgage Tax (ranked 7th, 67.8% accuracy).
- It ranks 14th on MMLU Pro with 86.2% accuracy.
- On coding tasks the model still has room for improvement—it ranks 29th on both Live Code Bench and Terminal Bench 2, and 28th on SWE-bench. It scores 0% on Vibe Code Bench.
- Overall, it places 15th/20 on the Vals Multimodal Index and 22nd/31 on the Vals Index.
- The cost savings compared to other models in the Gemini 3 series or Gemini 2.5 are dramatic: roughly 5–20x cheaper per test across benchmarks, while maintaining respectable accuracy. For example, on Finance Agent it costs
$0.072 per test vs $0.370 for Gemini 3 Flash, while performing comparably.
Our results show that the model does not perform as well as other models in the Gemini 3 series. However, it is fast and quite cost-efficient
relative to those models, making it a good choice for applications that demand scale, speed, or cost-efficiency.
Evaluations were run with a temperature of 1.0 and a “high” thinking level, via the official Google API.