Welcome to the Vitruvian Benchmarks page — your preview of how our models perform across a growing set of evaluation tasks. Here we track results on key benchmarks covering general knowledge, reasoning, instruction following, and Italian language understanding.
This page is continuously updated as we release new models and expand our evaluation coverage. Our goal is to provide a clear, consistent view of Vitruvian's capabilities across time — helping users, developers, and partners understand where our models are strong and where there's room to improve.
Browse the benchmarks below to explore detailed results and follow our progress.
Massive Multitask Language Understanding – Italian.
Vitruvian_Scientist-14B
74.50%
Vitruvian_Explainer-14B
74.50%
Qwen_2.5_14B
74.00%
Vitruvian_Smart-12B
67.10%
Mistral_small_22B
65.16%
Italian Language Instructional Comprehension
Model Name | ORTOGRAPHY | SYNTAX | LITERATURE | CIVIC EDUCATION | ART HISTORY | LEXYCON | GEOGRAPHY | TOURISM | MORPHOLOGY | CURRENT EVENTS | SINONIMYS | HISTORY |
---|---|---|---|---|---|---|---|---|---|---|---|---|
![]() Vitruvian_Scientist-14B |
65,30% | 64,20% | 73,50% | 73,70% | 67,70% | 82,70% | 76,90% | 71,90% | 52,10% | 79,30% | 85,20% | 77,00% |
![]() Vitruvian_Explainer-14B |
66,8% | 68,1% | 71,40% | 71,70% | 68,40% | 82,70% | 78,70% | 68,70% | 56,40% | 82,60% | 85,30% | 77,90% |
![]() Vitruvian_Smart-12B |
63,70% | 65,20% | 74,40% | 77,00% | 72,20% | 84,40% | 81,60% | 72,90% | 52,90% | 80,40% | 85,50% | 77,70% |
![]() LLamAntino-3-8B |
52,83% | 54,47% | 67,68% | 67,32% | 68,67% | 76,61% | 76,92% | 70,82% | 40,00% | 79,35% | 76,42% | 74,85% |
![]() Llama-3.1-8b-Ita |
53,04% | 53,65% | 67,17% | 71,22% | 70,10% | 81,51% | 79,26% | 71,73% | 52,14% | 82,61% | 81,15% | 77,40% |