ITALIC
ITALIC is a benchmark developed by the University of Milano-Bicocca to evaluate how well language models can understand and follow instructions in Italian. It includes tasks like question answering, summarization, translation, and reasoning, with a focus on zero-shot and few-shot settings. The goal is to test models in realistic, instruction-based scenarios that reflect practical language use.
ITALIC
Italian Language Instructional Comprehension
Model Name | ORTOGRAPHY | SYNTAX | LITERATURE | CIVIC EDUCATION | ART HISTORY | LEXYCON | GEOGRAPHY | TOURISM | MORPHOLOGY | CURRENT EVENTS | SINONIMYS | HISTORY |
---|---|---|---|---|---|---|---|---|---|---|---|---|
![]() Vitruvian_Scientist-14B |
65,30% | 64,20% | 73,50% | 73,70% | 67,70% | 82,70% | 76,90% | 71,90% | 52,10% | 79,30% | 85,20% | 77,00% |
![]() Vitruvian_Explainer-14B |
66,8% | 68,1% | 71,40% | 71,70% | 68,40% | 82,70% | 78,70% | 68,70% | 56,40% | 82,60% | 85,30% | 77,90% |
![]() Vitruvian_Smart-12B |
63,70% | 65,20% | 74,40% | 77,00% | 72,20% | 84,40% | 81,60% | 72,90% | 52,90% | 80,40% | 85,50% | 77,70% |
![]() LLamAntino-3-8B |
52,83% | 54,47% | 67,68% | 67,32% | 68,67% | 76,61% | 76,92% | 70,82% | 40,00% | 79,35% | 76,42% | 74,85% |
![]() Llama-3.1-8b-Ita |
53,04% | 53,65% | 67,17% | 71,22% | 70,10% | 81,51% | 79,26% | 71,73% | 52,14% | 82,61% | 81,15% | 77,40% |
![]() maestrale-chat-v0.4 |
54,17% | 55,40% | 70,93% | 70,20% | 67,35% | 78,65% | 76,61% | 69,08% | 44,29% | 80,43% | 70,34% | 66,22% |
![]() Almavave-Velvet-14B |
44,08% | 45,63% | 69,21% | 72,56% | 67,86% | 77,32% | 76,81% | 71,43% | 42,86% | 83,70% | 70,03% | 66,40% |
![]() iGenius-Italia-9b |
32,13% | 31,35% | 54,47% | 52,00% | 55,20% | 59,14% | 62,82% | 59,18% | 27,14% | 65,22% | 43,46% | 50,00% |
![]() Fastweb-MIIA-7B |
40,31% | 44,60% | 62,70% | 59,92% | 57,96% | 63,13% | 65,58% | 57,55% | 29,29% | 71,74% | 52,68% | 55,34% |
![]() Minerva-7B |
25,75% | 27,13% | 40,35% | 44,50% | 46,33% | 45,45% | 49,13% | 51,33% | 32,86% | 52,17% | 38,31% | 42,66% |
Question Example