Webinar: Evaluating LLMs Across Languages: Values, Reasoning, and Hallucinations

24 February 2026, 12:00-13:00 CET

Annika Simonsen, Freja Thoresen, Hafsteinn Einarsson.

Annika Simonsen (University of Iceland), Freja Thoresen (Alexandra Institute), Hafsteinn Einarsson (Univeresity of Iceland)

What values drive LLMs? Can they navigate through mazes more easily in English than in Icelandic? How can we measure hallucinations? This TrustLLM webinar will be about evaluating LLMs across languages.

The presentation is split into three topics:

As LLMs enter high‑stakes applications, it becomes crucial to understand not just what they can do, but which norms and values their answers implicitly rely on. Annika Simonsen (University of Iceland) will present one of three topics in this webinar: ValEU, a European values benchmark for assessing how closely LLMs align with shared cultural values. ValEU offers a transparent way to probe the moral and societal assumptions embedded in LLMs across countries, topics, and demographic groups.

Large language models often struggle when reasoning outside of English. Hafsteinn Einarsson (University of Iceland) will present the second topic of this TrustLLM webinar: benchmarking model performance on math problems and maze navigation. Icelandic Math Eval contains math‑competition problems from 1984 to 2025, while MazeEval tests the navigation skills of LLMs across different languages.

Hallucinations are an inherent risk when working with large language models. Freja Thoresen (Alexandra Institute, Copenhagen) will present the third topic of this TrustLLM webinar: hallucinations. They come in two main types: faithfulness and factuality. Freja will discuss strategies for detecting hallucinations, including metrics like token‑level classifiers and multilingual benchmark suites.

Please note that this webinar will be recorded!

More from TrustLLM