TrustLLM work packages

WP1: Project management

LiU, LSP, TNO

This work package is responsible for the strategic coordination and management of the project activities, to ensure that project objectives and all contractual obligations met. This WP will also ensure smooth internal and external communication as well as financial follow-up and reporting to the EC. In addition to the project management, the WP will lead data and IPR management activities.

WP2: Multilingual Dataset Creation

LSP, AXI, FHG, UOI, MID, FZJ, NTNU, TNO

This work package is responsible for collecting, processing, and providing large-scale datasets for model training. The objectives of this work are:

1) providing a collection of crawled and compiled raw text data in the relevant Germanic languages

2) a processing pipeline that includes formatting, quality filtering, and deduplication

3) a pre-processed training dataset for building the LLMs.

This work package will also collaborate and exchange data and tools with other data-centric European initiatives such as HPLT.

WP3: Development of Factual LLMs

FHG, LiU, MID, UCPH, TNO

Exploration and development of methods that enhance the factual reliability of LLMs. We address this objective by:

1) developing retriever-based approaches that improve the factual correctness of LLMs,

2) developing time-aware language models,

3) improving the multi-step reasoning capabilities of LLMs.

The solutions will be provided as well-documented Python code and via APIs.

WP4: Multilingual LLM Training and Transfer Learning for Low-resource Languages

LiU, AXI, MID, LSP, UCPH, FZJ

This work package will develop methods for training and adapting LLMs for low-resource languages using cross-lingual transfer from typologically related languages.

WP5: Trustworthy LLM Tuning/Alignment

FHG, AXI, LiU, UOI, MID, FZJ

WP5 will develop and implement techniques that enhance the trustworthiness, alignment, and overall performance of the proposed model. These objectives encompass the creation of multilingual datasets tailored for instruction fine-tuning and RLHF, streamlining the annotation process for RLHF data, establishing rules and principles for Constitutional AI, translating existing evaluation datasets into Germanic languages, and addressing grammatical correctness, bias, and data imbalance in pre-training.

By accomplishing these objectives, we aim to ensure the practical value and reliability of the model across a wide range of applications.

WP6: Efficient LLM Training and Usage

FZJ, UOI, LSP

This work package aims at providing the technical ground for enabling large-scale training in the European HPC environment. On the one hand, this includes providing software frameworks, and deploying them for efficient usage. On the other hand, this implies contributing to technological progress, which make LLMs as efficient in training and application as possible. In this work package, we develop important algorithmic progress to improve the model’s capabilities and at the same time reduce the computational footprint both in training and in evaluation. Furthermore, we will study methods to increase the data-efficiency of the training procedure, reducing data requirements for high-quality models as far as possible We will also study scaling laws and the scaling behaviour of the trained models. All efficiency considerations will equally affect the computational cost, the power consumption, and the CO2 footprint.
This work package will connect the activities towards foundation models throughout the project and will coordinate the creation D6.6, a final summary report of the LLM models released since the beginning of the project, including capabilities, performance, and trustworthiness.

WP7: Multilingual and Multi-metric LLM Evaluation

AXI, FHG, LiU, LSP, UCPH, UOI, MID, NTNU

This work package consists of the creation of a collection of evaluation objectives, alongside corresponding benchmarking datasets and evaluation protocols, which collectively will be dubbed the Germanic Unified Language Benchmark (GULB). GULB will consist of two main categories. Firstly, a measure of the intrinsic language modelling performance of the models. Secondly, an evaluation of the truthfulness and bias of the model, as well as how aligned the model is with European values. Aside from creating the benchmarking datasets, this work package also consists of the creation of an open-source modular evaluation framework, which lets anybody utilise GULB and other benchmarks in an online environment.

WP8: Use Cases and Applications

TNO, MID, FHG, UOI, FZJ, LSP, NTNU, FHG

The overall objective of this work package to illustrate transferability and generalizability of LLMS and demonstrate their applicability in different domains by designing, technically realizing and showcasing applications. The LLMs capabilities will be demonstrated and assessed (using the evaluation tools from WP7) with focus on their factful, trustworthy, and bias-aware aspects, as well as demonstrating their transferability and generalizability to different applications.

WP9: Communication, Dissemination and Stakeholder Management

LiU, NTNU, LSP, UCPH, AXI, UOI, MID, FHG, FZJ, TNO, AKI

This WP will carry out strategic communication and dissemination, with particular attention to the project’s pathway to impact. It will craft a detailed plan for communication and dissemination, as well as provide all necessary tools for activities. It will address all foreseen target groups on the scientific, economical and societal level. It will establish several communication channels and enable a European ecosystem for LLMs and future Foundation Models (multimodal, multilingual) based on the actions described in the LEAM concept.

WP10: Ethics

LiU, all partners

This work package is responsible for setting up appropriate routines for best ethics practices in TrustLLM, as well as routines for monitoring compliance. We will also develop recommendations for the management of potential situations of conflicting interests wrt ethics aspects and strive to continuously improve awareness. The WP will define how the ALTAI assessment tool will be applied and facilitate the formation and the work of the External Ethics Advisory Board (EEAB).