Brussels / 3 & 4 February 2024


Building Open Source Language Models

LINAGORA, as a leader in the Open LLM France community, has made it a priority to pull the curtain off of the process of building Large Language Models (LLMs). While most LLMs in use today – even the “open” ones – reveal few to no details about their training, and especially the data on which they are trained, we have decided to share it all. In this talk, we discuss why using an open model trained on traceable data is important for business and research alike and examine some of the difficulties involved in pursuing an open strategy for LLMs. We bring to the table our experience with data collection and training of LLMs, including the Claire family of language models.

For links to the Claire models:

Dataset & Code:


Photo of Julie Hunter Julie Hunter