Workshop on Language Models
Introduction
In automatic language processing, a language model is a statistical model which models the distribution of sequences of words, and more generally sequences of discrete symbols (letters, phonemes, words), in a natural language. A language model can, for example, predict the word following a sequence of words. BERT and GPT-3 are language models (Translated from the French Wikipedia definition)
These models, now ubiquitous in Natural Language Processing, pose several problems: they include many biases (representation biases linked to gender, age or origin, for example), they require large amounts of data, and are therefore only available in a few languages. Finally, they require heavy training, and have therefore a very high environmental cost.
Program
Meeting of the “Ethics and Artificial Intelligence” group organized online on Wednesday July 1, 2021:
- Benoît Sagot (INRIA) : Neural language models: representativeness and representation bias (the presentation will be put online as soon as it is received)
- Karine Gentelet (UQO) : Digital (and AI): a relevant tool in the political / identity strategies of Indigenous Peoples of Canada
- Daniel Andler (IJN, IUF) : Who Speaks?
The meeting took place online, on Zoom.