IT-University of Copenhagen

01/24/2024 | Press release | Distributed by Public on 01/24/2024 03:39

Grant: ITU researcher working for transparency in training data for AI

Grant: ITU researcher working for transparency in training data for AI

Assistant Professor at IT University of Copenhagen's Computer Science Department, Anna Rogers, has just secured a prestigious 940 000-euro Villum Young Investigator grant for a research project aimed at developing new methods for identifying references to language model training data.

Anna RogersComputer Science DepartmentResearchalgorithmsartificial intelligencegrants

Written 24 January, 2024 09:26 by Theis Duelund Jensen

Last year, the Danish Language Council named ChatGPT as the word that best characterised 2023. It can write speeches, party songs and adopt a Danish style and tone of voice, if you know how to communicate with it. ChatGPT belongs to the so-called generative form of artificial intelligence that can produce new content from data it has been fed. But it does not specify which sources it bases its answers on, which makes it difficult to sanity check.

With the project PlagAIrism:getting generative AI to provide references to its training data, which has just been awarded approximately 940 000 euro from the Villum Young Investigator programme, Anna Rogers, researcher at the IT University of Copenhagen, wants to develop new methods to identify and list the sources of the data that robots like ChatGPT have been trained with. It will contribute to transparency, reliability and respect for copyright in the development of generative AI systems.

"This grant will allow me to start a new research lab, working on the core principles of NLP model interpretability via the lens of model training data. It will support a postdoc and two PhD students, as well as a new GPU server. In collaboration with Allen Institute for AI, Carnegie Mellon University, and Hugging Face, this project will develop new methods for identifying references to training data of language models. Potentially, such methods could change the conversation about how the creators of texts used for training could expect to be credited and compensated," says Anna Rogers.

This year the Villum Foundation received 98 applications for the programme. The 22 researchers who made the cut have been through a process of academic evaluation and interviews with the foundation's scientific committee, as well as final approval from the board of the foundation.

Further information

Theis Duelund Jensen, Press Officer, tel: 2555 0447, email: [email protected]