Skip to main content
SearchLoginLogin or Signup

We Must Fix the Lack of Transparency Around the Data Used to Train Foundation Models

Published onMay 31, 2024
We Must Fix the Lack of Transparency Around the Data Used to Train Foundation Models
·
history

You're viewing an older Release (#1) of this Pub.

  • This Release (#1) was created on Dec 13, 2023 ()
  • The latest Release (#2) was created on May 31, 2024 ().

Abstract

Access to information about the data used to train foundation AI models is vital for many tasks. Despite progress made by sections of the AI community, there remains a general lack of transparency about the content and sources of training datasets. Whether the result of voluntary initiative by firms or regulatory intervention, this has to change.

Keywords: artificial intelligence, machine learning, foundation models, training data, transparency, trust



12/13/2023: To preview this content, click below for the Just Accepted version of the article. This peer-reviewed version has been accepted for its content and is currently being copyedited to conform with HDSR’s style and formatting requirements.


©2023 Jack Hardinges, Elena Simperl, and Nigel Shadbolt. This article is licensed under a Creative Commons Attribution (CC BY 4.0) International license, except where otherwise indicated with respect to particular material included in the article.

Comments
0
comment
No comments here
Why not start the discussion?