Open-Source AI Definition Finally Gets Its First Release Candidate (zdnet.com)
(Wednesday October 09, 2024 @11:30PM (BeauHD)
from the so-far-so-good dept.)
- Reference: 0175222055
- News link: https://news.slashdot.org/story/24/10/09/2048207/open-source-ai-definition-finally-gets-its-first-release-candidate
- Source link: https://www.zdnet.com/article/open-source-ai-definition-finally-gets-its-first-release-candidate-and-a-compromise/
An anonymous reader quotes a report from ZDNet:
> Getting open-source and artificial intelligence (AI) on the same page isn't easy. Just ask the Open Source Initiative (OSI). The OSI, the open-source definition steward organization, has been working on creating an open-source artificial intelligence definition for two years now. The group has been making progress, though. Its Open Source AI Definition has [1]now released its first release candidate, RC1 . The [2]latest definition aims to clarify the often contentious discussions surrounding open-source AI. It specifies four fundamental freedoms that an AI system must grant to be considered open source: the ability to use the system for any purpose without permission, to study how it works, to modify it for any purpose, and to share it with or without modifications. So far, so good.
>
> However, the OSI has opted for a compromise regarding training data. Recognizing it's not easy to share full datasets, the current definition requires "sufficiently detailed information about the data used to train the system" rather than the full dataset itself. This approach aims to balance transparency with practical and legal considerations. That last phrase is proving difficult for some people to swallow. From their perspective, if all the data isn't open, then AI large language models (LLM) based on such data can't be open-source. The OSI summarized these arguments as follows: "Some people believe that full, unfettered access to all training data (with no distinction of its kind) is paramount, arguing that anything less would compromise full reproducibility of AI systems, transparency, and security. This approach would relegate Open-Source AI to a niche of AI trainable only on open data."
The OSI acknowledges that the definition of open-source AI isn't final and may need significant rewrites, but the focus is now on fixing bugs and improving documentation. The final version of the Open Source AI Definition is scheduled for release at the [3]All Things Open conference on October 28, 2024.
[1] https://www.zdnet.com/article/open-source-ai-definition-finally-gets-its-first-release-candidate-and-a-compromise/
[2] https://opensource.org/blog/the-open-source-ai-definition-v-1-0-rc1-is-available-for-comments
[3] https://allthingsopen.org/
> Getting open-source and artificial intelligence (AI) on the same page isn't easy. Just ask the Open Source Initiative (OSI). The OSI, the open-source definition steward organization, has been working on creating an open-source artificial intelligence definition for two years now. The group has been making progress, though. Its Open Source AI Definition has [1]now released its first release candidate, RC1 . The [2]latest definition aims to clarify the often contentious discussions surrounding open-source AI. It specifies four fundamental freedoms that an AI system must grant to be considered open source: the ability to use the system for any purpose without permission, to study how it works, to modify it for any purpose, and to share it with or without modifications. So far, so good.
>
> However, the OSI has opted for a compromise regarding training data. Recognizing it's not easy to share full datasets, the current definition requires "sufficiently detailed information about the data used to train the system" rather than the full dataset itself. This approach aims to balance transparency with practical and legal considerations. That last phrase is proving difficult for some people to swallow. From their perspective, if all the data isn't open, then AI large language models (LLM) based on such data can't be open-source. The OSI summarized these arguments as follows: "Some people believe that full, unfettered access to all training data (with no distinction of its kind) is paramount, arguing that anything less would compromise full reproducibility of AI systems, transparency, and security. This approach would relegate Open-Source AI to a niche of AI trainable only on open data."
The OSI acknowledges that the definition of open-source AI isn't final and may need significant rewrites, but the focus is now on fixing bugs and improving documentation. The final version of the Open Source AI Definition is scheduled for release at the [3]All Things Open conference on October 28, 2024.
[1] https://www.zdnet.com/article/open-source-ai-definition-finally-gets-its-first-release-candidate-and-a-compromise/
[2] https://opensource.org/blog/the-open-source-ai-definition-v-1-0-rc1-is-available-for-comments
[3] https://allthingsopen.org/
The courts need to revisit Fair Use Doctrine. (Score:2)
It seems like that could use some expansion for modern times.
*How much of someone else's work can I use without getting permission?
Under the fair use doctrine of the U.S. copyright statute, it is permissible to use limited portions of a work including quotes, for purposes such as commentary, criticism, news reporting, and scholarly reports. There are no legal rules permitting the use of a specific number of words, a certain number of musical notes, or percentage of a work. Whether a particular use qualifies