• Home
  • Blog
  • Android
  • Cars
  • Gadgets
  • Gaming
  • Internet
  • Mobile
  • Sci-Fi
Tech News, Magazine & Review WordPress Theme 2017
  • Home
  • Blog
  • Android
  • Cars
  • Gadgets
  • Gaming
  • Internet
  • Mobile
  • Sci-Fi
No Result
View All Result
  • Home
  • Blog
  • Android
  • Cars
  • Gadgets
  • Gaming
  • Internet
  • Mobile
  • Sci-Fi
No Result
View All Result
Blog - Creative Collaboration
No Result
View All Result
Home Internet

The New York Times prohibits AI vendors from devouring its content

August 14, 2023
Share on FacebookShare on Twitter

Benj Edwards / Getty Images

In early August, The New York Times updated its terms of service (TOS) to prohibit scraping its articles and images for AI training, reports Adweek. The move comes at a time when tech companies have continued to monetize AI language apps such as ChatGPT and Google Bard, which gained their capabilities through massive unauthorized scrapes of Internet data.

The new terms prohibit the use of Times content—which includes articles, videos, images, and metadata—for training any AI model without express written permission. In Section 2.1 of the TOS, the NYT says that its content is for the reader’s “personal, non-commercial use” and that non-commercial use does not include “the development of any software program, including, but not limited to, training a machine learning or artificial intelligence (AI) system.”

Further down, in section 4.1, the terms say that without NYT’s prior written consent, no one may “use the Content for the development of any software program, including, but not limited to, training a machine learning or artificial intelligence (AI) system.”

NYT also outlines the consequences for ignoring the restrictions: “Engaging in a prohibited use of the Services may result in civil, criminal, and/or administrative penalties, fines, or sanctions against the user and those assisting the user.”

Advertisement

As threatening as that sounds, restrictive terms of use have not previously stopped the wholesale gobble of the Internet into machine learning data sets. Every large language model available today—including OpenAI’s GPT-4, Anthropic’s Claude 2, Meta’s Llama 2, and Google’s PaLM 2—has been trained on large data sets of materials scraped from the Internet. Using a process called unsupervised learning, the web data was fed into neural networks, allowing AI models to gain a conceptual sense of language by analyzing the relationships between words.

The controversial nature of using scraped data to train AI models, which has not been fully resolved in US courts, has led to at least one lawsuit that accuses OpenAI of plagiarism due to the practice. Last week, the Associated Press and several other news organizations published an open letter saying that “a legal framework must be developed to protect the content that powers AI applications,” among other concerns.

OpenAI likely anticipates continued legal challenges ahead and has begun making moves that may be designed to get ahead of some of this criticism. For example, OpenAI recently detailed a method that websites could use to block its AI-training web crawler using robots.txt. This led to several sites and authors publicly stating they would block the crawler.

For now, what has already been scraped is baked into GPT-4, including New York Times content. We may have to wait until GPT-5 to see whether OpenAI or other AI vendors respect content owners’ wishes to be left out. If not, new AI lawsuits—or regulations—may be on the horizon.

Next Post

The 21 best music documentaries on Netflix

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

No Result
View All Result

Recent Posts

  • These Android phone tools can help you change your look
  • Did you know the Galaxy S26 has a new USB webcam mode?
  • Today’s Hurdle hints and answers for March 21, 2026
  • Moon phase today explained: What the Moon will look like on March 21, 2026
  • NYT Connections hints and answers for March 21. Tips to solve ‘Connections’ #1014.

Recent Comments

    No Result
    View All Result

    Categories

    • Android
    • Cars
    • Gadgets
    • Gaming
    • Internet
    • Mobile
    • Sci-Fi
    • Home
    • Shop
    • Privacy Policy
    • Terms and Conditions

    © CC Startup, Powered by Creative Collaboration. © 2020 Creative Collaboration, LLC. All Rights Reserved.

    No Result
    View All Result
    • Home
    • Blog
    • Android
    • Cars
    • Gadgets
    • Gaming
    • Internet
    • Mobile
    • Sci-Fi

    © CC Startup, Powered by Creative Collaboration. © 2020 Creative Collaboration, LLC. All Rights Reserved.

    Get more stuff like this
    in your inbox

    Subscribe to our mailing list and get interesting stuff and updates to your email inbox.

    Thank you for subscribing.

    Something went wrong.

    We respect your privacy and take protecting it seriously