Interesting times!
The suit contends that ChatGPT did not have permission to do a deep scan of the NYT's article database to train their system, and in doing so violated the NYT's terms of service.
From the Ars article (an Arsicle?): "Weeks after The New York Times updated its terms of service (TOS) to prohibit AI companies from scraping its articles and images to train AI models, it appears that the Times may be preparing to sue OpenAI. The result, experts speculate, could be devastating to OpenAI, including the destruction of ChatGPT's dataset and fines up to $150,000 per infringing piece of content."
and "This speculation comes a month after Sarah Silverman joined other popular authors suing OpenAI over similar concerns, seeking to protect the copyright of their books.
But here's the biggie: "NPR reported that OpenAI risks a federal judge ordering ChatGPT's entire data set to be completely rebuilt—if the Times successfully proves the company copied its content illegally and the court restricts OpenAI training models to only include explicitly authorized data. OpenAI could face huge fines for each piece of infringing content, dealing OpenAI a massive financial blow just months after The Washington Post reported that ChatGPT has begun shedding users, "shaking faith in AI revolution." Beyond that, a legal victory could trigger an avalanche of similar claims from other rights holders.
Unlike authors who appear most concerned about retaining the option to remove their books from OpenAI's training models, the Times has other concerns about AI tools like ChatGPT. NPR reported that a "top concern" is that ChatGPT could use The Times' content to become a "competitor" by "creating text that answers questions based on the original reporting and writing of the paper's staff."
Fair Use is quite an issue. I quote news sites all the time, just like the excerpts above. I make no claim it is my content, it is clearly delineated as to what is quoted from the article and what is my commentary or additional content. And I am in no way making any money from this. Things are a little different when you have AI/LLM systems hoovering up all the content that they can find to train up. Those system makers want to spend the least amount of money possible to train their systems because their energy costs are absolutely huge! I posted an article a month or so ago about a new supercomputer that will be running an AI system that consumed as much power as either 3,000 or 30,000 houses, I saw both numbers. If these guys can get training data for free, they'll go for it. But authors are pushing back: if people have to buy their books to read it (excluding libraries where people can borrow for free), then why should AI companies get a free read?
If an art generating AI wants to use my photos, I would like to be compensated! If you want to use one of my photos for a desktop wallpaper or screen saver, I'm honored. If you sell my photos for profit - then we have an issue! I've spent over four decades developing my craft and I'm pretty decent at it, I'd like some acknowledgement and compensation for it and not for it to be stolen for an AI system's use, as they've been doing.
https://arstechnica.com/tech-policy/2023/08/report-potential-nyt-lawsuit-could-force-openai-to-wipe-chatgpt-and-start-over/
The suit contends that ChatGPT did not have permission to do a deep scan of the NYT's article database to train their system, and in doing so violated the NYT's terms of service.
From the Ars article (an Arsicle?): "Weeks after The New York Times updated its terms of service (TOS) to prohibit AI companies from scraping its articles and images to train AI models, it appears that the Times may be preparing to sue OpenAI. The result, experts speculate, could be devastating to OpenAI, including the destruction of ChatGPT's dataset and fines up to $150,000 per infringing piece of content."
and "This speculation comes a month after Sarah Silverman joined other popular authors suing OpenAI over similar concerns, seeking to protect the copyright of their books.
But here's the biggie: "NPR reported that OpenAI risks a federal judge ordering ChatGPT's entire data set to be completely rebuilt—if the Times successfully proves the company copied its content illegally and the court restricts OpenAI training models to only include explicitly authorized data. OpenAI could face huge fines for each piece of infringing content, dealing OpenAI a massive financial blow just months after The Washington Post reported that ChatGPT has begun shedding users, "shaking faith in AI revolution." Beyond that, a legal victory could trigger an avalanche of similar claims from other rights holders.
Unlike authors who appear most concerned about retaining the option to remove their books from OpenAI's training models, the Times has other concerns about AI tools like ChatGPT. NPR reported that a "top concern" is that ChatGPT could use The Times' content to become a "competitor" by "creating text that answers questions based on the original reporting and writing of the paper's staff."
Fair Use is quite an issue. I quote news sites all the time, just like the excerpts above. I make no claim it is my content, it is clearly delineated as to what is quoted from the article and what is my commentary or additional content. And I am in no way making any money from this. Things are a little different when you have AI/LLM systems hoovering up all the content that they can find to train up. Those system makers want to spend the least amount of money possible to train their systems because their energy costs are absolutely huge! I posted an article a month or so ago about a new supercomputer that will be running an AI system that consumed as much power as either 3,000 or 30,000 houses, I saw both numbers. If these guys can get training data for free, they'll go for it. But authors are pushing back: if people have to buy their books to read it (excluding libraries where people can borrow for free), then why should AI companies get a free read?
If an art generating AI wants to use my photos, I would like to be compensated! If you want to use one of my photos for a desktop wallpaper or screen saver, I'm honored. If you sell my photos for profit - then we have an issue! I've spent over four decades developing my craft and I'm pretty decent at it, I'd like some acknowledgement and compensation for it and not for it to be stolen for an AI system's use, as they've been doing.
https://arstechnica.com/tech-policy/2023/08/report-potential-nyt-lawsuit-could-force-openai-to-wipe-chatgpt-and-start-over/