OpenAI accidentally deleted potential evidence in NY Times copyright lawsuit (updated)

admin 3 hours ago Tech Leave a comment 0 Views

Lawyers for The New York Times and Daily News, which are suing OpenAI for allegedly scraping their works to train its AI models without permission, say OpenAI engineers accidentally deleted data potentially relevant to the case.

Earlier this fall, OpenAI agreed to provide two virtual machines so that counsel for The Times and Daily News could perform searches for their copyrighted content in its AI training sets. (Virtual machines are software-based computers that exist within another computer’s operating system, often used for the purposes of testing, backing up data, and running apps.) In a letter, attorneys for the publishers say that they and experts they hired have spent over 150 hours since November 1 searching OpenAI’s training data.

But on November 14, OpenAI engineers erased all the publishers’ search data stored on one of the virtual machines, according to the aforementioned letter, which was filed in the U.S. District Court for the Southern District of New York late Wednesday.

OpenAI tried to recover the data — and was mostly successful. However, because the folder structure and file names were “irretrievably” lost, the recovered data “cannot be used to determine where the news plaintiffs’ copied articles were used to build [OpenAI’s] models,” per the letter.

“News plaintiffs have been forced to recreate their work from scratch using significant person-hours and computer processing time,” counsel for The Times and Daily News wrote. “The news plaintiffs learned only yesterday that the recovered data is unusable and that an entire week’s worth of its experts’ and lawyers’ work must be re-done, which is why this supplemental letter is being filed today.”

The plaintiffs’ counsel makes clear that they have no reason to believe the deletion was intentional. But they do say the incident underscores that OpenAI “is in the best position to search its own datasets” for potentially infringing content using its own tools.

An OpenAI spokesperson declined to provide a statement.

But late Friday, November 22, counsel for OpenAI filed a response to the letter sent by lawyers for The Times and Daily News on Wednesday. In their response, OpenAI’s attorneys unequivocally denied that OpenAI deleted any evidence, and instead suggested that the plaintiffs were to blame for a system misconfiguration that led to a technical issue.

“Plaintiffs requested a configuration change to one of several machines that OpenAI has provided to search training datasets,” OpenAI’s counsel wrote. “Implementing plaintiffs’ requested change, however, resulted in removing the folder structure and some file names on one hard drive — a drive that was supposed to be used as a temporary cache … In any event, there is no reason to think that any files were actually lost.”

In this case and others, OpenAI has maintained that training models using publicly available data — including articles from The Times and Daily News — is fair use. In other words, in creating models like GPT-4o, which “learn” from billions of examples of e-books, essays, and more to generate human-sounding text, OpenAI believes that it isn’t required to license or otherwise pay for the examples — even if it makes money from those models.

That being said, OpenAI has inked licensing deals with a growing number of new publishers, including the Associated Press, Business Insider owner Axel Springer, Financial Times, People parent company Dotdash Meredith, and News Corp. OpenAI has declined to make the terms of these deals public, but one content partner, Dotdash, is reportedly being paid at least $16 million per year.

OpenAI has neither confirmed nor denied that it trained its AI systems on any specific copyrighted works without permission.

Update: Added OpenAI’s response to the allegations.

Source link

meganwoolsey Home

Giants release Daniel Jones: What it means for N.Y. and what comes next for the QB

As F1 thrives in Las Vegas, some local businesses still feel the financial strain

How Michele Kang became one of the biggest investors in women’s soccer

NFL moves Broncos-Chargers in Week 16 for first ‘Thursday Night Football’ flex

‘Gladiator II’ Review: Denzel Washington and Paul Mescal Entertain

When Your Living Room Is Also an Art Gallery

Five Science Fiction Movies to Stream Now

A Stroke Paralyzed Jesse Malin. Next Month, He’ll Stand Onstage Again.

OpenAI accidentally deleted potential evidence in NY Times copyright lawsuit (updated)

Related Articles

About admin

Check Also

Sequoia marks up its 2020 fund by 25%

Leave a Reply Cancel reply

Listeria Outbreak Kills One, Prompts Recall for Yu Shang Food

Visible Promo Code: Up to $240 Off Plans

Best Internet Providers in Santa Clarita, California

‘Gladiator II’ Review: Denzel Washington and Paul Mescal Entertain

Super Micro Computer and C3.ai Lead Another Amazing Week for AI Stocks

Listeria Outbreak Kills One, Prompts Recall for Yu Shang Food

New Covid Shots: Cost, Availability, Side Effects and More

Titus Kaphar’s Homecoming

3 High-Yield Dividend Stocks to Buy Sooner Rather Than Later

Right-Wing Influencer Network Tenet Media Allegedly Spread Russian Disinformation

Listeria Outbreak Kills One, Prompts Recall for Yu Shang Food

New Covid Shots: Cost, Availability, Side Effects and More

Titus Kaphar’s Homecoming

3 High-Yield Dividend Stocks to Buy Sooner Rather Than Later

Right-Wing Influencer Network Tenet Media Allegedly Spread Russian Disinformation

13 Best Couches You Can Buy Online (2024): Sectionals, Sofas, Sleepers, and More

How Chris Perfetti of ‘Abbott Elementary’ Spends His Sundays

2 things to try if you want to avoid the Pro

Jon McNeill’s lessons on innovation through subtraction

Abortion-Rights Group Sues DeSantis Administration, Alleging Free Speech Violation

Listeria Outbreak Kills One, Prompts Recall for Yu Shang Food

Visible Promo Code: Up to $240 Off Plans

Best Internet Providers in Santa Clarita, California

‘Gladiator II’ Review: Denzel Washington and Paul Mescal Entertain

Super Micro Computer and C3.ai Lead Another Amazing Week for AI Stocks