Microsoft and OpenAI, has allegedly trained its AI models, including ChatGPT, with copyrighted data and works of nonfiction authors, according to a proposed class action reported by Reuters.
The lawsuit claims that OpenAI illegally copied tens of thousands of nonfiction books in order to train its massive language models to respond to human text prompts.
First to name first to name Microsoft, OpenAI's biggest investors to date, as one of its defendants, the lawsuit claims that the company was "deeply involved" in the models' training and development, and therefore is also accountable for copyright infringement.
Leading the lawsuit that was reportedly filed in federal court in New York on Tuesday, is Julian Sancton, author of narrative nonfiction book 'Madhouse at the End of the Earth' and a Hollywood Reporter editor.
"While OpenAI and Microsoft refuse to pay nonfiction authors, their AI platform is worth a fortune," claimed Sancton's attorney, Justin Nelson. "The basis of OpenAI is nothing less than the rampant theft of copyrighted works."
Microsoft's Involvement
According to a separate report by the Hollywood Reporter, the lawsuit emphasizes Microsoft's "key role" in providing "critical assistance" in the creation of unlicensed copies of authors' works to be used as training data and the commercialization of GPT-based technology.
"Microsoft's Azure provided the cloud computing systems that powered the training process, and continues to power OpenAI's operations to this day," according to the lawsuit. "Without these bespoke computing systems, OpenAI would not have been able to execute and profit from the mass copyright infringement alleged herein."
The report also stated Microsoft's knowledge of OpenAI randomly crawling the internet for copyrighted material to train its model, to be one of the reasons it is named as a defendant within the newest lawsuit.
The suit claims that based on an interview with Microsoft's CEO Satya Nadella, it could be concluded that the company had close involvement in developing, maintaining, and supporting OpenAI's supercomputing system.
The lawsuit was reportedly was referring to a CNBC interview that saw the CEO stating "beneath what OpenAI is putting out as large language models, remember, the heavy lifting was done by the Azure team to build the compute infrastructure."
According to Sancton, Microsoft should have been aware of its partner's "large scale copyright infringement" in breach of intellectual property rules as a result of that procedure and its choice to invest $13 billion in the corporation.
ChatGPT's Consequences for Authors
Sancton argues in the lawsuit that the alleged wrongdoing was "manifestly unfair use" since readers might replace buying his book with examining ChatGPT's information on his work to learn from his writing style.
He also claims that OpenAI and Microsoft have denied writers possible license arrangements, citing agreements with content providers such as the Associated Press and other undisclosed organizations.
It further emphasized that if not for the alleged infringement, blanket licensing procedures would be available through an institution such as the Copyright Clearance Center.
Due to active litigation, an OpenAI spokeswoman declined to comment on the Tuesday complaint. Microsoft representatives did not immediately reply to a request for comment.
The case is one of several taken against OpenAI and other tech companies by groups of copyright owners, including writers John Grisham, George R.R. Martin, and Jonathan Franzen, over the alleged exploitation of their work to train AI systems. The charges have been disputed by the corporations.