Wiley has partnered with AI research company Anthropic to introduce standards for how AI is integrated into research content. As part of the partnership, Wiley will adopt Anthropic’s Model Context Protocol, an open standard which enables integration of peer-reviewed content and AI tools across multiple platforms. One aspect of the partnership is also “to establish standards for how AI tools properly integrate scientific journal content into results while providing appropriate context for users, including author attribution and citations.” Josh Jarrett, senior vice president of AI Growth at Wiley says in a release, “Through this partnership, Wiley is not only setting […]
AI
AI Translation Company GlobeScribe Launches
The founders of Bloodhound Books, who will leave the company in August, have launched GlobeScribe.ai, which translates fiction using AI. Aimed at independent publishers and authors, founders Fred Freeman and Betsy Reavley plan to “break down the barriers of language, opening up access to global markets for books that may have otherwise never been translated due to cost, time, or demand limitations,” the Bookseller reports. According to the website, GlobeScribe is “tailored for fiction, designed to handle dialogue, rhythm and nuance — not just literal meaning.” Currently in beta mode, the service translates from English into Spanish, German, Italian, Portuguese, and […]
Authors Sue Microsoft Over LLM Copyright Infringement
A group of authors–Kai Bird, Jia Tolentino, Eloisa James, Hampton Sides, Victor LaValle, Mary Bly, Jonathan Alter, Eugene Linden, Daniel Okrent, Rachel Vail, and Simon Winchester–filed suit against Microsoft in New York’s Southern District, arguing that the tech company’s use of their books to train its Megatron LLM is copyright infringement. Plaintiffs argue that Microsoft knew that they needed licenses to use books because they entered a licensing deal with Harper Collins last year. But they say that Microsoft used the Books3 pirate database of nearly 200,000 books, knowing that it was infringement. “The end result is a computer model […]
A Different California Judge Believes LLMs Are Likely Infringing Much of the Time, But Authors Made the Wrong Argument So Meta Case Is Dismissed
In another copyright infringement case brought in California’s Northern District—this time against Meta, filed by 13 prominent authors—Judge Vince Chhabria issued a surprising ruling. He suggest that “in cases involving uses like Meta’s, it seems like the plaintiffs will often win,” but in this particular case the plaintiffs made the wrong arguments and thus “the Court has no choice but to grant summary judgment to Meta on the plaintiffs’ claim that the company violated copyright law by training its models with their books.” That said, however, “In the grand scheme of things, the consequences of this ruling are limited. This […]
Anthropic’s Use of Legally Acquired Books “Was Exceedingly Transformative” And Fair Use; Stealing and Using 7 Million Pirated Books Was Not
AI company Anthropic won a limited victory in the copyright infringement case brought against it by authors Andrea Bartz, Charles Graeber and Kirk Wallace Johnson in the Northern District of California. Federal District Judge William Alsup granted Anthropic summary judgment for their use of copyrighted books to train their service Claude and its predecessors, finding it “was exceedingly transformative and was a fair use under Section 107 of the Copyright Act.” At the same time, however, Judge Alsup wrote that Anthropic’s downloading of millions of illegal, pirated copies of books for its central library and AI training was not remotely […]
Libraries Digitize Works for AI Training, Public Use
Harvard University and the Boston Public Library are releasing collections of digitized materials to tech companies for AI training, the AP reports. Harvard is providing nearly one million books in 254 languages from the past 600 years, in a dataset called Institutional Books 1.0. The BPL will give collections of old newspapers and government documents. “It is a prudent decision to start with public domain data because that’s less controversial right now than content that’s still under copyright,” Microsoft deputy general counsel Burton Davis said. The library material includes original sources, which is lacking in much of the AI companies’ […]