Recent news and lawsuits have discussed the use of artificial intelligence with regard to creative work—whether AI-produced work can be copyrighted (it can’t) and if using books to train machine learning is a copyright violation. Much of the conversation within the industry is focused on contract language, where agents and authors are hoping to limit the use of AI without permission and block any training on their material, while publishers are trying to retain flexibility for the future and not make promises they can’t fulfill. Quieter perhaps is how publishing companies are using the technology in their day-to-day operations, and […]
AI
Still Another Lawsuit: Authors Guild Sues OpenAI In New York
Following three similar suits filed in San Francisco (and thus in California’s sometimes unpredictable Ninth Circuit), on Tuesday the Authors Guild and a roster of well-known authors filed suit in New York’s Southern District against OpenAI for the “flagrant and harmful infringements of plaintiffs’ registered copyrights in written works of fiction” in training their large language models. The Guild and named plaintiffs David Baldacci, Mary Bly, Michael Connelly, Sylvia Day, Jonathan Franzen, John Grisham, Elin Hilderbrand, Christina Baker Kline, Maya Shanbhag Lang, Victor Lavalle, George R.R. Martin, Jodi Picoult, Douglas Preston, Roxana Robinson, George Saunders, Scott Turow and Rachel Vail […]
New KDP Guidelines Require Participants to Acknowledge AI-Generated Content
Amazon revised their KDP publishing content guidelines recently to add a section that requires participants to indicate when they have used AI tools to create elements of any submitted book. The service distinguishes between “AI-assisted” work — which does not need to be disclosed — and “AI-generated content,” which does need to be acknowledged. They “define AI-generated content as text, images, or translations created by an AI-based tool. If you used an AI-based tool to create the actual content (whether text, images, or translations), it is considered ‘AI-generated,’ even if you applied substantial edits afterwards.” In contrast, AI-assisted work is […]
Your Books Trained Those Large Language Models
As is already being contested in a number of lawsuits seeking class action status, the core datasets on which all of the major large language models have been trained rely on stolen, copyrighted books. As a new Atlantic magazine article by Alex Reisner puts it, “Pirated books are being used as inputs for computer programs that are changing how we read, learn, and communicate. The future promised by AI is written with stolen words.” BookCorpus was stolen from Smashwords authors. Books3 is a body of between 150,000 and 190,000 books from established publishers and authors. The Atlantic piece extracts the […]
Online Book Analyzer Taken Down After Backlash
Literature analysis site Prosecraft shut down on Monday after many authors expressed concern online that their books were included without their consent. Prosecraft was a product of Shaxpir, a cloud-based word processing software. According to a blog post, Prosecraft launched in 2017 after founder Benji Smith estimated the word count of his favorite books while writing his memoir. Prosecraft automated that process, analyzing text for word count and number of adverbs, as well as “vividness” and “passive voice,” and sharing sample pages of the latter. The venture used “techniques [that] were originally developed by computational linguists at the UVM Computational […]
Open Road Launches Metadata Service
Open Road is launching an enterprise SaaS program, offering metadata-as-a-service to all publishers on an annual pay-per-title basis. The initiative is designed to “continually optimize metadata and maximize title discovery” by using machine learning and proprietary data on Open Road’s consumer audiences “to automatically and continually optimize metadata.” (The service is a good example of how machine learning can sometimes provide unambiguous benefit, without encroaching on anyone’s rights.) The service is also being provided to publishers who participate in the Open Road’s ebook backlist marketing service at no additional cost. But the SaaS offering gives Open Road a product that […]