A US district court judge ruled that authors’ suit against Nvidia for copyright infringement to train the chipmaker’s large language models will be allowed to continue. The judge granted part of Nvidia’s motion to dismiss. The lawsuit hinges on three main claims, two of which Judge Jon S. Tigar agreed to let play out in the court: direct, contributory, and vicarious copyright infringement. Authors assert that Nvidia trained the LLMs in its “Megatron family” on the dataset The Pile, which includes Books3, a collection of more than 196,000 pirated books. Nvidia claims that Megatron was trained on “portions of The […]