Dentons US LLP

10/02/2024 | News release | Distributed by Public on 10/02/2024 08:35

Copyright infringement by AI training

October 2, 2024

On 27 September 2024, the Hamburg Regional Court (LG Hamburg) delivered a significant ruling in case 310 O 227/23, regarding copyright infringement in connection with AI training datasets. The court ruled against the plaintiff, who claimed that the defendant had infringed his copyright by using a photo in a dataset for training generative artificial intelligence (AI).

Facts of the case

The plaintiff, who claimed to be the author of the photo in question, argued that the defendant's actions constituted an unlawful reproduction of the photo, violating his rights under § 16 UrhG. The plaintiff asserted that the reproduction was not covered by the exceptions in §§ 44a, 44b, and 60d UrhG. Specifically, the plaintiff argued that the reproduction was not a temporary act within the meaning of § 44a UrhG, did not qualify as text or data mining under § 44b UrhG and was not conducted for scientific research purposes within the meaning of § 60d UrhG.

Furthermore, the plaintiff referred to a disclaimer on the website: The disclaimer on the image agency's website stated: "RESTRICTIONS YOU MAY NOT: (...) Use automated programs, applets, bots or the like to access the [...] .com website or any content thereon for any purpose, including, by way of example only, downloading [c]ontent, indexing, scraping or caching any content on the website."

The defendant, a nonprofit organization that created and publicly made available a dataset with 5.85 billion image-text pairs for AI training, argued that the reproduction was covered by the statutory exceptions for temporary reproduction acts (§ 44a UrhG), text and data mining (§ 44b UrhG) and scientific research (§ 60d UrhG). The defendant also argued that the photo was freely accessible on the internet and that reproduction was a common practice in AI training.

Court's findings

The court found that the reproduction was neither temporary nor incidental, as required by § 44a UrhG. Although the reproduction could fall under the copyright exception for text and data mining according to § 44b UrhG, the court examined whether the disclaimer was machine-readable, as required by § 44b para. 3 sentence 2 UrhG. It was determined that the term "machine-readable" should be interpreted as "machine-understandable," meaning that the reservation must be automatically processable by software. Thus, a usage reservation written in natural language is also considered machine-understandable, especially if modern technologies like AI applications can comprehend texts written in natural language. The court found that the reservation was sufficiently clear and met the requirements for an explicit declaration under Art. 5 para. 4 sentence 3 DSM Directive. The effective and machine-readable reservation on the image agency's website meant that the exception for text and data mining under § 44b UrhG did not apply. This meant that the defendant's reproduction of the photo was not covered by this exception.

Since the exception under § 44b UrhG did not apply due to the disclaimer, the court examined whether the reproduction was covered by the exception under § 60d UrhG. The court concluded that the reproduction for scientific research by research organizations was permissible, as the defendant's activities were considered non-commercial scientific research, and the dataset was made publicly available for free.

In summary, the effective and machine-readable disclaimer on the image agency's website played a crucial role in excluding the application of the exception for text and data mining and influenced the court's decision in favor of applying the exception for scientific research under § 60d UrhG.

Practical Advice

Given the remaining uncertainties, companies should adopt a cautious approach. It is advisable to:

  • Ensure that internal policies on data usage and copyright compliance are up to date and reflect the latest legal developments.
  • Formulate clear and explicit disclaimers regarding the use of copyrighted material and ensure that these are machine-readable where possible.
  • Develop and implement robust protocols for the deletion of data after analysis to avoid potential legal pitfalls.