Meta and Torrents: True, False, or Rationalization?

February 26, 2025

AIs gobble datasets for training. It is another fact that many LLMs and datasets contain biased information, are incomplete, or plain stink. One ethical but cumbersome way to train algorithms would be to notify people that their data, creative content, or other information will be used to train AI. Offering to pay for the right to use the data would be a useful step some argue.

Will this happen? Obviously not.

Why?

Because it’s sometimes easier to take instead of asking. According to Toms Hardware, “Meta Staff Torrented Nearly 82TB Of Pirated Books For AI Training-Court Records Reveal Copyright Violations.” The article explains that Meta pirated 81.7 TB of books from the shadow libraries Anna’s Archive, Z-Library, and LibGen. These books were then used to train AI models. Meta is now facing a class action lawsuit about using content from the shadow libraries.

The allegations arise from Meta employees’ written communications. Some of these messages provide insight into employees’ concern about tapping pirated materials. The employees were getting frown lines, but then some staffers’ views rotated when they concluded smart software helped people access information.

Here’s a passage from the cited article I found interesting:

“Then, in January 2023, Mark Zuckerberg himself attended a meeting where he said, “We need to move this stuff forward… we need to find a way to unblock all this.” Some three months later, a Meta employee sent a message to another one saying they were concerned about Meta IP addresses being used “to load through pirate content.” They also added, “torrenting from a corporate laptop doesn’t feel right,” followed by laughing out loud emoji. Aside from those messages, documents also revealed that the company took steps so that its infrastructure wasn’t used in these downloading and seeding operations so that the activity wouldn’t be traced back to Meta. The court documents say that this constitutes evidence of Meta’s unlawful activity, which seems like it’s taking deliberate steps to circumvent copyright laws.”

If true, the approach smacks of that suave Silicon Valley style. If false, my faith in a yacht owner with gold chains might be restored.

Whitney Grace, February 26, 2025

Comments

Got something to say?





  • Archives

  • Recent Posts

  • Meta