“Even easy things are hard.” | The Verge

Posted Jun 12, 2025 at 1:10 PM UTC

A

“Even easy things are hard.”

Astute AI copyright observer Michael Weinberg raises some good questions about the Common Pile, an AI training dataset billed as being composed of only “openly licensed text”:

On one hand, this is an interesting effort to build a new type of training dataset that illustrates how even the “easy” parts of this process are actually hard. On the other hand, I worry that some people read “openly licensed training dataset” as the equivalent of (or very close to) “LLM free of copyright issues.”

Does an AI Dataset of Openly Licensed Works Matter?

[michaelweinberg.org]

Follow topics and authors from this story to see more like this in your personalized homepage feed and to receive email updates.

Adi Robertson

Most Popular

News publishers take paywall-blocker 12ft.io offline

News publishers take paywall-blocker 12ft.io offline

Can the music industry make AI the next Napster?

Can the music industry make AI the next Napster?

Did AI companies win a fight with authors? Technically

Did AI companies win a fight with authors? Technically

Anthropic wins a major fair use victory for AI — but it’s still in trouble for stealing books

Anthropic wins a major fair use victory for AI — but it’s still in trouble for stealing books

Runway CEO Cris Valenzuela wants Hollywood to embrace AI video

Runway CEO Cris Valenzuela wants Hollywood to embrace AI video

Elon Musk’s apparent power play at the Copyright Office completely backfired

Elon Musk’s apparent power play at the Copyright Office completely backfired

News publishers take paywall-blocker 12ft.io offline

News publishers take paywall-blocker 12ft.io offline

News publishers take paywall-blocker 12ft.io offline

Emma RothJul 17

Can the music industry make AI the next Napster?

Can the music industry make AI the next Napster?

Can the music industry make AI the next Napster?

Elizabeth LopattoJul 1

Did AI companies win a fight with authors? Technically

Did AI companies win a fight with authors? Technically

Did AI companies win a fight with authors? Technically

Adi RobertsonJun 28

Anthropic wins a major fair use victory for AI — but it’s still in trouble for stealing books

Anthropic wins a major fair use victory for AI — but it’s still in trouble for stealing books

Anthropic wins a major fair use victory for AI — but it’s still in trouble for stealing books

Emma RothJun 24

Runway CEO Cris Valenzuela wants Hollywood to embrace AI video

Runway CEO Cris Valenzuela wants Hollywood to embrace AI video

Runway CEO Cris Valenzuela wants Hollywood to embrace AI video

Nilay PatelJun 5

Elon Musk’s apparent power play at the Copyright Office completely backfired

Elon Musk’s apparent power play at the Copyright Office completely backfired

Elon Musk’s apparent power play at the Copyright Office completely backfired

Tina NguyenMay 14

5:14 PM UTC

Epic just won its Google lawsuit again, and Android may never be the same

Two hours ago

The Switch 2 is off to a speedy start for big third-party games

2:00 PM UTC

Why AI researchers are getting paid like NBA All-Stars

Jul 30

Inside the LG G5’s shocking last-place finish at the 2025 TV Shootout

Two hours ago

All the news from Nintendo’s July 2025 Direct showcase

Jul 30

Spotify’s terrible privacy settings just leaked Palmer Luckey’s bops and bangers