> License what? Every available copyrighted work? Even getting a tiny fraction i...

dijksterhuis · on Feb 12, 2025

> the basis that figuring out what IP went into what model output is just too hard, so instead they just agree to distribute it to whomever is on the New York Times best seller list at any given moment.

the long tail exists, and there will always be a threshold for payments due to rights holders.

it used to be (like 10 years ago so i might not remember the details exactly) that if you earned less than £1 from youtube performing music rights in a quarter then any money you earned was put back into the pot and redistributed to those earning over £1.

it just wasn’t worth the cost to keep track of £0.00001 earnings for all the rights holder in the bottom of the long tail each quarter, or to pay the bank fees when the eventually earn £0.01 that can be paid to them.

definitely not perfect, but at least some people were getting paid, instead of none.

also, youtube’s data they gave us was fairly shit (video title, url). so that didn’t help. nor did the lack of compute/data proc infrastructure/skills. was historically a manual spreadsheet job trying to work out who to cut.

i had to do it a few times :/

edit —

> The biggest AI companies could even run the enforcement cartels ala BMI/ASCAP to compute and collect royalties owed.

what could happen, for music at least, is the same thing that happened with youtube, mashed up with live music analogies.

a licensing negotiation with BMI/ASCAP/PRS, and maybe major publishers directly if they get frustrated with the PROs. then PROs will use sampling of other revenue streams to work out what the likely popular things are for AI. then divvy up whatever the lump sum is between the most popular songs.

we used to do this for live music. i had to generate the sampled dataset in microsoft access each year and weed out the all the radio stings.

sorry for costing you a million pounds that one year ed sheeran :/

DrillShopper · on Feb 12, 2025

> figuring out what IP went into what model output is just too hard

Check out this one cool trick companies found for skirting copyright restrictions.

Lawyers HATE them!