Anna’s Archive Backed Up Spotify, Plans to Release 300TB Music Archive

Source: TorrentFreak

Article note: The mix of "That's how Spotify itself started" and "I'm told massive copyright infringement is OK as long as it's to train my competing simulacra generator" responses (and my general ire for the state of copyright) makes this entirely deserving of the celebratory jokes it's receiving.

vinylAnna’s Archive is generally known as a meta-search engine for shadow libraries, helping users find pirated books and other related resources.

However, its archival ambitions don’t stop at text. This weekend, the site announced that it had successfully backed up Spotify, which must come as a shock to the music industry.

“A while ago, we discovered a way to scrape Spotify at scale. We saw a role for us here to build a music archive primarily aimed at preservation,” Anna’s Archive volunteer “ez” writes.

The site acknowledges that there have been many successful music preservation initiatives, particularly among torrenting audiophiles at dedicated private trackers. However, a dedicated preservation archive for music is not generally available, at least not yet.

300TB of Music

With its latest scraping effort, Anna’s Archive aims to fill this gap. While Spotify doesn’t have all the music in the world, the streaming service does have an impressive 256 million tracks from more than 15 million artists, spanning 58 million albums.

the collection

Anna’s Archive says it has archived roughly 86 million music files, almost 300 TB in total. Relatively popular songs are stored in their original 160kbit/s OGG Vorbis quality, while the rest use 75kbit/s to save hundreds of terabytes of storage. Altogether, these tracks represent 99.6% of all Spotify listens.

This music heist will be shared in a single torrent file. Unlike books, these tracks will not be available as individual downloads, although that could change if there’s enough interest.

At the time of writing, no music has been released. The first torrent focuses on metadata instead; releasing 199.9GB of compressed artist, album, and track metadata in one go. The next stage will include music files.

releases anna

For now, the metadata release is being shared by more than 200 people, which means that there is plenty of interest. And we suspect that this will pick up further when the music archives are released.

That said, seeding 300TB will be a significant challenge, as most people don’t have 300TB of free storage space. Therefore, it makes sense that these music archives will be released in batches.

spotify torrent

The AI Angle

The metadata is a goldmine for archivists and audio researchers. In a blog post, Anna’s Archive shares a series of charts and graphs comparing key statistics, such as the top music genres by artist count or the distribution of tracks by duration.

The massive data repositories, including the music itself, will also be very appealing to tech companies developing AI models. However, after many U.S. tech giants were sued for actively sharing Anna’s Archive’s text data, they will be cautious to cross this line again.

Of course, foreign AI companies may have fewer reservations. In fact, Anna’s Archive already offers high-speed access to its data for groups training Large Language Models (LLMs) in exchange for donations.

Spotify Responds

Spotify, meanwhile, is aware of the reported breach and has launched an investigation to find out how it was possible.

“An investigation into unauthorized access identified that a third party scraped public metadata and used illicit tactics to circumvent DRM to access some of the platform’s audio files. We are actively investigating and mitigating the incident,” the company told Billboard.

Anna’s Archive volunteer ‘ez’, meanwhile, stresses that they are ‘merely’ trying to safeguard musical heritage with this scraping effort.

“With your help, humanity’s musical heritage will be forever protected from destruction by natural disasters, wars, budget cuts, and other catastrophes,” ‘ez’ notes.

From: TF, for the latest news on copyright battles, piracy and more.

This entry was posted in News. Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *