Hacktivists scrape 86M Spotify tracks, claim their aim is to preserve culture
- Reference: 1766424289
- News link: https://www.theregister.co.uk/2025/12/22/hacktivists_scrape_songs_spotify/
- Source link:
The scraping appears to have been carried out by people associated with Anna's Archive, a shadow-library site that focuses on preserving media - traditionally books and academic papers - by aggregating metadata and distributing large datasets rather than directly hosting copyrighted works. In practice, Anna's Archive functions more like a metadata search engine, allowing users to find the content they want and connecting them with downloads, usually via torrent, from other sources to reduce legal liability.
In a Saturday [1]blog post , the group said it couldn't pass up an opportunity "outside of text" to scrape Spotify at scale, claiming to have archived roughly 86 million music files, which it says account for about 99.6 percent of listens on the platform.
[2]
"A while ago, we discovered a way to scrape Spotify at scale," Anna's Archive said. "We saw a role for us here to build a music archive primarily aimed at preservation."
[3]
[4]
Anna's Archive justified its Spotify scraping by describing it as a "humble attempt to start a 'preservation archive' for music" in order to protect "humanity's musical heritage" from "destruction by natural disasters, wars, budget cuts, and other catastrophes."
In particular, the Anna's Archive team said that it wants to get around some of the most common problems in other music preservation initiatives, namely an "over-focus on the most popular artists," an "over-focus on the highest possible quality" and a lack of authoritative torrent lists representing "all music ever produced."
[5]
Those noble claims fall apart quickly upon further reading of Anna's Archive's blog post, though.
While 300 TB comprising roughly 86 million music files, which the group claims represent about 99.6 percent of Spotify’s listens, is a vast amount of audio, it falls well short of the platform’s full catalog. Anna’s Archive says Spotify contains around 256 million tracks in total, meaning the audio files it archived cover only about a third of the catalog, with the remaining tracks represented only in metadata rather than preserved as music files.
By not bothering with all the musical chaff in Spotify's catalog, the Anna's Archive team is apparently content to let those less popular songs languish despite their claim to want to avoid focusing on just the most popular artists.
[6]
It's not clear how the Archive intends to break up the 300 TB worth of music into torrent files, or if it intends to release one massive file (we reached out to the team but didn't hear back), but the blog notes that it's only going to be making it available via "a torrents-only archive aimed at preservation" that "can easily be mirrored by anyone with enough disk space."
In short, the archival goal is one that butts up against letting users simply download the individual Spotify tracks they want, as that would just be plain old piracy.
As with its broader preservation rhetoric, Anna's Archive's claim of benevolence in releasing the collection strictly for archival purposes is undercut later in the blog post.
"If there is enough interest, we could add downloading of individual files to Anna's Archive," volunteer member "ez" wrote on the blog. "Please let us know if you'd like this."
[7]Anti-piracy messaging may just encourage more piracy
[8]AI slop hits new high as fake country artist goes to #1 on Billboard digital songs chart
[9]Denmark takes a Viking swing at VPN-enabled piracy
[10]Lawyer's 6-year-old son uses AI to build copyright infringement generator
If Anna's Archive intended to return to the Spotify servers to scrape the remaining songs, it appears they might be too late, according to a Spotify spokesperson's comment to The Register.
"Spotify has identified and disabled the nefarious user accounts that engaged in unlawful scraping," the company told us. "We've implemented new safeguards for these types of anti-copyright attacks and are actively monitoring for suspicious behavior."
No mention was made by either Spotify or Anna's Archive of how the scrapers managed to bypass Spotify's digital rights management software.
While Spotify didn't respond to questions about Anna's Archive's supposed preservation motivations, the company did note that it views the theft of tens of millions of pieces of intellectual property from its servers as a simple act of piracy, regardless of whether Spotify itself is a bit of an IP pirate that [11]doesn't fairly pay its artists.
"Since day one, we have stood with the artist community against piracy, and we are actively working with our industry partners to protect creators and defend their rights," Spotify said.
For now, metadata covering nearly all of Spotify's roughly 256 million tracks is available to download from Anna's Archive. The music files themselves aren't out yet, but the Archive claims that it's planning to release them - in order of popularity - sometime in the future. ®
Get our [12]Tech Resources
[1] https://annas-archive.li/blog/backing-up-spotify.html
[2] https://pubads.g.doubleclick.net/gampad/jump?co=1&iu=/6978/reg_security/front&sz=300x50%7C300x100%7C300x250%7C300x251%7C300x252%7C300x600%7C300x601&tile=2&c=2aUnNllNhMxPmj56lBUKRRQAAAQc&t=ct%3Dns%26unitnum%3D2%26raptor%3Dcondor%26pos%3Dtop%26test%3D0
[3] https://pubads.g.doubleclick.net/gampad/jump?co=1&iu=/6978/reg_security/front&sz=300x50%7C300x100%7C300x250%7C300x251%7C300x252%7C300x600%7C300x601&tile=4&c=44aUnNllNhMxPmj56lBUKRRQAAAQc&t=ct%3Dns%26unitnum%3D4%26raptor%3Dfalcon%26pos%3Dmid%26test%3D0
[4] https://pubads.g.doubleclick.net/gampad/jump?co=1&iu=/6978/reg_security/front&sz=300x50%7C300x100%7C300x250%7C300x251%7C300x252%7C300x600%7C300x601&tile=3&c=33aUnNllNhMxPmj56lBUKRRQAAAQc&t=ct%3Dns%26unitnum%3D3%26raptor%3Deagle%26pos%3Dmid%26test%3D0
[5] https://pubads.g.doubleclick.net/gampad/jump?co=1&iu=/6978/reg_security/front&sz=300x50%7C300x100%7C300x250%7C300x251%7C300x252%7C300x600%7C300x601&tile=4&c=44aUnNllNhMxPmj56lBUKRRQAAAQc&t=ct%3Dns%26unitnum%3D4%26raptor%3Dfalcon%26pos%3Dmid%26test%3D0
[6] https://pubads.g.doubleclick.net/gampad/jump?co=1&iu=/6978/reg_security/front&sz=300x50%7C300x100%7C300x250%7C300x251%7C300x252%7C300x600%7C300x601&tile=3&c=33aUnNllNhMxPmj56lBUKRRQAAAQc&t=ct%3Dns%26unitnum%3D3%26raptor%3Deagle%26pos%3Dmid%26test%3D0
[7] https://www.theregister.com/2022/08/02/antipiracy_messaging_piracy/
[8] https://www.theregister.com/2025/11/10/ai_country_artist_hits_number_one/
[9] https://www.theregister.com/2025/12/15/denmark_vpn_ban/
[10] https://www.theregister.com/2025/12/03/ai_has_made_ip_violations/
[11] https://www.pbs.org/newshour/show/musicians-push-back-on-dwindling-payments-from-streaming-services
[12] https://whitepapers.theregister.com/
I'm only downloading to train my AI
Sometimes, if I find an artist I don't like, I download one of their tracks and have it play on repeat overnight while I sleep, costing that artist untold millions of dollars in lost plays. I've stolen billions of dollars doing this, and I have no intention of ever stopping.
You are in grave danger .
Perhaps you have heard of the learning technique which involves being subjected to taped audio instruction whilst asleep? According to your remarks, you have exposed yourself to countless examples of inanity. That's in part from the trite characteristics of 'backing' music, but mainly from repetitions of so-called 'lyrics'; it's a mystery to me where the connection lies between 'pop' singers - many of whom make a virtue from their inability reliably to hit a note - and the notion of the lyrical.
Although your intention to deny income to would-be artistes (aka 'artists'), and even larger sums of money from the owners of the performing monkeys, is admirable, you should look to your own mental health.
That 300TB is all Never Gonna ...
The ultimate Rick Roll.
Spotify?
Who cares? It's Spotify. Who are well known for cheating musicians.
Re: Spotify?
ah, not only musicians. There is a certain blogger who also does some audio recordings of her works. Her works have been on Spotify, but she has not agreed to Spotify distributing them. Essentially, they stole from her (Girl on the Net, if you must know). I think Spotify does have some contracts with some labels, but they pay a pittance to the musicians (and sometimes, as noted, they just steal - pot, kettle, black).
It's a plebeian subset of music culture
Word usage in the article suggests the author and the spokesmen for Anna's Archive are focused upon the 'popular' music which is Spotify's staple (and profitable) fare.
There is talk about 'songs', this apparently meaning individual 'works'. There are 'tracks', generally in the 'pop' world a 'track' (originating from a throwback to the organisation of material on a vinyl record) contains one 'song' which is equivalent to the capacity of 'one side' of a standard vinyl 'pop single'. Mention of 'artists' rather than artistes .
When referring to music of lesser vapidity, one might categorise it by composer and performers. A broader consideration is archiving differing interpretations of a composer's works by varying combinations of performers. This latter almost cannot occur with 'pop' music for two reasons.
First, the transient nature of the 'pop scene' makes it unlikely other performers would want to disinter previously extant works.
Second, infernal copyright is deployed differingly between the genres. For substantial works, the musical score, and sung words, are protected; that including works long out of copyright by the expedient of claiming 'rights' over the typefaces of notation, layout, added information, and so forth; in addition broadcast/recorded performances are protected.
The bulk of revenue from 'pop' music comes from recorded performances. Live performance can be highly profitable, but serves primarily as a marketing ploy. For deeper genres, live performance is the essence. The spontaneity of performances by the same artistes on differing occasions, and of varying artistes (and combinations, as with a symphony orchestra) each offering their own interpretations, is the heart of the matter. Recordings could be said to help individual artistes and ensembles market their live performances. In this context, curated collections of differing performances of the same work are invaluable for enabling people unable frequently to attend concerts to sample spontaneity, despite most recordings being 'touched up' during (e.g. retakes of sections) or after (when processed into the master copy).
The Spotify collection, I believe, is encoded as MP3. That's adequate for popular music, which tends to lack nuance with respect to instrumental complexity and to dynamic range. Aficionados of other musical types are better served from catalogues containing multiple interpretations and high technical recording quality available at the time of performance. Of the various BitTorrent catalogues I have come across, RuTRacker stands out as best with regard to non-trivial music 'content', with respect to varying interpretation, and for offering, when available, choice among recording technologies.
Re: It's a plebeian subset of music culture
Artistes like clannad, fainne lasta, you like those artistes? Their music? Of course you do.
Re: It's a plebeian subset of music culture
I am not familiar with them, but they appear to be folk musicians. An acceptable genre, so long as not 'owned' by record 'labels'.
A Christmas present?
Please, may I have more down-votes? Make my day.
Artists
Most artists have to scrape by.