How I Curated the Music Pool: From 500 Hours of Archive.org to 60 Tracks

Before the migration to Epidemic Sound earlier this year, the entire music pool for the 24/7 stream was Creative Commons material curated from Archive.org and a handful of smaller open-license archives. The pool peaked at around sixty-four tracks. To get to those sixty-four, I listened to roughly five hundred hours of source material across about eight months. Most of it did not survive the cut. This essay is about how the cut worked, what I learned about what does and does not function as background music, and which artists deserve every play they get.

I am writing this in part because the curation process turned out to be more interesting than I expected, and in part because the artists whose work carried the stream for months deserve named credit somewhere on this site — even now that the live pool has moved to a commercial license.

The starting funnel

Archive.org’s music collections are the largest publicly-accessible repository of legally redistributable music on the internet. The Free Music Archive (FMA), the Internet Archive’s Live Music Archive, the Netlabels collection, and the various creator-owned uploads cumulatively contain hundreds of thousands of tracks. The legal status varies: some tracks are Public Domain or CC0 (no rights reserved), some are CC-BY (attribution required), some are CC-BY-NC or CC-BY-SA (with various other requirements). For a commercial-adjacent project like a streaming channel with ads, only CC0 and CC-BY tracks are reliably safe to use, and even within those there are edge cases (some artists later removed their CC license, some files were mislabelled by uploaders).

My funnel looked like this:

Start with all CC0 and CC-BY tracks from a few known-good uploaders (Monplaisir, Komiku, Loyalty Freak Music, Scott Buckley, and a long tail of smaller artists who had uploaded under matching licenses).
Discard anything with vocals in any language. Background music needs to be instrumental; lyrics compete for attention with whatever the listener is reading or writing.
Discard anything tagged as “battle,” “boss,” “epic,” “trailer,” “action,” or any other label indicating high energy. Lofi-adjacent ambient is the target; orchestral chase music is not.
Discard anything with a tempo over approximately 110 BPM. Faster tracks read as foreground content rather than background.
Discard anything with abrupt dynamic shifts. Music that goes from quiet to loud is great for cinema; it is awful for a study soundtrack that has to sit at the same perceived volume all day.

The funnel cuts a vast amount of material. From the roughly five hundred hours that survived the initial keyword filter, perhaps eighty hours survived the manual listening pass. From those eighty hours, the final pool of sixty-four tracks was what passed the “could I leave this on for six hours of work without it once becoming distracting” test, which is the only test that actually matters.

The artists who carried the pool

A handful of artists ended up over-represented in the final cut, not because I was lazy in curation but because they happened to produce exactly the right kind of music for this use case. They deserve naming.

Monplaisir is the alias of a French composer who uploaded several albums of slow, melodic, gently-melancholy piano and acoustic instrumental pieces to the Free Music Archive under CC0. The release titled Chill and ambient songs from Monplaisir & cie’s projects under Creative Commons 0 and Public Domain contributed roughly a third of the surviving tracks at peak. Monplaisir’s compositions are unusually consistent in mood: there is no track in the entire collection that I would not pair with study work, which is a remarkable level of mood discipline. If you have time, his discography on Bandcamp and Archive.org rewards attention.

Komiku is a similar story. French, prolific, CC0, produces game-style and ambient instrumental music with a specific knack for unobtrusive melodies. Some of Komiku’s tracks were originally written for hypothetical video game soundtracks, but their ambient pieces — particularly the slower, piano-forward releases — work perfectly as study music.

Loyalty Freak Music uploaded several albums of CC0 instrumental music across a range of styles (lofi, ambient, soft electronica). The output varies more than Monplaisir’s, but the best Loyalty Freak Music tracks were among the most-played in the pool.

Scott Buckley is an Australian composer who releases under CC-BY. His output skews more cinematic and orchestral than is strictly ideal for background study music, but his slower piano-and-strings pieces fit the brief perfectly and added some of the most distinctive tracks in the pool — pieces with enough emotional weight to be noticed without being so dramatic that they pull attention.

These four were the load-bearing artists. Beyond them, the long tail included a few dozen smaller contributors — individual album uploads from artists whose names I had to recover from the metadata, often with no other public web presence. For those, the attribution was preserved in our stream title rotation while we used CC-BY material, even when the artist seemed to have no easy way to be contacted.

What does not work as background music (the hard-won list)

If you are ever in the position of curating a music pool for a long-form stream, podcast background, or study playlist, here is the list of categories that sound like they should work but do not:

Anything with prominent piano arpeggios. Music theorists call this Alberti bass; it is the rolling left-hand pattern in a lot of Mozart and Scarlatti. It sounds calm at first but turns out to be very attention-grabbing over long periods because the predictable repetition starts to feel mechanical and your brain begins focusing on the pattern rather than your work.

Music labelled “study” or “focus” by the uploader. These are often the worst tracks in any given collection because they have been written to a brief — “needs to sound like study music” — rather than emerging from a genuine creative intent. The brief-driven tracks tend to be generic, formulaic, and weirdly hollow. The good tracks for studying are almost never labelled that way; they are just calm music that happens to also work for studying.

Music with sudden silences. A pause in the middle of a track is dramatic in a foreground context and disastrous in a background context. The silence pulls attention every time, and after a few hours of work it becomes anti-productive. Tracks with very gradual dynamic changes and minimal abrupt transitions are the only ones that survive long-form use.

Tracks under two minutes. Short tracks force more frequent transitions in the playlist, which means more frequent moments where the music changes its character — exactly the moments where the listener is most likely to be pulled out of focus. The pool eventually skewed strongly toward tracks in the three-to-six minute range.

Tracks over eight minutes. The opposite problem: very long tracks tend to drag in their middle sections and become noticeable through fatigue. The sweet spot is medium-length tracks that resolve gracefully into the next track without either dragging or jolting.

Anything with prominent vocals, even non-semantic ones. This was a surprise; I expected non-semantic vocal samples (the “oohs” and “aahs” of ambient music) to be safe. They are not, at least for verbal work. Even non-semantic vocals recruit some language-processing circuitry and become subtly distracting during reading or writing. The pool eventually stripped to almost entirely vocal-free instrumentation.

The metadata problem

A separate hard problem of curating music from Archive.org is that the metadata is wildly inconsistent. Tracks come with embedded ID3 tags that sometimes match the actual track, sometimes match a different track from the same album, sometimes contain album art that crashes ffmpeg’s concat demuxer, sometimes have track lengths that disagree with the actual file by several seconds. Roughly a quarter of my eventual workflow was metadata-cleanup scripts: stripping album art (which solves a specific ffmpeg bug), re-encoding to consistent bitrates, normalising perceived loudness across the pool with loudnorm, and rewriting ID3 tags to match a clean source-of-truth dictionary I maintained in YAML.

The metadata-cleanup work is the part of music curation that nobody talks about and that almost nobody enjoys. It is also where the bulk of the actual hours went. If you ever try to do something similar, budget five to ten hours of cleanup for every ten hours of music you survive the listening pass with.

Where we ended up

In early May we migrated the live stream’s music pool to a commercial Epidemic Sound subscription. The pool is now around four hundred tracks, all licensed for streaming on our YouTube and Twitch channels, and the variety is dramatically wider than the Creative Commons pool ever was. The migration was the right call for the project — the variety reduces the cumulative listener fatigue that the smaller CC pool was starting to produce — but it does change what the project is, slightly. The first year of the stream was carried by named open-source artists; the second year is carried by Epidemic Sound’s commercial roster.

This is one reason I want to keep this essay on the site. The names matter. If you have ever heard a track on the 24/7 stream and wondered who made it, there is a real chance it was Monplaisir, Komiku, Loyalty Freak Music, Scott Buckley, or one of the long-tail contributors who built the foundation of the project for free. They have not been replaced exactly; they have been moved to a quieter shelf. They are still the reason the channel exists.

If you make music in the calm-instrumental space and you want to release it under CC0 or CC-BY for projects like this to use, the Free Music Archive and Archive.org are still the obvious places to do it. We will probably never go back to a fully Creative Commons pool — the operational complexity of curating one is higher than I can sustainably maintain alongside the rest of the project — but other projects will, and they need that source pool to keep growing.

— Dario

This site is 100% free and stays alive thanks to non-intrusive ads. If you've found it useful, please consider disabling ad blockers for lofistudy247.com — it helps us keep generating new wallpapers.