Metadata Madness (Music)

It's all in the semantics

Jan 22, 2022

Over a decade ago, Douglas Rushkoff wrote a book titled Program or Be Programmed. One of the blurbs was "The debate over whether the Net is good or bad for us fills the airwaves and the blogosphere. The real question is, do we direct technology, or do we let ourselves be directed by it and those who have mastered it?"

I have been "coding" since I designed my first web page in 1998. The pervasive feeling at that time was that the internet would give us control over whatever we perceived to be controlling us, and give us freedoms we didn’t have before. Everyone wanted to have a website and we painstakingly designed them on our own.

But 12 years after Rushkoff's directive, who really likes coding when what they're doing doesn't necessarily require that skill? If you're going to write a book, why write it in HTML or XML? If you’re writing a song, why be concerned with the key signature, let alone write it out? Writing is understood to be a "WYSIWYG" activity, but perhaps we should be writing in a markup language (or music notation) because we have more control, or at least understand what is going on under the hood. Even here on Substack, the ability to write in HTML is very limited, such as font color, something people loved in 1998. I recall conversations about web design that was more like interior decorating or making a film. It was all about the aesthetics, and WYSIWYG software like Microsoft’s FrontPage or Macromedia’s Dreamweaver was the ultimate in DIY. But that DIY is now AI and neural nets, primarily designed to help people find the music they like through collaborative filtering, now a technology over 20 years old. While we want to be disabused of code, content creators should know something about it and have a way to report errors.

What has happened since 2010 (and even back to the late 90s) is that creativity itself can now be largely driven by data sets. Think Beethoven's Tenth Symphony made with AI: How can we verify that the data was purely Beethoven and not intermingled with Haydn data? And if we could, why should we? Does it matter if it’s all classical period material anyway? Why go through the trouble to verify it?

Musical Data Points

Once you release and distribute digital music, bots are scraping the musical metadata from the top-level distributors (Spotify, Apple Music, Amazon) without input from the composer and/or publisher (as far as I know), and in most cases is inaccurate. Spotify generally uses what is called the “raw audio model” which essentially crawls the web to find descriptors in natural language, but in the absence of standardized data, such as “key signature” it’s going to use the raw data and create fields that primarily supports search such as “acousticness”, “valence”, danceability, and so on.

For example on SongData.io, the key of my ambient piece Atacama is set at Eb Major. Since it's ambient music, it's not in any key, but could be Eb Minor. It was written to be a sonic abstraction of the Atacama Desert, and is not a “song”. The tempo is listed at 139. Again, it's ambient and has no rhythm, tempo, or meter. The "acousticness" was set at 100%—but I didn't use any acoustic instruments. Classical music also has an acousticness of 100% because it doesn't typically use electronic instruments. But my piece is 100% electronic. I'm not sure what "liveness" is but nothing was recorded with a microphone, and was not done live. Danceability is 14% (I'd love to see this choreographed!). “Instrumentalness” is 98%. That's correct because it doesn't have lyrics. But I wonder what part of the 2% was perceived as having words? It’s an interesting thought experiment to take all the metrics, then have it “compose” music based on scores. I suppose that’s the upshot of all this—both interesting and depressing. The only creativity is running the numbers and scoring the results. It’s very much a gaming activity, something I’m personally opposed to because it’s outside what we have known to be a function of the humanities. (Whereas computer science has never been considered a part of the humanities, perhaps it is moving in that direction. If you want to be a designer in the Metaverse (like a Web designer circa 1998), knowing how to code might be essential, as it eventually became an essential part of web design. I’m not averse to doing things that are considered “techy” because it’s become essential).

[In terms of metrics, a book I recommend is The Tyranny of Metrics. In my view, ratings attached to various data points misrepresents music as being long-durational and existing in larger contexts, like albums. If Joni Mitchell’s Blue was released in 2022 its metrics would probably kill the spirit of it. No one would ever get to savor its emotionality—unless it had high “valence” first set by algorithms].

Sometimes it is a matter of incorrect linking. For example, on Shazam, my solo piano piece Naturally in Nature is linked to a totally unrelated video of a completely different work on a different album, Sea of Horizontals, a solo bass piece, not piano. https://www.shazam.com/track/461233631/naturally-in-nature That’s not uninteresting in itself in terms of creating “collisions” or happy accidents. But in my experience, data accidents are usually not happy and result in misinformation or misattribution.

Google Search is notorious for indexing gaffes. Since there is a Lee Barry whose genre is Dance/Electronic, that’s where I get filed, when in fact most of my work is Ambient and Prog Rock, if not Jazz.

XML Schemas

What we’re now starting to see is the Semantic Web, one of the early ideas for the Web. The way in which Google categorizes songs and albums (“From sources across the web”) is an example of semantic data. But the problem is that it is done entirely with machines and is unreliable. Humans aren’t checking for accuracy. They simply can’t because of the sheer volume of data. (AI could save the day, but who’s checking the AI)?

A few days ago someone asked the question on Quora if there was a way to identify songs written in odd meters, 5/4, 5/8, 7/8, 11/4, and so on. I said it would be impossible given that this data has never been codified in a database. But one could presume that pop music written in odd meters was more common in the 1970s through the early 1990s, and then make some kind of rough estimate. But you’d never have enough good data to run machine learning on it. There are ways of adding metadata (such as Kid3), but as far as I know, it doesn't include musical rudiments and is Linux only, and would generate so little data. It would be more accurate data, but insufficient to model it.

I was thinking perhaps Music XML could be used.

XML looks complicated but is quite simple considering how complicated music notation can appear. It's very basic code but doesn't get parsed by browsers. One of the ways to encode proper musical schemas is by using the Music Ontology Project code, but it doesn’t [yet] have a way to code in key signatures, meter, and tempo, as far as I know. Currently, it’s being parsed by machines and is unreliable or simply not standardized, as is Spotify’s raw data model which makes rough guesses about keys.

This is why I like standard music notation so much because it is such a good Standard. One “models” it by simply knowing it. Knowing goes a long way, and so obviates modeling. Music is in fact “code” but becomes “code” only in the sense that it can be finely tweaked to create the desired result.

Takeaways:

For creators, check the metadata, and if possible, correct it on your own personal website. (Program Or Be Programmed)
Sites that use machine learning should verify data points before releasing them into the wild by having the publisher sign off on them.