Observant 1 readers will have noticed that we here at Angry Metal Guy are a bunch of complete nerds. As the sysadmin, I am definitely one of the worst offenders. I’m also, like a suspicious number of the other staff, a trained research scientist (though in my case, I left academia and work in tech). On a completely unrelated note, I’ve been meaning to play with neural networks for a while, and a few months ago the idea of running some of the current generation of rather impressive image classification neural networks against album art occurred to me.
Metal album art is very tropey—in most cases you can guess a lot about what a record is going to sound like from the art. I was curious to what extent a neural network could be trained to pick up on those tropes. For those unfamiliar, an artificial neural network is a machine learning tool inspired by biological brains, and they’ve demonstrated successes recently in everything from playing board games to text generation to image classification—that is, given an image, working out what it contains. They can still be pretty easily confused by a lot of things, but current results are pretty impressive, and helpfully you can download a pre-trained general purpose image classification neural network from the internet and quickly retrain it for a specific task.
Now, as with any machine learning approach, the other thing we’re going to need is a lot of data. Fortunately, this very website has an archive of thousands of reviews with album art and other metadata attached, so I’m using that as the training data for my neural network. It’s very much not perfect—some albums are hard to classify, some are tagged with a whole pile of genres, and some are plain misclassified, so you’ll see some oddities in the data occasionally—but as long as most of it is accurate enough, we should be okay.
We’ll start simple: we’ll see if we can train the neural network to tell the difference between brvtal/trve and unbrvtal album art. I think most fans can do this pretty easily, so let’s try the computer. For the purposes of this test, we’ll use the genre tags from the reviews, and define brvtal metal as anything which is death, black, doom, or grind, and unbrvtal metal as anything which is heavy, progressive, power, melodic, folk, post-, thrash, or a couple of others (Don’t @ me about how trve your favorite genre is: this is the division that gets the best results.).
To my slight surprise, the neural network does very well here: with a data set of 3000 records split approximately equally into brvtal and unbrvtal categories, I got an accuracy of over 70%. The handy thing about the way this network is set up is that the classification result is a number between 0 and 1, where 0 is “not at all brvtal” and 1 is “completely brvtal”. This means we get a brvtality score and can ask it, say, what the most brvtal album art we’ve ever seen is.
Or the least. You’ll note that there’s a Jørn record in this set.2.
We can also check which records the network was most wrong about—where it classified something as extremely brvtal when it was not, and vice versa. The following albums it thought were brvtal while the tagging indicated they’re not.
These albums on the other hand it thought were not brvtal, while the tagging indicated they were.
Here, in I’m inclined to agree with the network’s calls rather than the label I got from the genre tags more often than not, especially on the albums it has identified as brvtal, so the network is doing pretty damn well.
Let’s try something a bit harder: can it tell death metal apart from black metal? I’m not convinced I can do that consistently. We’ll use the same scale as before, but this time 0 is death metal and 1 is black metal. Surprisingly, the neural network is also pretty good at this. It’s definitely harder, but my results are between 60 and 65% accurate. Once again we’ll get the most strongly classified albums first. These are the most black metal.
And these are the most death metal.
Definitely some recurring themes there. Here’s the records where it was most wrong in each direction. First, the death metal records it thought were black. Apparently Insomnium art is super black metal.
And now, black metal records it thought were death.
At this point I decided to try something harder. I think this was the original challenge from the Angry Metal Staff water cooler chatter: Can I teach the neural network to identify good records, as defined by the infallible Angry Metal Rating Scale? Unfortunately, the answer appears to be no. I played around a lot with how I partition the data and the parameters of the model, but I couldn’t get any results I was satisfied were significant.
I did manage to build one model which produced results that passed a naive significance test (p of around 0.023), but this is Angry Metal Guy, not some sort of major academic journal. We don’t profit off the unpaid labor of our contributors, and we won’t stand for p-hacking! 4
It’s also worth noting at this point that one of the other disadvantages of neural networks is it’s hard to tell where their results are coming from. What does the network identify in those album covers that mark them as brvtal? We can make some guesses by looking at the results, but there’s no obvious way of getting an actual explanation out of it.
Let’s focus on the successes, though. We now have a tool which can produce a completely unbiased, objective measure of the brvtality level of literally anything we can make an image of. Questions which previously were philosophical, debatable, or indeed entirely ineffable can now, through the power of computers, be definitively effed. All that remains is to decide what questions to ask. We could, for example, ask which Metallica albums are brvtal?
But that’s insufficiently ambitious, and I’m pretty sure nobody cares about Metallica anyway. We can ask anything. For example, a question I’m sure has been lurking unanswered in the back of everyone’s minds forever: which Tay-Tay album is the most brvtal?
Now you know. Taylor Swift: surprisingly brvtal. Or is Lady Gaga more trve?
I think that about covers music, so now we can get onto the important stuff: cats. There’s a great Twitter/Facebook account called Black Metal Cats. However, I began to suspect that a conspiracy might be afoot: what if some of the cats are, in fact, death metal cats? Fortunately, this is a question we are now equipped to answer. Here are the most black metal of Black Metal Cats.
Sure enough, though, here are some death metal cats. What a scandal! At press time, Black Metal Cats had not responded to our request for comment.
Finally, there’s only one thing more important than cats: Which Angry Metal Guy staff member is the most brvtal? 5
And with that question settled, my work here is done.
Technical notes (the rest of you can stop reading here): The above very serious research uses fastai and pytorch for the neural networks and Jupyter notebooks for data presentation. The network used is the pre-trained ResNet-18 model. Significance of results was checked with a simple chi-square test on the binary classifications—both trained networks presented are at least five sigma. My code and data is on Github and is presented without warranty, and with the warning that this is research code, so it’s terrible and poorly documented.