Nine Years of Angry: A Retrospective in Numbers

Angry Metal GuyWe here at are a bunch of nerds. Writers at, both past and present, hold a surprising number of degrees and we get pretty nerdy about the site. We haven’t quite done diffusion analyses of opinions among writers, but we got dangerously close a few years ago. As one might expect, it means that we have a pretty nerdy readership and one such nerd just dropped a bunch of stats on us. We couldn’t help bask narcissistically in its glow and it being written, we thought that we should definitely pass it on to you to enjoy. But the beauty of this piece isn’t just that you can enjoy it. The beauty of this piece is that you can lord it over fans of other websites, showing that we don’t just rubber-stamp everything. And with Beelzebub as my witness, we’re going to hold on to that moral high ground as long as we continue to get treated like we’re a Blogspot instead of one of the most prolific and well-read metal review sites on the Internetz.1

We chose the 28th of May 2018 as the day to publish this stuff, then, because today is the nine year anniversary of the existence of Angry Metal Guy and As crazy as it might seem, we have lasted for a much longer time than I think anyone could have ever guessed when I started the site. I am grateful to all the people who work here and who have worked here during these nine long years. First and foremost, Steel Druhm, who has been a stalwart companion through the vast majority of the site’s existence. Second, the team who runs things now, made up of Madam X and Dr. A.N. Grier, in addition to Steel Druhm, who valiantly keep things moving from day-to-day. Third, all of the writers who have contributed through the years and whose contributions can be seen mapped out below (see: Fig. 2). Fourth, contributors like the guys over at Metal-Fi, who have made this blog both better and more unique and just companions like No Clean Singing and Metal Bandcamp who have been big supporters of what we do. Fifth, to all of the labels and bands who have entrusted us with promo materials, particularly in the early days before we got to the point where we had such a huge and dedicated following. We’ve had some great promo people in our time and excellent contacts with them. And lastly, but hardly least, to the readers of the site, who have helped make a ridiculously awesome community o’ metal nerds. Thanks!

Author: tbwester

A few years ago, a friend of mine, who you may know as Kronos, began writing album reviews for a little site which you may know as Angry Metal Guy. I started reading AMG regularly after that, and have enjoyed following the site ever since. The staff here are superb writers, and the community that’s been built around the reviews, made manifest through comments, often has uniquely thoughtful insights not found on your typical music critique website.

Praise aside, I am a man of hard science, and damn it, I want to be sure that whatever enjoyment I’m getting out of this site comes from 100% farm-fed, home-grown and statistically unbiased methodology. To this end, I decided to look for clues in the cold, sterile numbers of AMG’s history. The following report details my findings.

The Data

I collected information from each review posted on the site back to the “first” review of 2009 (Amorphis // Skyforger). There is one review from 2004 that was posted and scored (Orphaned Land // Mabool),2 but I did not include it in the main body of works due to its temporal isolation from the rest of the articles. I kept track of the each review’s author, tags, score, and publication date. In total, I found over 3,300 reviews of scored albums (i.e. excluding unscored albums and special reviews like TYMHM, Yer Metal is Olde, live reviews, etc.), which, as far statistics go, is a gold mine for analysis.

Most of the information from each review can be found in its webpage metadata. Specifically, I used the official scores listed in each review page’s tag. Some care had to be taken here since not all reviews have this tag. In these cases, I had to fill the scores in manually, but in total there were only about 80 of these articles.

I also found three albums that were scored twice: Resumed // Alienations, Deafheaven // New Bermuda and Leprous // Malina. These reviews are added twice to the data set, but, as you will soon see, their scores are averaged together in the analysis.

Fun Facts

There have been 23 reviews that have been deemed “perfect” (score = 5.0) on this site. I show their historical occurrences on the timeline below, along with the total number of reviews posted each month. One thing to note is that the density of 5.0 albums seems to have decreased in recent times. I’ll discuss this trend more thoroughly in the next section.

Figure 1. The Historical Occurrence of 5.0 Scores over Time and the Frequency of Reviews

Figure 1. The Historical Occurrence of 5.0 Scores over Time and the Frequency of Reviews

I also wanted to see the comings and goings of site staff. I show the number of reviews posted by site reviewers on the heat map below (brighter colors mean more reviews posted by a reviewer in a particular month). A surprising fact for me was that clearly site staff are recruited in groups, as indicated by the blocky nature of reviewers popping up out of the sea of nothingness. Most regular contributors write a few articles per month, with the exception of one over-achiever…

Figure 2. A heat map of review frequency over time

Figure 2. A heat map of review frequency by a given author over time

Now, the age old question: What’s the average score awarded on AMG? Below I show the distribution of scores for all 3,300+ reviews, and you can see that it is slightly skewed compared to your off-the-shelf bell curve. The average score is 3.07.

Figure 3. The review of's scoring curve. Eat that metal review sites I secretly scorn!

Figure 3. The review of’s scoring curve. Eat that metal review sites I secretly scorn!

You might see this and think “Wow, that distribution is skewed! 3.0 is far above a fair 2.5 average!”3 Well, astute reader, read on and you’ll soon find out that things become more interesting when looking at trends over time.

I also looked at the score distributions for some common genres I see frequently listed in reviews. I present a table of “average scores for albums tagged with [genre]”. Now, genres are definitely a bit fuzzy, and any specific album may appear in multiple categories if it contains multiple genre tags. Plus, some genres are underrepresented on this site. Bear these caveats in mind as you feast your eyes on the following table:

Reviews tagged with… Average score
Death Metal 3.10 +/- 0.04
Black Metal 3.11 +/- 0.04
Doom Metal 3.16 +/- 0.05
Progressive Metal 3.24 +/- 0.06
Folk Metal 3.32 +/- 0.09
Thrash Metal 3.01 +/- 0.06
Heavy Metal 3.09 +/- 0.05
Hardcore 3.11 +/- 0.11
Power Metal 3.00 +/- 0.07
Hard Rock 2.94 +/- 0.11

I think in general we can see that all averages are roughly the same. Quantitatively, they are all within one or two uncertainties (the number after the +/-) of each other. The uncertainties are proportional to the number of albums that contribute to each average score. For example, we have a lot of death metal albums, so we can be pretty confident in the average. In contrast, there aren’t many hard rock reviews, so even though the average is lower, it is inherently not as precise of a number.


With nine years of data, there certainly must be trends to uncover. Below I show a plot of average monthly scores awarded by all reviewers over the entire site’s history. Immediately obvious is the downward trend. To be more quantitative, I ran a weighted linear regression, using weights proportional to the total number of reviews in each month (represented by error bars). The regression result is the red line in the figure, and indeed it has a downward trend that is statistically significant.4 One could interpret this in a few ways. A less-likely interpretation (albeit more entertaining) is that the AMG staff are perfect arbiters of musical quality, and that music has simply been getting worse over time. A more plausible interpretation might be that the AMG staff have refined their expectations of good albums over the years and are indeed approaching the coveted unbiased 2.5 average, as one might expect.

This plot also reveals what may be the worst month for music on AMG. In December 2013, I found 5 reviews, all with scores less than 3.0. They are: Netherbird // The Ferocious Tides of Fate (2.5); Alcest // Shelter (2.0); Thrall // Aokigahara Jukai (2.0); Boston // Life, Love & Hope (1.0); and Sheol // Sepulchral Ruins below the Temple (2.5). Ouch.

Figure 4. Average monthly review score over time

Figure 4. Average monthly review score over time

Something else we can do is look for genre effects. In other words, do the AMG staff favor particular kinds of music? This might be worth knowing if you, the reader, are particularly concerned about staff favoritism on your review site. I investigated this question by separating the average score versus month data above by the most popular genre tags used in the reviews (Death Metal, Folk Metal, etc.). I then performed the same linear regressions on each of these subsets, and I plot the fitted slope (i.e. change in average score vs. time) for each genre below.5 Data points below the blue line trend to lower scores over time, while data points above the blue line trend to higher scores. You can also see the total site trend as a red bar (from the slope of the line in first plot) on this axis too. Once again, the error bars are proportional to the number of reviews of each genre.

Figure 5. Change in average score per month by genre

Figure 5. Change in average score per month by genre

If AMG were biased towards particular genres, you might expect the average score to decrease due to one or more genres, but stay the same or increase for the favored genres. I am sorry to report the somewhat boring conclusion that there is no evidence of any bias evident from this plot.6 Not only do all of the genres (except one—which I will discuss momentarily) trend down, they are all trending down at roughly the same rate. This last point can be quantified by looking at the overlap in error bars between all genres. Hardcore, while trending upwards, has very few reviews associated with it, so its error bars are large, and furthermore, its error bars overlap with 0 (i.e. no trend).

Trends are nice, but patterns are better. One natural thing to look for in a data set like this one are so-called “seasonal effects”. A seasonal effect in this context might be: Do AMG reviewers give higher scores during the [summer/fall/winter/spring]? To address these questions, we can look at the autocorrelation of the data set. Autocorrelation is a measure of the extent to which an individual data point depends on previous data points in a series. I plot the autocorrelation function (ACF) below for the de-trended AMG‘s average monthly scores,7 as a function of the lag (i.e. the number of months back one looks). If there were repeating patterns, one would see spikes in the autocorrelation around a lag of 12, which would mean that average scores from one year are correlated with average scores from the same season in the previous year. All of the values are low and random; anything above 0.2 might otherwise be cause for investigation, and the spike at lag 0 is not relevant. We can therefore conclude that there are no seasonal trends in AMG‘s review process, so if you are a band looking to submit material for a review on AMG, rest assured that your timing will not affect your score.8

Figure 6. Statistical magic in action. Wow.

Figure 6. Statistical magic in action. Wow.



Delving into the AMG archives has convinced me of one thing: The AMG reviewers are very self-consistent. There is no evident favoritism towards particular genres, and scores seem to be not inflated or skewed when looking at recent years. Furthermore, over the years, the consistency is increasing. I suspect in the next few years, we will see the average site score each month level out to around 2.5. [see AMG‘s response to this in footnote 3 below. – Ed] At that point, it will be interesting to look at above-average- and below-average-score months to get a robust sense of the metal scene. All this said, I can now sleep easy knowing that my beloved metal review site appears satisfactorily and statistically fair.

The original post for this article can be found—unfootnoveled—by clicking here. And we thank the author tbwester for his contribution to our humble little site. 

Show 8 footnotes

  1. No offense to Blogspot owners, I’m sure you’re all very nice. – AMG
  2. This is a backdated review from when I was but the student and Al Kikuras was the master, I also posted a few other reviews that I had written for forums and stuff, to just fill out content early on. – AMG
  3.  One note of caution that I would like to add about our rating curve. While the 2.5 average would make the site ‘normal,’ it would probably actually indicate that we’re scoring low. There are two reasons for this. First, we should anticipate a positivity bias in our ratings. We are not attempting to be purely ‘objective’ in our ratings by grabbing random albums that we don’t have any indication whether we’ll like or not. Instead, we choose albums we expect to like. Even when we assign random albums, we do so with the assumption that the listener will have the possibility to enjoy the album. For example, I reviewed Amaranthe albums anticipating the possibility of enjoying them (or, well, at least the first one). I like power metal and I’m not opposed to cheese, either. Alas. Second, I think it is fair to assume that labels still serve the purpose of weeding out bad records and bands. By going through a selection process of which band to sign and support, the majority of bands on labels should be assumed to be talented enough that if given to someone who is a fan of their particular style, one should assume that the average album being released by the industry should be a 3.0. We can, therefore, have a debate about whether we should normalize the scale to quality—that is, the base talent assumed makes “good” records a 2.5/5.0 on our scale—but we don’t. Therefore, assuming that A) there’s a selection bias of what gets released and B) there’s a selection bias in what gets reviewed, a curve this close to ‘normal’ would suggest to me that we are firm but fair scorers and generally consistent over time. – AMG
  4.  If this trend continues, in 40 year’s time, AMG will give all albums scores of 0, which is a humorous yet cautionary tale about the dangers of extrapolation.
  5.  This is addressing “Simpson’s Paradox”, which isn’t really a paradox, but describes instances when a regression over an entire data set trends one direction while subsets of the data trend another. This is often the case when there are inherent differences in groups of data one hasn’t considered, e.g. data sets containing men and women, or, metal album genres.
  7.  Detrending (or looking at the regression residuals) is important here since the whole data set trends downward. You won’t see seasonal patterns in the ACF plot looKing at the raw data, because the slope means that each data point is necessarily correlated with the previous point. We want to see if there’s any additional patterns in spite of the overall trend.
  8. Another interesting thing here is that record labels tend to chunk big releases at certain times of the year. For example, they tend to not release lots of big new releases during the summer festival season. Does this mean that we don’t score big releases better than small ones? That wouldn’t surprise me. – AMG
« »