For as much rage as is often directed at it, much of the process behind Metacritic, the popular game review score aggregation site, remains frustratingly opaque. A group of researchers at Full Sail University sought to shed some light on the service, with results from their study presented by course director Adams Greenwood-Ericksen at GDC 2013.
Greenwood-Ericksen says his team started by reading various criticisms of Metacritic around the web, much of which boiled down to a single issue: All of the value of a game is boiled down to a single number, which in turn is used by some publishers and studios to determine bonuses, hiring, firing and more. A strong streak of review scores hitting Metacritic can even impact stock prices.
Greenwood-Ericksen noted that there is some transparency on Metacritic's process buried on its website but also some frustrating ambiguities. Most notoriously, Metacritic openly weights each publication's score based on "their quality and overall stature," but that weight is not published. There are also some issues that Metacritic doesn't take into account, such as reviewer bias, the influence of previous entries in a series and missed or untracked scores.
One interesting untracked data point on Metacritic is how reviewers are influenced by other published scores. The Full Sail team's studies revealed that variability in scores collapses after the first day. In other words, there may be room for a wide variety of scores on the first day that a review hits, but once a number of reviews for a game are already up, other reviews tend to fall in line with that initial range.
Greenwood-Ericksen also noted that converting a wide range of review methods into Metacritic's 100-point scale can be messy. For example, some website use a scholastic rating system (A through F), but that system only uses half of the available scale in schools — anything below 60 is an F. When a game review site uses that scale, Metacritic needs to determine if an F equals a 60 or a zero, and the original site's intent isn't always clear.
While its weaknesses are clear, Greenwood-Ericksen and team wanted to determine if there is value to the Metacritic rating in spite of that. There's no reliable way to translate quality to quantitative data, so instead they measured a game's economic value by comparing sales to scores.
The results? It turns out that the Metacritic score is a great indicator of financial success. Sales rise sharply for games listed as 83 or higher on Metacritic. Greenwood-Ericksen determined there was a .72 correlation between the Metacritic score and sales, which is considered extraordinarily high — though he emphasized that this means the two are closely related, not that one directly causes the other.
After determining that Metacritic has value in indicating economic success, Greenwood-Ericksen and his team wanted to figure out the mysterious publication weights that Metacritic will not reveal. He noted that they determined this using publicly available information; it's not official or confirmed by Metacritic itself, but the data they uncovered seems to be reliable.
Greenwood-Ericksen discovered that a very heavily weighted publication, such as IGN, has as much as five times more weight on the overall Metacritic score than a low-rated publication. The highest-rated publications on his list included IGN, GameTrailers, Game Informer, New York Times, and Wired. Lower-rated publications included many fan websites but also a few major publications such as Giant Bomb and Official PlayStation 2 Magazine U.K. You can check out the full list of publications and their weights at Gamasutra.
(Note: Full Sail's data was cut off at around 18 months ago. As such, Polygon was not included in its results.)
Greenwood-Ericksen concluded his presentation with an oft-used phrase about the United Nations: "If it didn't exist, we'd have to invent something like it." He says his team at Full Sail is working on presentations digging deeper into some of Metacritic's problems as well as some potential solutions to help improve its usefulness, but these initial results speak for themselves: Whatever problems it might have, Metacritic serves a purpose.