Preface: I wind up wanting to write this article each time I check out the reviews for a game that I like that has recently been released. The game in this case is Lucifer Within Us and I was given a press key from Kitfox Games for promotional purposes. As much as I enjoy the game, and as cool as the people at Kitfox seem to be, a promotional key buys my coverage, not my opinion. I mention this because while the point is not limited to that particular game, this is probably the kind of thing that is best disclosed up front.
Steam’s review system is terrible and is terrible in a way that maximizes the misery for almost everyone involved. The only case I’ve really seen in favour of the Steam review system is that Epic doesn’t have user reviews. Humble reports the Steam reviews, Epic uses professional reviews, and Itch.io, Uplay, Battle.net, and Origin don’t report reviews at all. The only other storefront that has user reviews is GOG, which offers a 5 star scale. The Steam system is unique in minimizing the amount of information conveyed while maximizing the costs to developers for the lack of information.
A user with a recorded playtime can submit a review with a ‘recommend’ or ‘not recommend’ rating, a mandatory text entry, and the option to enable comments on the review. Reviews on Steam are aggregated into two ratings, one for the past 30 days, and one for the product’s lifetime. All reviews can be read on a game’s store page and filtered over different characteristics (including reviews of the reviews, which Steam calls helpfulness). The aggregates of ratings from Steam purchases fall into the following categories: overwhelmingly positive, very positive, mostly positive, mixed, mostly negative, very negative, overwhelmingly negative.
An important threshold is when 70% of reviews from Steam purchases are recommendations as this will produce a ‘mostly positive’ rating with friendly blue text, while falling below this value will produced the dreaded orange (or even red for lower tiers) text. A handful of developers who have investigated and reported on improved sales when moving up in review categories, but the most consistent and notable observation is the substantial drop in sales when moving into the ‘mixed’ rating. Only Valve will know the full effects of Steam’s review system, but the fear from this drop in sales has embedded itself enough to provoke trigger warnings about orange text during talks.
The first major problem is the idea that an aggregate of recommendations will reflect the quality of the game in the first place. One appeal of independent and low budget games is that they can afford to take risks and appeal to a narrow audience. Unfortunately, this type of game is the most likely to be penalized through aggregation. Niche games do not appeal to an average player by definition and so, without interventions in the form of marketing or recommendations, these games will be poorly rated. The fact that an average player won’t like the game is informative in a way, since not everyone wants to spend time looking through niche games to find something they really love, but this information does not merit the prominent placement that Steam gives it.
The problems posed by the niche game example show how Steam’s treatment of reviews is at odds with the general direction the storefront has been heading. Steam has been trending towards the kind of advantages that are present in gaming subscription services like EA Play. Subscription services have a low search cost for finding a new favourite game (essentially a couple of hours in games that didn’t connect), and the reported increase in playtime indicates that the quality of play is increasing. Steam’s own search costs are low (maximum two hours and some minor inconvenience of requesting a refund), and the introduction of Steam Direct was specifically designed to create a deep pool of games in which to conduct that search.
The review system as it currently stands runs counter to the spirit of opening the storefront to all kinds of titles. Given that a status of ‘Mixed’ or below will reduce sales (which can still be refunded within 2 hours), fewer people will experiment with a game that did not immediately find its audience. The aggregation of reviews also introduces risk for other marketing methods that should be suited to niche games like free weekends. Users will accurately report that they do not recommend the game, but repeating this fact ceases to be informative past a certain point. The review system rewards games that are just appealing enough to push over the recommend threshold to the largest number of players, not ones that make bold but polarizing decisions.
Valve’s treatment of which reviews have standing also runs counter to an accurate portrayal of the quality of the game. If the argument is that a larger number of reviews will wash out the individual quirks of any given review, it doesn’t make much sense to turn around and say that reviews from purchases off the storefront do not count towards the aggregate. It is possible that Valve does this as a form of ‘verified purchase’ to prevent review manipulation on the part of a developer, but it appears they’re able to identify the source of a given key (for example, if it came from Humble) and so it is hard to believe that this is the only form of prevention. It is also inconsistent with how Valve treats negative reviews. Steam will report when there is unusual review activity occurring, but it is left to the user to identify the cause and interpret the chart. If the inclusion of reviews from buyers outside of Steam creates such a threat of distortion, presumably an audience that is discerning enough to parse through a spike in negative reviews can do the same with positive reviews. The fact that crowdfunding supporters, players who buy directly from the developer, or other sources of external keys may be inclined to review the game more favourably shouldn’t be disqualifying on its own, but this assumption isn’t even supported by experience. Backers lose no time in expressing their displeasure with a finished product that isn’t what they were expecting, and Steam reviews are one of the safest places a reviewer can express their true opinions of a game (in contrast to a YouTube review which carries the threat of vindictive copyright claim). An irony of the Steam review system is that it is one that eliminates professional opinions from its most prominent measure of game quality.
This brings me to the second major problem with the Steam review system. The content of the reviews is terrible. I would not trust the collective user base of Steam to get a pizza in the oven let alone inform me about my entertainment options. There are a few genres of reviews that include stylized templates that nobody has the time or inclination to read but maybe act as a dress rehearsal for a job at PC Gamer, to the guy who’s obviously mad the main character’s a woman but thinks he’s put in some clever subterfuge, and the reviewer who is oh so very knowledgeable about game development and is going to school the creator on the basics. Since a written review is mandatory, it’s up to the user to navigate the corpus using Steam’s provided filters, assuming they want to expose themselves to brain poison. Steam’s own view seems to be that the recommendation is for the buyers and the text is for the developers. The documentation cites user reviews as one channel of feedback among many for developers. While this is true, the system is clearly intended for more than developer feedback since it doesn’t explain why the aggregate of recommendations is prominently displayed.
Valve is even more specific in its positioning of the review system for developers. They state that the reviews reflect how well the game meets players’ expectations. This is the real problem I have with the review system. It is clear to see how often these expectations are often unreasonable and have little to do with the game. Lucifer Within Us was sitting at Mixed on release (and currently at the 70% mark) with at least 10% of the reviews saying the game is good but they disagree with the price (and in compiling these numbers I was fairly restrictive on choosing ones that had no other substantial complaints, though this percentage represents a fixed point in time and will not age well). These are not informative reviews. The price is known to the user considering the game, and these reviews are not going to be updated if the game gets discounted. The propensity for a user to give Lucifer Within Us a chance based on its quality quite literally hinged on people who felt it was a quality game but did not report it as such because they felt their preferences for price were universal.
Overall, the current form of the review system seems to take the worst parts of each of its components. The part for the players either forces them to dig deeper to decide if the rating is justified or leaves them with a false impression of the quality of the game. The part for the developers has the authors of the feedback labouring under the assumption that other players care about their opinions beyond the thumbs up. The most prominent measure of quality penalizes niche choices, even as the storefront purports to be lowering barriers for entry. Qualitative judgements are difficult, even when you’re not trying to produce an automated system to produce them, but it’s hard to say that a system that can make a creator uneasy about putting their game in front of a new audience is the best that it can be. Through outsourcing reviews to its user base, Steam is very much getting what it pays for.