We do not know how useful most games analysis is. More developers and analysts are sharing their analysis publicly but regularly miss one measure that would make the results significantly more helpful. This measure is the variability of the estimate.

It helps to think about what is being measured when applying statistics. Most of the time we want to learn something we don’t know about a population (buyers of games for instance). It is not practical to measure the population and so we use a sample to make an estimate of the thing we want to learn. Often we want to learn about the mean, which is a measure of central tendency. It is a summary of all the values the population takes on once weighted for their probability. When wondering if someone will buy a game, each individual customer will only ever be in one of two states: bought or did not buy, but the mean will summarize the population by reporting the probability that a given person bought the game.

Not every reader of games analysis will have been exposed to concepts like populations or samples, but we take a lot of this for granted each time an article reports a mean. Reporting all of the values in a sample is unwieldy, and so a summary measure is more informative and easier to understand. However, populations are often characterized by two values: the mean and the variance. While we are interested in the mean, the variance allows us to make meaningful statements about the mean.

The variance tells us something about the spread of values. To see why the variance is important, consider the two distributions below. The vertical axis measures probability, while the horizontal axis measures the values. Both of these distributions have the same mean, but the first one is much more concentrated around the mean than the other. We would expect to see values close to the mean appear much more frequently from the first distribution than the second.

Suppose you were offered an estimate of how much needed to be spent on marketing in order to achieve a minimum number of sales. If an analyst came back with an estimate of $5,000, this would probably be good news. Now suppose the analyst came back with a range that said the required value was between $3,500 to $6,500. Not a catastrophe, and you might even hedge your bets by paying more than the mean (which is still $5,000) and pay $6,500. What about $0 to $10,000? The top range might still be feasible, but at some point the validity of the estimate is going to be called into question.

“You mean to tell me we may need to spend a fraction of someone’s salary or nothing at all to sell this thing?” This might even be the time we seek out another analyst, and yet, provided this range is reported accurately, the analyst is doing us a favour. This range exists for every estimate, even if it is not reported. The current state of games analysis is the analyst that reports $5,000 but does not report, and possibly doesn’t even bother to check, if there is a high variance.

By reporting either a variance or a confidence interval, most games analysis can become considerably more informative. Analysts may find it costly to report a measure that undermines the certainty that was implied by the single authoritative mean, but consumers of analysis can and should demand more. There are far greater errors present in most games analysis (including my own) because of the limitations of the data we are using, but quantifying the uncertainty allows readers to use the results intelligently, instead of relying on ‘gut checks’ or the reputation of the analyst.

Reporting variance would also allow more analysis to be meaningfully compared against each other. Consider the case of the Boxleiter number. This number has been estimated on a number of occasions, and it has been suggested this number has decreased over time. It’s possible, but we don’t need to guess. If the estimates of the Boxleiter number had reported a variance, there is a test that can be performed to determine if the results are statistically different from each other (you don’t even need the underlying data. If you’re a Python user, stats.ttest_ind_from_stats in SciPy will do the trick with just the mean, standard deviation (which is the square root of the variance), and number of observations. There are also more complicated tests when the assumptions of the t-test aren’t satisfied). For want of a single number we have lost the ability to answer an interesting and important question.

Developers and analysts are under no obligation to meet a particular standard, and some of these articles are intended to support other marketing efforts, but a useful article is a shareable article. It is a shame to find an exciting result, only to realize that it’s less certain than we thought once the variance is taken into account, but these results are still useful. Sometimes precision is just a matter of getting a larger sample, and honestly reporting the challenges in the original work can sometimes be the key to getting in touch with someone who can access one.

Any data driven article can be enhanced by quantifying the uncertainty of an estimate, whether it’s a direct report of the variance, a confidence interval, or whatever measure is appropriate for what is being reported. The additions to an article amount to a few more numbers at most and possibly some discussion. The readers walk away more informed and, more importantly, the articles that do report them can be meaningfully compared, making them more shareable. It is a benefit for both the analyst and their readers.