You can make stats say pretty much anything.

Posted on Posted in Everything, Miscellaneous, Software Testing

Jared Quinert pointed out to me that I missed an opportunity to work in a good point about testing in my entry supporting the WGA. I’ve recently been struggling with the task of coming up with meaningful stats to report to my handlers to help justify the existence of my team and anyone with a brain cell would have noticed that talking about prodco execs massaging stats of WGA writers would have been the perfect segue into this. Apparently I still suck at the blogging thing. I’m sorry.

Let’s pretend for a moment that I was clever enough to put two and two together. Here’s what I might have said.

You can make statistics say pretty much whatever you want. This is potentially dangerous because in my experience, more often than not, your audience is passive won’t care at how you arrived at your final figures as long as the layman’s explanation sounds reasonable.

A tester worth their salt is different. They’re conditioned to ask ‘why’ to almost everything, but I have a sneaking suspicion that there are many people out there that accept any given number at face value as long as it’s not obviously outrageous, maybe because they’re afraid of numbers, maybe because it simply doesn’t occur to them to question why.

Let’s take the following statement on the AMPTP’s website:

According to WGAw, 4,434 of its working film and television members earned a combined $905.8 million in 2006. The average member earned $204,295 and over half earned at least $104,750. The WGA noted that these numbers are based on earnings reported for dues purposes and thus do not fully reflect above-scale payments.

In case you’re wondering where they got their information, they got it from a report like this one.

It would be very easy to read that statement and conclude that at least half of screenwriters earned over 100K last year – not a bad paypacket in most people’s book. Certainly the AMPTP seem quite content to leave it at that.

Let’s have a closer look though. One key word that jumps out at me is ‘working’. ‘4,434 of its working film and television members…’ – meaning that there is an unstated number of non-working film and television guild members. What is the guild’s total membership? According to the report I linked to above, it’s 7313 +/- 400.

So when they say ‘The average member earned $204,295‘ they mean 910m divided by 4,434, which they use to arrive at their average figure of around 200k. Let’s factor in the conservative figure of total membership – 6913 (7313 – 400).

910m / 6913 = $131,636

Very different to their figure of 205K. Still, 131K is still a pretty good wicket to be on, right? Maybe, but remember, we’re averaging wages earned by all across a large number and at least 2,479 of those people (36%) did not work at all during the fiscal year stated.

Let’s add a little more contextual information. As a writer, you might work steadily for a season on TV, or be paid for a script during one year, and then not work again for 12 months or more. The 36% of writers that didn’t work this year are predominantly not going to be the same people who don’t work next year (assuming the strike is resolved by then).

Taking our average figure of 131.6K and stretching that across two years, we’re suddenly looking at 65K and this is before we factor in tax. An agent will typically take 10 percent which further eats away at your nest egg. Suddenly the princely sum the AMPTP are touting is not looking nearly so princely.

Even if you were to say that the average writer works 64% of the time, then you come to a sum of $84,224, less agents fees – $75,800, less tax (let’s be generous and tax them 33%) and you arrive at almost exactly $50,000 – and remember I’ve been using conservative figures.

Are the AMPTP being deceitful? I’ll leave that for you to conclude. They’re definitely not giving us the entire picture, that’s a certainty.

As software testers, we’re taught to look for multiple explanations for problems. The map is not the territory. How often has your first conclusion turned out to be incorrect. If you’re anything like me, the answer is very.

I don’t think examining stats is very different in this regard. You need to look closely and think critically, whether or not there’s a neat little summary next to the numbers that tell you what you’re supposd to think. What the numbers don’t say may be just as important as what they do.

What contextual information may be missing? Can you drill further into the numbers and look more closely? Is there a heavy skew and a long tail hidden by a mean average? What does that mean?

Abstracting a little more, you can look at who has put the statistics together and why they might have chosen to present it the way they did. Do they have a particular bias or motive that may make them want to present statistics in a particular light? Someone with a KPI bonus for keeping bugs low might present a set of figures very differently than someone with a bonus based on the number of bugs found.

I am not at all surprised at the number of war stories one hears about statistics being misused and abused. It’s too easy to accept numbers at face value when the numbers look as good as you want them to, or present a slam-dunk answer to an argument.

I am going to do my best however, to make a point of being the guy who, when presented with stats, does dig deeper and shakes things to see what falls out. It might be a bit more effort, but it might also become a lot more interesting, not to mention valuable.

One thought on “You can make stats say pretty much anything.

Leave a Reply

Your email address will not be published. Required fields are marked *