How many sheep? Mis-counting productivity.

This image has an empty alt attribute; its file name is how-many-sheep.png
Count ’em: 1…?

Counting is as easy as 1,2,3. Sometimes.

Tell that to people who count productivity. Here, the apparently simple act of counting is of course nothing of the sort. When tracking economic change or comparing performance, it is another deep source of uncertainty – often underestimated and un-acknowledged, as we’ll discover.

To see how it begins, a simple question: are there two sheep here, or one? 

Does the lamb count? 

No? A lamb isn’t a sheep, then? Ok, how about one and a half?

But not so fast… the ewe is about to give birth, imminently (I made that up, but let’s pretend). 

So, what about one and two halves? That makes two, like some of you said, even if you didn’t mean that kind of two. Or maybe one, plus a half, plus a quarter? 

And no, the white thing just visible above the one sheep that’s most uncontestably a sheep doesn’t count: that’s a rock.  

Counting: As easy as 1,2,3.

You take the point: counting higher than one just got difficult, and the reason is all to do with definition.  Whenever we count something, we define it. And if we want to count two, we’re saying they’re in some important way the same thing. 

The trivial idea is that definitional ambiguity leads to uncertainty: how many sheep are there if we don’t agree what a sheep is / what season to count them in (maybe not in Spring) / etc? 

What we measure, using what definitions and methods, can be highly contestable, and in much of our public data these things can and do change. This means that even if our counting itself was perfectly accurate, miraculously free of sampling problems and the like, there would still be uncertainty. 

Counting productivity

Sheep are at the relatively easy end of the problem and – you’ll gather – not really what this post is about. We have bigger statistical fish to fry (don’t you love metaphors?), like comparisons of rates of cancer or other disease between countries, for example, or comparisons of productivity or GDP growth. 

It’s the latter that are the focus of this blog. Vital to our understanding of what’s going on in the economy and capable of rattling politics with one bad set of numbers, these too are full of definitional headaches.

The headaches matter for two main reasons A) because the effects of definitional changes in economics can be huge; and B) despite this, because the implications for the reliability of any given economic number are often ignored and, with that, the implications for any policy based on them. 

Occasionally, though, the issue is forced into the open.  Take UK productivity. In the long run, said the economist Paul Krugman, productivity is almost everything. As we produce more value for each hour worked (in the case of labour productivity) a modern economy grows richer. 

Among numerous current concerns about UK productivity is the apparent fact that growth has slowed dramatically in recent years but we’re not entirely sure why – known as the productivity puzzle – and that labour productivity per hour is strikingly lower than in many other large economies. You don’t have to look far to find people who are worried. This, from The Times in January, was fairly typical.    

This image has an empty alt attribute; its file name is times-on-productivity.png

But the UK is not lagging 20% behind. At least we don’t think so. And not because it suddenly caught up. About a month earlier in fact, the best estimate of the productivity gap fell overnight from about 20% to about 10% – an enormous correction. The OECD, working with the UK’s Office for National Statistics, said it had spotted differences in the way countries counted how many hours people worked. And if hours worked are counted differently, then output per hour changes. 

The UK had been making very few adjustments to reported working hours. In France, by contrast, those reported hours are statistically adjusted for holidays, strikes, sick leave and so on. This means French working hours appear shorter, and output per hour higher, than they would under the old UK formula. On a more like-for-like basis, it turns out that the notoriously overworked Brits actually work shorter hours than the notoriously long-lunching French – and the productivity gap between the two countries roughly halves.*

The gap in labour productivity with the United States is also around 8 percentage points smaller than previously thought – closing from 24% to 16%. With Germany, the gap shrinks from 22% to 14%. UK productivity leapfrogs Italy. 

Clearly, the gaps with France, Germany and the US haven’t disappeared. They are still, some might argue, embarrassing – though there might also be benign reasons for some of the gap in some cases. For example, France has higher unemployment, lower employment (UK about 4% unemployment, France about 9%, according to the OECD), and many of the UK’s extra jobs are low-wage, meaning low productivity. That’s no reflection on the importance of these jobs or how hard people work, it’s simply the way economists assess the cash-value of their output. But it means we could probably raise average productivity per hour in the UK if we sacked a few. No one thinks we should. So, being behind might not all be bad.  

But put that to one side. No matter how we interpret the gap – for good or bad – this was unquestionably a massive correction, implying that we should have had serious doubts about how big the gap really was.

Confidence built on uncertainty

And yet, there was little sign of doubt in the presentation of that old data, nor in those who discussed it. When The Times picked up the figure of 20%, presumably from an ONS press release or website, the published data had not yet been revised to take account of this latest methodological correction, and the paper appears to have been unaware of the OECD’s findings, or that a correction was on the way. Yet with hindsight we see that because of the potential for methodological or definitional differences we should not have accepted those old numbers with anything like confidence. 

The problem extends further. Labour productivity has two components: hours worked, and how much value we crank out of them. Definitional problems apply to both. Output (usually GDP) also depends on a thousand-and-one definitions, some of which are subjects of intense research and argument about what might have changed in an increasingly digital and service-based economy where output is far from easy to define. Some commentators suggest that the current definitions are failing to capture significant chunks of that output. Presumably, these definitions might also be revised. 

The extensive doubts about what’s going on in these cases are essentially caused by definitional ambiguity, incorporated into methodology. What should we count when we count hours worked? When we compare counts in one place with those in another, are we sure we have every potential variation in method nailed down? Are we looking at sheep, or lambs?

As for differences between places, so for differences in time, which can also introduce methodological glitches. Are we sure that our methods, whatever they are, don’t need to be changed in some way to account for changing behaviour in the workplace, for example, or changing patterns of work or output?

No, we are not sure, because we can’t be sure. We can’t know all the problems that might lurk in the data, old, new or emerging, as a result of our choice of method or definition, or with what potential magnitude of effect. 

Then what do we do? 

Do we assume we now have this latest productivity comparison fixed? Do we pretend, in other words, that everything’s good enough? Or could there be something else going on? Should we just whack on an extra ten percentage point potential error just in case, in either direction? What if definitions of output do also change – as seems likely at some point – when we find new ways to describe value in a service-based, digital economy, and what if these changes have varied effects on different economies which have different balances of economic activity – some more manufacturing based, others more service heavy (like the UK)? Then productivity will change again, and so will the gaps, even though real productivity – whatever that is – will be chugging along in its own sweet way wholly unaffected by any of these changes to what we say it’s doing. 

When I discussed the productivity correction with the Winton Centre’s David Spiegelhalter, he said he was amazed how little attention it attracted. 

‘This is so important, and it is generally glossed over,” he said. “I think politicians and journalists are deeply disturbed at realising the frailty of these concepts that they use in arguments, and would simply rather not know.’

I think he’s right that few are ready to face up to the frailties. On the other hand, I suspect a lot of us sort of know deep down that these things couldn’t be other than frail. So, we finish up with a classic cognitive dissonance: we know it’s dodgy, but we’d really, really like it to be reliable, and the way that we talk about the measurements, obsessing over small changes, implies that we simultaneously think there’s a solid ‘productivity’ or ‘GDP’ out there – and that the number is somehow true. The problem, then, is how to force the chatter out of its denial about what many know are the awkward limitations. 

Beyond uncertainty?

But David also invites a question: whether we should think of definitional problems as uncertainty:

‘I would not even call this uncertainty, since what is it we are uncertain of? Unless you think there is some true, platonic, ‘productivity’ out there, this is all convention, and so it is all ambiguity.’

”Uncertain’ should mean there is at least a theoretical possibility of being certain, either through additional knowledge or simply waiting to see what happens. The hint is in the term – ‘uncertain’’

From a purely practical point of view, I guess what most matters is whether the economy – including productivity – is going in a good direction for the right reasons at a reasonable rate, so that if it’s not we can try to change policy. In that sense, I think we use the word ‘uncertain’ to qualify our confidence in what we think is a reasonable judgement based on the data available. 

But when we’re altogether uncertain how uncertain we should be, or what the thing is that we’re uncertain of, perhaps the problem goes beyond that. 

The least we can do is acknowledge it – which journalism almost never does, and even the ONS doesn’t do much. I think we should be open to ideas about what a suitable qualification to the data should look like in these cases, if it’s possible to frame one. Meanwhile, we could use examples like the OECD’s revisions of labour productivity to remind ourselves that numbers we might be inclined to take for granted can be badly wrong in ways we hadn’t even previously thought of. 

*Do the French really work less than the Brits? Average working hours might not be the best measure. If a larger part of the working population in the UK than in France is part-time (UK about 23%, France about 14%, according to the OECD), then average UK hours will look low, even though full-time UK employees might be bigger slaves to the desk. A definition of working hours that might sound sensible for the purposes of assessing overall productivity might be less good at telling us what we want to know about a long working hours culture. 

(A version of this post originally appeared in February 2019 as a blog for the Winton Centre for the Communication of Evidence and Uncertainty.)

Posted in: Uncategorized