Wednesday, 14 September 2016

The importance of local data

Last week, I found myself in the unusual position of rowing publicly with the Office for National Statistics.

The reason for the row? The ONS had published the annual figures for civil partnerships in England and Wales. Only this year, a bit of the data was missing: the breakdown by local authority.

When I asked when it would be available, I was told it wasn't being published this time.

Now, the reason the row was unusual is that I'm generally a huge fan of the ONS. The Trinity Mirror data unit, which I run, relies heavily on the fact the ONS are so good at publishing fine-grained data,  broken down by local authority, or super output area, or whatever. We serve scores of local and regional titles, as well as national ones. One dataset can contain multiple stories for multiple titles.

I don't really want to re-hash the row. Suffice to say the ONS has now published the data, and I'm very grateful for that.

Andy Dickinson, in a very fair summary of the issues, mentioned that some onlookers found the whole issue 'odd', linking to this tweet by James Ball:

While it is true that some ONS data is traditionally issued at national level, or in fairly abstract form, it is pretty much unprecedented for a dataset which has always been published with a local breakdown to suddenly not be published with a local breakdown.

That needs to be resisted, and not just by local or regional journalists. And that's what I want to focus on here: why local data matters, and matters to everyone.

It's fairly obviously true, I think, that local data is more likely to contain information of greater interest to local audiences than national data.

Let's look at that civil partnership data, for instance. Yes, the national figure - civil partnerships falling from 1,683 in 2014 to 861 in 2015, largely because of the availability of same-sex marriage - is interesting to everyone. But if you live in Teesside, you might be more interested to know that there were no civil partnerships at all in your region in 2015. If you live in Brighton, you might be more interested to know there were more civil partnerships there than anywhere else in 2015, and more than in the whole of Yorkshire. If you live in Dorset, you might be more interested to know that your county bucked the national trend, with 11 civil partnerships in 2015 compared to just eight the year before. And if you live in Blaenau Gwent, you might be more interested to know that your area has seen fewer civil partnerships than anywhere else in England or Wales since they first became available, with just 15 in eight years (compared to 1,449 in Brighton).

But it isn't just that local data is more interesting to local audiences. Often the local data contains facts that should be more interesting to everyone.

Take another dataset that came out last week: the number of deaths, related to drugs, in England and Wales. This dataset was broken down by local authority.

Because it was, we were instantly able to see that Blackpool has nearly twice as many drug-deaths per head as anywhere else in Britain.

Just take a moment to consider how astonishing - and alarming - that fact is.

We were also able to plot the rate of drug deaths as a map, which revealed that a remarkable number of the places with the highest incidence of drug-deaths were coastal towns.

When my colleague Patrick Scott (who did the analysis) tweeted this, a number of hypotheses were put forward by readers. Maybe these places also had high deprivation, or unemployment, or a lack of 'fulfilling' jobs or careers. Maybe they were particularly physically (as well as socially) isolated. Maybe it was 'incomers', heading to the coast to party, who were the people doing the dying.

This brings me to the key point about local data. Local data provides the building blocks of analysis. It is where we start if we want to try to understand what is actually going on.

We can, for example, test the theory about drug deaths being correlated with deprivation, since deprivation figures are available at local authority-level, too.

It's only because of that fact that we can plot a graph like this:

Sure, there's some degree of correlation. But there's something else going on in Blackpool. It isn't a simple pattern. There is more to investigate here.

Look, data can be a dangerous thing. A lot of people don't know what they are doing with statistics, and make unwarranted claims. So, sadly, do a lot of journalists - a fact that can make people who do know what they are doing with data reluctant to put it in the hands of those who might not. That, however, is to conflate two separate issues: data literacy (which we need to improve), and data availability (which we need to increase).

Ultimately, the battle is won when more data is available and more people can accurately find patterns, trends and information which are interesting to them. The more local the data, the better the chance it has relevance to our lives, and the better the chance we will be able to use it to make connections that make sense of the world.

Surely that's something worth a little row from time to time?

Sunday, 22 May 2016

Why that Express splash about '12m Turks' is so mathematically illiterate

"12M TURKS SAY THEY'LL COME TO THE UK," shouted the Sunday Express today. The story was based, they said, on an exclusive opinion poll carried out in Turkey, asking people whether they would want to come to the UK if we stay in the EU, and Turkey joins.

My first thought was: well, that sounds like kind of a lot. Twelve million. The entire population of London and Greater Manchester combined.

My second thought was: I hope they have included a link to the full opinion poll so I can judge the figures for myself.

They did. It is here. And it proves the 12 million figure is utterly unjustified, and the Express has (presumably unwittingly) made some dreadful errors with the statistics.

The 12 million figure is based on a simple calculation. Take the percentage of people who answered 'yes' to the pollsters' question about coming to the UK (15.8%). Apply that to the total population of Turkey (80 million or so). Hey presto, you get about 12 million.

And that's fine.

But let's look at the question that was asked:


Notice the word 'consider'. Not 'would you intend to move', or 'would you probably move', but 'would you consider'. Personally I consider doing a lot of things I'm unlikely to end up doing. I've considered moving to New York. I've considered going vegetarian. When I was buying a car, I considered very many models. 

I only ended up buying one.

But the real problem is that other bit: 'or any member of your family'.

This skews the figures completely and renders the 12 million claim utterly null and void.

To understand the point, imagine if Turkey had a population of just 100, neatly divided into 10 separate families.

Now imagine that each of those families had just ONE member who would consider moving to the UK.

It doesn't take a maths genius to realise that the number of people who would consider moving to the UK is 10, and that these 10 make up 10% of the entire population.

But what if we applied Express logic to this group of people?

Let's imagine pollsters choose the first person in each row and ask them the question: "Would you, or any member of your family, consider moving to Britain?"

Now, NONE of the people polled would themselves consider moving to the UK. But ALL of them have a family member who would.

So they would all have to answer 'yes' to the question as put.

We would end up with 10 out of 10 people, 100%, saying 'yes'.

In fact, whichever one of the 100 people you polled, they would all have to say 'yes'. Even though 90% have no intention of even considering moving to the UK.

Applying Express logic, and generalising these figures to the population as a whole, we'd end up with a headline saying that EVERY SINGLE PERSON IN TURKEY was going to come to the UK. And yet clearly only 10 out of 100 were even considering it.

Look at it this way: I imagine as near as dammit 100% of Turks have at least one family member who has considered NOT moving to the UK if Turkey enters the EU. Maybe they are considering moving somewhere else; maybe they are at least considering staying in Turkey.

You wouldn't therefore try to construct a splash saying "ABSOLUTELY NO TURKS WILL ENTER THE UK IF WE MAKE THEM PART OF EUROPE".

That would be equally stupid.

It is, in short, a terrible way to phrase the question if your honest intent is to find out the number of people who are genuinely likely move to the UK in the event of Turkey joining the EU.

(A final note: quite a lot of people on Twitter are criticising the poll for 'only' asking 2,600 people for their opinion. Well, it's a poll. That's how polls work. It's a decent sample size with a tolerable margin of error. There are many things wrong with the poll, but that isn't one of them.)

Sunday, 21 February 2016

How Labour lost Middle England

Ah, Middle England. A mythical place of cafetieres, Scandinavian mini-series and pilates. Where Nissan Qashqais roam the land, carpets are shoe-free zones, and the wine rack is always full of reasonably-priced supermarket reds.

The place - legend has it - where general elections are won or lost.

I grew up in Middle England. At least, I was always pretty sure I had. Detached house in semi-rural Lincolnshire. Went to school at a good local comprehensive. Played cricket and tennis as well as football. That sort of thing.

But perhaps you think you also grew up in Middle England, and perhaps your experience was quite different to mine. That's the problem with Middle England: it's so bloody ill-defined. We all think we know what it is; we just never actually spell it out.

These days it doesn't really matter. We don't need to compare notes from school days, or the labels on our clothes. Demographic data is plentiful, and easy enough to analyse. We can measure the middle of England - and in very precise terms.

Middle England, then, is not a mythical concept at all. It's a mathematical one. Middle England becomes Median England, or Modal England, and there's nothing very mysterious about that.

Let's say Middle England really does decide elections. What did it think in 2015?

First let's get our parliamentary constituencies in some sort of ranked order, from most to least affluent. There are various ways of doing this. For now, let's use a simple measure: house prices, which are available for constituencies here.

Do this, and something immediately becomes apparent: London's housing market is ridiculous and skews the figures dramatically. Nineteen of the 20 constituencies with the highest median house prices in 2014 were in the capital. In Kensington, the average house price was an eye-watering £1.2million.

So we'll leave London out of it.

That leaves us with 460 English constituencies, from Esher and Walton in Surrey (median house price: £465,000) to Liverpool Wavertree (median house price: £78,000).

You'd probably expect the richest areas to vote Conservative. You might not realise the extent to which that happens. In 2015, 34 of the 35 constituencies with the highest median house prices voted Tory.

The one that didn't? Cambridge.

Similarly, 42 of the 43 places with the lowest median house price voted Labour. Pendle, held by the Conservatives' Andrew Stephenson, was the exception.

What, then, of Middle England?

To visualise what happened, let's divide the 460 non-London English constituencies into ten groups of 46. The wealthiest areas, in terms of house price, will be group one. The poorest will be group 10.

Here's what happened at the general election:

What's potentially concerning for Labour here is that even among the seventh group, a majority elected a Conservative MP.

If we take Middle England to be groups 4, 5, 6 and 7, the Tories won 143 seats in Middle England. Labour won 37.

In the old days, of course, Labour didn't really need to win Middle England. It had Scotland, too, where it always did disproportionately well compared to the Conservatives. The SNP wipeout last year had removed that particular crutch. There's no suggestion in the polls it's coming back any time soon.

Labour still does well in Wales, with its 36 MPs, and London, with 54. But if the party doesn't need to win Middle England outright, it certainly needs to be fighting for it.

In 2015, that was a fight Labour lost - and lost badly.