Wednesday, 14 September 2016

The importance of local data

Last week, I found myself in the unusual position of rowing publicly with the Office for National Statistics.

The reason for the row? The ONS had published the annual figures for civil partnerships in England and Wales. Only this year, a bit of the data was missing: the breakdown by local authority.

When I asked when it would be available, I was told it wasn't being published this time.

Now, the reason the row was unusual is that I'm generally a huge fan of the ONS. The Trinity Mirror data unit, which I run, relies heavily on the fact the ONS are so good at publishing fine-grained data,  broken down by local authority, or super output area, or whatever. We serve scores of local and regional titles, as well as national ones. One dataset can contain multiple stories for multiple titles.

I don't really want to re-hash the row. Suffice to say the ONS has now published the data, and I'm very grateful for that.

Andy Dickinson, in a very fair summary of the issues, mentioned that some onlookers found the whole issue 'odd', linking to this tweet by James Ball:


While it is true that some ONS data is traditionally issued at national level, or in fairly abstract form, it is pretty much unprecedented for a dataset which has always been published with a local breakdown to suddenly not be published with a local breakdown.

That needs to be resisted, and not just by local or regional journalists. And that's what I want to focus on here: why local data matters, and matters to everyone.

It's fairly obviously true, I think, that local data is more likely to contain information of greater interest to local audiences than national data.

Let's look at that civil partnership data, for instance. Yes, the national figure - civil partnerships falling from 1,683 in 2014 to 861 in 2015, largely because of the availability of same-sex marriage - is interesting to everyone. But if you live in Teesside, you might be more interested to know that there were no civil partnerships at all in your region in 2015. If you live in Brighton, you might be more interested to know there were more civil partnerships there than anywhere else in 2015, and more than in the whole of Yorkshire. If you live in Dorset, you might be more interested to know that your county bucked the national trend, with 11 civil partnerships in 2015 compared to just eight the year before. And if you live in Blaenau Gwent, you might be more interested to know that your area has seen fewer civil partnerships than anywhere else in England or Wales since they first became available, with just 15 in eight years (compared to 1,449 in Brighton).

But it isn't just that local data is more interesting to local audiences. Often the local data contains facts that should be more interesting to everyone.

Take another dataset that came out last week: the number of deaths, related to drugs, in England and Wales. This dataset was broken down by local authority.

Because it was, we were instantly able to see that Blackpool has nearly twice as many drug-deaths per head as anywhere else in Britain.

Just take a moment to consider how astonishing - and alarming - that fact is.

We were also able to plot the rate of drug deaths as a map, which revealed that a remarkable number of the places with the highest incidence of drug-deaths were coastal towns.



When my colleague Patrick Scott (who did the analysis) tweeted this, a number of hypotheses were put forward by readers. Maybe these places also had high deprivation, or unemployment, or a lack of 'fulfilling' jobs or careers. Maybe they were particularly physically (as well as socially) isolated. Maybe it was 'incomers', heading to the coast to party, who were the people doing the dying.

This brings me to the key point about local data. Local data provides the building blocks of analysis. It is where we start if we want to try to understand what is actually going on.

We can, for example, test the theory about drug deaths being correlated with deprivation, since deprivation figures are available at local authority-level, too.

It's only because of that fact that we can plot a graph like this:


Sure, there's some degree of correlation. But there's something else going on in Blackpool. It isn't a simple pattern. There is more to investigate here.

Look, data can be a dangerous thing. A lot of people don't know what they are doing with statistics, and make unwarranted claims. So, sadly, do a lot of journalists - a fact that can make people who do know what they are doing with data reluctant to put it in the hands of those who might not. That, however, is to conflate two separate issues: data literacy (which we need to improve), and data availability (which we need to increase).

Ultimately, the battle is won when more data is available and more people can accurately find patterns, trends and information which are interesting to them. The more local the data, the better the chance it has relevance to our lives, and the better the chance we will be able to use it to make connections that make sense of the world.

Surely that's something worth a little row from time to time?