How Energy Performance Certificate data could be really useful, but isn't

One of the frustrating things about discussing housing in this country is that we have historically lacked some key data which would allow us to compare ourselves more accurately with other countries. For example, there are no good statistics on the size of homes we're building now. It is commonly accepted that we build very small homes compared to other new countries, but the only some of the evidence for this is very out of date (see this Policy Exchange report from 2005 which cites these EU statistics from 2002 which cite English House Condition Survey data from 1996). It may well be true that we are building small homes at the moment but we just don't have good enough data to say for sure. [Update: I completely forgot about RIBA's excellent research on this very topic. Thanks to Rebecca for reminding me. So the data gap isn't quite as large I thought, though much of the below still applies.]

Relatedly, we can't really compare our house prices with those in other countries because the simplest consistent comparison, price per square foot or square metre, is not available to us. This matters to people who make housing policy, but it also matters to people thinking of moving house between different countries.

There is a solution in sight, however. The law requires an Energy Performance Certificate to be produced for every house that is sold or rented out. An EPC is drawn up by an expert after looking over the house, and captures key information about the energy efficiency of the house. But it also captures other information, notably the type of house, its size in square metres, and its exact address. There are now about 7 million domestic EPCs, all held on a single register and as of today searchable by address. I just looked up the EPC for a house down the road from me, which is the same kind of Victorian mid-terrace as I share with friends. It pretty much confirms what I thought, which is that our house retains heat about as well as a sieve.

To get back to my point though, what this means is that we've got a huge and growing database of home sizes. And because the Land Registry has recently started releasing its data on house prices, again with the exact address provided, it should be possible to link the two datasets together to calculate the average price per square metre in different parts of the country, for new as well as old houses. More sophisticated analysis could also reveal the extent people are willing to pay for for more energy efficient homes.

I don't know whether anyone in government is working on this. As far as I can see they're not, and that wouldn't surprise me as the key department (Communities and Local Government) is these days shedding statisticians and generally doing less analytical work.

But it should be possible for academics and laypeople to analyse the data in this way. The problem is that the government has decided that EPC data should only be available in bulk to certain organisations and only if they are prepared to stump up the money for it. The costs range from 1p to 10p per record depending on how much detail you want, but in any case this quickly mounts up if you want any kind of comprehensive database at local or regional level.

The government says these prices are to cover the costs of disseminating the data. Maybe that's fair enough and maybe it isn't, but it does mean that the kind of useful analysis I've described above can't be performed by anyone outside central government. So if the CLG are determined to ration access to the EPC data by price I think it should really be doing its own analysis and making the most of this data on our behalf.


  a related problem is that the EPC data seems to be all stored as PDFs rather than useful text (particulrly bad as there's so much boilerplate copy and imagery in the documents bloating a few dozen k of information to close to 200). This has the effect of making the data both expensive to store/ deliver and almost totally useless for the kind of analysis you suggest might be useful. (or at least the task would require significant and expensive processing resources and be error prone)

    I hope that they might be keeping the data in raw form somewhere and that one day someone with a clue might liberate it but experience tells me that they're probably not.


  2. Hey Tom,

    I've been assuming that the 'bulk data' service I mentioned in the 2nd from last paragraph consists of tabular data, as opposed to just PDFs. You're absolutely right, in PDF form they would be completely unusable for any kind of aggregate analysis.

  3. Hi Tom,

    The data is stored in tabular form on the servers of the accrediting bodies who own the software that produces the EPCs, but I'm not sure if this data is collated central by landmark, or if they just have the pdfs.

    I have been trying to find out what percentage of EPCs fall into each of the A to G bands, which you would think was basic information, but no one seems to be able to provide it, even the accrediting bodies who must have the data available.

    I run a national EPC company and as such have contacts in several of the Accrediting organisations to try and source this info. They were unable to provide us with this, and passed us on to DECC who passed us on to OFGEM who passed us on to Landmark who told us to ask the accrediting bodies!


