4 A future without a census?
Existing alternatives to the
census
51. As already indicated, the Government and other
public sector bodies carry out a number of surveys of areas such
as employment, health, household incomes and young people.[55]
The National Data Strategy recognises the potential value
of such 'administrative data' for research purposes.[56]
The strategy outlines the steps necessary for this data, along
with commercial[57] and
tracking data,[58] to
be useful for research. It states: "The [Economic and Social
Research Council] will investigate the appetite for a collaborative
programme of research involving both private and public sector
organisations in the use of transactions datainformation
collected and retained by organisations in the conduct of their
business".[59]
52. We earlier noted the difficulties that may arise
in using data collected for one clear (commercial) purpose to
provide information for other purposes (see paragraph 26). Our
witnesses were sharply divided about the potential for, and the
degree of difficulty of, combining different existing administrative
databases to produce information comparable to that derived from
the census. The importance of getting this right is thrown into
relief when one considers, for example, that, "crudely, 75%
of local authority funding is centrally funded with allocation
based to a significant degree on population".[60]
53. Professor Les Mayhew, Cass Business School, explained
that he was able to carry out surveys that were more timely and
more accurate on a small scale than the census by combining data
sets gathered by local providers of health, local authority and
other services:[61]
There is no dataset that completely covers all
the population or is 100% reliable, so you have to combine them
in some way. We link the population to property registers. We
have a set of rules by which you can confirm or not confirm, based
on whether they are on more than one dataset and other rules,
which I could explain.
Because I think I probably have as much experience
as anybody in using all these datasets, I want to put on record
the fact that you can look at household composition using administrative
data by linking people to their addresses.[62]
54. He explained that:
The data base produced may be contrasted with
information available from the Census
- The results are timelier than the Census
which can take several years to process before being released
and be more than 12 years out of date before it is refreshed.
Based on the 2011 Census basic population aggregates will not
be released before July 2012 and more detailed data will not follow
in some cases until much later.
- On Census day this year we obtained and
processed snapshots of local administrative data covering the
six Olympic Boroughs in London who commissioned the work. Fully
geo-referenced databases down to household level by age and sex
were completed and handed over to local authorities inside six
months.
- The databases contain much greater granularity
than is possible using Census data. Each database is able to produce
statistics for any size or shape of geographical area or administrative
unit and unlike the Census is not constrained by any pre-determined
geographical boundaries.[63]
55. While acknowledging the existence and importance
of alternative sources of data,[64]
other witnesses raised a number of concerns about the robustness
of such administrative data, and the ease of comparison of data
from different sources. For example, Professor David Blane commented:
civil servants always laugh and say, 'You have
seen nothing until you have faced the problems in administrative
data.' A real ace like Professor Mayhew, who is motivated and
skilled, can cut through a lot of problems, but I worry that most
things are going to be done by people doing routine work. They
are not going to be as motivated and skilled as Professor Mayhew.
The potential for introducing error into the data is enormous.[65]
56. The British Library was particularly concerned
that the loss of the census would lead to the fragmentation of
information about the UK population:
incomplete administrative and sample-based survey
data cannot:
substitute for complete and impartial census data;
provide micro-level neighbourhood data required by
local authorities and third sector organisations; and,
support analysis of long-term trends due to methodological
changes.[66]
57. Dr Eldin Fahmy, from the School for Policy Studies
in the University of Bristol, believed that the decision to axe
the decennial population census after 2011 would "substantially
undermine the capacity of UK social science to analyse and understand
social processes at a small scale".[67]
TWRI Policy and Research suggested that, whilst no census is perfect,
"the absence of a 2021 census will be extremely detrimental
to the understanding of changes in society in the decade 2011
to 2021".[68]
58. Moreover, other sources of data have restrictions
on their use or general availability: we were told, for example,
that the Labour Force Survey Annual Local Area Data Series and
the Unitary Authority/Local Authority (UA/LA series) had been
withdrawn on instruction from Office for National Statistics (ONS),
due to confidentiality issues.[69]
The Joseph Rowntree Foundation explained that, although these
databases had always been anonymised when released for analysis
by outside bodies to ensure that users could not identify any
respondent with the information given, advances in technology
and software had made it easier to link survey records to either
other survey files or other administrative or commercial databases.
The ONS had therefore concluded: "Although the risk for most
respondents is very small, there remains a risk of identification
for people with unusual combinations of personal circumstances.
Thus the release outside the central government statistical services
of social survey databases with small area identifiers, alongside
a national database with detailed coding, has now been ceased".[70]
59. Administrative
data is often collected without consideration of potential wider
application and use, thus often is only fit for a single purpose.
There is reluctance on the part of research scientists and government
social scientists to utilise it for other purposes due to the
difficulties in its reuse. The existing National Data Strategy
should provide good practice and guidance on expanding the number
of uses and the longevity of data collected at public expense.
60. We also
recommend that the ONS seek to remove bureaucratic burdens currently
hindering the broader use of data. Too often the Data Protection
Act is used as an excuse for not reusing data collected at considerable
cost to the public purse. We consider it entirely possible that
data could be collected in a way to facilitate better public administration
that would not contravene the principles of the Data Protection
Act. We would like the Government to indicate how it plans to
more broadly use data from sources, such as the Labour Force Survey,
as part of their response to this report.
61. The Local Government Association indicated that
they would like to see better and more available data in a number
of areas:
Health data on the ageing population and disability
The availability of data on income and taxation
International migrants.[71]
This would suggest that currently neither the census
nor other administrative datasets are adequate in these areas.
62. Even while arguing that expanding the system
of utilising local administrative data adopted by his company
to cover the country would cost roughly one tenth or less of the
present cost of the Census, enable results to be available with
six months and the exercise repeated at more regular intervals,
Professor Mayhew acknowledged: "There is always scope to
improve, and a population register would probably enable further
improvements"; and "there is a demand out there from
the policy community which is not quite the same as the requirements
of the academic community".[72]
63. We consider
Professor Mayhew's evidence as confirmation that there is a credible
alternative to the census for the purposes of local government.
However, we note that local government are not the only users
of census data, andbecause of their ad hoc natureProfessor
Mayhew's surveys would not substitute for the census in terms
of being able to derive a snapshot of the whole nation at one
time, with very widespread coverage (because of the mandatory
nature of the census process) and the ability to make direct comparisons
over time. The academic community would clearly lose more than
the public sector by the ending of the census.
64. Furthermore,
we are concerned that there would need to be a level of expertise
not currently widely available amongst organisations collecting
data in order to achieve results comparable with those obtained
from census data. We recommend that the Government use the time
until the next census is due to ensure administrative data is
better able to supplement or replace census data. This will require
a considerable investment, and possibly the production of a list
of approved providers for local authorities, health bodies, etc,
to ensure that the data produced is both robust and comparable
across authority boundaries and devolved administrations across
the whole of the UK.
65. Francis
Maude, the Minister for the Cabinet Office and Paymaster General,
wrote to us that "while cost is a driver, the real issue
is ensuring that the best possible approach is taken". We
are not persuaded that local and frequent surveys could provide
an adequate substitute for census data despite the potential advantage
of providing more up-to-date information unless they were designed
and implemented to a high standard. We are therefore not convinced
therefore that the use of administrative data would be a cheaper
option over a ten year census cycle.
66. However,
if standards could be set to facilitate integration with administrative
sources, we consider it possible that obligations could be imposed
on privatised utilities to produce and provide government with
access to useful social data.
Concerns about the future availability
and reliability of administrative sources
67. The Royal Statistical Society had concerns that
the exclusive use of administrative data had another problem:
the overall data infrastructure would become
more dependent on the policies of a range of government departments
and organizations whose primary objective is not data collection.
From a social science perspective this creates a risk that data
series may change in unplanned ways and that comparable datasets
through time may be difficult to achieve.[73]
68. In a future without a census there would be a
need to ensure that social data was not compromised by policy
decisions taken at local or national levels. The Royal Statistical
Society suggested to us that government departments supplying
information to the statistical system replacing the census should
be required to consult with the National Statistician before implementing
major changes to their data collections.[74]
We think this suggestion has considerable merit. We
recommend that the ONS, if they decide to discontinue the census,
should consider how administrative data might be collected over
a sustained period without falling hostage to political considerations.
69. In this
context, there is a particular problem in relation to ensuring
the robustness of longitudinal studies by providing a benchmark
against which the representative nature of the surviving cohort
may be measured. Soundly-based longitudinal studies are a particular
strength of the UK at present, and are vital in particular in
relation to research into health and educational outcomes. We
expect the ONS to pay particular attention to ensuring that any
alternatives to the census enable the continuance of such studies.
70. Professor Ceri Peach warned that using a multitude
of unplanned, diverse surveys and studies to replace a monolithic
census where surveying methods are standardised and changes planned
in advance posed the risk that slight changes in sampling methods
could lead to unnoticed influences on the data outputs.[75]
71. Academic witnesses stated that such a risk was
mitigated by the existence of a central reference point.[76]
This is currently provided by the census which, by its simple
existence, influences the data collected in surveys and how that
data is categorised. Many other surveys and studies employ terminology
and definitions used in the census due to the obvious benefit
of being more compatible with the data in the census. In the absence
of the census, disparate surveys could become increasingly incompatible.[77]
Professor Martin of the Royal Statistical Society told us that
it would be "exceedingly difficult to mandate organisations
of different sizes and shapes, with different biases inherent
in their populations, to produce something that you knew was using
the same methodology in every place".[78]
72. We asked whether the ESRC should set standards
by which data collected by ESRC-funded research should be reported,
but the academics we questioned rejected this.[79]
However, the National Data
Strategy exists and we judge that this strategy would provide
a vehicle through which greater coherence of data collection,
both administrative and research, could be achieved in future.
73. Concerns were also raised that because the census
is used as a standard against which academic studies are compared
to test the accuracy of the sampling methods used,[80]
its loss might render a number of other studies less useful. The
Joseph Rowntree Foundation suggested:
The main problem however with [the new UK Household
Longitudinal Study, Understanding Society (USoc)] as an alternative
to the census, is that it's longitudinal, so over time because
of wave on wave attrition (where the number of respondents decreases
over time due to loss to follow up, emigrating or death) it becomes
less representative of the population. Although it is large and
robust, when you get down to small area statistics which census
produces, then USoc is not appropriate.[81]
74. When looking at a central reference the key consideration
was held to be something that reliably and consistently linked
individuals to addresses.[82]
Professor Mayhew told us that address databases did exist but
that to make the whole process more efficient "the first
thing that could be done is to link all administrative records
to [a Unique Property Reference Number], and that would make a
huge difference to the quality of the data and the processing
of the data in future".[83]
75. There were problems raised with respect to the
use and combination of administrative data. Professor Blane was
worried about the sensitivities of the British public to cross
referencing government collected data:
I am aware that within the civil service there
are problems of linking data. The example I know about is with
the ONS Longitudinal Study where, for about 20 years, there has
been talk of linking in people's benefit records from the Department
for Work and Pensions. Some years it is on and some years it is
off. There is a big problem about the Data Protection Act and
whether a civil service department will release data to another
civil service department because of the implications under the
Data Protection Act. It could be that the culture in Britain is
different from that in Scandinavia-that this relatively legitimised
linkage in Scandinavia is foreign to the culture of Britain and
that it would not work.[84]
76. Professor Joshi, President of the Society for
Lifecourse and Longitudinal Studies, raised what might be a more
pertinent point; that while addresses may be unique, "people
may not be uniquely associated with addresses".[85]
Any central database would have to ensure that it caught those
people, like "visitors, second homes and children moving
between parents", if it was to be as useful as the census.[86]
Several current sources of data, such as the Child Benefit database,[87]
the Local Land and Property Gazetteer[88]
and the General Practice Research Database,[89]
were mentioned as useful during our oral evidence sessions but
none of them had all of the key features of the census.
77. There is
a danger that, if the census is not repeated, there will be no
equivalent large-scale collection of trusted data that can be
used to correct smaller surveys. We are convinced of the need
to have a national reference point that other datasets might use
as a benchmark for their own parameters. We recommend that the
ONS consider how this might be achieved in the absence of a census;
it appears to us that making an existing dataset better would
be more advantageous than adding a new one.
78. We were also concerned to establish the degree
to which government departments will be able to continue to collect
data on the scale which they have done recently. As our predecessor
Committee found,[90]
it is difficult to track research and development (R&D) spend
by government departments accurately over time using published
statistics in the R&D scoreboard. Nor do the statistics readily
show what proportion of R&D is social science related. There
have been concerns, however, that government research and development
will see a significant cutback given the stringent budgetary cuts
faced by government departments. In 2007, Lord Sainsbury's review
of research spending in government[91]
recommended better identification of and protection for departmental
R&D budgets but the Campaign for Science and Engineering in
the UK considers that, while the Government accepted this recommendation,
there has been little progress.[92]
The joint heads of the Government Social Research Service (Jenny
Dibden and Richard Bartholomew) told us that they did not believe
that social science was being unduly affected by the budget cuts
within departments:
Heads of Analysis undertook an exercise, which
Sir John Beddington has spoken to in the House of Lords Science
and Technology Committee, that looked at how Departments were
dealing with the spending review settlement specifically in relation
to analysts and what they spent on the research budget. The conclusion
that we reached on the early returns was that analysis was not
being disproportionately affected, which was important. There
were indications that in some Departments spending would be preserved,
or potentially increased, and in other Departments there would
be reductions.[93]
79. The Government
spends significant sums on R&D, though it is not clear what
percentage of this relates to social science. However, we accept
the assurance of the joint heads of the Government Social Research
Service that spending is, in the main, being protected even in
this difficult economic climate. That same climate makes it even
more vital that the Government ensures such expenditure, firmly
based on evidence, achieves the maximum benefit possible.
80. We also have concerns about the future availability
of data to the third sector, especially volunteer and community-based
organisations with limited resources. Any move away from the provision
of data free of charge under the Neighbourhood Statistics service
would clearly have a detrimental effect on their ability to continue
to provide services needed locally. We
are not convinced of the value of government collecting data simply
because it has happened in the past and we consider that the responsibility
to maintain particular datasets should rest with those bodies
most interested in the dataset. However, we recognise the difficulties
that local charities and support groups may have in accessing
information if the census is discontinued. We regard it as essential
that the Government recognise these needs and confirm that appropriate
steps would be taken to ensure these groups to have continued
free access to whatever alternative data is gathered and shared
by public bodies in order to avoid detriment to the valuable local
services provided by them. We anticipate a central data repository
from which all publicly funded social data, not subject to legal
or commercial restrictions, would be made available.
Future developments
81. The majority of submissions to the inquiry were
more concerned about the loss of the census as a source of social
data than with anticipating new data sources. Despite this, various
resources were proffered as potential sources of census-style
information, such as a Swedish-style population register,[94]
expanding the data associated with the National Insurance number
(for example, associating the address or information on characteristics
of the dwelling where an individual is living),[95]
and sharing government administrative databases more widely.[96]
None of these options was seen as a ready replacement and some
effort would be necessary to make them easier to use, more comprehensive
or more tightly associated with individuals over time.
82. Professor Blane told us that the collection of
data really depended on the principle of informed consent. In
Sweden, every citizen has a unique ID that appears on every official
document relating to that citizen, making cross referencing of
information very easy. This is balanced by the fact that the use
of the data is controlled through an ethics committee and the
system is subject to renewal through a referendum every ten years
to ensure that the state still has a mandate for that level of
data co-ordination by government.[97]
Our witnesses doubted that, in a larger and more heterogeneous
society such as the UK, this degree of "information swapping"
would be as acceptable, despite the ready access to computing
power making the sharing of even the largest datasets more feasible.[98]
In this context, we note the difficult data protection issues
already being raised. Therefore,
although we do not rule out the development of new sources of
data in the future, we consider it would be wrong to discontinue
the census simply hoping that new developments will provide a
solution to the gaps caused by the loss of census data. ONS must
be sure that the tools used to collect data will be adequate.
One key concern is that we have not identified any dataset that
will really enable social scientists/historians to follow individuals
over time. Most public sector data-gathering is focused on the
size of specified groups, rather than details of individuals,
and private sector databases (such as those for loyalty cards)
cover only parts of the population and are of little relevance
for many of the economically excluded and the poorer sections
of society.
83. We are convinced
that the social science benefits of the census are valuable and
that they outweigh the financial costs. However, we are also convinced
that there remain significant benefits to be gained in terms of
improving the consistency, currency and availability of administrative
data to government planners. Although we put forward these conclusions
to assist in the ONS's 'Beyond 2011' project, we consider it essential
that the Government not only retain access to the breadth and
quality of data it collects but seeks to improve its currency
and consistency.
84. The regular
conduct of a census in the UK has provided Government and social
scientists with an almost unique dataset with which to examine
the changing nature of UK society over the past 200 years. We
consider that good evidence-based social policy is founded on
such data and that the Government needs to ensure future access
to high quality social data.
55 Ev 52 Back
56
"UK Strategy for Data Resources for Social and Economic Research",
UK Data Forum, 2011
www.esrc.ac.uk/funding-and-guidance/tools-and-resources/research-resources/data-services/NDS/index.aspx Back
57
Data generated by organisations that operate on a 'for-profit'
basis, for example, loyalty cards. Back
58
Data generated by watching traffic, real or virtual, for example
CCTV or visitors to a web page. Back
59
"UK Strategy for Data Resources for Social and Economic Research",
section 6.4, UK Data Forum, 2011
www.esrc.ac.uk/funding-and-guidance/tools-and-resources/research-resources/data-services/NDS/index.aspx Back
60
Ev 58 Back
61
Ev 57 Back
62
Q 14 Back
63
Ev 57 Back
64
For example Ev 42 Back
65
Q 18 [Professor Blane] Back
66
Ev w37 Back
67
Ev w1 Back
68
Ev w15 Back
69
Ev 55 Back
70
Ev 55-56 Back
71
Ev 60 Back
72
Ev 58 and Q 14 Back
73
Ev 42 Back
74
Ibid. Back
75
Ev w3 Back
76
Q 18 Back
77
Q 112 Back
78
Q 57 Back
79
Q 42 Back
80
For example, Ev w28 Back
81
Ev 55 Back
82
Q 29 Back
83
Q 30 [Professor Mayhew] Back
84
Q 30 [Professor Blane] Back
85
Q 34 Back
86
Q 34 Back
87
Q 39 Back
88
Q 29 Back
89
Q 18 Back
90
Science and Technology Committee, Sixth Report of Session 2009-10,
The impact of spending cuts on science and scientific research,
HC 335-I Back
91
HM Treasury, October 2007, The Race to the Top: A Review of
Government's Science and Innovation Policies. Recommendation
8.4 Back
92
"Government departmental R&D spending", Campaign
for Science and Engineering in the UK, http://sciencecampaign.org.uk/?p=7144
Back
93
Q 120 [Jenny Dibden] Back
94
Ev w16 Back
95
Ev w12 Back
96
For example, Ev 35 Back
97
Q 3 Back
98
For example Q 30 [Professor Blane] and Q 133 [Richard Bartholomew] Back
|