The Census and social science - Science and Technology Committee Contents


4  A future without a census?

Existing alternatives to the census

51. As already indicated, the Government and other public sector bodies carry out a number of surveys of areas such as employment, health, household incomes and young people.[55] The National Data Strategy recognises the potential value of such 'administrative data' for research purposes.[56] The strategy outlines the steps necessary for this data, along with commercial[57] and tracking data,[58] to be useful for research. It states: "The [Economic and Social Research Council] will investigate the appetite for a collaborative programme of research involving both private and public sector organisations in the use of transactions data—information collected and retained by organisations in the conduct of their business".[59]

52. We earlier noted the difficulties that may arise in using data collected for one clear (commercial) purpose to provide information for other purposes (see paragraph 26). Our witnesses were sharply divided about the potential for, and the degree of difficulty of, combining different existing administrative databases to produce information comparable to that derived from the census. The importance of getting this right is thrown into relief when one considers, for example, that, "crudely, 75% of local authority funding is centrally funded with allocation based to a significant degree on population".[60]

53. Professor Les Mayhew, Cass Business School, explained that he was able to carry out surveys that were more timely and more accurate on a small scale than the census by combining data sets gathered by local providers of health, local authority and other services:[61]

    There is no dataset that completely covers all the population or is 100% reliable, so you have to combine them in some way. We link the population to property registers. We have a set of rules by which you can confirm or not confirm, based on whether they are on more than one dataset and other rules, which I could explain.

    Because I think I probably have as much experience as anybody in using all these datasets, I want to put on record the fact that you can look at household composition using administrative data by linking people to their addresses.[62]

54. He explained that:

    The data base produced may be contrasted with information available from the Census

    -  The results are timelier than the Census which can take several years to process before being released and be more than 12 years out of date before it is refreshed. Based on the 2011 Census basic population aggregates will not be released before July 2012 and more detailed data will not follow in some cases until much later.

    -  On Census day this year we obtained and processed snapshots of local administrative data covering the six Olympic Boroughs in London who commissioned the work. Fully geo-referenced databases down to household level by age and sex were completed and handed over to local authorities inside six months.

    -  The databases contain much greater granularity than is possible using Census data. Each database is able to produce statistics for any size or shape of geographical area or administrative unit and unlike the Census is not constrained by any pre-determined geographical boundaries.[63]

55. While acknowledging the existence and importance of alternative sources of data,[64] other witnesses raised a number of concerns about the robustness of such administrative data, and the ease of comparison of data from different sources. For example, Professor David Blane commented:

    civil servants always laugh and say, 'You have seen nothing until you have faced the problems in administrative data.' A real ace like Professor Mayhew, who is motivated and skilled, can cut through a lot of problems, but I worry that most things are going to be done by people doing routine work. They are not going to be as motivated and skilled as Professor Mayhew. The potential for introducing error into the data is enormous.[65]

56. The British Library was particularly concerned that the loss of the census would lead to the fragmentation of information about the UK population:

    incomplete administrative and sample-based survey data cannot:

substitute for complete and impartial census data;

provide micro-level neighbourhood data required by local authorities and third sector organisations; and,

support analysis of long-term trends due to methodological changes.[66]

57. Dr Eldin Fahmy, from the School for Policy Studies in the University of Bristol, believed that the decision to axe the decennial population census after 2011 would "substantially undermine the capacity of UK social science to analyse and understand social processes at a small scale".[67] TWRI Policy and Research suggested that, whilst no census is perfect, "the absence of a 2021 census will be extremely detrimental to the understanding of changes in society in the decade 2011 to 2021".[68]

58. Moreover, other sources of data have restrictions on their use or general availability: we were told, for example, that the Labour Force Survey Annual Local Area Data Series and the Unitary Authority/Local Authority (UA/LA series) had been withdrawn on instruction from Office for National Statistics (ONS), due to confidentiality issues.[69] The Joseph Rowntree Foundation explained that, although these databases had always been anonymised when released for analysis by outside bodies to ensure that users could not identify any respondent with the information given, advances in technology and software had made it easier to link survey records to either other survey files or other administrative or commercial databases. The ONS had therefore concluded: "Although the risk for most respondents is very small, there remains a risk of identification for people with unusual combinations of personal circumstances. Thus the release outside the central government statistical services of social survey databases with small area identifiers, alongside a national database with detailed coding, has now been ceased".[70]

59. Administrative data is often collected without consideration of potential wider application and use, thus often is only fit for a single purpose. There is reluctance on the part of research scientists and government social scientists to utilise it for other purposes due to the difficulties in its reuse. The existing National Data Strategy should provide good practice and guidance on expanding the number of uses and the longevity of data collected at public expense.

60. We also recommend that the ONS seek to remove bureaucratic burdens currently hindering the broader use of data. Too often the Data Protection Act is used as an excuse for not reusing data collected at considerable cost to the public purse. We consider it entirely possible that data could be collected in a way to facilitate better public administration that would not contravene the principles of the Data Protection Act. We would like the Government to indicate how it plans to more broadly use data from sources, such as the Labour Force Survey, as part of their response to this report.

61. The Local Government Association indicated that they would like to see better and more available data in a number of areas:

    Health data on the ageing population and disability

    The availability of data on income and taxation

    Second jobs

    Broadband take-up

    International migrants.[71]

This would suggest that currently neither the census nor other administrative datasets are adequate in these areas.

62. Even while arguing that expanding the system of utilising local administrative data adopted by his company to cover the country would cost roughly one tenth or less of the present cost of the Census, enable results to be available with six months and the exercise repeated at more regular intervals, Professor Mayhew acknowledged: "There is always scope to improve, and a population register would probably enable further improvements"; and "there is a demand out there from the policy community which is not quite the same as the requirements of the academic community".[72]

63. We consider Professor Mayhew's evidence as confirmation that there is a credible alternative to the census for the purposes of local government. However, we note that local government are not the only users of census data, and—because of their ad hoc nature—Professor Mayhew's surveys would not substitute for the census in terms of being able to derive a snapshot of the whole nation at one time, with very widespread coverage (because of the mandatory nature of the census process) and the ability to make direct comparisons over time. The academic community would clearly lose more than the public sector by the ending of the census.

64. Furthermore, we are concerned that there would need to be a level of expertise not currently widely available amongst organisations collecting data in order to achieve results comparable with those obtained from census data. We recommend that the Government use the time until the next census is due to ensure administrative data is better able to supplement or replace census data. This will require a considerable investment, and possibly the production of a list of approved providers for local authorities, health bodies, etc, to ensure that the data produced is both robust and comparable across authority boundaries and devolved administrations across the whole of the UK.

65. Francis Maude, the Minister for the Cabinet Office and Paymaster General, wrote to us that "while cost is a driver, the real issue is ensuring that the best possible approach is taken". We are not persuaded that local and frequent surveys could provide an adequate substitute for census data despite the potential advantage of providing more up-to-date information unless they were designed and implemented to a high standard. We are therefore not convinced therefore that the use of administrative data would be a cheaper option over a ten year census cycle.

66. However, if standards could be set to facilitate integration with administrative sources, we consider it possible that obligations could be imposed on privatised utilities to produce and provide government with access to useful social data.

Concerns about the future availability and reliability of administrative sources

67. The Royal Statistical Society had concerns that the exclusive use of administrative data had another problem:

    the overall data infrastructure would become more dependent on the policies of a range of government departments and organizations whose primary objective is not data collection. From a social science perspective this creates a risk that data series may change in unplanned ways and that comparable datasets through time may be difficult to achieve.[73]

68. In a future without a census there would be a need to ensure that social data was not compromised by policy decisions taken at local or national levels. The Royal Statistical Society suggested to us that government departments supplying information to the statistical system replacing the census should be required to consult with the National Statistician before implementing major changes to their data collections.[74] We think this suggestion has considerable merit. We recommend that the ONS, if they decide to discontinue the census, should consider how administrative data might be collected over a sustained period without falling hostage to political considerations.

69. In this context, there is a particular problem in relation to ensuring the robustness of longitudinal studies by providing a benchmark against which the representative nature of the surviving cohort may be measured. Soundly-based longitudinal studies are a particular strength of the UK at present, and are vital in particular in relation to research into health and educational outcomes. We expect the ONS to pay particular attention to ensuring that any alternatives to the census enable the continuance of such studies.

70. Professor Ceri Peach warned that using a multitude of unplanned, diverse surveys and studies to replace a monolithic census where surveying methods are standardised and changes planned in advance posed the risk that slight changes in sampling methods could lead to unnoticed influences on the data outputs.[75]

71. Academic witnesses stated that such a risk was mitigated by the existence of a central reference point.[76] This is currently provided by the census which, by its simple existence, influences the data collected in surveys and how that data is categorised. Many other surveys and studies employ terminology and definitions used in the census due to the obvious benefit of being more compatible with the data in the census. In the absence of the census, disparate surveys could become increasingly incompatible.[77] Professor Martin of the Royal Statistical Society told us that it would be "exceedingly difficult to mandate organisations of different sizes and shapes, with different biases inherent in their populations, to produce something that you knew was using the same methodology in every place".[78]

72. We asked whether the ESRC should set standards by which data collected by ESRC-funded research should be reported, but the academics we questioned rejected this.[79] However, the National Data Strategy exists and we judge that this strategy would provide a vehicle through which greater coherence of data collection, both administrative and research, could be achieved in future.

73. Concerns were also raised that because the census is used as a standard against which academic studies are compared to test the accuracy of the sampling methods used,[80] its loss might render a number of other studies less useful. The Joseph Rowntree Foundation suggested:

    The main problem however with [the new UK Household Longitudinal Study, Understanding Society (USoc)] as an alternative to the census, is that it's longitudinal, so over time because of wave on wave attrition (where the number of respondents decreases over time due to loss to follow up, emigrating or death) it becomes less representative of the population. Although it is large and robust, when you get down to small area statistics which census produces, then USoc is not appropriate.[81]

74. When looking at a central reference the key consideration was held to be something that reliably and consistently linked individuals to addresses.[82] Professor Mayhew told us that address databases did exist but that to make the whole process more efficient "the first thing that could be done is to link all administrative records to [a Unique Property Reference Number], and that would make a huge difference to the quality of the data and the processing of the data in future".[83]

75. There were problems raised with respect to the use and combination of administrative data. Professor Blane was worried about the sensitivities of the British public to cross referencing government collected data:

    I am aware that within the civil service there are problems of linking data. The example I know about is with the ONS Longitudinal Study where, for about 20 years, there has been talk of linking in people's benefit records from the Department for Work and Pensions. Some years it is on and some years it is off. There is a big problem about the Data Protection Act and whether a civil service department will release data to another civil service department because of the implications under the Data Protection Act. It could be that the culture in Britain is different from that in Scandinavia-that this relatively legitimised linkage in Scandinavia is foreign to the culture of Britain and that it would not work.[84]

76. Professor Joshi, President of the Society for Lifecourse and Longitudinal Studies, raised what might be a more pertinent point; that while addresses may be unique, "people may not be uniquely associated with addresses".[85] Any central database would have to ensure that it caught those people, like "visitors, second homes and children moving between parents", if it was to be as useful as the census.[86] Several current sources of data, such as the Child Benefit database,[87] the Local Land and Property Gazetteer[88] and the General Practice Research Database,[89] were mentioned as useful during our oral evidence sessions but none of them had all of the key features of the census.

77. There is a danger that, if the census is not repeated, there will be no equivalent large-scale collection of trusted data that can be used to correct smaller surveys. We are convinced of the need to have a national reference point that other datasets might use as a benchmark for their own parameters. We recommend that the ONS consider how this might be achieved in the absence of a census; it appears to us that making an existing dataset better would be more advantageous than adding a new one.

78. We were also concerned to establish the degree to which government departments will be able to continue to collect data on the scale which they have done recently. As our predecessor Committee found,[90] it is difficult to track research and development (R&D) spend by government departments accurately over time using published statistics in the R&D scoreboard. Nor do the statistics readily show what proportion of R&D is social science related. There have been concerns, however, that government research and development will see a significant cutback given the stringent budgetary cuts faced by government departments. In 2007, Lord Sainsbury's review of research spending in government[91] recommended better identification of and protection for departmental R&D budgets but the Campaign for Science and Engineering in the UK considers that, while the Government accepted this recommendation, there has been little progress.[92] The joint heads of the Government Social Research Service (Jenny Dibden and Richard Bartholomew) told us that they did not believe that social science was being unduly affected by the budget cuts within departments:

    Heads of Analysis undertook an exercise, which Sir John Beddington has spoken to in the House of Lords Science and Technology Committee, that looked at how Departments were dealing with the spending review settlement specifically in relation to analysts and what they spent on the research budget. The conclusion that we reached on the early returns was that analysis was not being disproportionately affected, which was important. There were indications that in some Departments spending would be preserved, or potentially increased, and in other Departments there would be reductions.[93]

79. The Government spends significant sums on R&D, though it is not clear what percentage of this relates to social science. However, we accept the assurance of the joint heads of the Government Social Research Service that spending is, in the main, being protected even in this difficult economic climate. That same climate makes it even more vital that the Government ensures such expenditure, firmly based on evidence, achieves the maximum benefit possible.

80. We also have concerns about the future availability of data to the third sector, especially volunteer and community-based organisations with limited resources. Any move away from the provision of data free of charge under the Neighbourhood Statistics service would clearly have a detrimental effect on their ability to continue to provide services needed locally. We are not convinced of the value of government collecting data simply because it has happened in the past and we consider that the responsibility to maintain particular datasets should rest with those bodies most interested in the dataset. However, we recognise the difficulties that local charities and support groups may have in accessing information if the census is discontinued. We regard it as essential that the Government recognise these needs and confirm that appropriate steps would be taken to ensure these groups to have continued free access to whatever alternative data is gathered and shared by public bodies in order to avoid detriment to the valuable local services provided by them. We anticipate a central data repository from which all publicly funded social data, not subject to legal or commercial restrictions, would be made available.

Future developments

81. The majority of submissions to the inquiry were more concerned about the loss of the census as a source of social data than with anticipating new data sources. Despite this, various resources were proffered as potential sources of census-style information, such as a Swedish-style population register,[94] expanding the data associated with the National Insurance number (for example, associating the address or information on characteristics of the dwelling where an individual is living),[95] and sharing government administrative databases more widely.[96] None of these options was seen as a ready replacement and some effort would be necessary to make them easier to use, more comprehensive or more tightly associated with individuals over time.

82. Professor Blane told us that the collection of data really depended on the principle of informed consent. In Sweden, every citizen has a unique ID that appears on every official document relating to that citizen, making cross referencing of information very easy. This is balanced by the fact that the use of the data is controlled through an ethics committee and the system is subject to renewal through a referendum every ten years to ensure that the state still has a mandate for that level of data co-ordination by government.[97] Our witnesses doubted that, in a larger and more heterogeneous society such as the UK, this degree of "information swapping" would be as acceptable, despite the ready access to computing power making the sharing of even the largest datasets more feasible.[98] In this context, we note the difficult data protection issues already being raised. Therefore, although we do not rule out the development of new sources of data in the future, we consider it would be wrong to discontinue the census simply hoping that new developments will provide a solution to the gaps caused by the loss of census data. ONS must be sure that the tools used to collect data will be adequate. One key concern is that we have not identified any dataset that will really enable social scientists/historians to follow individuals over time. Most public sector data-gathering is focused on the size of specified groups, rather than details of individuals, and private sector databases (such as those for loyalty cards) cover only parts of the population and are of little relevance for many of the economically excluded and the poorer sections of society.

83. We are convinced that the social science benefits of the census are valuable and that they outweigh the financial costs. However, we are also convinced that there remain significant benefits to be gained in terms of improving the consistency, currency and availability of administrative data to government planners. Although we put forward these conclusions to assist in the ONS's 'Beyond 2011' project, we consider it essential that the Government not only retain access to the breadth and quality of data it collects but seeks to improve its currency and consistency.

84. The regular conduct of a census in the UK has provided Government and social scientists with an almost unique dataset with which to examine the changing nature of UK society over the past 200 years. We consider that good evidence-based social policy is founded on such data and that the Government needs to ensure future access to high quality social data.


55   Ev 52 Back

56   "UK Strategy for Data Resources for Social and Economic Research", UK Data Forum, 2011
www.esrc.ac.uk/funding-and-guidance/tools-and-resources/research-resources/data-services/NDS/index.aspx 
Back

57   Data generated by organisations that operate on a 'for-profit' basis, for example, loyalty cards. Back

58   Data generated by watching traffic, real or virtual, for example CCTV or visitors to a web page. Back

59   "UK Strategy for Data Resources for Social and Economic Research", section 6.4, UK Data Forum, 2011
www.esrc.ac.uk/funding-and-guidance/tools-and-resources/research-resources/data-services/NDS/index.aspx 
Back

60   Ev 58 Back

61   Ev 57 Back

62   Q 14 Back

63   Ev 57 Back

64   For example Ev 42 Back

65   Q 18 [Professor Blane] Back

66   Ev w37 Back

67   Ev w1 Back

68   Ev w15 Back

69   Ev 55 Back

70   Ev 55-56 Back

71   Ev 60 Back

72   Ev 58 and Q 14 Back

73   Ev 42 Back

74   Ibid. Back

75   Ev w3 Back

76   Q 18 Back

77   Q 112 Back

78   Q 57 Back

79   Q 42 Back

80   For example, Ev w28 Back

81   Ev 55 Back

82   Q 29 Back

83   Q 30 [Professor Mayhew] Back

84   Q 30 [Professor Blane] Back

85   Q 34 Back

86   Q 34 Back

87   Q 39 Back

88   Q 29 Back

89   Q 18 Back

90   Science and Technology Committee, Sixth Report of Session 2009-10, The impact of spending cuts on science and scientific research, HC 335-I Back

91   HM Treasury, October 2007, The Race to the Top: A Review of Government's Science and Innovation Policies. Recommendation 8.4 Back

92   "Government departmental R&D spending", Campaign for Science and Engineering in the UK, http://sciencecampaign.org.uk/?p=7144  Back

93   Q 120 [Jenny Dibden] Back

94   Ev w16 Back

95   Ev w12 Back

96   For example, Ev 35 Back

97   Q 3 Back

98   For example Q 30 [Professor Blane] and Q 133 [Richard Bartholomew] Back


 
previous page contents next page


© Parliamentary copyright 2012
Prepared 21 September 2012