Variation within households in consent to link survey data to administrative records: Evidence from the UK Millennium Cohort Study

Tarek Mostafa

This article was first published in the NCRM newsletter.

Longitudinal surveys face significant challenges due to the rise in survey costs, attrition over time, and non-coverage of the target population. A promising solution to some of these problems is survey and administrative data linkage. Administrative data linkage leads to shorter interviews, less respondent burden and an overall reduction in costs (1) in addition to the gain of valuable information on respondents. However, access to administrative data is restricted by consent. Non-consent occurs when respondents refuse permission to link their administrative records to their survey data. It results in smaller samples and possibly in sample bias if the likelihood of consent is related to the characteristics of respondents.

This study aims to advance our knowledge about consent by analysing adult respondents’ behaviour when consenting to link their own administrative records in contrast to their behaviour when consenting to link someone else’s records (i.e. the cohort member in the Millennium Cohort Study). These variations in consent behaviour have not been explored in the past. All previous studies focused on respondents consenting to link their own records but not those of other members of their household (2, 3). The paper uses data from the UK Millennium Cohort Study (MCS) and focuses on consent to link the cohort member’s health and educational records and the main respondent’s health and economic records (all consents are sought in wave 4 of MCS). The study attempts to answer the following research questions:

Do respondents behave differently when consenting to link their own administrative records in comparison to consenting to link those of their children?

Does respondents’ consent behaviour vary according to the domain of consent, e.g. health, economic, education records?

What is the impact of interviewers on consent outcomes and can interviewer effects be separated from the impact of an interviewer’s geographical assignment?

In summary, the findings show that main respondents behave differently when consenting to link their own records and when consenting on behalf of the cohort members. For instance, parents of children with high cognitive skills are more likely to consent on linking their children’s educational records. In contrast, the child’s cognitive skills do not affect the parents’ likelihood to link their own health and economic records. Moreover, being a private person has a more significant effect on the MRs outcomes than those of the CM. When it comes to loyalty to the survey, respondents who have missed a wave in the past are found to be less likely to consent irrespective of the outcome. In contrast, partial evidence was found in support of the impact of past relationship with the agency holding the administrative data. Among the sociodemographic characteristics of respondents, ethnicity was found to have the strongest impact irrespective of the outcome. Nonwhite respondents are less likely to consent. The cross-equation correlations measured through the multivariate probit models showed that the highest level of association is between outcomes sought for the same respondent (i.e. MRs consenting for linking their own records vs. MRs consenting for linking the CMs records).When interviewers’ effects were included through the use of fixed effects models, the explanatory power of the models increased by 3 to 4 times. This indicates that the interviewers’ characteristics and behaviour have a large effect on consent.

In terms of fieldwork practices, the findings suggest that it is possible to identify the respondents who are less likely to consent (ethnic minorities, respondents with higher privacy concerns, and respondents who have dropped out from the survey in the past). Interviewers have a strong impact on consent, therefore in the case of low consent rates, the matching of interviewers and respondents and the allocation of interviewers, with more survey experience, to difficult cases might improve consent rates. The findings also indicate that the linked administrative data is likely to suffer from sample composition bias due to non-consent. This is of a particular interest for the MCS data users. For instance the linked MCS and educational records are likely to lose children with lower cognitive skills. Similarly the high and significant impact of ethnicity means that samples are likely to lose non-white minorities. Since ethnicity is highly correlated with educational, health and economic outcomes, the data contained in the linked administrative records will be affected by non-consent. However, the total level of bias depends on non-consent and on the extent of non-linkage (the failure to link data even if consent was given) which might alleviate or exacerbate the initial non-consent bias.

A full paper on this topic was published in the International Journal of Social Research Methodology: Mostafa, T. (2015). Variation within Households in Consent to Link Survey Data to Administrative Records: Evidence from the UK Millennium Cohort Study. International Journal of Social Research Methodology. 579.2015.1019264


1 Sakshaug, J., Couper, M., Ofstedal, M., & Weir, D. (2012). Linking survey and administrative records: Mechanisms of consent. Sociological Methods and Research, 41, 535–569

2 Korbmacher, J., & Schroeder, M. (2013). Consent when linking survey data with administrative records: The role of the interviewer. Survey Research Methods, 7, 115–131

3 Sala, E., Burton, J., & Knies, G. (2012). Correlates of obtaining informed consent to data linkage: Respondent, interview, and interviewer characteristics. Sociological Methods and Research, 41, 414–439