Note: This research was funded in part by Wellcome. NHS England is the data controller, TPP is the data processor, and the key researchers on OpenSAFELY are acting on behalf of NHS England. This implementation of OpenSAFELY is hosted within the TPP environment, which is accredited to the ISO 27001 information security standard and is NHS IG Toolkit–compliant (
24); patient data have been pseudonymized for analysis and linkage using industry standard cryptographic hashing techniques; all pseudonymized data sets transmitted for linkage onto OpenSAFELY are encrypted; access to the platform is via a virtual private network connection and is restricted to a small group of researchers; the researchers hold contracts with NHS England and only access the platform to initiate database queries and statistical models; all database activity is logged; and only aggregate statistical outputs leave the platform environment, following best practice for anonymization of results, such as statistical disclosure control for low cell counts (
25). The OpenSAFELY research platform adheres to the obligations of the U.K. General Data Protection Regulation and the Data Protection Act 2018. In March 2020, the Secretary of State for Health and Social Care used powers under the U.K. Health Service (Control of Patient Information) Regulations 2002 to require organizations to process confidential patient information for the purposes of protecting public health, providing health care services to the public, and monitoring and managing the COVID-19 outbreak and incidents of exposure; this sets aside the requirement for patient consent (
26). This was extended in November 2022 for the NHS England OpenSAFELY COVID-19 research platform (
27). In some cases of data sharing, the common law duty of confidence is met using, for example, patient consent or support from the Health Research Authority Confidentiality Advisory Group (
28). Taken together, these provide the legal bases to link patient data sets on the OpenSAFELY platform. General practitioner practices, from which the primary care data are obtained, are required to share relevant health information to support the public health response to the pandemic and have been informed of the OpenSAFELY analytics platform. This study was approved by the Health Research Authority (REC reference 20/LO/0651) and by the London School of Hygiene and Tropical Medicine Ethics Board (reference 21863). Dr. Sterne is the guarantor of the study.
Disclaimer: The views expressed in this article are those of the authors and not necessarily those of the National Institute for Health and Care Research (NIHR), NHS England, Public Health England, or the Department of Health and Social Care.
Acknowledgment: The authors are grateful for the support received from the TPP Technical Operations team throughout this work and for the generous assistance from the information governance and database teams at NHS England and the NHS England Directorate.
Financial Support: This study was supported by the COVID-19 Longitudinal Health and Wellbeing National Core Study, funded by UK Research and Innovation (UKRI) Medical Research Council (MRC) (grant reference MC_PC_20059), and the COVID-19 Data and Connectivity National Core Study, led by Health Data Research UK in partnership with the Office for National Statistics and funded by UKRI MRC (MC_PC_20058). The OpenSAFELY Platform is supported by grants from the Wellcome Trust (222097/Z/20/Z); UKRI MRC (MR/V015757/1, MR/W016729/1); National Institute for Health and Care Research (NIHR135559, COV-LT2-0073), and Health Data Research UK (HDRUK2021.000, 2021.0157). TPP provided technical expertise and infrastructure within its data center pro bono in the context of a national emergency. Drs. Horne and Sterne are funded by the NIHR Bristol Biomedical Research Centre. Dr. Sterne is funded by Health Data Research UK South-West. The funders had no role in the design of the study; collection, analysis, or interpretation of the data; writing of the report; or the decision to submit the manuscript for publication.
Reproducible Research Statement: Study protocol, statistical code, and data set: Access to the underlying identifiable and potentially reidentifiable pseudonymized electronic health record data is tightly governed by various legislative and regulatory frameworks and restricted by best practice. The data in OpenSAFELY are drawn from general practice data across England, where TPP is the data processor. TPP developers initiate an automated process to create pseudonymized records in the core OpenSAFELY database, which are copies of key structured data tables in the identifiable records. These pseudonymized records are linked onto key external data resources that have also been pseudonymized via SHA-512 1-way hashing of NHS numbers using a shared salt. Bennett Institute for Applied Data Science developers and principal investigators holding contracts with NHS England have access to the OpenSAFELY pseudonymized data tables as needed to develop the OpenSAFELY tools. These tools in turn enable researchers with OpenSAFELY data access agreements to write and execute code for data management and data analysis without direct access to the underlying raw pseudonymized patient data and to review the outputs of this code. All code for the full data management pipeline—from raw data to completed results for this analysis—and for the OpenSAFELY platform as a whole is available for review at
https://github.com/OpenSAFELY. All study code is available for inspection and reuse under MIT license at
https://github.com/opensafely/covid-vaccine-effectiveness-sequential-vs-single.
Corresponding Author: Jonathan A.C. Sterne, PhD, Population Health Sciences, University of Bristol, Oakfield House, Oakfield Grove, Bristol, BS8 2BN United Kingdom; e-mail,
[email protected].
Correction: This article was amended on 6 June 2023 to correct the key in Figure 2. A correction has been published (doi:10.7326/L23-0217).
Author Contributions: Conception and design: W.J. Hulme, E. Williamson, H.I. McDonald, C.T. Rentsch, L. Tomlinson, S.J.W. Evans, L. Smeeth, M.A. Hernán, J.A.C. Sterne.
Analysis and interpretation of the data: W.J. Hulme, E. Williamson, E.M.F. Horne, A. Green, H.I. McDonald, C.T. Rentsch, L. Tomlinson, I.J. Douglas, S.J.W. Evans, L. Smeeth, T. Palmer, M.A. Hernán, J.A.C. Sterne.
Drafting of the article: W.J. Hulme, E.M.F. Horne, E. Williamson, T. Palmer, M.A. Hernán, J.A.C. Sterne.
Critical revision for important intellectual content: W.J. Hulme, E. Williamson, H.I. McDonald, K. Bhaskaran, C.T. Rentsch, S.J.W. Evans, L. Smeeth, T. Palmer, M.A. Hernán, J.A.C. Sterne.
Final approval of the article: W.J. Hulme, E. Williamson, E.M.F. Horne, A. Green, H.I. McDonald, A.J. Walker, H.J. Curtis, C.E. Morton, B. MacKenna, R. Croker, A. Mehrkar, S. Bacon, D. Evans, P. Inglesby, S. Davy, K. Bhaskaran, A. Schultze, C.T. Rentsch, L. Tomlinson, I.J. Douglas, S.J.W. Evans, L. Smeeth, T. Palmer, B. Goldacre, M.A. Hernán, J.A.C. Sterne.
Provision of study materials or patients: W.J. Hulme, A.J. Walker, C.E. Morton, R. Croker, S. Bacon, D. Evans, P. Inglesby, S. Davy.
Statistical expertise: W.J. Hulme, E. Williamson, E.M.F. Horne, K. Bhaskaran, C.T. Rentsch, S.J.W. Evans, T. Palmer, J.A.C. Sterne.
Obtaining of funding: L. Tomlinson, L. Smeeth, B. Goldacre, J.A.C. Sterne.
Administrative, technical, or logistic support: W.J. Hulme, A. Green, H.J. Curtis, C.E. Morton, B. MacKenna, A. Mehrkar, S. Bacon, D. Evans, P. Inglesby, S. Davy, A. Schultze.
Collection and assembly of data: W.J. Hulme, E. Williamson, E.M.F. Horne, A. Green, A.J. Walker, C.E. Morton, S. Bacon, D. Evans, P. Inglesby, S. Davy, I.J. Douglas
This article was published at
Annals.org on 2 May 2023.
The CEMP Measure: Overcoming Challenges in COVID-19 Research
Hulme et al. rightly stress the importance of addressing selection effects when assessing the effectiveness of COVID-19 vaccines using observational data (1). They offer two proposals for doing so. However, their methods require high quality, population-level health data on both vaccinated and unvaccinated persons. That data was available for their U.K.-based study, but is generally unavailable in the U.S. We have developed and demonstrated a novel, robust approach to addressing selection effects when studying vaccine effectiveness (VE) against mortality, using only death and vaccination records (2-4). Our approach relies on the idea that the predictors of non-COVID natural death are good proxies for background health and risk of COVID death. We measure VE and relative mortality risk (RMR = 1 – VE) using a novel outcome measure the COVID Excess Mortality Percentage (CEMP). CEMP is the number of COVID deaths divided by non-COVID natural deaths in a group of decedents. The group can be defined by any variables available from death records, including age, gender, race/ethnicity, area-level socio-economic status, and education. The CEMP denominator controls for differences in underlying population health between two groups, such as vaccinated versus unvaccinated adults, two-dose versus three dose vaccinees, or Pfizer versus Moderna vaccinees. RMR between two groups can be retrieved as an odds ratio, in either a univariate comparison between two groups or multivariate logistic analysis, although in our experience, a simple univariate comparison within an age range is sufficient to address known confounders. The CEMP-based approach for studying VE relies only on death certificates, which are available for all decedents, linked to vaccination data. Racial/ethnic differences in COVID-19 mortality rates can be measured using only death certificate data. Using data only on decedents avoids challenges in estimating the population at risk. For example, population statistics may miscount some racial/ethnic groups. Race/ethnicity may also be inaccurately captured in death certificates, but comparisons between groups will be unbiased unless miscoding differs between COVID-19 decedents and decedents from other natural causes. Miscoding of COVID-19 as cause of death will bias VE estimates only if miscoding differs between groups. The CEMP-based approach can be used to compare VE for 1-vs-2-vs-3 doses of a vaccine (2), study waning over time (2), compare vaccines (3), and compare mortality rates between racial/ethnic groups (4). It can complement and be usefully compared to other approaches, including those proposed by Hulme et al.
References
1. Hulme WJ, Williamson E, Horne EMF, et al. Challenges in Estimating the Effectiveness of COVID-19 Vaccination Using Observational Data, Ann Intern Med. 2023;176:685-693. https://doi:10.7326/M21-4269
2. Atanasov V, Barreto N, Whittle J, et al. Understanding COVID-19 Vaccine Effectiveness Against Death Using a Novel Measure: COVID Excess Mortality Percentage, Vaccines. 2023;11(2):379. https://www.mdpi.com/2076-393X/11/2/379.
3. Atanasov V, Barreto N, Whittle J, et al. Selection Effects and COVID-19 Mortality Risk after Pfizer vs. Moderna Vaccination: Evidence from Linked Mortality and Vaccination Records. Vaccines. 2023;11(5):971. https://doi.org/10.3390/vaccines11050971
4. Yuan AY, Atanasov V, Barreto N, et al. Racial/Ethnic Disparities in COVID-19 Mortality: National Evidence from Death Certificates (working paper 2023), at http://ssrn.com/abstract=4317312