Research and Reporting Methods6 August 2013
    Author, Article, and Disclosure Information


    A universal challenge in studies that quantify the accuracy of diagnostic tests is establishing whether each participant has the disease of interest. Ideally, the same preferred reference standard would be used for all participants; however, for practical or ethical reasons, alternative reference standards that are often less accurate are frequently used instead. The use of different reference standards across participants in a single study is known as differential verification.

    Differential verification can cause severely biased accuracy estimates of the test or model being studied. Many variations of differential verification exist, but not all introduce the same risk of bias. A risk-of-bias assessment requires detailed information about which participants receive which reference standards and an estimate of the accuracy of the alternative reference standard. This article classifies types of differential verification and explores how they can lead to bias. It also provides guidance on how to report results and assess the risk of bias when differential verification occurs and highlights potential ways to correct for the bias.


    • 1. Rutjes AWReitsma JBCoomarasamy AKhan KS, and Bossuyt PMEvaluation of diagnostic tests when there is no gold standard. A review of methods. Health Technol Assess2007;11:iii, ix-51. [PMID: 18021577] CrossrefMedlineGoogle Scholar
    • 2. Knottnerus JAThe Evidence Base of Clinical Diagnosis. London BMJ 2003. Google Scholar
    • 3. de Groot JABossuyt PMReitsma JBRutjes AWDendukuri NJanssen KJet alVerification problems in diagnostic accuracy studies: consequences and solutions. BMJ2011;343:d4770. [PMID: 21810869] CrossrefMedlineGoogle Scholar
    • 4. Lijmer JGMol BWHeisterkamp SBonsel GJPrins MHvan der Meulen JHet alEmpirical evidence of design-related bias in studies of diagnostic tests. JAMA1999;282:1061-6. [PMID: 10493205] CrossrefMedlineGoogle Scholar
    • 5. Rutjes AWReitsma JBDi Nisio MSmidt Nvan Rijn JC, and Bossuyt PMEvidence of bias and variation in diagnostic accuracy studies. CMAJ2006;174:469-76. [PMID: 16477057] CrossrefMedlineGoogle Scholar
    • 6. Lehman CDLee CILoving VAPortillo MSPeacock S, and DeMartini WBAccuracy and value of breast ultrasound for primary imaging evaluation of symptomatic women 30-39 years of age. AJR Am J Roentgenol2012;199:1169-77. [PMID: 23096195] CrossrefMedlineGoogle Scholar
    • 7. Büller HRTen Cate-Hoek AJHoes AWJoore MAMoons KGOudega Ret alAMUSE (Amsterdam Maastricht Utrecht Study on thromboEmbolism) InvestigatorsSafely ruling out deep venous thrombosis in primary care. Ann Intern Med2009;150:229-35 LinkGoogle Scholar
    • 8. Thangaratinam SBrown KZamora JKhan KS, and Ewer AKPulse oximetry screening for critical congenital heart defects in asymptomatic newborn babies: a systematic review and meta-analysis. Lancet2012;379:2459-64. [PMID: 22554860] CrossrefMedlineGoogle Scholar
    • 9. Zuithoff NPVergouwe YKing MNazareth IHak EMoons KGet alA clinical prediction rule for detecting major depressive disorder in primary care: the PREDICT-NL study. Fam Pract2009;26:241-50. [PMID: 19546117] CrossrefMedlineGoogle Scholar
    • 10. Gupta AChandrasekhar AGupte NPatil SBhosale RSambarey Pet alByramjee Jeejeebhoy Medical College-Johns Hopkins University Study GroupSymptom screening among HIV-infected pregnant women is acceptable and has high negative predictive value for active tuberculosis. Clin Infect Dis2011;53:1015-8. [PMID: 21940417] CrossrefMedlineGoogle Scholar
    • 11. Alonzo TABrinton JTRingham BM, and Glueck DHBias in estimating accuracy of a binary screening test with differential disease verification. Stat Med2011;30:1852-64. [PMID: 21495059] CrossrefMedlineGoogle Scholar
    • 12. Walter SDMacaskill PLord SJ, and Irwig LEffect of dependent errors in the assessment of diagnostic or screening test accuracy when the reference standard is imperfect. Stat Med2012;31:1129-38. [PMID: 22351623] CrossrefMedlineGoogle Scholar
    • 13. Whiting PFRutjes AWWestwood MEMallett SDeeks JJReitsma JBet alQUADAS-2 GroupQUADAS-2: a revised tool for the quality assessment of diagnostic accuracy studies. Ann Intern Med2011;155:529-36 LinkGoogle Scholar
    • 14. de Groot JADendukuri NJanssen KJReitsma JBBossuyt PM, and Moons KGAdjusting for differential-verification bias in diagnostic-accuracy studies: a Bayesian approach. Epidemiology2011;22:234-41. [PMID: 21228702] CrossrefMedlineGoogle Scholar
    • 15. Gaffikin LMcGrath JAArbyn M, and Blumenthal PDVisual inspection with acetic acid as a cervical cancer test: accuracy validated using latent class analysis. BMC Med Res Methodol2007;7:36. [PMID: 17663796] CrossrefMedlineGoogle Scholar
    • 16. Bossuyt PMReitsma JBBruns DEGatsonis CAGlasziou PPIrwig LMet alSTARD GroupTowards complete and accurate reporting of studies of diagnostic accuracy: the STARD initiative. Fam Pract2004;21:4-10. [PMID: 14760036] CrossrefMedlineGoogle Scholar
    • 17. Little RJA and Rubin DBStatistical Analysis with Missing Data. New York Wiley-Interscience 2002. Google Scholar
    • 18. Begg CB and Greenes RAAssessment of diagnostic tests when disease verification is subject to selection bias. Biometrics1983;39:207-15. [PMID: 6871349] CrossrefMedlineGoogle Scholar
    • 19. de Groot JAJanssen KJZwinderman AHMoons KG, and Reitsma JBMultiple imputation to correct for partial verification bias revisited. Stat Med2008;27:5880-9. [PMID: 18752256] CrossrefMedlineGoogle Scholar
    • 20. Differential Oxford Dictionaries Web site. Accessed at on 4 January 2013. Google Scholar
    • 21. Harel O and Zhou XHMultiple imputation for correcting verification bias. Stat Med2006;25:3769-86. [PMID: 16435337] CrossrefMedlineGoogle Scholar
    • 22. Donders ARvan der Heijden GJStijnen T, and Moons KGReview: a gentle introduction to imputation of missing values. J Clin Epidemiol2006;59:1087-91. [PMID: 16980149] CrossrefMedlineGoogle Scholar
    • 23. Ewer AKFurmston ATMiddleton LJDeeks JJDaniels JPPattison HMet alPulse oximetry as a screening test for congenital heart defects in newborn infants: a test accuracy study with evaluation of acceptability and cost-effectiveness. Health Technol Assess2012;16:v-xiii, 1-184. [PMID: 22284744] CrossrefMedlineGoogle Scholar
    • 24. Lord SJStaub LPBossuyt PM, and Irwig LMTarget practice: choosing target conditions for test accuracy studies that are relevant to clinical practice. BMJ2011;343:d4684. [PMID: 21903693] CrossrefMedlineGoogle Scholar
    • 25. Whiting PRutjes AWReitsma JBBossuyt PM, and Kleijnen JThe development of QUADAS: a tool for the quality assessment of studies of diagnostic accuracy included in systematic reviews. BMC Med Res Methodol2003;3:25. [PMID: 14606960] CrossrefMedlineGoogle Scholar
    • 26. Whiting PFWeswood MERutjes AWReitsma JBBossuyt PN, and Kleijnen JEvaluation of QUADAS, a tool for the quality assessment of diagnostic accuracy studies. BMC Med Res Methodol2006;6:9. [PMID: 16519814] CrossrefMedlineGoogle Scholar
    • 27. Hayen AMacaskill PIrwig L, and Bossuyt PAppropriate statistical methods are required to assess diagnostic tests for replacement, add-on, and triage. J Clin Epidemiol2010;63:883-91. [PMID: 20079607] CrossrefMedlineGoogle Scholar
    • 28. Sonke GSVerbeek AL, and Kiemeney LAA philosophical perspective supports the need for patient-outcome studies in diagnostic test evaluation. J Clin Epidemiol2009;62:58-61. [PMID: 18619792] CrossrefMedlineGoogle Scholar