Research and Reporting Considerations for Observational Studies Using Electronic Health Record DataFREE
Electronic health records (EHRs) are an increasingly important source of real-world health care data for observational research. Analyses of data collected for purposes other than research require careful consideration of data quality as well as the general research and reporting principles relevant to observational studies. The core principles for observational research in general also apply to observational research using EHR data, and these are well addressed in prior literature and guidelines. This article provides additional recommendations for EHR-based research. Considerations unique to EHR-based studies include assessment of the accuracy of computer-executable cohort definitions that can incorporate unstructured data from clinical notes and management of data challenges, such as irregular sampling, missingness, and variation across time and place. Principled application of existing research and reporting guidelines alongside these additional considerations will improve the quality of EHR-based observational studies.
Observational research helps to advance clinical knowledge and inform the practice of medicine. Electronic health records (EHRs) contain large quantities of health care data that are captured during care and are an increasingly important resource for conducting observational health research (1). The potential value of these data relates to the large volume of data drawn from real-world practice that may include more diverse patients and conditions than are feasible to include in studies that rely on primary data collection (2, 3). Although EHRs typically provide larger quantities of clinical data than are available from surveys, registries, and clinical trials, the quality of these data—which were not collected for research purposes—raises important research and reporting considerations.
The core considerations for observational research are the same whether the research uses data collected primarily for research purposes or EHR data collected during the course of care. These core considerations are well described in reporting guidelines, such as STROBE (Strengthening the Reporting of Observational Studies in Epidemiology) (4). The RECORD (REporting of studies Conducted using Observational Routinely-collected health Data) (5) guidelines extend the STROBE guideline with related recommendations for studies using routinely collected health data, which are directly relevant to EHR-based studies.
This article is intended to complement existing guidelines by describing additional research and reporting issues that should be considered when conducting, reporting, and interpreting EHR-based studies. Issues encountered in our own prior research (6–8) and discussed by collaborative groups, such as the Observational Health Data Sciences and Informatics (OHDSI) (6) initiative, inform our recommendations. Issues that we address include assessment of the accuracy of algorithmic cohort definitions and electronic phenotyping that can incorporate unstructured data, such as that from clinical notes (9, 10), and managing common irregularities of EHR data that can bias study results, such as irregular sampling, missingness, and nonstationarity across time and place (11–13). We use 2 examples to illustrate some of these issues.
Example 1: Identifying Primary Care Patients With High-Risk Opioid Use
We conducted a study to quantify the prevalence of chronic opioid use and determine whether primary care prescribing guidelines could decrease it (7). Because primary data collection or manual chart abstraction would be prohibitively expensive, we used EHR data collected in the context of routine primary care. We initially sought to identify a cohort of patients with “prescription opioid misuse”; behaviors of interest included breach of opioid pain contracts (14), medication diversion, premature refills, and chronic use of high dosages. Unfortunately, we quickly found it difficult to develop an accurate computer-executable definition of “prescription opioid misuse” because formal diagnostic codes were sparse and inconsistent. This is commonly the case for clinical data not closely linked to billing or compliance incentives (15, 16). For many medical conditions, less than 10% of the affected individuals' EHRs contain the respective International Classification of Diseases (ICD) diagnosis codes (17). Diagnostic coding accuracy also varies across settings, provider types, and whether a billing code specialist assigned the code (16,18–20).
Documentation and workflow variability introduced additional challenges. Notably, clinicians did not use a standardized electronic note template for screening questionnaires that could facilitate simple text recognition of such terms as “opioid contract” or “prescription drug monitoring program.” Thus, we had to define our cohort on the basis of alternate structured elements, such as total quantity of opioids prescribed within a given time window, while excluding patients with any history of a cancer-related diagnosis. Subsequent studies by other researchers illustrate that using algorithmic natural-language processing to refine and validate cohort definitions can identify one-third more patients with opioid misuse than identified by diagnosis codes alone (21, 22).
Example 2: Predicting Diagnostic Test Results
In another study, we sought to identify low-yield diagnostic tests by using EHR data available at the time of test ordering to predict whether common inpatient laboratory tests, such as magnesium, sodium, creatinine, and blood cultures, would yield abnormal results (23, 24). The values of common vital signs and laboratory tests were identified as important predictors of subsequent test results, but so was the existence and number of such measurements, so our model included these counts.
Issues to Consider When Conducting Observational Studies of EHR Data
Developing “Executable” Cohorts in EHRs
Algorithmic approaches to using EHR data to identify patient cohorts expand the feasibility of large-scale observational research but require validation. These algorithms are referred to using such terms as “cohort definitions,” “health outcomes of interest,” “inclusion/exclusion criteria,” or “phenotypes.” A key step in “electronic phenotyping” is translating human-understandable descriptions into computer-executable definitions (25, 26). This step may involve simple logic that combines structured elements. Example 1 used this approach when we identified patients receiving chronic pain care as those who received opioid prescriptions from primary care providers while excluding patients with opioid prescriptions from oncology providers because these prescriptions may be for palliative care. Other approaches use probabilistic algorithms to estimate the likelihood that a patient belongs to a cohort of interest on the basis of patterns of data observed in other similar patients.
Augmenting electronic phenotyping algorithms by including additional content from clinical notes is a popular approach, but it is not a cure-all because there can be gross documentation inconsistencies from copy-and-paste templates (27, 28) and notes may ultimately only provide incremental information beyond deliberate use of more consistently available structured data elements (29). These additional layers of complexity require their own evaluation, consistent with recommendations 6.1 and 6.2 from RECORD (5). For a sample of the cases considered, a reference standard must be established for whether they meet the cohort definition. This often requires manual chart review by multiple domain experts, with assessment of interrater reliability (for example, kappa score) (30). The algorithmic approach can then be evaluated relative to the reference standard in terms of diagnostic and information retrieval metrics (31) of precision (positive predictive value) and recall (sensitivity). This allows researchers and reviewers to assess whether the algorithmic cohort definition can be extrapolated to larger samples with satisfactory results. Such projects as Phenotype KnowledgeBase (32) and OHDSI support these efforts by collecting a growing number of publicly available, human-understandable, and computer-executable definitions.
EHR Data Irregularities
Confounding, a well-recognized challenge in all observational research, is magnified when studies use broadly available EHR data collected by individuals providing care rather than by those curating data for research or billing purposes. For example, because sicker patients tend to receive more testing and treatment, confounding by indication (33) can bias the predictive value of laboratory results (13). Strategies for addressing such confounding is an important topic that is well covered in existing literature (34–42).
Missing data is another challenge in observational studies that can be magnified when EHR data are used. Data in an EHR are often missing not at random (43, 44). Gaps in a patient's record may be a result of loss to follow-up or transition to another care provider or insurer. Alternatively, data may be missing because of errors in populating a database record or incomplete linkage of different records belonging to one patient. When data are missing related to patient- or provider-specific factors, such as the patient being too sick to seek health care, the missing-at-random assumption is violated. Statistical methods generally used to handle missing data include multiple imputation and inverse probability weighting (43, 45, 46) and have been applied to studies using EHR data (11, 44). Another challenge is “nondata” generated by copying and pasting of note information or inappropriate carry-forward of discontinued medications or resolved diagnoses or symptoms. In some situations, it is possible to discern the presence of the workflow that is producing the nondata (such as audit logs for copied text), but defining true data can be challenging.
Temporal Data Complexity
Electronic health records can provide high-resolution, time-stamped longitudinal data. Yet, misinterpretation of such time stamps can inadvertently “leak” future data into predictive models. For example, observational analysis may indicate that length of hospital stay is associated with growth of resistant bacteria in blood cultures, but length of stay would not be useful for point-of-care predictions because it is future information. More insidious are misleading EHR time stamps, such as clinical progress notes whose contents may have a time stamp corresponding to note initiation rather than to the timing of clinical events. The time between clinical care decisions, note initiation, and note completion may be separated by many hours or even days, and thus the content of the note may reflect knowledge obtained in the future relative to note initiation. Similarly, using a hospital diagnosis-related group (DRG) for sepsis is unlikely to be valid for intrahospital bacteremia predictions, because the DRG codes are routinely assigned after hospitalization by coders after review of completed documentation (16, 47). These irregularities warrant clear specification of source and timing of available data elements in EHR-based studies, and whether they would be available in the respective live clinical settings they are intended to apply to.
Care captured in EHRs changes over time, often rapidly, as a result of the introduction of new tests and therapies, new clinical evidence, changing incentives, and EHR infrastructure alterations (such as changing vendors, modules, or naming standards). In one study predicting future hospital practices, the relevance of EHR data decayed with a half-life of about 4 months for overall practice trends (48). For individual patient charts, static clinical information can be outdated within a matter of hours (49). In another example, time variation had a strong effect on the performance of wound healing prediction models (50). Such change represents nonstationarity, in which the data-generating process changes over time (51). However, observational studies often report findings from a single snapshot of a data set in time.
Changes in coding and documentation practices or introduction of new EHR software versions also drive data nonstationarity. As a result, study variable definitions developed by using historical data or data from a different source (such as a different health system) may find fewer subjects or the wrong subjects, while associations between treatment and effect may not hold when replicating analyses with different data (52). Nonstationarity can similarly affect calibration and clinical utility of predictive models (53). Diagnostics summarizing how longitudinal EHR data sets change over time can support observational study reporting, such as descriptive statistics year-over-year on the prevalence of categories of data (for example laboratory records, procedure records, mortality data) as well as specific data values (for example, the frequency of specific diagnosis codes).
In example 2 (laboratory diagnostic prediction), validation on “future” data may better reflect whether the models will be generalizable to future data streams than would random cross-validation or hold-out test sets. In other words, researchers should develop models on early years of data while evaluating on later years of data. Furthermore, nonstationarity indicates that models and cohort definitions based on EHR data probably will need to be regularly updated to match current data structures and processes.
Multisite Data Variability and Common Data Models
Reproducibility and replication are well-accepted principles for high-quality observational research but can raise particular challenges for EHR-based studies when different clinical sites use different EHR vendors. Even with a common EHR vendor or otherwise interoperable data structures (for example, Fast Healthcare Interoperability Resources [FHIR] ), the idiosyncrasies of local implementation will probably require a laborious, manual, and potentially ambiguous mapping of semantic meaning of data elements. In our laboratory diagnostics example, we wanted to assess reproducibility across multiple sites (Stanford University; University of California, San Francisco; and University of Michigan), requiring manual reconciliation between each site's slightly different data representations. For example, one site may use the term “WBC,” another “white blood cells,” and yet another “white cells.” Other data have less clear reconciliation options, such as one site consolidating aerobic and anaerobic blood culture tests into a single result while another separates the 2 types of tests, preventing directly comparable results across sites.
Consolidating standard terminologies and common data models (CDMs), such as that used in the Observational Medical Outcomes Partnership (OMOP), can facilitate multisite observational studies (55). Distributing executable analysis code can in turn provide the most explicit documentation of subtle study design choices and embedded assumptions that may be unclear in the methods sections of study reports. Provision of code enables review by external experts and can promote replication. The use of CDMs can in turn enable researchers to use turnkey tools for EHR data diagnostics and analysis developed within the respective research communities.
Even if CDMs are used, the processes that convert raw EHR data to research variables require careful consideration and documentation because they may introduce unexpected and unquantified variation in data sets, affecting downstream analyses. For example, following OHDSI conventions to convert EHR data to the OMOP CDM involves mapping source diagnosis codes (such as ICD codes) to Systematized Nomenclature of Medicine (SNOMED) codes, but individual sites may define custom mappings such that different ICD codes may be mapped to the same SNOMED code. Such tools as OHDSI Automated Characterization of Health Information at Large-scale Longitudinal Evidence Systems (ACHILLES) (56) provide a mechanism to generate reports on data quality by flagging potential errors, such as implausible dates or missing data fields. Other research collaboratives, such as the National Patient-Centered Clinical Research Network (PCORnet) or Sentinel Initiative, have developed related approaches and frameworks for data quality assessment (57–59). In cases where site-to-site variability is directly measurable, such as that introduced by different mappings between terminologies, researchers should consider analyses to quantify and report the effect of site variability on measured associations.
In conclusion, EHRs contain large quantities of real-world health care data and are an increasingly important data resource for observational research. Yet, analysis of data collected for nonresearch purposes requires consideration of data quality and observational research and reporting principles. Most of the important considerations for observational research using EHRs are the same as for observational research using other data and are well addressed by existing recommendations. In the Table, we summarize the considerations for EHR-based observational research that we discussed in this article and provide suggestions for reporting on these issues. Our hope is that the principled application of existing research and reporting guidelines alongside these additional considerations will improve the quality of EHR-based observational studies that drive continuously learning health care systems (60).
Chute CG. Invited commentary: Observational research in the age of the electronic health record. Am J Epidemiol. 2014;179:759-61. [PMID: 24488512] doi: 10.1093/aje/kwt443CrossrefMedlineGoogle Scholar
Sherman RE, Anderson SA, Dal Pan GJ, et al. Real-world evidence—what is it and what can it tell us? N Engl J Med. 2016;375:2293-2297. [PMID: 27959688] CrossrefMedlineGoogle Scholar
Bartlett VL, Dhruva SS, Shah ND, et al. Feasibility of using real-world data to replicate clinical trial evidence. JAMA Netw Open. 2019;2:e1912869. [PMID: 31596493] doi: 10.1001/jamanetworkopen.2019.12869CrossrefMedlineGoogle Scholar
von Elm E, Altman DG, Egger M, et al; STROBE Initiative. The Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) statement: guidelines for reporting observational studies. Ann Intern Med. 2007;147:573-7. [PMID: 17938396] LinkGoogle Scholar
Benchimol EI, Smeeth L, Guttmann A, et al; RECORD Working Committee. The REporting of studies Conducted using Observational Routinely-collected health Data (RECORD) statement. PLoS Med. 2015;12:e1001885. [PMID: 26440803] doi: 10.1371/journal.pmed.1001885CrossrefMedlineGoogle Scholar
Hripcsak G, Duke JD, Shah NH, et al. Observational Health Data Sciences and Informatics (OHDSI): opportunities for observational researchers. Stud Health Technol Inform. 2015;216:574-8. [PMID: 26262116] MedlineGoogle Scholar
Chen JH, Hom J, Richman I, et al. Effect of opioid prescribing guidelines in primary care. Medicine (Baltimore). 2016;95:e4760. [PMID: 27583928] doi: 10.1097/MD.0000000000004760CrossrefMedlineGoogle Scholar
Vashisht R, Jung K, Schuler A, et al. Association of hemoglobin A1c levels with use of sulfonylureas, dipeptidyl peptidase 4 inhibitors, and thiazolidinediones in patients with type 2 diabetes treated with metformin: analysis from the Observational Health Data Sciences and Informatics initiative. JAMA Netw Open. 2018;1:e181755. [PMID: 30646124] doi: 10.1001/jamanetworkopen.2018.1755CrossrefMedlineGoogle Scholar
Newton KM, Peissig PL, Kho AN, et al. Validation of electronic medical record-based phenotyping algorithms: results and lessons learned from the eMERGE network. J Am Med Inform Assoc. 2013;20:e147-54. [PMID: 23531748] doi: 10.1136/amiajnl-2012-000896CrossrefMedlineGoogle Scholar
Shivade C, Raghavan P, Fosler-Lussier E, et al. A review of approaches to identifying patient phenotype cohorts using electronic health records. J Am Med Inform Assoc. 2014 Mar-Apr;21:221-30. [PMID: 24201027] doi: 10.1136/amiajnl-2013-001935CrossrefMedlineGoogle Scholar
Wells BJ, Chagin KM, Nowacki AS, et al. Strategies for handling missing data in electronic health record derived data. EGEMS (Wash DC). 2013;1:1035. [PMID: 25848578] doi: 10.13063/2327-9214.1035CrossrefMedlineGoogle Scholar
Madden JM, Lakoma MD, Rusinak D, et al. Missing clinical and behavioral health data in a large electronic health record (EHR) system. J Am Med Inform Assoc. 2016;23:1143-1149. [PMID: 27079506] doi: 10.1093/jamia/ocw021CrossrefMedlineGoogle Scholar
Agniel D, Kohane IS, Weber GM. Biases in electronic health record data due to processes within the healthcare system: retrospective observational study. BMJ. 2018;361:k1479. [PMID: 29712648] doi: 10.1136/bmj.k1479CrossrefMedlineGoogle Scholar
Hariharan J, Lamb GC, Neuner JM. Long-term opioid contract use for chronic pain management in primary care practice. A five year experience. J Gen Intern Med. 2007;22:485-90. [PMID: 17372797] CrossrefMedlineGoogle Scholar
Wright A, McCoy AB, Hickman TT, et al. Problem list completeness in electronic health records: a multi-site study and assessment of success factors. Int J Med Inform. 2015;84:784-90. [PMID: 26228650] doi: 10.1016/j.ijmedinf.2015.06.011CrossrefMedlineGoogle Scholar
Fisher ES, Whaley FS, Krushat WM, et al. The accuracy of Medicare's hospital claims data: progress has been made, but problems remain. Am J Public Health. 1992;82:243-8. [PMID: 1739155] CrossrefMedlineGoogle Scholar
Wei WQ, Teixeira PL, Mo H, et al. Combining billing codes, clinical notes, and medications from electronic health records provides superior phenotyping performance. J Am Med Inform Assoc. 2016;23:e20-7. [PMID: 26338219] doi: 10.1093/jamia/ocv130CrossrefMedlineGoogle Scholar
Reker DM, Hamilton BB, Duncan PW, et al. Stroke: who's counting what? J Rehabil Res Dev. 2001 Mar-Apr;38:281-9. [PMID: 11392661] MedlineGoogle Scholar
Chescheir N, Meints L. Prospective study of coding practices for cesarean deliveries. Obstet Gynecol. 2009;114:217-23. [PMID: 19622980] doi: 10.1097/AOG.0b013e3181ad9533CrossrefMedlineGoogle Scholar
Al Achkar M, Kengeri-Srikantiah S, Yamane BM, et al. Billing by residents and attending physicians in family medicine: the effects of the provider, patient, and visit factors. BMC Med Educ. 2018;18:136. [PMID: 29895287] doi: 10.1186/s12909-018-1246-7CrossrefMedlineGoogle Scholar
Carrell DS, Cronkite D, Palmer RE, et al. Using natural language processing to identify problem usage of prescription opioids. Int J Med Inform. 2015;84:1057-64. [PMID: 26456569] doi: 10.1016/j.ijmedinf.2015.09.002CrossrefMedlineGoogle Scholar
Manning CD, Schütze H. Foundations of Statistical Natural Language Processing. Cambridge, MA: MIT Pr; 1999. Google Scholar
Levinson W, Born K, Wolfson D. Choosing wisely campaigns: a work in progress. JAMA. 2018;319:1975-1976. [PMID: 29710232] doi: 10.1001/jama.2018.2202CrossrefMedlineGoogle Scholar
Xu S, Hom J, Balasubramanian S, et al. Prevalence and predictability of low-yield inpatient laboratory diagnostic tests. JAMA Netw Open. 2019;2:e1910967. [PMID: 31509205] doi: 10.1001/jamanetworkopen.2019.10967CrossrefMedlineGoogle Scholar
Richesson RL, Sun J, Pathak J, et al. Clinical phenotyping in selected national networks: demonstrating the need for high-throughput, portable, and computational methods. Artif Intell Med. 2016;71:57-61. [PMID: 27506131] doi: 10.1016/j.artmed.2016.05.005CrossrefMedlineGoogle Scholar
Banda JM, Seneviratne M, Hernandez-Boussard T, et al. Advances in electronic phenotyping: from rule-based definitions to machine learning models. Annu Rev Biomed Data Sci. 2018;1:53-68. [PMID: 31218278] doi: 10.1146/annurev-biodatasci-080917-013315CrossrefMedlineGoogle Scholar
Berdahl CT, Moran GJ, McBride O, et al. Concordance between electronic clinical documentation and physicians' observed behavior. JAMA Netw Open. 2019;2:e1911390. [PMID: 31532513] doi: 10.1001/jamanetworkopen.2019.11390CrossrefMedlineGoogle Scholar
Hirschtick RE. A piece of my mind. Copy-and-paste. JAMA. 2006;295:2335-6. [PMID: 16720812] CrossrefMedlineGoogle Scholar
Marafino BJ, Dudley RA, Shah NH, et al. Accurate and interpretable intensive care risk adjustment for fused clinical data with generalized additive models. AMIA Jt Summits Transl Sci Proc. 2018;2017:166-175. [PMID: 29888065] MedlineGoogle Scholar
Landis JR, Koch GG. The measurement of observer agreement for categorical data. Biometrics. 1977;33:159-74. [PMID: 843571] CrossrefMedlineGoogle Scholar
Manning CD, Raghavan P, Schutze H, eds. Evaluation in information retrieval. In: Manning CD, Raghavan P, Schutze H, eds. Introduction to Information Retrieval. Cambridge, UK: Cambridge Univ Pr; 2008. Google Scholar
Kirby JC, Speltz P, Rasmussen LV, et al. PheKB: a catalog and workflow for creating electronic phenotype algorithms for transportability. J Am Med Inform Assoc. 2016;23:1046-1052. [PMID: 27026615] doi: 10.1093/jamia/ocv202CrossrefMedlineGoogle Scholar
Freemantle N, Marston L, Walters K, et al. Making inferences on treatment effects from real world data: propensity scores, confounding by indication, and other perils for the unwary in observational research. BMJ. 2013;347:f6409. [PMID: 24217206] doi: 10.1136/bmj.f6409CrossrefMedlineGoogle Scholar
Stuart EA, DuGoff E, Abrams M, et al. Estimating causal effects in observational studies using electronic health data: challenges and (some) solutions. EGEMS (Wash DC). 2013;1. [PMID: 24921064] doi: 10.13063/2327-9214.1038CrossrefMedlineGoogle Scholar
Stuart EA. Matching methods for causal inference: a review and a look forward. Stat Sci. 2010;25:1-21. [PMID: 20871802] CrossrefMedlineGoogle Scholar
Cafri G, Wang W, Chan PH, et al. A review and empirical comparison of causal inference methods for clustered observational data with application to the evaluation of the effectiveness of medical devices. Stat Methods Med Res. 2019 Oct-Nov;28:3142-3162. [PMID: 30203707] doi: 10.1177/0962280218799540CrossrefMedlineGoogle Scholar
Pearl J. Causal inference in statistics: an overview. Stat Surv. 2009;3:96-146. CrossrefGoogle Scholar
Austin PC. A comparison of 12 algorithms for matching on the propensity score. Stat Med. 2014;33:1057-69. [PMID: 24123228] doi: 10.1002/sim.6004CrossrefMedlineGoogle Scholar
Austin PC. An introduction to propensity score methods for reducing the effects of confounding in observational studies. Multivariate Behav Res. 2011;46:399-424. [PMID: 21818162] CrossrefMedlineGoogle Scholar
Toh S, García Rodríguez LA, Hernán MA. Confounding adjustment via a semi-automated high-dimensional propensity score algorithm: an application to electronic medical records. Pharmacoepidemiol Drug Saf. 2011;20:849-57. [PMID: 21717528] doi: 10.1002/pds.2152CrossrefMedlineGoogle Scholar
Arbogast PG, Ray WA. Performance of disease risk scores, propensity scores, and traditional multivariable outcome regression in the presence of multiple confounders. Am J Epidemiol. 2011;174:613-20. [PMID: 21749976] doi: 10.1093/aje/kwr143CrossrefMedlineGoogle Scholar
Biondi-Zoccai G, Romagnoli E, Agostoni P, et al. Are propensity scores really superior to standard multivariable analysis? Contemp Clin Trials. 2011;32:731-40. [PMID: 21616172] doi: 10.1016/j.cct.2011.05.006CrossrefMedlineGoogle Scholar
Perkins NJ, Cole SR, Harel O, et al. Principled approaches to missing data in epidemiologic studies. Am J Epidemiol. 2018;187:568-575. [PMID: 29165572] doi: 10.1093/aje/kwx348CrossrefMedlineGoogle Scholar
Beaulieu-Jones BK, Lavage DR, Snyder JW, et al. Characterizing and managing missing structured data in electronic health records: data analysis. JMIR Med Inform. 2018;6:e11. [PMID: 29475824] doi: 10.2196/medinform.8960CrossrefMedlineGoogle Scholar
Wood AM, White IR, Thompson SG. Are missing outcome data adequately handled? A review of published randomized controlled trials in major medical journals. Clin Trials. 2004;1:368-76. [PMID: 16279275] CrossrefMedlineGoogle Scholar
Eekhout I, de Boer RM, Twisk JW, et al. Missing data: a systematic review of how they are reported and handled. Epidemiology. 2012;23:729-32. [PMID: 22584299] doi: 10.1097/EDE.0b013e3182576cdbCrossrefMedlineGoogle Scholar
- 47. Averill RF, Goldfield N, Hughes JS, et al. All patient refined diagnosis related groups (APR-DRGs). Version 20.0. Methodology overview. Wallingford, CT: 3M Health Information Systems; 2003. Accessed at www.hcup-us.ahrq.gov/db/nation/nis/APR-DRGsV20MethodologyOverviewandBibliography.pdf on 20 December 2019. Google Scholar
Chen JH, Alagappan M, Goldstein MK, et al. Decaying relevance of clinical data towards future decisions in data-driven inpatient clinical order sets. Int J Med Inform. 2017;102:71-79. [PMID: 28495350] doi: 10.1016/j.ijmedinf.2017.03.006CrossrefMedlineGoogle Scholar
Rosenbluth G, Jacolbia R, Milev D, et al. Half-life of a printed handoff document. BMJ Qual Saf. 2016;25:324-8. [PMID: 26558826] doi: 10.1136/bmjqs-2015-004585CrossrefMedlineGoogle Scholar
Jung K, Shah NH. Implications of non-stationarity on predictive modeling using EHRs. J Biomed Inform. 2015;58:168-174. [PMID: 26483171] doi: 10.1016/j.jbi.2015.10.006CrossrefMedlineGoogle Scholar
Moreno-Torres JG, Raeder T, Alaiz-Rodríguez R, et al. A unifying view on dataset shift in classification. Pattern Recognit. 2012;45:521-30. CrossrefGoogle Scholar
Lindenauer PK, Lagu T, Shieh MS, et al. Association of diagnostic coding with trends in hospitalizations and mortality of patients with pneumonia, 2003-2009. JAMA. 2012;307:1405-13. [PMID: 22474204] doi: 10.1001/jama.2012.384CrossrefMedlineGoogle Scholar
Walsh CG, Sharman K, Hripcsak G. Beyond discrimination: a comparison of calibration methods and clinical usefulness of predictive models of readmission risk. J Biomed Inform. 2017;76:9-18. [PMID: 29079501] doi: 10.1016/j.jbi.2017.10.008CrossrefMedlineGoogle Scholar
Mandel JC, Kreda DA, Mandl KD, et al. SMART on FHIR: a standards-based, interoperable apps platform for electronic health records. J Am Med Inform Assoc. 2016;23:899-908. [PMID: 26911829] doi: 10.1093/jamia/ocv189CrossrefMedlineGoogle Scholar
- 55. CommonDataModel. Github. Accessed at https://github.com/OHDSI/CommonDataModel on 16 October 2019. Google Scholar
Huser V, DeFalco FJ, Schuemie M, et al. Multisite evaluation of a data quality tool for patient-level clinical data sets. EGEMS (Wash DC). 2016;4:1239. [PMID: 28154833] doi: 10.13063/2327-9214.1239CrossrefMedlineGoogle Scholar
Brown JS, Kahn M, Toh S. Data quality assessment for comparative effectiveness research in distributed data networks. Med Care. 2013;51:S22-9. [PMID: 23793049] doi: 10.1097/MLR.0b013e31829b1e2cCrossrefMedlineGoogle Scholar
Callahan TJ, Bauck AE, Bertoch D, et al. A comparison of data quality assessment checks in six data sharing networks. EGEMS (Wash DC). 2017;5:8. [PMID: 29881733] doi: 10.5334/egems.223CrossrefMedlineGoogle Scholar
Huser V, Kahn MG, Brown JS, et al. Methods for examining data quality in healthcare integrated data repositories. Pac Symp Biocomput. 2018;23:628-633. [PMID: 29218922] MedlineGoogle Scholar
Smith M, Saunders R, Stuckhardt L, et al. Best care at lower cost: the path to continuously learning health care in America. Washington, DC: Institute of Medicine, Committee on the Learning Health Care System in America; 2012. Google Scholar
Author, Article and Disclosure Information
Center for Biomedical Informatics Research, School of Medicine, Stanford University (A.C., N.H.S.)
Division of Hospital Medicine, School of Medicine, Stanford University (J.H.C.)
Financial Support: Drs. Callahan and Shah are supported by the National Institutes of Health (NIH) National Library of Medicine under award 5R01LM01136906 and the NIH National Institute of General Medical Sciences under award 5R01GM10143005. Dr. Chen is supported by the NIH Big Data to Knowledge initiative via the National Institute of Environmental Health Sciences under award K01ES026837.
Disclosures: Dr. Chen is co-owner of Reaction Explorer LLC, which sells chemistry education software not directly related to clinical research or applications. Authors not named here have disclosed no conflicts of interest. Disclosures can also be viewed at www.acponline.org/authors/icmje/ConflictOfInterestForms.do?msNum=M19-0873.
Corresponding Author: Alison Callahan, MISt, PhD, Room X231, Medical School Office Building, Stanford University, 1265 Welch Road, Stanford, CA 94305; e-mail, [email protected]
Current Author Addresses: Dr. Callahan: Room X231, Medical School Office Building, Stanford University, 1265 Welch Road, Stanford, CA 94305.
Dr. Shah: Room X235, Medical School Office Building, Stanford University, 1265 Welch Road, Stanford, CA 94305.
Dr. Chen: Room X213, Medical School Office Building, Stanford University, 1265 Welch Road, Stanford, CA 94305.
Author Contributions: Conception and design: A. Callahan, N.H. Shah.
Drafting of the article: A. Callahan, N.H. Shah, J.H. Chen.
Critical revision for important intellectual content: A. Callahan, N.H. Shah, J.H. Chen.
Final approval of the article: A. Callahan, N.H. Shah, J.H. Chen.
Administrative, technical, or logistic support: N.H. Shah.
This article is part of the Annals supplement “Implementing, Studying, and Reporting Health System Improvement in the Era of Electronic Health Records.” The Moore Foundation (contract number 7107) provided funding for publication of this supplement. Andrew D. Auerbach, MD, MPH (University of California, San Francisco); David W. Bates, MD, MSc (Brigham and Women's Hospital, Harvard Medical School, and Harvard School of Public Health); Jaya K. Rao, MD, MHS (Annals Deputy Editor); and Christine Laine, MD, MPH (Annals Editor in Chief), served as editors for this supplement.