Original Research
4 April 2025

Comparison of Initial Artificial Intelligence (AI) and Final Physician Recommendations in AI-Assisted Virtual Urgent Care Visits

Publication: Annals of Internal Medicine
Volume 178, Number 4
Visual Abstract. Comparison of Initial Artificial Intelligence (AI) and Final Physician Recommendations in AI-Assisted Virtual Urgent Care Visits This study compares the concordance and quality of initial artificial intelligence (AI) diagnoses and management recommendations to the diagnoses and recommendations of physicians who had access to the AI recommendations but may or may not have viewed them. Adults with common acute symptoms seen in a virtual urgent care clinic comprised the study population.
Visual Abstract. Comparison of Initial Artificial Intelligence (AI) and Final Physician Recommendations in AI-Assisted Virtual Urgent Care Visits
This study compares the concordance and quality of initial artificial intelligence (AI) diagnoses and management recommendations to the diagnoses and recommendations of physicians who had access to the AI recommendations but may or may not have viewed them. Adults with common acute symptoms seen in a virtual urgent care clinic comprised the study population.

Abstract

Background:

Whether artificial intelligence (AI) assistance is associated with quality of care is uncertain.

Objective:

To compare initial AI recommendations with final recommendations of physicians who had access to the AI recommendations and may or may not have viewed them.

Design:

Retrospective cohort study.

Setting:

Cedars-Sinai Connect, an AI-assisted virtual urgent care clinic with intake questions via structured chat. When confidence is sufficient, AI presents diagnosis and management recommendations (prescriptions, laboratory tests, and referrals).

Patients:

461 physician-managed visits with AI recommendations of sufficient confidence and complete medical records for adults with respiratory, urinary, vaginal, eye, or dental symptoms from 12 June to 14 July 2024.

Measurements:

Concordance of diagnosis and management recommendations of initial AI recommendations and final physician recommendations. Physician adjudicators scored all nonconcordant and a sample of concordant recommendations as optimal, reasonable, inadequate, or potentially harmful.

Results:

Initial AI and final physician recommendations were concordant for 262 visits (56.8%). Among the 461 weighted visits, AI recommendations were more frequently rated as optimal (77.1% [95% CI, 72.7% to 80.9%]) compared with treating physician decisions (67.1% [CI, 62.9% to 71.1%]). Quality scores were equal in 67.9% (CI, 64.8% to 70.9%) of cases, better for AI in 20.8% (CI, 17.8% to 24.0%), and better for treating physicians in 11.3% (CI, 9.0% to 14.2%), respectively.

Limitations:

Single-center retrospective study. Adjudicators were not blinded to the source of recommendations. It is unknown whether physicians viewed AI recommendations.

Conclusion:

When AI and physician recommendations differed, AI recommendations were more often rated better quality. Findings suggest that AI performed better in identifying critical red flags and supporting guideline-adherent care, whereas physicians were better at adapting recommendations to changing information during consultations. Thus, AI may have a role in assisting physician decision making in virtual urgent care.

Primary Funding Source:

K Health.

Get full access to this article

View all available purchase options and get full access to this article.

Supplemental Material

Supplementary Material
Additional Tables and Figures
Statistical Code

References

1.
Esteva A, Kuprel B, Novoa RA, et al. Dermatologist-level classification of skin cancer with deep neural networks. Nature. 2017;542:115-118. [PMID: 28117445] doi: 10.1038/nature21056
2.
Wenderott K, Krups J, Zaruchas F, et al. Effects of artificial intelligence implementation on efficiency in medical imaging-a systematic literature review and meta-analysis. NPJ Digit Med. 2024;7:265. [PMID: 39349815] doi: 10.1038/s41746-024-01248-9
3.
Siontis K, Noseworthy P, Attia Z, et al. Artificial intelligence-enhanced electrocardiography in cardiovascular disease management. Nat Rev Cardiol. 2021;18:465-478. [PMID: 33526938] doi: 10.1038/s41569-020-00503-2
4.
Mullainathan S, Obermeyer Z. Diagnosing physician error: a machine learning approach to low-value health care. The Quarterly Journal of Economics. 2022;137:679-727. doi: 10.1093/qje/qjab046
5.
Shafi S, Parwani A. Artificial intelligence in diagnostic pathology. Diagn Pathol. 2023;18:109. [PMID: 37784122] doi: 10.1186/s13000-023-01375-z
6.
Weiss J, Raghu VK, Paruchuri K, et al. Deep learning to estimate cardiovascular risk from chest radiographs: a risk prediction study. Ann Intern Med. 2024;177:409-417. [PMID: 38527287] doi: 10.7326/M23-1898
7.
Dagan N, Magen O, Leshchinsky M, et al. Prospective evaluation of machine learning for public health screening: identifying unknown hepatitis C carriers. N Engl J Med AI. 2024;1. doi: 10.1056/AIoa2300012
8.
Lam T, Cheung M, Munro Y, et al. Randomized controlled trials of artificial intelligence in clinical practice: systematic review. J Med Internet Res. 2022;24:e37188. [PMID: 35904087] doi: 10.2196/37188
9.
Kueper J, Terry A, Zwarenstein M, et al. Artificial intelligence and primary care research: a scoping review. Ann Fam Med. 2020;18:250-258. [PMID: 32393561] doi: 10.1370/afm.2518
10.
Susanto A, Lyell D, Widyantoro B, et al. Effects of machine learning-based clinical decision support systems on decision-making, care delivery, and patient outcomes: a scoping review. J Am Med Inform Assoc. 2023;30:2050-2063. [PMID: 37647865] doi: 10.1093/jamia/ocad180
11.
Labkoff S, Oladimeji B, Kannry J, et al. Toward a responsible future: recommendations for AI-enabled clinical decision support. J Am Med Inform Assoc. 2024;31:2730-2739. [PMID: 39325508] doi: 10.1093/jamia/ocae209
12.
Zeltzer D, Herzog L, Pickman Y, et al. Diagnostic accuracy of artificial intelligence in virtual primary care. Mayo Clinic Proceedings: Digital Health. 2023;1:480-489. doi: 10.1016/j.mcpdig.2023.08.002
13.
Watson-Daniels J, Parkes DC, Ustun B. Predictive multiplicity in probabilistic classification. In: Williams B, Chen Y, Neville J, eds. Proceedings of the 37th AAAI Conference on Artificial Intelligence, Washington, DC, 7–14 February 2023. AAAI-23 Technical Tracks 9; 2023;37:10306-10314. doi: 10.1609/aaai.v37i9.26227
14.
Kompa B, Snoek J, Beam A. Second opinion needed: communicating uncertainty in medical machine learning. NPJ Digit Med. 2021;4:4. [PMID: 33402680] doi: 10.1038/s41746-020-00367-3
15.
Lumley T, Gao P, Schneider B. survey: Analysis of Complex Survey Samples. 2024. Accessed at https://cran.r-project.org/web/packages/survey/index.html on 13 January 2025.
16.
Korn EL, Graubard BI. Confidence intervals for proportions with small expected number of positive counts estimated from survey data. Survey Methodology. 1998;24:193-201.
17.
Cohen J. A coefficient of agreement for nominal scales. Educational and Psychological Measurement. 1960;20:37-46. doi: 10.1177/001316446002000104
18.
Revelle W. psych: Procedures for Psychological, Psychometric, and Personality Research [Internet]. 2024. Accessed at https://cran.r-project.org/web/packages/psych/index.html on 13 January 2025.
19.
Centor RM, Witherspoon JM, Dalton HP, et al. The diagnosis of strep throat in adults in the emergency room. Med Decis Making. 1981;1:239-246. [PMID: 6763125] doi: 10.1177/0272989X8100100304
20.
De Fauw J, Ledsam JR, Romera-Paredes B, et al. Clinically applicable deep learning for diagnosis and referral in retinal disease. Nat Med. 2018;24:1342-1350. [PMID: 30104768] doi: 10.1038/s41591-018-0107-6
21.
Hannun AY, Rajpurkar P, Haghpanahi M, et al. Cardiologist-level arrhythmia detection and classification in ambulatory electrocardiograms using a deep neural network. Nat Med. 2019;25:65-69. [PMID: 30617320] doi: 10.1038/s41591-018-0268-3

Comments

0 Comments
Sign In to Submit A Comment
Carlos M Swanger, MD 4 April 2025
Erosion of physician clinical acumen by over-reliance on AI

Have you considered somehow measuring the impact of AI support on physician's engagement in clinical problem solving over time? Might not some physicians begin to abdicate their own decision-making process in favor of just following along with what AI finds and recommends? We would then lose the advantage of human clinical acumen which led to your finding that 11% of the time physicians outperformed AI.

Information & Authors

Information

Published In

cover image Annals of Internal Medicine
Annals of Internal Medicine
Volume 178Number 4April 2025
Pages: 498 - 506

History

Published online: 4 April 2025
Published in issue: April 2025

Keywords

Authors

Affiliations

Tel Aviv University, Tel Aviv, Israel (D.Z.)
Zehavi Kugler, MD
K Health, New York, New York (Z.K., L.H., T.Brufman, R.I.B., K.L., T.Beer, I.F., R.S.)
Lior Hayat, MD
K Health, New York, New York (Z.K., L.H., T.Brufman, R.I.B., K.L., T.Beer, I.F., R.S.)
Tamar Brufman, MD
K Health, New York, New York (Z.K., L.H., T.Brufman, R.I.B., K.L., T.Beer, I.F., R.S.)
Ran Ilan Ber, PhD
K Health, New York, New York (Z.K., L.H., T.Brufman, R.I.B., K.L., T.Beer, I.F., R.S.)
Keren Leibovich, PhD
K Health, New York, New York (Z.K., L.H., T.Brufman, R.I.B., K.L., T.Beer, I.F., R.S.)
K Health, New York, New York (Z.K., L.H., T.Brufman, R.I.B., K.L., T.Beer, I.F., R.S.)
Ilan Frank, MSc
K Health, New York, New York (Z.K., L.H., T.Brufman, R.I.B., K.L., T.Beer, I.F., R.S.)
Ran Shaul, BASc
K Health, New York, New York (Z.K., L.H., T.Brufman, R.I.B., K.L., T.Beer, I.F., R.S.)
Caroline Goldzweig, MD, MSHS
Department of Medicine, Cedars-Sinai Medical Center, Los Angeles, California (C.G., J.P.).
Joshua Pevnick, MD, MSHS
Department of Medicine, Cedars-Sinai Medical Center, Los Angeles, California (C.G., J.P.).
Acknowledgment: The authors acknowledge Yael Steuerman, PhD; Ishay Bitton; Michal Hershkovitz; Oron Mozes; Yaniv Cohen; and Zachary Siegel for technical support and Kevin Stephens, MD; David Morley, MD; and Neil Brown, MD, for medical support.
Financial Support: By K Health.
Disclosures: Disclosure forms are available with the article online.
Reproducible Research Statement: Study protocol: Available upon request from the corresponding author. Statistical code: Posted in Supplement 3. Data set: The data used in this study contain identifiable patient information and are not publicly available to protect privacy. However, deidentified data are available for replication purposes upon request, subject to approval by the Cedars-Sinai Institutional Review Board and measures to protect patient confidentiality. Researchers seeking access should contact the corresponding author for details on the data request process.
Corresponding Author: Dan Zeltzer, PhD, Berglas School of Economics, Tel Aviv University, Tel Aviv, Israel 6997801; e-mail, [email protected].
Author Contributions: Conception and design: D. Zeltzer, Z. Kugler, L. Hayat, T. Brufman, R. Ilan Ber, K. Leibovich, R. Shaul, C. Goldzweig.
Analysis and interpretation of the data: D. Zeltzer, Z. Kugler, L. Hayat, T. Brufman, R. Ilan Ber, K. Leibovich, T. Beer, I. Frank, C. Goldzweig, J. Pevnick.
Drafting of the article: D. Zeltzer, Z. Kugler, K. Leibovich, C. Goldzweig.
Critical revision of the article for important intellectual content: D. Zeltzer, Z. Kugler, T. Brufman, R. Ilan Ber, K. Leibovich, R. Shaul, C. Goldzweig, J. Pevnick.
Final approval of the article: D. Zeltzer, Z. Kugler, L. Hayat, T. Brufman, R. Ilan Ber, K. Leibovich, T. Beer, I. Frank, R. Shaul, C. Goldzweig, J. Pevnick.
Statistical expertise: D. Zeltzer, R. Ilan Ber, K. Leibovich.
Obtaining of funding: R. Shaul.
Administrative, technical, or logistic support: L. Hayat, T. Beer, I. Frank, J. Pevnick.
Collection and assembly of data: L. Hayat, K. Leibovich, T. Beer.
This article was published at Annals.org on 4 April 2025.

Author Disclosures

Download Author Disclosures

Metrics & Citations

Metrics

Citations

If you have the appropriate software installed, you can download article citation data to the citation manager of your choice. For an editable text file, please select Medlars format which will download as a .txt file. Simply select your manager software from the list below and click Download.

For more information or tips please see 'Downloading to a citation manager' in the Help menu.

Format





Download article citation data for:
Dan Zeltzer, Zehavi Kugler, Lior Hayat, et al. Comparison of Initial Artificial Intelligence (AI) and Final Physician Recommendations in AI-Assisted Virtual Urgent Care Visits. Ann Intern Med.2025;178:498-506. [Epub 4 April 2025]. doi:10.7326/ANNALS-24-03283

View More

Login Options:
Purchase

You will be redirected to acponline.org to sign-in to Annals to complete your purchase.

Access to EPUBs and PDFs for FREE Annals content requires users to be registered and logged in. A subscription is not required. You can create a free account below or from the following link. You will be redirected to acponline.org to create an account that will provide access to Annals. If you are accessing the Free Annals content via your institution's access, registration is not required.

Create your Free Account

You will be redirected to acponline.org to create an account that will provide access to Annals.

View options

PDF/EPUB

View PDF/EPUB

Related in ACP Journals

Full Text

View Full Text

Figures

Tables

Media

Share

Share

Copy the content Link

Share on social media