Estimation of Breast Cancer Overdiagnosis in a U.S. Breast Screening Cohort
Submit a Comment
Contributors must reveal any conflict of interest. Comments are moderated. Please see our information for authorsregarding comments on an Annals publication.
Abstract
Background:
Objective:
Design:
Setting:
Participants:
Measurements:
Results:
Limitations:
Conclusion:
Primary Funding Source:
Get full access to this article
View all available purchase options and get full access to this article.
Supplemental Material
References
Comments
Sign In to Submit A CommentInformation & Authors
Information
Published In
History
Keywords
Copyright
Authors
Metrics & Citations
Metrics
Citations
If you have the appropriate software installed, you can download article citation data to the citation manager of your choice. For an editable text file, please select Medlars format which will download as a .txt file. Simply select your manager software from the list below and click Download.
For more information or tips please see 'Downloading to a citation manager' in the Help menu.
Estimation of Breast Cancer Overdiagnosis in a U.S. Breast Screening Cohort. Ann Intern Med.2022;175:471-478. [Epub 1 March 2022]. doi:10.7326/M21-3577
View More
Login Options:
Purchase
You will be redirected to acponline.org to sign-in to Annals to complete your purchase.
Access to EPUBs and PDFs for FREE Annals content requires users to be registered and logged in. A subscription is not required. You can create a free account below or from the following link. You will be redirected to acponline.org to create an account that will provide access to Annals. If you are accessing the Free Annals content via your institution's access, registration is not required.
Create your Free Account
You will be redirected to acponline.org to create an account that will provide access to Annals.
Overtreatment, not Overdiagnosis
We read the study by Ryser et al. with great interest. We applaud the authors’ focus on patient-centered, evidenced-based care and the use of Bayesian algorithms to quantify a difficult to measure metric. All women should be empowered to make informed decisions for their healthcare and these data are important in understanding the risks and benefits. The implementation of screening mammography has contributed to a significant reduction in breast cancer morbidity and mortality, and it remains one of our best tools for detecting treatable breast cancer (1, 2). While 1 in 7 cancers may be over diagnosed and overtreated, 6 in 7 cancers are not (3). Moreover, two-thirds of over diagnosed cancers progressed and are only considered overdiagnosed because the patient died from other causes (3). To reduce screening for these reasons increases the likelihood that we will miss cancers that are clinically significant. We believe it is important to remember that without screening, clinicians do not know whether a cancer, indolent or progressive, is present until patients become symptomatic. Rather than reducing our screening of all breast cancers, we should focus on improving our management of indolent cancers. As Ryser et al. note, most of the over diagnosed malignancies were due to DCIS. This is likely due in part to the high overall survival of DCIS and the complexity of DCIS management secondary to the varying degrees of aggressiveness (4). To this end, we look forward to several trials, including the LORD, LORIS, COMET, and LORETTA trials, which seek to assess outcomes of active surveillance with or without endocrine therapy for low grade DCIS (5). We believe breast cancer screening provides valuable information. We should not forgo this resource because it detects possibly indolent cancers, but rather leverage it to better identify what cancers will become clinically significant. Therefore, we believe breast cancer screening remains an important service to our patients and an important tool for clinicians and that future efforts should be focused on reducing morbidity through improved management practices.
REFERENCES:
1. Morris E, Feig SA, Drexler M, Lehman C. Implications of Overdiagnosis: Impact on Screening Mammography Practices. Popul Health Manag. 2015;18 Suppl 1(Suppl 1):S3-S11. doi: 10.1089/pop.2015.29023.mor. PubMed PMID: 26414384.
2. Monticciolo DL, Helvie MA, Hendrick RE. Current Issues in the Overdiagnosis and Overtreatment of Breast Cancer. American Journal of Roentgenology. 2017;210(2):285-91. doi: 10.2214/AJR.17.18629.
3. Ryser MD, Lange J, Inoue LYT, O’Meara ES, Gard C, Miglioretti DL, et al. Estimation of Breast Cancer Overdiagnosis in a U.S. Breast Screening Cohort. Annals of Internal Medicine. 2022. doi: 10.7326/M21-3577.
4. Mukhtar RA, Wong JM, Esserman LJ. Preventing Overdiagnosis and Overtreatment: Just the Next Step in the Evolution of Breast Cancer Care. Journal of the National Comprehensive Cancer Network J Natl Compr Canc Netw. 2015;13(6):737-43. doi: 10.6004/jnccn.2015.0088.
5. Kanbayashi C, Thompson AM, Hwang E-SS, Partridge AH, Rea DW, Wesseling J, et al. The international collaboration of active surveillance trials for low-risk DCIS (LORIS, LORD, COMET, LORETTA). Journal of Clinical Oncology. 2019;37(15_suppl):TPS603-TPS. doi: 10.1200/JCO.2019.37.15_suppl.TPS603.
Incorrect assumption risks underestimation of breast cancer overdiagnosis
Ryser et al.'s study1 is admirable for high-quality Breast Cancer Screening Consortium (BCSC) data, best Bayesian practices, and identifiability validation. However, it also depends on an improbable assumption: that screening has equal sensitivity for indolent and progressive tumors.
The assumption is not plausible because indolent tumors are typically smaller than progressive tumors, and because smaller tumors are harder to detect on a mammogram.2,3 Indeed, indolent tumors are thought most common at the smallest sizes, where sensitivity is lowest. Ryser et al. assign 81% sensitivity to all tumors (indolent and progressive),1 but actual sensitivity is much less at these small sizes (e.g., 10%3 or 26%2 at 5 mm).
To investigate further, we first confirmed that we could reproduce Ryser et al.’s results by simulating data exactly according to their model. Next, we simulated data following their model in all respects, except that we reduced sensitivity for indolent tumors. The simulated data had 42% overdiagnosis with our parameters.4 Yet, Ryser et al.’s methods estimated it had only 12% overdiagnosis. So, their methods greatly underestimated overdiagnosis, owing to the equal-sensitivity assumption. We worry the same thing happens when they apply their methods to real data.
We suggest they proceed as follows: Modify their model to allow different indolent and progressive sensitivities (βindolent≠β). Refit to BCSC data multiple times while fixing βindolent at values spanning the possible range (eg, 0.02-0.72 in steps of 0.1). Plot βindolent versus resulting overdiagnosis estimates, with uncertainty intervals. If results stay similar to their published 15.4% overdiagnosis estimate,1 our concern is unfounded. We suggest these steps because Ryser and colleagues make innovative, major advancements—worth building on even if alterations are desirable.
Other assumptions used by Ryser et al. might be realistic, but little evidence is available for them. For example, Ryser et al. assume incidences of latent progressive and indolent cancers are proportional throughout ages 40-75, a 35-year period, without supporting evidence.1 In fairness, however, all overdiagnosis estimates have similar issues, including our own. We assumed that, were it not for screening, breast cancer incidence would be proportional at screened and unscreened ages during 1975-2009 (another 35-year period). Our estimate was 31% population overdiagnosis.5 However, this lowered to 28% (95%CI:23-34%) or 24% (95%CI:13-33%) after allowing assumption violations of sizes seen in earlier decades.5
Because all overdiagnosis estimates are assumption-dependent, we believe our approach and Ryser et al.’s both have value. Goals should focus on testing and reducing assumptions.
References
Disclosures:
Funding for C.H.'s participation in this work was provided by Exergen, Corp. (Watertown, MA), a manufacturer of thermometers that has no conflicts of interest in breast cancer screening.
Re: Estimation of Breast Cancer Overdiagnosis in a U.S. Breast Screening Cohort.
Ryser et al [1] used a lead-time model to predict that 1 in 7 screen-detected breast cancers will be overdiagnosed in a contemporary United States setting. They argue that the alternative excess incidence model depends on a reliable estimate of breast cancer incidence in the absence of screening. This is correct, but Ryser et al. do not consider that variation in the time of introduction of breast screening in the Nordic countries has provided exactly that [2]. This allows for a study design as close to a randomized trial as possible. Estimates of overdiagnosis provided by excess incidence models [2] are higher than those derived from the randomized trials [3] while estimates from lead-time models are lower.
The Independent UK Panel [3] estimated that 19% of cancers detected during the time a screening program is running are overdiagnosed. Accounting for interval cancers, this means about 1 in 3 screen-detected cancers, carcinoma in situ and invasive cases combined, were overdiagnosed. Increased sensitivity of the mammography equipment has likely increased overdiagnosis since the trials rather than decreased it. Ryser et al. [1] note that their predicted level of overdiagnosis is considerably higher than in previous modelling studies but it is still substantially lower than the estimate derived from the randomized trials [3] and excess-incidence studies with a contemporary, non-screened reference group [2]. Why?
Sojourn time is the period where tumors can be detected with mammography only. Ryser et al [1] estimated mean sojourn time (MST) to be 6.6 years and sensitivity at 81.4%. These are extreme values. The longest published estimate of MST from the randomized trials that we have found is 3.3 years (with 71% sensitivity) [4]. The authors also refers to a Norwegian modelling study reporting about 7 years MST; however, sensitivity was forced to be about 60%, which is unrealistic.
One may question if long MST are medically justifiable. The mammographic and clinical detection thresholds are 8.0 and 10.0 mm, respectively, corresponding to that sojourn time is about 1 volume doubling time [5]. Observed median volume doubling times for tumors of these sizes is about 260 days [5]. Why do the authors assume MST to be 6.6/(260/365)=9.3 times higher than clinically observed sojourn time?
Like other modelling studies, Ryser et al. provide an educated prediction that is highly sensible to its underlying assumptions with a much lower estimate of overdiagnosis than the randomized trials.
References
Estimation of Breast Cancer Overdiagnosis in a U.S. Breast Screening Cohort: a step in the right direction, but still an underestimation.
Ryser et al. presented a well-crafted modeling study on breast cancer screening (1). The study does show advances in relation to other previously published models, such as the ability to stratify overdiagnosis cases into indolent and progressive cancers. By accommodating an indolent fraction of preclinical disease, the authors underestimated the overdiagnosis to a lesser extent compared to previous modeling studies (2).
Like any modeling study, the article has many assumptions and brings indirect evidence, which reduces the strength of the evidence for clinical decision-making. The sojourn time estimate is much higher than those derived from clinical trials. Moreover, each woman received 2.3 screening tests on average, 51.3% had a single test and 92.1% received 5 tests or fewer. In contrast, the authors are modeling annual and biennial screening between ages 50-74. With so few screening rounds, the actual overdiagnosis in the cohort is certainly much less than what would be found if the model's screening protocols had been followed.
In the only two randomized controlled trials in which mammographic screening was not offered to control group after the end of the study, it is possible to derive reliable estimates of overdiagnosis if the calculation methods are harmonized (2). These estimates are higher than those presented by Ryser et al., but still are underestimated in relation to clinical practice, due to the lower sensitivity of mammograms at the time and the contamination of the control groups.
Although the academic discussion on overdiagnosis is generally restricted to the Nordic countries and North America, low and middle income countries (LMICs), where approximately eighty percent of the world's population lives, have experienced increasing trends in breast cancer incidence (3) and diffusion of mammographic screening in the last decades. Screening in the 70-74 age group does not even have conclusive evidence on efficacy (4). Furthermore, the absolute benefit if any would be lower in LMICs because of the lower risk of breast cancer. On the other hand, overdiagnosis in this age group tends to be higher in LMICs, due to the lower life expectancy, which often does not reach 20 years in women aged 60 years (5) and which has decreased even more with the COVID-19 pandemic, tending to continue to decline in countries with low vaccination coverage. Health professionals and managers should consider this balance between harms and benefits when interpreting the results of the modeling study presented here.
References
1. Ryser MD, Lange J, Inoue LY, O’Meara ES, Gard C, Miglioretti DL, Bulliard JL, Brouwer AF, Hwang ES, Etzioni RB. Estimation of breast cancer overdiagnosis in a U.S. breast screening cohort. Ann Intern Med. 2022. doi: 10.7326/M21-3577
2. Migowski A, Nadanovsky P, Vianna CM de M. Estimation of Overdiagnosis in Mammographic Screening: a Critical Assessment. Revista Brasileira de Cancerologia. 2021; 67(2): e-151281. doi: 10.32635/2176-9745.RBC.2021v67n2.1281
3. Sung H, Ferlay J, Siegel RL, Laversanne M, Soerjomataram I, Jemal A, Bray F. Global Cancer Statistics 2020: GLOBOCAN Estimates of Incidence and Mortality Worldwide for 36 Cancers in 185 Countries. CA Cancer J Clin. 2021 May;71(3):209-249. doi: 10.3322/caac.21660.
4. Migowski A, Silva GAE, Dias MBK, Diz MDPE, Sant'Ana DR, Nadanovsky P. Guidelines for early detection of breast cancer in Brazil. II - New national recommendations, main evidence, and controversies. Cad Saude Publica. 2018 Jun 21;34(6):e00074817. doi: 10.1590/0102-311X00074817.
5. World Health Organization. The global health observatory [internet]. WHO; [cited 2022 Apr 25]. Available from: https://www.who.int/data/gho/data/indicators/indicator-details/GHO/life-expectancy-at-age-60-(years).
Authors' Response to Comments
Dr. Migowski is correct to note that the rate of overdiagnosis depends on the frequency of screening. However, our overdiagnosis estimate is not specific to the screening pattern of our study cohort. Rather the cohort data was only used to learn about the natural history of breast cancer. Given the learned natural history, we then predicted schedule-specific overdiagnosis rates, i.e., the frequency of overdiagnosis among screen-detected cancers under biennial mammography between the ages of 50 and 74. The ability to predict the rate of overdiagnosis for an arbitrary screening schedule is a key feature of our approach.
Drs Zahl and Jørgensen as well as Dr. Migowski are concerned that our mean sojourn time (MST) estimate of 6.6 years is substantially longer than estimates derived from historic trials. Our MST estimate was derived through Bayesian inference from the study data, using a non-informative prior distribution. As a Bayesian estimate, it is naturally accompanied by posterior uncertainty–the MST could be as low as 4.9 years–and correlated with other model parameters, especially the test sensitivity and the fraction of indolent tumors. As a result of such parametric uncertainty and correlations, it is preferable not to focus on point estimates of individual parameters but rather to consider their joint posterior distribution.
It is unclear whether MST estimates from historic trials are comparable to contemporary screening cohorts. By its very definition as the screen-detectable preclinical period, the sojourn time is not an absolute quantity, and depends on both the accuracy of the screening apparatus and the recall and biopsy referral practices. Considerable improvements in screening technology in addition to more frequent supplemental screening have extended the time window during which a breast tumor is screen-detectable, and thus likely also the MST.
Sensitivity analyses showed that informative priors for the MST had little to no effect on the overdiagnosis estimate. As such, they refute the proposition by Drs Zahl and Jørgensen that the allegedly inflated MST estimate may have resulted in an underestimate of the overdiagnosis rate, when compared to previous excess incidence studies. To the contrary, a more plausible explanation for this discrepancy is the documented bias of the excess incidence method itself, leading to inflated overdiagnosis estimates in most settings (1).
We agree with Dr. Harding and colleagues that indolent and progressive tumors may have different screening sensitivities, and that the proposed sensitivity analyses are valuable in exploring the potential impact of such differences. Yet we are unsure about their claim that indolent tumors should have a lower sensitivity compared to progressive ones. DCIS lesions are more likely to be indolent than invasive cancers, yet the mammographic sensitivity for DCIS is higher than that for invasive cancers (2, 3). More evidence is needed to render the proposed sensitivity analyses interpretable.
Finally, we concur with Dr Chiu and colleagues that mammography screening is an important service to patients, and that the identification and management of indolent tumors, especially among DCIS patients, should be a priority. This is, as noted by Dr. Migowski, ever more important as breast screening is expanded among a growing number of low- and middle-income countries.
References