Original Research19 July 2011

Communicating Data About the Benefits and Harms of Treatment

A Randomized Trial
    Author, Article, and Disclosure Information



    Despite limited evidence, it is often asserted that natural frequencies (for example, 2 in 1000) are the best way to communicate absolute risks.


    To compare comprehension of treatment benefit and harm when absolute risks are presented as natural frequencies, percents, or both.


    Parallel-group randomized trial with central allocation and masking of investigators to group assignment, conducted through an Internet survey in September 2009. (ClinicalTrials.gov registration number: NCT00950014)


    National sample of U.S. adults randomly selected from a professional survey firm's research panel of about 30 000 households.


    2944 adults aged 18 years or older (all with complete follow-up).


    Tables presenting absolute risks in 1 of 5 numeric formats: natural frequency (x in 1000), variable frequency (x in 100, x in 1000, or x in 10 000, as needed to keep the numerator >1), percent, percent plus natural frequency, or percent plus variable frequency.


    Comprehension as assessed by 18 questions (primary outcome) and judgment of treatment benefit and harm.


    The average number of comprehension questions answered correctly was lowest in the variable frequency group and highest in the percent group (13.1 vs. 13.8; difference, 0.7 [95% CI, 0.3 to 1.1]). The proportion of participants who “passed” the comprehension test (≥13 correct answers) was lowest in the natural and variable frequency groups and highest in the percent group (68% vs. 73%; difference, 5 percentage points [CI, 0 to 10 percentage points]). The largest format effect was seen for the 2 questions about absolute differences: the proportion correct in the natural frequency versus percent groups was 43% versus 72% (P < 0.001) and 73% versus 87% (P < 0.001).


    Even when data were presented in the percent format, one third of participants failed the comprehension test.


    Natural frequencies are not the best format for communicating the absolute benefits and harms of treatment. The more succinct percent format resulted in better comprehension: Comprehension was slightly better overall and notably better for absolute differences.

    Primary Funding Source:

    Attorney General Consumer and Prescriber Education grant program, the Robert Wood Johnson Pioneer Program, and the National Cancer Institute.

    • Optimal ways of communicating the potential benefits and harms of treatments are not clear.

    • This randomized trial involving 2944 adults compared 5 numeric formats for presenting outcomes that could occur with drug treatment. People seemed to best understand information that was presented as a simple percentage. Even with the percent format, however, about one third of participants had difficulty understanding data about benefit and harm risks.

    • The researchers studied adults who agreed to participate in a survey about 2 hypothetical drug treatments rather than patients facing actual medical decisions.

    • Presenting probable treatment outcomes in a simple percentage format might improve comprehension.

    —The Editors

    To make informed decisions, patients need to understand what is likely to happen with and without treatment. It is widely accepted that natural frequencies (for example, 2 in 1000 persons) are the best way to communicate these absolute risks. Major organizations, such as the Cochrane Collaboration (1, 2), the International Patient Decision Aid Standards Collaboration (3), and the Medicines and Healthcare products Regulatory Agency (4) (the United Kingdom's equivalent of the U.S. Food and Drug Administration) all recommend using natural frequencies to present absolute risks.

    The evidence behind these recommendations, however, is limited and is extrapolated from studies in a very specialized context: Bayesian probability revisions in diagnostic testing. On the basis of the only randomized trial identified (which included 60 students) (5), a 2004 systematic review recommended natural frequencies over percents (6). A 2009 systematic review (7) also considering only probability revisions did not recommend use of natural frequencies because it identified an additional series of randomized trials refuting the superiority of natural frequencies (8). Most important, the only 2 direct tests of absolute risk formats for communicating treatment effects found small differences favoring percents (9, 10). Because these trials tested only simple, artificial scenarios in highly educated, self-selected convenience samples (people actively seeking information on Harvard University's “Your Cancer Risk” Web site ([www.diseaseriskindex.harvard.edu/update/]), the findings might not be true in more typical settings.

    We conducted a randomized trial comparing comprehension of the benefits and harms of drugs when absolute risks are presented as natural frequencies, percents, or both. To test the formats among typical people facing typical decisions, we used familiar conditions, presented multiple absolute risks (because treatments have multiple benefits and harms), and recruited a nationally representative sample of U.S. adults.

    Study Design

    This parallel-group randomized trial, conducted and completed in September 2009, compared numerical formats for presenting absolute risks (Figure 1). Participants were members of a nationally representative research panel who had previously signed privacy statements agreeing to receive e-mail invitations to participate in surveys. Invitations included a link that let them see the study's purpose (our link read, “ways to provide information about prescription drugs”), the time required to complete the survey, and a reminder that participation was voluntary; for the text of the survey, see the Supplement. Participants were allocated in a 1:1 ratio to receive absolute risks in 1 of 5 numeric formats; they were not told about the testing of alternative formats. The Committee for the Protection of Human Subjects at Dartmouth Medical School approved the study. The protocol was registered with ClinicalTrials.gov before recruitment began (NCT00950014).

    Figure 1. Study flow diagram.

    Setting and Participants

    Participants were recruited from a research panel created by Knowledge Networks (Menlo Park, California), a professional survey firm. The panel consists of about 30 000 households recruited by probability methods (random-digit dialing of U.S. residential landlines, supplemented by address-based sampling to capture cell phone–only households). In return for free Internet access and cash incentives (or necessary computer equipment), panelists agreed to receive e-mail invitations to participate in surveys. The ability of the panel to produce nationally representative samples has been demonstrated in head-to-head studies in comparison with random-digit dialing techniques (11–13). The National Science Foundation uses the Knowledge Networks panel for its grant program involving general population experiments (14).

    For our study, the survey firm invited a simple random sample to participate (Figure 1). Eligibility criteria (established before recruitment began) were age 18 years or older and ability to complete the survey on a computer (<9% of panel members were excluded because they only had Web TV, which has insufficient resolution to display the images in a readable format). Over the next 2 weeks, nonrespondents received e-mail reminders or an automated telephone call. Of the 4316 persons invited, 2944 (68%) agreed to participate.

    Randomization and Intervention

    Participants were randomly assigned to 1 of the 5 numeric presentation groups (without stratification or blocking) by using a central computerized random-number generator to ensure allocation concealment. The survey firm programmed the random-number generation and allocation to happen immediately after participants clicked on the link to participate in the survey.

    All data were presented in a standard layout (drug facts boxes)—a single-page summary including a narrative description of what the drug is for, along with a data table with absolute risks and differences for the beneficial and harmful outcomes in the drug and placebo groups (15). Participants were shown boxes for 2 hypothetical drugs that are used to treat familiar conditions—a heartburn drug that dramatically reduces a common symptom and a cholesterol drug that reduces uncommon events (the chance of dying of a heart attack) and has a very uncommon side effect (muscle breakdown)—which allowed us to present absolute risks across a broad range of magnitude. To make the boxes realistic, we adapted data from trials of actual drugs (lansoprazole and simvastatin) and masked the drug identities with false names (Paxcid and Questor).

    The drug boxes for each study group were identical except for the numerical formats used to express the absolute risks (Figure 2):

    Figure 2. Data presentations in the 3 numeric format groups.

    The percent plus natural frequency and variable frequency–alone format groups are not shown but can be constructed on the basis of the numbers in the figure.

    1. Natural frequency: Absolute risks and differences were expressed as whole numbers per 1000 (for example, 20 in 1000). Because natural frequencies are the most commonly recommended format, we used them as the reference group.

    2. Variable frequency: Absolute risks and differences were expressed as frequencies in which the denominator was adjusted so that it is the smallest multiple of 10 necessary to keep the numerator greater than 1. For example, 2% is expressed as 2 in 100, 0.2% is 2 in 1000, and 0.02% is 2 in 10 000. To minimize confusion, denominators varied only between table rows.

    3. Percent: Absolute risks and differences were expressed as percents rounded to whole numbers, unless decimals were needed to see the absolute difference (for example, 3.3% [placebo group] vs. 2.5% [drug group] = 0.8%).

    4. Percent plus natural frequency: Absolute risks were expressed as both percent and a natural frequency (x in 1000). To avoid data overload, absolute differences were expressed only as percents.

    5. Percent plus variable frequency: Absolute risks were expressed as both percent and a variable frequency (as explained above). To avoid data overload, absolute differences were expressed as percents.

    Outcomes and Follow-up

    The online survey, which took about 20 minutes to complete, measured comprehension and judgments of drug benefits and harms and the helpfulness of data (screen shots are shown in the Supplement). Because our goal was to test understanding rather than recall, a drug box remained on the screen with each question.

    We conducted 2 pilot tests to ensure that the online process worked. The first pilot (42 participants) tested the recruitment and randomization procedures, debugged the survey, and identified questions with high item nonresponse. The second pilot (106 participants) assessed whether questions were understandable by including open-ended responses after the questions (for example, “Please explain why you answered false”). No pretest data were included in the final analyses.

    Primary Outcome Measure

    The primary outcome was comprehension of data presented in the drug boxes, assessed with 18 questions (9 per drug box); these were a mixture of true-or-false and fill-in-the-blank questions. Some questions involved judging the direction of an effect (for example, true or false: “Paxcid and placebo work equally well at completely relieving heartburn”), and others involved quantifying effects (for example, true or false: “People given Questor were twice as likely to have bothersome muscle aches as people given placebo”). Twelve questions were identical for all 5 format groups. The other 6 questions varied only in that the numeric format in the question matched that of the presentation (for example, “2% more people” in the percent group compared with “20 in 1000 more” in the natural frequency group).

    We scored comprehension in 3 ways: mean number of correct answers, the proportion of persons who “passed” the test (the score closest to 70%, or ≥13 correct responses out of 18 questions), and the proportion of persons who got an “A” grade (the score closest to 90%, or ≥16 correct responses out of 18 questions).

    Statistical Analysis

    We used the results from an earlier study involving similar questions assessing comprehension of drug box data (15) as the basis for our sample size calculations; that study demonstrated that about 80% of persons who were shown drug boxes in the percent plus natural frequency format achieved a “passing” grade. We asserted that a 10–percentage point change in the proportion of persons who passed would be clinically important. To be conservative (that is, to maximize the required sample size), we used pass rates of 80% versus 70% in our calculation. To account for multiple comparisons, we set the 2-sided α value to 0.0125 because comparing each data format against natural frequency required 4 pairwise comparisons (0.05 ÷ 4 = 0.0125). We needed about 550 people in each group to have 90% power to detect a difference of 10 percentage points or greater.

    For the primary outcome measure (comprehension), missing answers were considered incorrect; for judgments and helpfulness ratings, missing answers were considered to represent “no opinion.” The proportion of missing answers ranged from less than 1% to 5%, and sensitivity analyses (complete case analysis) yielded nearly identical results. The natural frequency group was the reference category for all comparisons. We used t tests for differences in means, 2-sample tests of proportions to test for differences in proportions, and chi-square tests for differences in ordinal variables. All comparisons were 2-sided and were considered statistically significant at P values less than 0.01. All analyses were done in Stata, version 11 (StataCorp, College Station, Texas).

    Role of the Funding Source

    This study was supported by the Attorney General Consumer and Prescriber Education grant program, the Robert Wood Johnson Pioneer Program, and the National Cancer Institute. The funding sources had no role in study design, data collection and analysis, manuscript preparation, or the decision to submit the manuscript for publication.


    Demographic characteristics were similar across the 5 numeric format groups (Table 1). The mean participant age was 47 years (range, 18 to 93 years), and 53% were women. Six percent of participants had less than a high school education; 38% had a college degree or higher. There were no crossovers during the trial.

    Table 1. Participant Characteristics

    Table 1.
    Comprehension of Benefits and Harms

    Figure 3 shows that the mean number of comprehension questions answered correctly was lowest in the variable frequency group and highest in the percent group (13.1 vs. 13.8; difference, 0.7 [95% CI, 0.3 to 1.1]; P < 0.001). Comprehension in the natural frequency group was nominally lower than that in the percent group: 13.4 vs. 13.8 correct (difference, 0.4 [CI, 0.1 to 0.8]; P = 0.03). The natural frequency and variable frequency groups had the lowest proportion of persons who passed the comprehension test; the percent group had the highest. The same pattern held for the proportion of participants who got an “A” grade.

    Figure 3. Comprehension in the 5 numeric format groups overall and among participants with low numeracy.

    There were 2944 participants overall and 1037 with low numeracy. “Low numeracy” is defined as answering 0 or 1 of the 3 numeracy questions correctly. The error bars represent the upper bound of the 95% CI.

    In 4 of the 18 questions, the absolute difference across the formats exceeded 10 percentage points (the threshold that we asserted was important). In the first question—whether serious muscle breakdown is much less common than minor liver inflammation with the cholesterol drug—the variable frequency group did worse than the natural frequency group (44% vs. 58% correct; P < 0.001).

    For the other 3 questions in which format mattered the most, comprehension was substantially higher in the percent and both percent plus frequency groups than in both frequency-only groups (Appendix Table). Two questions were about absolute differences (for example, “How many fewer people had a heart attack with the cholesterol drug than with placebo?”). For these 2 questions, the proportion answering correctly for the natural frequency versus the percent group was 43% versus 72% (difference, 29 percentage points [CI, 23 to 34 percentage points]; P < 0.001) and 73% versus 87% (difference, 14 percentage points [CI, 9 to 18 percentage points]; P < 0.001). The third question involved identifying a particular data item.

    Appendix Table. Proportion of Participants Who Correctly Answered the 18 Comprehension Questions

    Appendix Table.

    Among the 7 questions about low probability events (those with <1% chance of occurring), comprehension did not differ between the natural frequency and percent groups.

    Comprehension, by Numeracy and Education

    The finding that comprehension of natural frequency was somewhat lower than comprehension of percents was consistent across levels of numeracy and education. For example, Figure 3 (bottom) shows that the pattern of findings for the 1037 respondents with low numeracy (those who answered ≤1 of 3 numeracy question correctly [16]) was similar to that in the sample as a whole; as expected, all comprehension measures were lower for this subgroup than in the overall sample. A similar pattern was evident among the 909 people with a high school education or less: The proportion of persons who passed was lower in the natural frequency group than the percent group (53% vs. 63% [difference, 10 percentage points [CI, 0 to 20 percentage points]), but the proportion of persons who got an “A” grade did not differ (18% vs. 17% [difference, −1 percentage point [CI, −9 to 7 percentage points]). The same pattern was also evident among those with the highest level of numeracy and education (those with a postgraduate degree).

    Judgments of Benefit and Harm

    When the absolute differences were numerically small, the natural frequency group judged benefits and harms as being larger than the percent group did (Table 2). For example, 43% of the natural frequency group said that the side effects of the heartburn drug (expressed as “40 in 1000 more” had diarrhea) was moderate or larger, whereas only 26% of the percent group (in which this information was expressed as “4% more” had diarrhea) came to the same conclusion (P < 0.001). Consistently, fewer persons in the natural frequency group than the percent group thought that the benefits of the heartburn drug were definitely worth the side effects (35% vs. 47%; P = 0.002).

    Table 2. Perception of Drug Benefit and Harm in the 5 Numeric Format Groups

    Table 2.
    Helpfulness Ratings

    Each group rated the data similarly in how they helped them understand drug benefit: The proportion that responded “helped me a lot” ranged from 56% in the percent group to 61% in both of the percent plus frequency groups. Ratings of harm data were similar: The proportion that chose “helped me a lot” ranged from 58% in the percent group to 64% in the percent plus variable frequency group. None of the differences was statistically significant.


    We found no evidence to support the assertion that natural frequency is the best format for communicating the benefits and harms of treatment. In fact, the percent format had slightly higher comprehension overall and at each level of numeracy and education. The combined percent plus natural frequency format was no better than the percent format alone. Comprehension of the variable frequency format was consistently lowest.

    The use of natural frequencies instead of percents to communicate absolute risks has been promoted largely on intuitive and evolutionary grounds (the human mind developed the ability to learn over thousands of years by observing and counting things; in contrast, the science of probability is only a few hundred years old) (5). However, the evidence supporting natural frequencies over percents for communicating to patients (based on 2 systematic reviews [6, 7] and our English-language MEDLINE searches to April 2011) is limited to trials testing a specific skill: the ability to use conditional probabilities when interpreting diagnostic test results. In fact, the 2 trials testing absolute risk formats for communicating treatment effects found small differences for percents over natural frequencies (9, 10). Nevertheless, even iconic “evidence-based” organizations have issued guidance promoting the use of natural frequencies over percents for communicating treatment effects.

    Our findings challenge such guidance. They also refute the common assumption that percents should be avoided for expressing small probabilities (for example, <1%). We previously made this assumption after repeatedly finding that study participants had the most difficulty converting “1 in 1000” to “0.1%” in our 3-item numeracy test (17)—a finding also observed in the current trial. The fact that comprehension of the 7 questions about low-probability events was the same in the percent group and the 2 natural frequency groups argues that the difficulty in converting between formats reflects trouble with manipulating decimal points rather than a comprehension problem.

    Our study also highlights known problems with frequency formats. People get confused when the denominator changes—for example, deciding whether 1 in 130 or 1 in 236 is a larger number (9, 18, 19). We wondered whether limiting denominator changes to orders of magnitude (for example, 100, 1000, and 10 000) and keeping denominators constant within rows of tables would minimize confusion and enhance the ability to discriminate between varying probabilities. Unfortunately, this format was still confusing.

    Variable frequency formats may be confusing because the larger number in the denominator means a smaller probability. Another reason for confusion is “denominator neglect” (20): People tend to focus on the numerator of a frequency and ignore the denominator. This problem is best illustrated with the variable frequency format in the cholesterol drug table, where the chance of serious muscle breakdown was 4 in 10 000 and the chance of liver inflammation was 1 in 100. Only 40% of participants in that group correctly identified serious muscle breakdown as the less common event. These incorrect responses probably reflect comparison of numerators (4 vs. 1) without considering the denominators (10 000 vs. 100).

    Denominator neglect may cause problems even when the denominator is held constant, as in our natural frequency format (always x in 1000) because it magnifies numerically small effects. For example, the increase in diarrhea with the heartburn drug looked bigger to the natural frequency group (presented as “40 in 1000”) than the percent group (presented as “4%”). Heightened perception of adverse effects may explain why the natural frequency group had less enthusiasm for the heartburn drug than the percent group did. Denominator neglect matters when it leads people away from a good intervention because of a format distortion rather than a balanced weighing of benefits and harms.

    In theory, combined formats should be best because they give people options and reinforce understanding by presenting the same data in different ways. We found that combined formats generally worked better than frequency formats alone: Comprehension was higher, and there was no evidence of denominator neglect. However, they worked no better than percents alone, and they are therefore probably not worth the additional visual clutter (they triple the number of values presented).

    Our trial has several important strengths. We used a rich set of comprehension questions to assess understanding of both relative and absolute differences (including small and large magnitudes) within and across rows of a complex table. There were no study dropouts (randomization occurred after potential participants agreed to complete the online survey), and item nonresponse was low (≤5%). In contrast to much of the prior research using convenience samples, our participants were recruited from a large national research panel that, by design, can be weighted so that results are nationally representative (that is, they account for the sampling strategy and panel recruitment). Weighting accounts for study nonparticipation and adjusts the demographic characteristics of the panel members to match the U.S. population on age, sex, race, education, region, and metropolitan residence. Weighted and unweighted results were nearly identical. We chose to present unweighted results to preserve the simplicity of the randomized trial. The negligible effect of weighting

    does suggest that the unweighted results are nationally generalizable.

    Our findings should be interpreted in light of several limitations. First, comprehension was tested in a survey rather than in the setting of actual medical decisions. In addition, there is no clear standard for the level of comprehension needed to make an informed decision. That is why we judged comprehension in 3 ways: mean score, proportion of persons who “passed” the test, and the proportion that received an “A” grade. Because setting thresholds is inherently arbitrary, we adapted a familiar external benchmark—school grades (“passing” is >70% correct, and an “A” grade is >90%), a strategy we used previously (21).

    Finally, there is room for improvement. Even with the percent format, about one third of participants failed the comprehension test. This may in part reflect that participants were facing hypothetical decisions. Patients facing real decisions might have been more engaged and done better. However, part of the problem undoubtedly reflects a poor understanding of numbers. While it is tempting to conclude that none of the formats is adequate, it is important to consider the complexity of the tasks involved. Participants had to navigate complex tables to find and compare numbers. The National Assessment of Adult Literacy considers such tasks to be among the most difficult that they assess (22). In the 2003 survey, only 53% of respondents could use a simple table to find and compare bank interest rates (requiring either subtracting or dividing 2 numbers). The fact that pass rates in our trial increased directly with education (ranging from 62% for persons with high school education or less to 85% for those with a postgraduate degree) and with numeracy (ranging from 56% for persons with low numeracy to 92% for those with the highest numeracy) highlights that although data formats matter, the main underlying need is for better education. Fortunately, evidence indicates that even a simple educational intervention can help (21). It is also possible that comprehension would improve over time through regular exposure to absolute risks in standardized formats. Leaders in risk communication believe that it is incumbent on policymakers to move in this direction to help the public make decisions in their own interest (23).

    People who are trying to communicate data about benefit and harm to the public, patients, physicians, and policymakers must choose a format for absolute risks. Our trial shows that they should avoid variable frequencies and that they should no longer accept the assertion that natural frequencies are the best format. On the basis of our findings, we believe that the percent format is probably best. It is more succinct (requiring one half as many numbers) and slightly better than the natural frequency format, particularly for communicating the most basic data needed to compare treatment effects: absolute differences.


    • 1. Rosenbaum SEGlenton COxman ADSummary-of-findings tables in Cochrane reviews improved understanding and rapid retrieval of key information. J Clin Epidemiol2010;63:620-6. [PMID: 20434024] CrossrefMedlineGoogle Scholar
    • 2. Schünemann HJ, Oxman AD, Higgins JP, Vist GE, Glasziou P, Guyatt GH. Presenting results and “Summary of findings” tables. In: The Cochrane Handbook. 2009. Accessed at www.mrc-bsu.cam.ac.uk/cochrane/handbook/ on 24 November 2010. Google Scholar
    • 3. Elwyn GO'Connor AStacey DVolk REdwards ACoulter Aet alInternational Patient Decision Aids Standards (IPDAS) CollaborationDeveloping a quality criteria framework for patient decision aids: online international Delphi consensus process. BMJ2006;333:417. [PMID: 16908462] CrossrefMedlineGoogle Scholar
    • 4. Medicines and Healthcare products Regulatory Agency. Guidance on communication of risks and benefits in patient information leaflets. 2005. Accessed at www.mhra.gov.uk/Howweregulate/Medicines/Medicinesregulatorynews/CON049410 on 31 May 2011. Google Scholar
    • 5. Gigerenzer GHoffrage UHow to improve Bayesian reasoning without instruction: frequency formats. Psychol Rev1995;102:684-704. CrossrefGoogle Scholar
    • 6. Trevena LJDavey HMBarratt AButow PCaldwell PA systematic review on communicating with patients about evidence. J Eval Clin Pract2006;12:13-23. [PMID: 16422776] CrossrefMedlineGoogle Scholar
    • 7. Visschers VHMeertens RMPasschier WWde Vries NNProbability information in risk communication: a review of the research literature. Risk Anal2009;29:267-87. [PMID: 19000070] CrossrefMedlineGoogle Scholar
    • 8. Girotto VGonzalez MSolving probabilistic and statistical problems: a matter of information structure and question form. Cognition2001;78:247-76. [PMID: 11124351] CrossrefMedlineGoogle Scholar
    • 9. Cuite CLWeinstein NDEmmons KColditz GA test of numeric formats for communicating risk probabilities. Med Decis Making2008;28:377-84. [PMID: 18480036] CrossrefMedlineGoogle Scholar
    • 10. Waters EAWeinstein NDColditz GAEmmons KFormats for improving risk communication in medical tradeoff decisions. J Health Commun2006;11:167-82. [PMID: 16537286] CrossrefMedlineGoogle Scholar
    • 11. Baker LB, Singer S, Wagner T. Validity of the Survey of Health and Internet and Knowledge Network's Panel and Sampling. Stanford: Health Economics Resource Center; 2003. Accessed at www.knowledgenetworks.com/ganp/reviewer-info.html on 31 May 2011. Google Scholar
    • 12. Chang LKrosnick JNational surveys via RDD telephone interviewing versus the internet: comparing sample representativeness and response quality. Public Opin Q2009;73:641-78. CrossrefGoogle Scholar
    • 13. Krotki K, Dennis JM. Probability-based survey research on the internet. Presented at the 53rd Conference of the International Statistical Institute, 22–29 August 2001, Seoul, Korea. Accessed at www.knowledgenetworks.com/ganp/docs/ISI-2001-confernce-paper.pdf on 16 May 2011. Google Scholar
    • 14. Time-sharing Experiments for the Social Sciences (TESS). Accessed at tess.experimentcentral.org/ on 13 December 2010. Google Scholar
    • 15. Schwartz LMWoloshin SWelch HGUsing a drug facts box to communicate drug benefits and harms: two randomized trials. Ann Intern Med2009;150:516-27. [PMID: 19221371] LinkGoogle Scholar
    • 16. Schwartz LMWoloshin SBlack WCWelch HGThe role of numeracy in understanding the benefit of screening mammography. Ann Intern Med1997;127:966-72. [PMID: 9412301] LinkGoogle Scholar
    • 17. Gigerenzer GGaissmaier WKurz-Milcke ESchwartz LWoloshin SHelping doctors and patients make sense of health statistics. Psychological Science in the Public Interest2008;8:53-96. CrossrefGoogle Scholar
    • 18. Grimes DASnively GRPatients' understanding of medical risks: implications for genetic counseling. Obstet Gynecol1999;93:910-4. [PMID: 10362153] MedlineGoogle Scholar
    • 19. Woloshin SSchwartz LMByram SFischhoff BWelch HGA new scale for assessing perceptions of chance: a validation study. Med Decis Making2000;20:298-307. [PMID: 10929852] CrossrefMedlineGoogle Scholar
    • 20. Yamagishi KWhen a 12.86% mortality is more dangerous than a 24.14%: implication for risk communication. Appl Cogn Psychol1997;11:495-506. CrossrefGoogle Scholar
    • 21. Woloshin SSchwartz LMWelch HGThe effectiveness of a primer to help people understand risk: two randomized trials in distinct populations. Ann Intern Med2007;146:256-65. [PMID: 17310049] LinkGoogle Scholar
    • 22. National Center for Education Statistics, U.S. Department of Education. National Assessment of Adult Literacy (NAAL). 2003. Accessed at nces.ed.gov/naal/index.asp on 9 March 2011. Google Scholar
    • 23. Fischhoff B. Questions of competence: the duty to inform and limits to choices. In: The Behavioral Foundations of Policy. Princeton: Princeton Univ Pr; 2010. Google Scholar


    Sign In to Submit A Comment
    Steven Woloshin4 October 2011
    Choice of Methods in Presenting Scenarios to Patients

    Dr. Nardone's suggestion is intended to help people develop a sense of numbers using concrete images.

    For this technique to work, though, it would be important to use images familiar to the target audience. We doubt that most Americans know how many soldiers are in a batallion (we don't)? Or the capacity of the field for the local AAA baseball team (we don't even know if we have a AAA team)? Or even the number of US Senators (we knew this one)?

    Even with familiar images, though, there may still be problems. Changing the denominators to accommodate chances of different magnitude may undermine communication. In our trial, people had the most trouble understandingthe variable frequency format where denominators changed by orders of magnitude (e.g. 100, 1,000, 10,000).

    While Dr. Nardone's approach may be useful to teach concepts, it would be not be feasible for the kinds of applications we envision: efficiently summarizing the multiple benefits and harms of medical interventions.

    Steven Woloshin, MD, MS and Lisa M. Schwartz, MD, MS

    VA Outcomes Group, White River Jct., VT and the Dartmouth Institute for Health Policy and Clinical Practice

    Conflict of Interest:

    None declared