Details of Literature Search and Selection Criteria
We used a broad literature search strategy to identify relevant publications from 1 January 1980 through 8 October 2003. We tested 3 search strategies, as MeSH terms in this area are imprecise. First, we tested an inclusive strategy by using the following MEDLINE indexing terms: intraoperative complications, postoperative complications, preoperative care, intraoperative care, anesthesia, or analgesia. The result was an unmanageable 63 000 citations with an estimated yield less than 2%. The second approach was more tailored, using any of the following terms and specifying them as the article's primary focus: intraoperative complications, postoperative complications, preoperative care, intraoperative care, and postoperative care, plus perioperative complications as text in the title or abstract. This strategy produced 15 466 citations. Third, we tested a narrow strategy by combining the second strategy with terms specific to pulmonary diseases and complications and identified 2860 potentially relevant citations.
We determined the degree of agreement between the second and third searches. Title review of the second, broader strategy results indicated 395 potentially relevant citations of 15 466 (2%) citations. Of these, the third, narrow strategy identified only 157 citations, leaving 238 potentially relevant citations that were missed by the narrow strategy. We reviewed the abstracts of the 238 missed citations in more detail, and 120 of them were potentially relevant. We then reviewed MeSH terms for these 120 citations to identify additional terms to improve the specificity of the second search without sacrificing sensitivity. These included terms for pulmonary, respiratory, or cardiopulmonary diseases, conditions, or complications and terms for oxygenation, chest roentgenography, and lung expansion modalities, such as incentive spirometry.
On the basis of recent systematic reviews and seminal randomized trials, we performed 7 additional focused searches that were tailored to specific topics: preoperative chest radiography, preoperative spirometry, laparoscopic versus open major abdominal procedures, general versus spinal or epidural anesthesia, intraoperative neuromuscular blockade, management of postoperative pain, and postoperative lung expansion techniques. We identified additional references by reviewing bibliographies of retrieved studies.
We limited the review of intervention strategies to randomized, controlled trials and previously published systematic reviews. We subsequently updated searches for studies of risk factors that included multivariable results and randomized, controlled trials of interventions to prevent postoperative pulmonary complications through 30 June 2005. We included only English-language publications and excluded publications without primary data from detailed abstraction (that is, letters, editorials, case reports, conference proceedings, and narrative reviews). We excluded 1) studies with fewer than 25 participants per study group; 2) studies that used only administrative data (for example, ICD-9-CM codes) because of recent data showing poor validity of administrative data for postoperative complications
(150-152); 3) studies from developing countries because of potential differences in respiratory and intensive care technology (according to lists from the Organisation for Economic Co-operation and Development and the International Monetary Fund)
(153, 154); 4) studies that lacked explicit criteria or definitions for pulmonary complications; 5) studies of ambulatory surgery; 6) studies in which outcomes were physiologic rather than clinical (for example, lung volumes and flow, oximetry); 7) studies of gastric pH manipulation; 8) studies of complications unique to a particular type of surgery (for example, upper airway obstruction after uvulectomy); 9) studies of cardiopulmonary or pediatric surgery; and 10) studies of organ transplantation because of profoundly immunosuppressive drugs. For studies using multivariable logistic regression analysis, we required at least 5 outcome occurrences for each covariate entered into the model. We based this criterion on evidence for minimum thresholds for model stability and reliability when estimating odds ratios and CIs
(155). We excluded the few studies that used discriminant analysis because we could not compare the results with odds ratios generated by logistic regression. We did not require eligible studies to provide explicit boundary criteria for risk factors (for example, severity of chronic obstructive pulmonary disease) or for severity of postoperative pulmonary complications (for example, severity of atelectasis). When studies provided such information, we included this in our summary of study characteristics (Appendix Tables
1,
2, and
8).
An investigator evaluated each citation according to the following strategy: title and abstract review, then review of the full reference if necessary. If a reviewer was uncertain, we made the final decision by consensus.
Of 16 959 citations identified by the search, 1223 were duplicates and 14 793 were not relevant on title and abstract review (
Figure). Of the remaining 943 potentially relevant citations, we excluded 626 citations after review of the full publication, abstracted 145 citations in detail, and used 172 citations as background references. We systematically abstracted data from eligible studies into standardized electronic data forms. Eligible studies varied in their definitions of the postoperative period. Most commonly, authors defined the postoperative period as the hospital stay, ranging from 4 hours to 3 months after surgery.
Assessing Study Quality
We used the USPSTF criteria for assigning hierarchy of research design and grading a study's internal validity as our basis for assessing study quality
(12). A good-quality cohort or case-series study, at a minimum, adjusted for key confounders of age, chronic obstructive pulmonary disease, and surgical type; showed little or no differential loss to follow-up; explicitly masked outcome assessment, and provided explicit definitions for what constituted a postoperative pulmonary complication. A fair-quality cohort or case-series study adjusted for key confounders, showed little or no differential loss to follow-up, and provided a clear definition for a postoperative pulmonary complication but was unclear about masking of outcome assessment. A poor-quality cohort or case-series study did not include 1 or more of the key confounders, showed statistically significant differential loss to follow-up, provided vague or no definitions for a postoperative pulmonary complication, explicitly did not mask outcome assessments, or a combination of these criteria. We assigned summary strength of recommendations for each risk factor and laboratory test according to modified criteria proposed by the USPSTF
(12). We modified the criteria for the review on preoperative risk stratification to reflect the absence of a risk–benefit equation when considering a risk factor rather than an intervention. When both univariate and multivariate data were available about a potential risk factor, we considered most strongly the effect of the multivariate data when assigning strength of recommendations. When the effect of a risk factor was based on only 1 multivariate study or was limited to univariate data, we considered the evidence to be insufficient to determine that the factor contributed to postoperative pulmonary complication risk.
A good-quality randomized, controlled trial met all of the following criteria: comparable groups assembled initially and maintained throughout the study, follow-up of at least 80% of participants, reliable and valid measurement instruments applied equally to all groups, clearly described interventions, consideration of important and relevant outcomes, appropriate attention to confounders in analysis, and intention-to-treat analyses. We graded studies as fair if any of the following problems occurred: generally comparable groups assembled initially but some (although not major) possible differences occurring in follow-up, generally acceptable (although not the best) measurement instruments generally applied equally, some but not all important outcomes considered, some but not all potential confounders accounted for in analysis, and intention-to-treat analyses. Studies were poor if any of the following “fatal” flaws occurred: sufficiently comparable groups not assembled initially nor maintained throughout the study, unreliable or invalid measurement instruments, measurement instruments not applied equally among groups during follow-up (including unblended outcome assessment), follow-up of less than 80% of participants, little or no attention to key confounders, and intention-to-treat analyses not done.
We also graded systematic reviews as good, fair, or poor on the basis of extent of literature searched, inclusion or exclusion of non–English-language publications, statements of inclusion and exclusion criteria, protocols for appraisal of study quality and data abstraction, data synthesis methods, presentation of results, and discussion of clinical inferences and future research needs. Components of good quality included searching of MEDLINE plus other important databases (for example, EMBASE, Cochrane Library, or Clinical Trials Registry), inclusion of non–English-language publications, and a clear statement of acceptable inclusion criteria (for example, population, intervention, primary outcomes, study design, and assessment of agreement among reviewers). Good-quality reviews had good protocols for appraisal of study quality (for example, randomization, allocation concealment, blinding, independent assessment by ≥ 2 reviewers, assessment of interreviewer agreement, and process for resolution of agreement stated) and data abstraction (for example, independent assessment by ≥ 2 reviewers, interreviewer agreement, resolution process for disagreement, and standardized data abstraction forms). Components of good-quality quantitative synthesis included random-effects models, assessment of statistical heterogeneity, handling of missing data, rationale for a priori sensitivity and subgroup analyses, and assessment of publication bias. Good-quality presentation of results included a flow diagram for results of the literature search with numbers and reasons for exclusions, adequate reporting of characteristics of included studies (for example, study design, participant characteristics, quality score, details of intervention, outcome definitions, and assessment of clinical heterogeneity), and summary results with effect sizes and CIs. A good-quality discussion included summarization of key findings, clinical inferences based on internal and external validity of studies, interpretation of results on the totality of the evidence, potential biases, and a future research agenda. We graded systematic reviews and meta-analyses as fair if they were of fair to good quality on most components and as poor if they achieved only poor quality on most components.
Statistical Methods
Our literature search yielded primarily unadjusted estimates for most laboratory factors of interest. Limited multivariable, adjusted estimates were available for albumin level less than 30 g/L and elevated blood urea nitrogen level. However, rather than attempt to compute a potentially biased summary estimate, we provided narrative descriptions of the pattern of results for these potential risk factors.
The eligible multivariable risk factor studies varied considerably in the number and type of competing risks and confounders included in the analyses. Extensive use of prescreening methods and variable selection algorithms often limited reporting to the subset of risk factors that were determined as statistically significant in a given sample. The result is the introduction of a subtle form of publication bias, which we verified by examination of the funnel plots for each risk factor.
We extracted odds ratios from each study, along with their respective SEs, 95% confidence limits, or both. When necessary, we estimated SEs from the 95% confidence limits (
156). We used the
I 2 statistic
(13) and the Cochran
Q statistic
(14) to assess study heterogeneity. We also recomputed pooled estimates with and without studies that produced extreme results. The
I 2 statistic is the proportion of the total variance in the pooled estimate that is attributable to between-study variance. It is the maximum of (0, (
Q −
df)/
Q), where the degrees of freedom (
df) are the number of studies minus 1. This situation occurs when we have only 2 studies. An
I 2 statistic of 50% or greater indicates substantial heterogeneity among study estimates. We used the DerSimonian–Laird method to compute random-effects estimates when the set of studies was heterogeneous
(15). In cases where 3 or more studies contributed estimates for a risk factor, we used the trim-and-fill method to adjust pooled estimates of a risk factor's effect on postoperative pulmonary complications for publication bias
(16).
For the review on intervention strategies, we performed simple means and chi-square testing when eligible studies did not provide CIs or P values for statistical significance. We did not perform quantitative pooling or meta-analyses because we identified previously published meta-analyses and found insufficient additional evidence to warrant repooling. In other cases, studies were too few or were too clinically heterogeneous for meta-analysis.
Comments
0 Comments