Interobserver Reliability of the Berlin ARDS Definition and Strategies to Improve the Reliability of ARDS Diagnosis

Published:December 14, 2017DOI:


      Failure to reliably diagnose ARDS may be a major driver of negative clinical trials and underrecognition and treatment in clinical practice. We sought to examine the interobserver reliability of the Berlin ARDS definition and examine strategies for improving the reliability of ARDS diagnosis.


      Two hundred five patients with hypoxic respiratory failure from four ICUs were reviewed independently by three clinicians, who evaluated whether patients had ARDS, the diagnostic confidence of the reviewers, whether patients met individual ARDS criteria, and the time when criteria were met.


      Interobserver reliability of an ARDS diagnosis was “moderate” (kappa = 0.50; 95% CI, 0.40-0.59). Sixty-seven percent of diagnostic disagreements between clinicians reviewing the same patient was explained by differences in how chest imaging studies were interpreted, with other ARDS criteria contributing less (identification of ARDS risk factor, 15%; cardiac edema/volume overload exclusion, 7%). Combining the independent reviews of three clinicians can increase reliability to “substantial” (kappa = 0.75; 95% CI, 0.68-0.80). When a clinician diagnosed ARDS with “high confidence,” all other clinicians agreed with the diagnosis in 72% of reviews. There was close agreement between clinicians about the time when a patient met all ARDS criteria if ARDS developed within the first 48 hours of hospitalization (median difference, 5 hours).


      The reliability of the Berlin ARDS definition is moderate, driven primarily by differences in chest imaging interpretation. Combining independent reviews by multiple clinicians or improving methods to identify bilateral infiltrates on chest imaging are important strategies for improving the reliability of ARDS diagnosis.

      Key Words


      ICC ( intraclass correlation coefficient)
      To read this article in full you will need to make a payment
      Subscribe to CHEST
      Already a print subscriber? Claim online access
      Already an online subscriber? Sign in
      Institutional Access: Sign in to ScienceDirect


        • Coggon D.
        • Martyn C.
        • Palmer K.T.
        • Evanoff B.
        Assessing case definitions in the absence of a diagnostic gold standard.
        Int J Epidemiol. 2005; 34: 949-952
        • Rubenfeld G.D.
        Confronting the frustrations of negative clinical trials in acute respiratory distress syndrome.
        Ann Thorac Surg. 2015; 12: S58-S63
        • Frohlich S.
        • Murphy N.
        • Boylan J.F.
        ARDS: progress unlikely with non-biological definition.
        Br J Anaesth. 2013; 111: 696-699
        • Sjoding M.W.
        • Cooke C.R.
        • Iwashyna T.J.
        • Hofer T.P.
        Acute respiratory distress syndrome measurement error. Potential effect on clinical study results.
        Ann Thorac Surg. 2016; 13: 1123-1128
        • Pham T.
        • Rubenfeld G.D.
        Fifty years of research in ARDS: the epidemiology of acute respiratory distress syndrome. A 50th birthday review.
        Am J Respir Crit Care Med. 2017; 195: 860-870
        • Bellani G.
        • Laffey J.G.
        • Pham T.
        • et al.
        Epidemiology, patterns of care, and mortality for patients with acute respiratory distress syndrome in intensive care units in 50 countries.
        JAMA. 2016; 315: 788-800
        • Weiss C.H.
        • Baker D.W.
        • Weiner S.
        • et al.
        Low tidal volume ventilation use in acute respiratory distress syndrome.
        Crit Care Med. 2016; 44: 1515-1522
        • Ranieri V.M.
        • Rubenfeld G.D.
        • Thompson B.T.
        • et al.
        Acute respiratory distress syndrome: the Berlin Definition.
        JAMA. 2012; 307: 2526-2533
        • Rubenfeld G.D.
        • Caldwell E.
        • Granton J.
        • Hudson L.D.
        • Matthay M.A.
        Interobserver variability in applying a radiographic definition for ARDS.
        Chest. 1999; 116: 1347-1353
        • Meade M.O.
        • Cook R.J.
        • Guyatt G.H.
        • et al.
        Interobserver variation in interpreting chest radiographs for the diagnosis of acute respiratory distress syndrome.
        Am J Respir Crit Care Med. 2000; 161: 85-90
        • Amato M.B.
        • Barbas C.S.
        • Medeiros D.M.
        • et al.
        Effect of a protective-ventilation strategy on mortality in the acute respiratory distress syndrome.
        N Engl J Med. 1998; 338: 347-354
        • Brower R.G.
        • Matthay M.A.
        • Morris A.
        • Schoenfeld D.
        • Thompson B.T.
        • Wheeler A.
        Ventilation with lower tidal volumes as compared with traditional tidal volumes for acute lung injury and the acute respiratory distress syndrome.
        N Engl J Med. 2000; 342: 1301-1308
        • Fan E.
        • Del Sorbo L.
        • Goligher E.C.
        • et al.
        An Official American Thoracic Society/European Society of Intensive Care Medicine/Society of Critical Care Medicine Clinical Practice Guideline: mechanical ventilation in adult patients with acute respiratory distress syndrome.
        Am J Respir Crit Care Med. 2017; 195: 1253-1263
        • Sudman S.
        • Bradburn N.M.
        • Schwarz N.
        Thinking About Answers: The Application of Cognitive Processes to Survey Methodology.
        Jossey-Bass Inc., San Francisco, CA1996
        • Ferguson N.D.
        • Fan E.
        • Camporota L.
        • et al.
        The Berlin definition of ARDS: an expanded rationale, justification, and supplementary material.
        Intensive Care Med. 2012; 38: 1573-1582
        • Fleiss J.L.
        • Levin B.
        • Paik M.C.
        Statistical Methods for Rates and Proportions.
        3rd ed. John Wiley & Sons, Inc, Hoboken, NJ2003
        • Landis J.R.
        • Koch G.G.
        The measurement of observer agreement for categorical data.
        Biometrics. 1977; 33: 159-174
        • Snijders T.A.B.
        • Cosker R.J.
        Multilevel Analysis: An Introduction to Basic and Advanced Multilevel Modeling.
        Sage, London2012
        • Spearman C.E.
        Correlation calculated from faulty data.
        Br J Psychol. 1910; 3: 271-295
        • Byrt T.
        • Bishop J.
        • Carlin J.B.
        Bias, prevalence and kappa.
        J Clin Epidemiol. 1993; 46: 423-429
        • Feinstein A.R.
        • Cicchetti D.V.
        High agreement but low kappa: I. The problems of two paradoxes.
        J Clin Epidemiol. 1990; 43: 543-549
        • Vach W.
        The dependence of Cohen's kappa on the prevalence does not matter.
        J Clin Epidemiol. 2005; 58: 655-661
        • Peng J.M.
        • Qian C.Y.
        • Yu X.Y.
        • et al.
        Does training improve diagnostic accuracy and inter-rater agreement in applying the Berlin radiographic definition of acute respiratory distress syndrome? A multicenter prospective study.
        Crit Care. 2017; 21: 12
        • Pesenti A.
        • Tagliabue P.
        • Patroniti N.
        • Fumagalli R.
        Computerised tomography scan imaging in acute respiratory distress syndrome.
        Intensive Care Med. 2001; 27: 631-639
        • Bass C.M.
        • Sajed D.R.
        • Adedipe A.A.
        • West T.E.
        Pulmonary ultrasound and pulse oximetry versus chest radiography and arterial blood gas analysis for the diagnosis of acute respiratory distress syndrome: a pilot study.
        Critical Care. 2015; 19: 282
        • Sekiguchi H.
        • Schenck L.A.
        • Horie R.
        • et al.
        Critical care ultrasonography differentiates ARDS, pulmonary edema, and other causes in the early course of acute hypoxemic respiratory failure.
        Chest. 2015; 148: 912-918
        • Zaglam N.
        • Jouvet P.
        • Flechelles O.
        • Emeriaud G.
        • Cheriet F.
        Computer-aided diagnosis system for the acute respiratory distress syndrome from chest radiographs.
        Comput Biol Med. 2014; 52: 41-48
        • Pauker S.G.
        • Kassirer J.P.
        The threshold approach to clinical decision making.
        N Engl J Med. 1980; 302: 1109-1117
        • Kassirer J.P.
        Our stubborn quest for diagnostic certainty. A cause of excessive testing.
        N Engl J Med. 1989; 320: 1489-1491
        • Guerin C.
        • Gaillard S.
        • Lemasson S.
        • et al.
        Effects of systematic prone positioning in hypoxemic acute respiratory failure: a randomized controlled trial.
        JAMA. 2004; 292: 2379-2387
        • Taccone P.
        • Pesenti A.
        • Latini R.
        • et al.
        Prone positioning in patients with moderate and severe acute respiratory distress syndrome: a randomized controlled trial.
        JAMA. 2009; 302: 1977-1984
        • Shah C.V.
        • Lanken P.N.
        • Localio A.R.
        • et al.
        An alternative method of acute lung injury classification for use in observational studies.
        Chest. 2010; 138: 1054-1061
        • Hendrickson C.M.
        • Dobbins S.
        • Redick B.J.
        • Greenberg M.D.
        • Calfee C.S.
        • Cohen M.J.
        Misclassification of acute respiratory distress syndrome after traumatic injury: The cost of less rigorous approaches.
        J Trauma Acute Care Surg. 2015; 79: 417-424

      Linked Article

      • ARDS Cannot Be Accurately Differentiated From Cardiogenic Pulmonary Edema Without Systematic Tissue Doppler Echocardiography
        CHESTVol. 154Issue 1
        • In Brief
          We read with interest the article by Sjoding et al1 in a recent issue of CHEST (February 2018). They found “moderate” interobserver agreement among clinicians in diagnosing ARDS using Berlin's criteria. As showed in the e-Tables, the ARDS criteria adopted were based, among others, on exclusion of cardiogenic pulmonary edema (CPE). Variance explained by the “chest imaging” criteria was 60%, whereas that explained by the “exclusion of CPE” criteria was very low. However, prevalence-adjusted bias-adjusted k was similar between the “ARDS risk factor” criteria and the “exclusion of CPE” criteria (ie, 0.65 and 0.70).
        • Full-Text
        • PDF
      • Response
        CHESTVol. 154Issue 1
        • In Brief
          We thank Drs Vassallo and colleagues for their interest in our recent study1 evaluating the inter-rater reliability of the Berlin ARDS definition. We agree that accurately identifying cardiogenic pulmonary edema (CPE) in patients with acute respiratory failure is essential for ensuring patients receive adequate treatment. Evidence suggests that missed diagnosis of CPE is common in this setting and may be associated with higher mortality.2 New approaches to identify patients with acute heart failure and cardiogenic pulmonary edema are needed, and increased adoption of bedside echocardiogram in routine practice is one potential solution.
        • Full-Text
        • PDF