Clinical staging of non-small cell lung cancer (NSCLC) helps determine the prognosis and treatment of patients; few data exist on the accuracy of clinical staging and the impact on treatment and survival of patients. We assessed whether participant or trial characteristics were associated with clinical staging accuracy as well as impact on survival.
We used individual participant data from randomized controlled trials (RCTs), supplied for a meta-analysis of preoperative chemotherapy (± radiotherapy) vs surgery alone (± radiotherapy) in NSCLC. We assessed agreement between clinical TNM (cTNM) stage at randomization and pathologic TNM (pTNM) stage, for participants in the control group.
Results are based on 698 patients who received surgery alone (± radiotherapy) with data for cTNM and pTNM stage. Forty-six percent of cases were cTNM stage I, 23% were cTNM stage II, and 31% were cTNM stage IIIa. cTNM stage disagreed with pTNM stage in 48% of cases, with 34% clinically understaged and 14% clinically overstaged. Agreement was not associated with age (P = .12), sex (P = .62), histology (P = .82), staging method (P = .32), or year of randomization (P = .98). Poorer survival in understaged patients was explained by the underlying pTNM stage. Clinical staging failed to detect T4 disease in 10% of cases and misclassified nodal disease in 38%.
This study demonstrates suboptimal agreement between clinical and pathologic staging. Discrepancies between clinical and pathologic T and N staging could have led to different treatment decisions in 10% and 38% of cases, respectively. There is therefore a need for further research into improving staging accuracy for patients with stage I-IIIa NSCLC.
The clinical staging of non-small cell lung cancer (NSCLC) is of paramount importance in determining a patient’s prognosis, guiding treatment decisions, and defining clinical trial eligibility, as well as allowing comparison between clinical trials. Incorrect staging of NSCLC may result in inaccurate prognostic information for patients and errors in patient treatment. After extrathoracic metastases have been excluded, tumor and nodal staging are critical in making treatment decisions, as patients with N0 and N1 involvement are generally candidates for surgery. Patients with ipsilateral mediastinal disease (N2) are a heterogeneous group and may be offered chemoradiation therapy or surgery (with preoperative or postoperative chemotherapy). Patients with contralateral (N3) mediastinal (or supraclavicular) nodal disease are offered chemoradiation therapy or palliative treatment options. Therefore, clinical understaging, that is, staging that misses mediastinal metastases or mediastinal invasion of the primary lesion, may risk the patient undergoing radical treatment of the primary lesion for no benefit. Conversely, incorrect clinical overstaging of mediastinal disease may result in surgery being denied to an otherwise operable patient. The current guidance from the Union for International Cancer Control
to treat early-stage NSCLC in medically inoperable patients has further highlighted the importance of accurate clinical staging. Applying local nonsurgical treatments without the benefit of systematic lymph node dissection runs the risk of being futile if there is clinical understaging with unrecognized mediastinal or systemic disease.
Although the importance of accurate clinical staging is clear and the performance characteristics of individual tests in lung cancer staging are known, fewer data exist on the accuracy of clinical staging of NSCLC and how this relates to the staging techniques employed. Three studies that have been reported all show high levels of inaccurate clinical staging; however, none have demonstrated the impact of erroneous staging on clinical outcome. A prospective study of 383 patients with potentially resectable NSCLC demonstrated that clinically unsuspected N2 disease was found in 14% of patients. Despite routine use of positron emission tomography-computed tomography (PET-CT) scanning,
a post-hoc analysis of 67 patients from the control arm of the Medical Research Council LU225 trial of preoperative chemotherapy suggested that nodal staging was inaccurate in 25% (95% CI, 15%-36%) of patients who underwent PET-CT scanning and mediastinoscopy.
showed that only 54% of patients were clinically staged accurately, and no comment could be made on whether this impacted on patient survival outcomes. Thus, to investigate further, we used individual participant data (IPD) from trials supplied for a systematic review and meta-analysis of preoperative chemotherapy in non-small cell lung cancer to assess the accuracy of clinical staging, factors that may affect inaccuracy, and how inaccuracy might impact on treatment decisions and survival.
To be eligible for inclusion in the original IPD meta-analysis,
trials should have randomized patients with NSCLC to preoperative chemotherapy followed by surgery (± postoperative radiotherapy) vs surgery (± postoperative radiotherapy). Full details of the methods are presented elsewhere.
However, only data from patients from the control arm in these trials were used in this analysis, to ensure that any difference between clinical and pathologic staging could not have been influenced by preoperative chemotherapy. Included randomized controlled trials (RCTs) used different editions of TNM staging, and these changes over time were taken into account (e-Table 1).
Data on age, sex, clinical staging techniques, clinical TNM stage, extent of resection, pathologic TNM stage, histology, performance status, treatment group and dates of randomization, last follow-up, and death were collected. We approached study investigators for permission to use these data for these analyses and for clarification where staging methods were unclear in the original trial protocol or manuscript.
To assess agreement between clinical TNM stage (cTNM) and pathologic TNM stage (pTNM), a simple percentage agreement was calculated. Agreement between clinical and pathologic stage was also calculated using a weighted Cohen’s κ, which takes into account both agreement by chance and the degree of disagreement. κ statistics were categorized, as < 66% = low agreement, ≥ 66% = fair agreement, and ≥ 90% = good agreement.
To assess whether or not patient and trial characteristics might be associated with any cTNM staging inaccuracy age, sex, histology, year of randomization, and staging method were included in a multivariate logistic regression model. Histology was classified into adenocarcinoma, squamous, and other/unknown. Staging methods were classified as CT scan with or without a chest radiograph or CT scan plus any other staging method, as there were insufficient data to do this in more detail. Staging method correlated strongly with year of randomization, so we included only the former in our primary analysis. However, a sensitivity analysis was also performed, where staging method was replaced with year of randomization. We generated Kaplan-Meier curves for overall survival based on patients who were clinically understaged, clinically overstaged, and for those whose cTNM and pTNM agreed, and compared these using a log-rank test, stratified by trial and subsequently also pathologic stage. The accuracy of clinical T stage and nodal status were considered separately to help pinpoint which disagreements could have influenced treatment decisions.
Fifteen RCTs were included in the original IPD systematic review and meta-analysis of preoperative chemotherapy followed by surgery vs surgery alone. Nine trials
All accessible hilar (level 10) lymph nodes must be dissected …A complete mediastinal lymph node sampling should be performed…for right-sided lesions, this includes 2R, 4R, 7, 8, and 9. For left-sided lesions, this includes 4L, 5, 6, 7, 8, and 9
(which recruited patients between 1987 and 1993) used chest radiography and mediastinoscopy only. More recent trials used CT scans and PET-CT imaging, but no trial utilized PET-CT scanning routinely, such that only 67 patients included in the analysis underwent PET-CT imaging. There was also variation among trials in the surgical methods used (Table 1).
Of the 698 patients included, 318 (46%) were cTNM stage I (83% of which were Ia), 160 (23%) were cTNM stage II (91% of which were IIa), and 218 (31%) were cTNM stage IIIa (Table 2). Only two patients were classed as cTNM stage IIIB, and were therefore not included in the regression or survival analyses. A more detailed breakdown is given in e-Figure 1.
Table 2Agreement Between Clinical and Pathologic TNM Stage Data
Agreement between cTNM and pTNM staging was low (52%; weighted Cohen’s κ = 0.35; 95% CI, 0.30-0.40) (Table 2). In 34% of cases, patients were clinically understaged, and in 14% of cases, patients were clinically overstaged (Table 2). In the main regression analysis, age (P = .12), sex (P = .62), histology (P = .82), and the staging method (P = .32) were not significantly associated with the accuracy of cTNM staging, and in a sensitivity analysis there was no association with year of randomization (P = .98; e-Table 2).
Survival varied with the accuracy of cTNM staging. In particular, patients who were clinically understaged appeared to have poorer survival than those who were clinically overstaged or those for whom cTNM and pTNM staging agreed (log-rank test stratified by trial P < .0001) (Fig 1). However, this is driven by the underlying pTNM stage (log-rank test stratified by trial and pathologic stage P = .54), which is more clearly illustrated in Figure 2. In particular, 44% of patients classed as cTNM stage I were pTNM stage II-IV, and 33% of patients classed as cTNM stage II were pTNM stage III-IV, explaining their lower survival (Fig 2).
Agreement was low between clinical and pathologic T stage (65%; weighted Cohen’s κ = 0.33; 95% CI, 0.27-0.39) (Table 3) and N stage (62%, weighted Cohen’s κ = 0.42; 95% CI, 0.37-0.48) (Table 4). Specifically, clinical staging failed to detect T4 disease in 10% of patients (Table 3), and nodal disease in 19% of patients. In addition, 12% were judged erroneously to have node-positive disease (Table 4).
Table 3Agreement Between Clinical and Pathologic T Stage Data
We found that cTNM stage disagreed with pTNM stage in about one-half of patients, and was not clearly associated with age, sex, histology, the staging method used, or year of randomization. The discrepancies between clinical and pathologic T staging and N staging could have led to different treatment decisions in 10% and 38% of cases, respectively.
To our knowledge, this is the first time IPD from major RCTs have been combined to assess the accuracy of staging in stage I-III NSCLC. While the randomized controlled trials included did not intend to evaluate staging, with the agreement of those who provided the data, this novel methodology provided us with a valuable opportunity to investigate more reliably the accuracy of clinical TNM staging. We could take advantage of per-protocol clinical staging and surgery and rigorous documentation of clinical and pathologic TNM stage for each patient. Also, data from randomized trials are less susceptible to the selection biases that can affect cohort studies.
Point: Clinical stage IA non-small cell lung cancer determined by computed tomography and positron emission tomography is frequently not pathologic IA non-small cell lung cancer: the problem of understaging.
Using IPD has enabled us to restrict the analysis to the control arms of these trials, thus avoiding confounding by treatment received and, in particular, potential downstaging from use of preoperative chemotherapy.
For the first time, to our knowledge, this study also demonstrates the impact of the inaccuracy of clinical staging on patient survival outcomes. Importantly, the impact of staging accuracy on clinical decision making is also demonstrated using unselected data. The poorer survival seen in clinically understaged patients was explained by the underlying pTNM stage.
Over time the trials included here used increasingly sophisticated staging methods, but surprisingly, a significant improvement in accuracy was not seen. However, many of the staging methods utilized in the included trials may now be considered suboptimal.
Earlier studies employed CT scanning and mediastinoscopy while the most recent trial used additional PET-CT imaging, but none used endosonography. Despite this, our staging accuracy results are remarkably similar to those from the audit of the quality of staging in Dutch patients,
which included routine use of PET-CT imaging and endosonography and included patients from January 2013 to December 2014. Indeed, of the patients included in our analysis that did undergo PET-CT imaging, one-quarter of patients were still understaged and this is discussed elsewhere.
for patients with stage IA disease, which does not recommend the use of PET imaging or endosonography. Although it is difficult to generalize, assuming the trial population reflects routine practice, the data here suggest that 44% of patients with clinical stage I disease might have more advanced disease diagnosed postoperatively. A further limitation is that intraoperative pathologic staging protocols may have varied and are unlikely to be as comprehensive as currently recommended.
However, incomplete pathologic staging would only serve to reduce the extent of nodal staging inaccuracy.
The advent of stereotactic radiotherapy and radiofrequency ablation for the treatment of early-stage NSCLC has highlighted the importance of accurate nodal staging. These newer techniques are used for the treatment of early-stage lung cancer but, in contrast to surgery, do not provide pathologic staging information. In a study of relapse of NSCLC following stereotactic radiotherapy or surgery, there were twice as many recurrences in local lymph nodes in patients undergoing stereotactic radiotherapy compared with surgery,
emphasizing the importance of accurate nodal staging prior to SABR.
When surgery is undertaken and pathologic staging is available, prior invasive mediastinal sampling may take on less significance if we assume that surgery followed by adjuvant chemotherapy is at least as effective as chemoradiation. When considering stage II and III disease, inaccurate clinical staging may reduce the efficacy of surgery by failing to detect multistation N2 or N3 disease. For patients undergoing radical radiotherapy, imprecise clinical staging can result in an incorrect radiation field.
The most likely explanation for the low level of accuracy of clinical staging for patients with operable NSCLC is the sensitivity of the diagnostic tools employed. Patients being considered for treatment with curative intent typically undergo CT and PET-CT imaging as well as mediastinal sampling when required. Using a 10-mm short-axis cutoff for significance of mediastinal nodes, the sensitivity of CT scanning in detecting mediastinal metastases is 55%.
are adhered to and PET-positive findings are clarified by invasive sampling. More recently the introduction of endobronchial and endoscopic ultrasound has improved the clinical staging of patients with NSCLC, resulting in a reduction in futile surgery
when employed routinely for patients with stage I-III disease.
These findings have implications for the care of patients with NSCLC, as well as appropriate selection of suitable patients for inclusion in clinical trials. Understaging the T stage may mean that the patient undergoes surgery without the surgeon knowing the full extent of the primary disease, which may result in an incomplete resection. Ten percent of patients in our analysis were found to have previously unexpected T4 disease. Erroneous nodal staging in patients without metastatic disease can similarly result in inappropriate treatment decisions, which can significantly impact on patient outcomes. Patients with nodal disease undetected by clinical staging methods may undergo futile surgery (or SABR) whereas chemoradiotherapy may have been the preferred initial treatment of clinicians and patients with full knowledge of nodal involvement. Conversely, if clinical staging overestimates the extent of nodal disease (114 patients [15%] in this meta-analysis), then this may mean patients are denied potentially curative surgery. The data for this analysis were obtained from patients in controlled clinical trials, generally from centers with lung cancer expertise. Therefore, clinical staging accuracy in the wider population could well be worse.
The results of this analysis highlight some flaws in the clinical care of patients with NSCLC and emphasize the need for further research into techniques for improving staging accuracy for patients with stage I-III NSCLC.
Author contributions: S. B. and D. F. had full access to all of the data in the study and take responsibility for the integrity of the data and the accuracy of the data analysis. S. B., D. F., J. F. T., R. J. S., and N. N. contributed substantially to the study design, data analysis and interpretation, and writing of the manuscript.
*NSCLC Meta-analysis Collaborative Group:Project Management Group: Sarah Burdett, Larysa H. M. Rydzewska, and Jayne F. Tierney (MRC Clinical Trials Unit at UCL, London, UK); Anne Auperin, Thierry Le Chevalier, Cécile Le Pechoux, and Jean-Pierre Pignon (Gustave-Roussy, Villejuif, France). International Advisory Group: Rodrigo Arriagada (Karolinska Institutet, Stockholm, Sweden), (Gustave-Roussy, Villejuif, France), David H. Johnson (Southwestern Medical Center, University of Texas, Dallas, TX), Jan van Meerbeeck (MOCA-Thoracic Oncology, University Hospital Antwerp, Antwerp, Belgium), Mahesh K. B. Parmar (MRC Clinical Trials Unit at UCL, London, UK); Richard J. Stephens (MRC Clinical Trials Unit at UCL, London, UK [retired]); and Lesley A. Stewart (Centre for Reviews and Dissemination, York, UK). Collaborators who supplied individual participant data: Paul A. Bunn (University of Colorado Cancer Center, Aurora, CO); Bertrand Dautzenberg (Service de Pneumologie et Réanimation, Groupe Hospitalier Pitié-Salpêtrière, Paris, France); David Gilligan (Addenbrooke’s Hospital, Cambridge, UK); Harry Groen (Universitair Medisch Centrum Groningen, Groningen, The Netherlands); Aija Knuuttila (Helsinki University Central Hospital, Helsinki, Finland); Eric Vallieres (Swedish Cancer Institute, Seattle, WA); Rafael Rosell (Catalan Institute of Oncology, Hospital Germans Trias i Pujol, Barcelona, Spain); Jack Roth (University of Texas M.D. Anderson Cancer Center, Houston, TX); Giorgio Scagliotti (University of Turin, San Luigi Hospital, Turin, Italy); Masahiro Tsuboi (National Cancer Center Hospital East, Kashiwanoha, Kashiwa-shi, Japan); David Waller (Glenfield Hospital, Leicester, UK); Virginie Westeel (Centre Hospitalier Universitaire, Besançon, France); and Yi-Long Wu and Xue-Ning Yang (Guangdong Lung Cancer Institute, Guangdong General Hospital and Guangdong Academy of Medical Sciences, Guangzhou, China).
Role of sponsors: The sponsor had no role in the design of the study, the collection and analysis of the data, or the preparation of the manuscript.
Other contributions: The authors thank all the patients who took part in all the trials included in these analyses. Publication is on behalf of the Non-Small Cell Lung Cancer Collaborative Group.
Additional information: The e-Figure and e-Tables can be found in the Supplemental Materials section of the online article.
Point: Clinical stage IA non-small cell lung cancer determined by computed tomography and positron emission tomography is frequently not pathologic IA non-small cell lung cancer: the problem of understaging.
FUNDING/SUPPORT: Funded by the UK Medical Research Council [Grant MC_UU_12023/28]. This work was in part undertaken at UCLH/UCL who received a proportion of funding from the United Kingdom Department of Health’s NIHR Biomedical Research Centre’s funding scheme (N. N.)
In a previous issue of CHEST (March 2019), Navani and colleagues1 presented results of a meta-analysis focused on the accuracy of clinical staging of non-small cell lung cancer (NSCLC) in clinical trial patients. In their analysis, clinical staging was accurate in just 52% of cases, with 34% of patients clinically understaged based on the pathologic TNM stage. This effect was seen across all subgroups stratified by stage. However, the meta-analysis did not take into account the important potential confounder of elapsed time between clinical staging and pathologic staging, which may contribute to understaging and progression of disease.
We thank Dr Gard and Dr Voskoboynik for their interest in our article1 and for their thoughtful comments. We agree that timeliness of care appears to be important for patients undergoing lung cancer diagnosis and treatment and is the subject of an ongoing systematic review.2 A previous randomized trial of endobronchial ultrasound by our group3 resulted in faster diagnosis and was also associated with improved survival. Reduced time to treatment may also improve patient experience and result in fewer patients reporting a drop in performance status.