The Accuracy of Clinical Staging of Stage I-IIIa Non-Small Cell Lung Cancer

Background Clinical staging of non-small cell lung cancer (NSCLC) helps determine the prognosis and treatment of patients; few data exist on the accuracy of clinical staging and the impact on treatment and survival of patients. We assessed whether participant or trial characteristics were associated with clinical staging accuracy as well as impact on survival. Methods We used individual participant data from randomized controlled trials (RCTs), supplied for a meta-analysis of preoperative chemotherapy (± radiotherapy) vs surgery alone (± radiotherapy) in NSCLC. We assessed agreement between clinical TNM (cTNM) stage at randomization and pathologic TNM (pTNM) stage, for participants in the control group. Results Results are based on 698 patients who received surgery alone (± radiotherapy) with data for cTNM and pTNM stage. Forty-six percent of cases were cTNM stage I, 23% were cTNM stage II, and 31% were cTNM stage IIIa. cTNM stage disagreed with pTNM stage in 48% of cases, with 34% clinically understaged and 14% clinically overstaged. Agreement was not associated with age (P = .12), sex (P = .62), histology (P = .82), staging method (P = .32), or year of randomization (P = .98). Poorer survival in understaged patients was explained by the underlying pTNM stage. Clinical staging failed to detect T4 disease in 10% of cases and misclassified nodal disease in 38%. Conclusions This study demonstrates suboptimal agreement between clinical and pathologic staging. Discrepancies between clinical and pathologic T and N staging could have led to different treatment decisions in 10% and 38% of cases, respectively. There is therefore a need for further research into improving staging accuracy for patients with stage I-IIIa NSCLC.

The clinical staging of non-small cell lung cancer (NSCLC) is of paramount importance in determining a patient's prognosis, guiding treatment decisions, and defining clinical trial eligibility, as well as allowing comparison between clinical trials. Incorrect staging of NSCLC may result in inaccurate prognostic information for patients and errors in patient treatment. After extrathoracic metastases have been excluded, tumor and nodal staging are critical in making treatment decisions, as patients with N0 and N1 involvement are generally candidates for surgery. Patients with ipsilateral mediastinal disease (N2) are a heterogeneous group and may be offered chemoradiation therapy or surgery (with preoperative or postoperative chemotherapy). Patients with contralateral (N3) mediastinal (or supraclavicular) nodal disease are offered chemoradiation therapy or palliative treatment options. Therefore, clinical understaging, that is, staging that misses mediastinal metastases or mediastinal invasion of the primary lesion, may risk the patient undergoing radical treatment of the primary lesion for no benefit. Conversely, incorrect clinical overstaging of mediastinal disease may result in surgery being denied to an otherwise operable patient. The current guidance from the Union for International Cancer Control 1 suggests that when there is doubt about stage, the less advanced, or lower category should be chosen.
The emergence of techniques such as stereotactic body radiotherapy 2 (SABR) and radiofrequency ablation 3 to treat early-stage NSCLC in medically inoperable patients has further highlighted the importance of accurate clinical staging. Applying local nonsurgical treatments without the benefit of systematic lymph node dissection runs the risk of being futile if there is clinical understaging with unrecognized mediastinal or systemic disease.
Although the importance of accurate clinical staging is clear and the performance characteristics of individual tests in lung cancer staging are known, fewer data exist on the accuracy of clinical staging of NSCLC and how this relates to the staging techniques employed. Three studies that have been reported all show high levels of inaccurate clinical staging; however, none have demonstrated the impact of erroneous staging on clinical outcome. A prospective study of 383 patients with potentially resectable NSCLC demonstrated that clinically unsuspected N2 disease was found in 14% of patients. Despite routine use of positron emission tomography-computed tomography (PET-CT) scanning, 4 a post-hoc analysis of 67 patients from the control arm of the Medical Research Council LU22 5 trial of preoperative chemotherapy suggested that nodal staging was inaccurate in 25% (95% CI, 15%-36%) of patients who underwent PET-CT scanning and mediastinoscopy. 6 A study comparing clinical and pathologic TNM data, collected for 2,336 patients included in the Dutch Lung Surgery Audit, 7 showed that only 54% of patients were clinically staged accurately, and no comment could be made on whether this impacted on patient survival outcomes. Thus, to investigate further, we used individual participant data (IPD) from trials supplied for a systematic review and meta-analysis of preoperative chemotherapy in nonsmall cell lung cancer to assess the accuracy of clinical staging, factors that may affect inaccuracy, and how inaccuracy might impact on treatment decisions and survival.

Methods
To be eligible for inclusion in the original IPD meta-analysis, 8 trials should have randomized patients with NSCLC to preoperative chemotherapy followed by surgery (AE postoperative radiotherapy) vs surgery (AE postoperative radiotherapy). Full details of the methods are presented elsewhere. 8 IPD were collected for 15 eligible randomized controlled trials and included 2,385 patients with nonsmall cell lung cancer. 8 However, only data from patients from the control arm in these trials were used in this analysis, to ensure that any difference between clinical and pathologic staging could not have been influenced by preoperative chemotherapy. Included randomized controlled trials (RCTs) used different editions of TNM staging, and these changes over time were taken into account (e- Table 1).
Data on age, sex, clinical staging techniques, clinical TNM stage, extent of resection, pathologic TNM stage, histology, performance status, treatment group and dates of randomization, last follow-up, and death were collected. We approached study investigators for permission to use these data for these analyses and for clarification where staging methods were unclear in the original trial protocol or manuscript.

Statistical Analysis
To assess agreement between clinical TNM stage (cTNM) and pathologic TNM stage (pTNM), a simple percentage agreement was calculated. Agreement between clinical and pathologic stage was also calculated using a weighted Cohen's k, which takes into account both agreement by chance and the degree of disagreement. k statistics were categorized, as < 66% ¼ low agreement, $ 66% ¼ fair agreement, and $ 90% ¼ good agreement. 9,10 To assess whether or not patient and trial characteristics might be associated with any cTNM staging inaccuracy age, sex, histology, year of randomization, and staging method were included in a multivariate logistic regression model. Histology was classified into adenocarcinoma, squamous, and other/unknown. Staging methods were classified as CT scan with or without a chest radiograph or CT scan plus any other staging method, as there chestjournal.org were insufficient data to do this in more detail. Staging method correlated strongly with year of randomization, so we included only the former in our primary analysis. However, a sensitivity analysis was also performed, where staging method was replaced with year of randomization. We generated Kaplan-Meier curves for overall survival based on patients who were clinically understaged, clinically overstaged, and for those whose cTNM and pTNM agreed, and compared these using a log-rank test, stratified by trial and subsequently also pathologic stage. The accuracy of clinical T stage and nodal status were considered separately to help pinpoint which disagreements could have influenced treatment decisions.

Results
Fifteen RCTs were included in the original IPD systematic review and meta-analysis of preoperative chemotherapy followed by surgery vs surgery alone. Nine trials 5,11-18 (randomizing 1,586 patients in total) included data on both cTNM and pTNM stage, providing 698 control-arm patients for analysis (Table 1). These RCTs accrued patients between 1987 and 2005.
Clinical staging protocols varied among the trials (Table 1). One trial 11 (which recruited patients between 1987 and 1993) used chest radiography and mediastinoscopy only. More recent trials used CT scans and PET-CT imaging, but no trial utilized PET-CT scanning routinely, such that only 67 patients included in the analysis underwent PET-CT imaging. There was also variation among trials in the surgical methods used (Table 1).
Of the 698 patients included, 318 (46%) were cTNM stage I (83% of which were Ia), 160 (23%) were cTNM stage II (91% of which were IIa), and 218 (31%) were cTNM stage IIIa (Table 2). Only two patients were classed as cTNM stage IIIB, and were therefore not included in the regression or survival analyses. A more detailed breakdown is given in e- Figure 1.
Survival varied with the accuracy of cTNM staging. In particular, patients who were clinically understaged appeared to have poorer survival than those who were clinically overstaged or those for whom cTNM and pTNM staging agreed (log-rank test stratified by trial P < .0001) (Fig 1). However, this is driven by the underlying pTNM stage (log-rank test stratified by trial and pathologic stage P ¼ .54), which is more clearly illustrated in Figure 2. In particular, 44% of patients classed as cTNM stage I were pTNM stage II-IV, and 33% of patients classed as cTNM stage II were pTNM stage III-IV, explaining their lower survival (Fig 2).

Results Summary
We found that cTNM stage disagreed with pTNM stage in about one-half of patients, and was not clearly associated with age, sex, histology, the staging method used, or year of randomization. The discrepancies between clinical and pathologic T staging and N staging could have led to different treatment decisions in 10% and 38% of cases, respectively.

Strengths
To our knowledge, this is the first time IPD from major RCTs have been combined to assess the accuracy of staging in stage I-III NSCLC. While the randomized controlled trials included did not intend to evaluate staging, with the agreement of those who provided the data, this novel methodology provided us with a valuable opportunity to investigate more reliably the accuracy of clinical TNM staging. We could take advantage of perprotocol clinical staging and surgery and rigorous documentation of clinical and pathologic TNM stage for each patient. Also, data from randomized trials are less susceptible to the selection biases that can affect cohort studies. 19,20 Using IPD has enabled us to restrict the analysis to the control arms of these trials, thus avoiding confounding by treatment received and, in particular, potential downstaging from use of preoperative chemotherapy.  For the first time, to our knowledge, this study also demonstrates the impact of the inaccuracy of clinical staging on patient survival outcomes. Importantly, the impact of staging accuracy on clinical decision making is also demonstrated using unselected data. The poorer survival seen in clinically understaged patients was explained by the underlying pTNM stage.

Limitations
Over time the trials included here used increasingly sophisticated staging methods, but surprisingly, a significant improvement in accuracy was not seen. However, many of the staging methods utilized in the included trials may now be considered suboptimal. 21 Earlier studies employed CT scanning and mediastinoscopy while the most recent trial used additional PET-CT imaging, but none used endosonography. Despite this, our staging accuracy results are remarkably similar to those from the audit of the quality of staging in Dutch patients, 7 which included routine use of PET-CT imaging and endosonography and included patients from January 2013 to December 2014. Indeed, of the patients included in our analysis that did undergo PET-CT imaging, one-quarter of patients were still understaged and this is discussed elsewhere. 6 While PET-CT imaging or endosonography was not routinely utilized in the trials included in this meta-analysis, this practice reflects current American College of Chest Physicians guidance 22 for patients with stage IA disease, which does not recommend the use of PET imaging or endosonography. Although it is difficult to generalize, assuming the trial population reflects routine practice, the data here suggest that 44% of patients with clinical stage I disease might have more advanced disease diagnosed postoperatively. A further limitation is that intraoperative pathologic staging protocols may have varied and are unlikely to be as comprehensive as currently recommended. 23 However, incomplete pathologic staging would only serve to reduce the extent of nodal staging inaccuracy.

Context
The advent of stereotactic radiotherapy and radiofrequency ablation for the treatment of early-stage NSCLC has highlighted the importance of accurate nodal staging. These newer techniques are used for the treatment of early-stage lung cancer but, in contrast to surgery, do not provide pathologic staging information.
In a study of relapse of NSCLC following stereotactic radiotherapy or surgery, there were twice as many recurrences in local lymph nodes in patients undergoing stereotactic radiotherapy compared with surgery, 24 emphasizing the importance of accurate nodal staging prior to SABR.
When surgery is undertaken and pathologic staging is available, prior invasive mediastinal sampling may take on less significance if we assume that surgery followed by adjuvant chemotherapy is at least as effective as   6 8 chemoradiation. When considering stage II and III disease, inaccurate clinical staging may reduce the efficacy of surgery by failing to detect multistation N2 or N3 disease. For patients undergoing radical radiotherapy, imprecise clinical staging can result in an incorrect radiation field.
The most likely explanation for the low level of accuracy of clinical staging for patients with operable NSCLC is the sensitivity of the diagnostic tools employed. Patients being considered for treatment with curative intent typically undergo CT and PET-CT imaging as well as mediastinal sampling when required. Using a 10-mm short-axis cutoff for significance of mediastinal nodes, the sensitivity of CT scanning in detecting mediastinal metastases is 55%. 22 PET-CT imaging has a sensitivity of 77% to 81% 25 and may vary according to brand of scanner and histology. In a systematic pooled analysis of 9,267 patients, mediastinoscopy had a sensitivity of 78%. 22 Overstaging may occur with PET-CT imaging unless current guidelines 22    The data for this analysis were obtained from patients in controlled clinical trials, generally from centers with lung cancer expertise. Therefore, clinical staging accuracy in the wider population could well be worse.

Conclusions
The results of this analysis highlight some flaws in the clinical care of patients with NSCLC and emphasize the need for further research into techniques for improving staging accuracy for patients with stage I-III NSCLC.
2. Timmerman R, Paulus R, Galvin J, et al. Stereotactic body radiation therapy for