Back to Journals » Cancer Management and Research » Volume 13

Construction and Validation of a Nomogram to Predict Overall Survival in Very Young Female Patients with Curatively Resected Breast Cancer

Authors Li N, Feng LW, Li ZN, Wang J, Yang L 

Received 4 June 2021

Accepted for publication 20 July 2021

Published 6 August 2021 Volume 2021:13 Pages 6181—6190

DOI https://doi.org/10.2147/CMAR.S321917

Checked for plagiarism Yes

Review by Single anonymous peer review

Peer reviewer comments 2

Editor who approved publication: Professor Seema Singh



Ning Li,1,* Li-Wen Feng,2,* Zuo-Nong Li,1 Jin Wang,1 Lu Yang1,3

1Department of Breast Oncology, Sun Yat-sen University Cancer Center, State Key Laboratory of Oncology in South China, Collaborative Innovation Center for Cancer Medicine, Guangzhou, 510060, People’s Republic of China; 2Department of Breast Surgery, Zhongshan Torch Development Zone Hospital, Zhongshan, 528403, People’s Republic of China; 3Department of Radiotherapy, Guangdong Provincial People’s Hospital and Guangdong Academy of Medical Sciences School of Medicine, South China University of Technology, Guangzhou, 510080, People’s Republic of China

*These authors contributed equally to this work

Correspondence: Lu Yang; Jin Wang Email [email protected]; [email protected]

Purpose: Young age is an independent negative predictor of breast cancer (BC) survival and correlates with the risk of local recurrence and contralateral BC. We aimed to design an effective and comprehensive nomogram to predict prognosis in very young patients with curatively resected BC.
Methods: Female patients with a diagnosis of BC aged ≤ 35 years at presentation were identified from the SEER database as a training cohort. The validation cohort consisted of 1002 consecutive women with BC aged ≤ 35 years that had received curative resection for BC at the Sun Yat-sen University Cancer Center. A nomogram was built based on the identified variables in multivariate Cox proportional hazards model. The performance of the nomogram was quantified using Harrell’s concordance index (C-index) and calibration curves.
Results: Overall, 10,872 young female patients who underwent surgery for BC were enrolled in the training cohort, while 1002 very young female BC patients were identified as independent validation cohort. Eight covariables (age, race, grade; ER, PR, and HER2 status; T, and N stages) were identified and incorporated to construct a nomogram. The C-index values of the nomogram were 0.727 (95% CI: 0.714– 0.740) and 0.722 (95% CI: 0.666– 0.778) for OS in the training and validation cohorts, respectively. The calibration curves showed a high degree of agreement between the predicted and actual observed survival rates in both training and validation cohorts. The nomogram displayed good calibration and acceptable discrimination. Based on the TPS of the nomogram model for OS with the X-tile program, patients were divided into 3 risk groups, which were easily discriminated on survival analyses for OS.
Conclusion: We have successfully constructed an effective nomogram to predict survival outcomes for young female patients with curatively resected BC, which may provide individual survival prediction to benefit prognosis evaluation and individualized therapy.

Keywords: young, breast cancer, prognosis, nomogram, SEER

Introduction

Female breast cancer (BC) has surpassed lung cancer as the most frequently diagnosed cancer globally in 2020, accounting for about 1 in 4 all new cancer diagnoses and 1 in 6 of all cancer deaths for women.1 In China, BC also remains the top malignancy in terms of incidence in women, and accounts for approximately 15% of total cancer cases and 7% of total cancer deaths.2 Although women in China have a lower risk of BC than do women in western countries, this disease occurs at a younger median age in Chinese women than in western White women.3 Typically, BC in young females exhibits certain pathological differences, including a more aggressive phenotype, less favorable prognosis, and higher risk of recurrence compared with older patients.4,5 Besides China, the GRELL study in Europe reported that the incidence of BC in young women has increased by 1.2% annually, especially for women <35 years of age.6 There was also a small but statistically significant increase in the incidence of young BC with distant involvement in the United States.7 Therefore, BC in young women has become a growing concern in clinical practice in the world.

Clinically, the tumor-node-metastasis (TNM) staging system is a tool commonly used by oncologists to predict disease prognosis.8,9 However, the TNM classification alone is insufficient to predict the long-term outcomes of all BC cases, especially for very young BC patients. Thus, effective prediction models for BC in very young women patients are warranted. A nomogram is a reliable tool to quantify individual risk that incorporates multiple important prognostic factors.10,11 Previously, we established effective prognostic nomogram models for very young patients with breast cancer in a single-center retrospective analysis.11 However, the sample size is small and small sample size is insufficient to allow us to perform subgroup analysis.

The definition of “BC in young women” in most literature is BC patients ≤40 years of age.12 In our study we included only patients aged ≤35 years as in a recent study.13 In this study, we obtained population-based data from the Surveillance, Epidemiology, and End Results (SEER) database, and developed a nomogram to predict the survival of BC in very young women. We also validated the nomogram using a retrospective cohort in our center. We hypothesized a nomogram could be designed by combining important clinical and pathological variables using a multivariate model to predict the likelihood of postoperative long-term prognosis in BC in very young women. The large size of the SEER database combined with data from our center allows to investigate clinical predictors of BC in very young patients and provides a detailed description of BC characteristics in this population.

Methods

Patient Selection and Data Acquisition

Patients with a diagnosis of BC between 2005 to 2015, and aged ≤35 years at presentation were identified from the SEER database (covering 18 registries) with the SEER*Stat version 8.3.9 (https://seer.cancer.gov/) as a training cohort. Variables selected from SEER database were as follows: age, sex, race, histologic type ICD-O-3, laterality, grade, T stage, N stage and TNM stage (AJCC stage group 6th edition), ER status, PR status, HER2 recode (after 2010), survival months, and vital status. Patients were included according to the following criteria: (I) female patients, (II) age ≤35 years, (III) diagnosis confirmed by histology, (IV) received resection surgery, (V) complete data available, with more than 0 days of survival. Patients were excluded for the following reasons: (I) cases with a diagnosis according to clinical or imaging findings or autopsy, (II) cases with unknown variables, (III) incomplete data available or complete data available but 0 days of survival (Figure 1).

Figure 1 Flow diagram of patient selection.

The validation cohort consisted of 1002 consecutive women with BC aged ≤35 years and who had received curative resection for BC at Sun Yat-sen University Cancer Center between 1 July 2002 and 31 July 2018. All patients were restaged by the sixth TNM classification system for BC.14

Statistical Analysis

The Chi-square test was used to compare categorical variables. The primary endpoint was overall survival (OS), which was defined as the interval from the date of surgery to the date of death from any cause. Kaplan-Meier curves were drawn for OS, and differences were compared by the Log rank test. The Cox proportional hazards model was used to perform univariate and multivariate analyses. Variables reaching a significant level of 0.1 in univariate analyses were included in multivariate analysis.

The nomograms for predicting 3-, 5-, and 10-year OS were formulated based on the corresponding independent prognostic factors in multivariate analysis. The discrimination of the nomogram models was evaluated by the Harrell’s concordance index (C-index). The value of the C-index ranges from 0.5–1.0, with 0.5 implying a random chance and 1.0 indicating a perfect prediction. Calibration curves of the nomogram models for OS were plotted to measure the agreement between predicted and actual outcomes. In addition, the optimal cutoff value for the scores from nomograms in terms of OS was determined by X-tile software,15 and patients were divided into three different risk groups (high, intermediate, low) according to total prognostic scores (TPS). To further validate the performance of the nomogram model, we also evaluated the nomograms in the validation cohort. Statistical analyses were performed by the R (version 3.6.2, http://www.r-project.org/). Two-sided P-values of <0.05 were identified as statistically significant.

Results

Baseline Characteristics

The workflow of the SEER data extraction used our study is illustrated in Figure 1. A total of 10,872 very young (age ≤35 years) female patients who underwent surgery for BC registered in the SEER database from 2005 to 2015 were enrolled in our study. These patients were used as a training cohort. Meanwhile, a total of 1002 very young female patients who underwent surgery for BC from 2003 and 2018 were selected from our center between. The patients from our center were used as the validation cohort. The detailed demographics and clinicopathological characteristics of all cases are summarized in Table 1.

Table 1 Patient Characteristics of the Training Cohort and the Validation Cohort

More than half of the patients were aged between 31 and 35 years old in both cohorts. Three of four patients (70.1%) in the training cohort were of White race, while all patients were Asian in the validation cohort. For both cohorts, most patients were diagnosed with tumor grade II/III/IV disease, and the most common histological type was ductal cancer.

Independent Predictors in the Training Cohort

The hazard ratios (HRs) for OS according to all variables in the univariate and multivariate Cox proportional hazards model are shown in Figure 2. In univariate analysis, we found that age, race, histology, grade, ER status, PR status, HER2 status, disease stage, T stage, and N stage were identified as significant prognostic factors for OS (Figure 2A). When those variables were further analyzed in the multivariate analysis, we found that age (P=0.015), race (P<0.001), grade (P<0.001), ER status (P<0.001), PR status (P<0.001), HER2 status (P<0.001), T stage (P<0.001), and N stage (P<0.001) remained statistically significant, indicating that they are significant, independent predictors for OS (Figure 2B). The associations between several import predictors (race, grade, T stage, N stage) and OS is further illustrated in Figure 3. The results showed that OS was significantly shorter for very young BC patients of Black race than for other races (Figure 3A). The survival curves for OS stratified by grade, T stage, and N stage separated quite well, with high grade, high T stage, and high N stage having the worst OS (Figure 3BD).

Figure 2 (A) Univariate and (B) multivariate analysis of overall survival for the training cohort.

Figure 3 Kaplan-Meier curves for overall survival stratified by (A) race, (B) grade, (C) T stage, and (D) N stage in the training cohort.

Prognostic Nomogram Building and Validation

Based on the independent predictors of OS in the multivariate analysis identified in the training cohort, nomograms were formulated to predict 3-, 5-, and 10-year OS (Figure 4). The model’s explanatory covariables consisted of age, race, grade, ER status, PR status, HER2 status, T stage, and N stage. Each level of the above variable was assigned a score on the scale. By adding the score for each of the selected variables, a total score was obtained for each patient. The 3-, 5-, and 10-year survival probability of each patient could be easily calculated by adding the scores for each variable. The nomogram showed that N stage and T stage contributed the most to prognosis, followed by grade, race, PR status, ER status, and age. Patients with higher scores in the nomogram corresponded to inferior OS. For instance, for a white women aged 21–30 with a T2N1, grade III, ER negative, PR negative and HER2 unknown BC, the total score for all variables was 303, which corresponded to 3-, 5-, and 10-year OS rates of about 88.6%, 80.5%, and 66.2%, respectively.

Figure 4 Nomograms predicting 3-, 5-, and 10-year overall survival for the training cohort.

The predictive accuracy of the nomogram system was evaluated by calculating the Harrell’s C index. The C-index values of the nomogram was 0.727 (confidence interval [CI]: 0.714–0.740) and 0.722 (95% CI: 0.666–0.778) for OS in the training and validation cohorts, respectively, which were higher than the expected value of 0.7 for a system having an accurate prediction of OS. The C-index of the nomogram in the training cohort is shown in Supplementary Figure 1. The results showed that the C-index values for OS in 3-, 5-, and 10-year were 0.802, 0.735, and 0.697, respectively.

The calibration curves for the probability of OS for the training cohort and the validation cohorts at 3, 5, and 10 years presented an optimal agreement between the prediction by nomogram and actual observation (Figure 5). Next, we divided patients into the following 3 groups based on the TPS of the nomogram model for OS in the training cohort using the X-tile program: low-risk (TPS, 156–239, 3032 patients), intermediate-risk (TPS, 240–290, 5252 patients), and high-risk (TPS, 291–405, 2588 patients) groups. The 10-year OS for the low-risk, intermediate-risk, and high-risk groups were 92.1%, 81.7%, and 60.4%, respectively. Survival analyses for OS demonstrated significant discrimination between these three groups (P<0.001, Figure 6A). For the validation cohort, the patients were divided into the same 3 groups based on the TPS of the nomogram model for OS: low-risk group (495 patients), intermediate-risk group (396 patients) and high-risk group (111 patients). Significant OS differences were also observed among three subgroups, with a 10-year OS of 93.5%, 76.8%, and 61.0% for low-risk, intermediate-risk, and high-risk groups, respectively (p<0.001, Figure 6B).

Figure 5 Calibration plots of the nomogram for 3‐, 5‐, and 10-year overall survival (AC) prediction in the training cohort, and 3‐, 5‐, and 10-year overall survival (DF) prediction in the validation cohort.

Figure 6 Kaplan-Meier curves for overall survival stratified by risk groups based on total prognostic scores from the nomogram model for (A) the training cohort and (B) the validation cohort.

Discussion

BC occurs at a younger median age in Chinese women than in western women. The reasons may largely relate to genetic differences and risk factors.3 BC in young women often presents at more advanced stages of disease at diagnosis, which might be due to the lack of screening programs for this age group. BC in young women is often diagnosed as triple-negative and HER2-positive disease. Furthermore, young age is an independent negative predictor of BC survival.4,5 Young age also correlates with the risk of local recurrence and contralateral BC.5,16 It is estimated that by increasing BC awareness, the proportion of BC in young women will increase in China. BC in young women presents a clinical challenge; thus, it is necessary to establish a model to predict the risk of BC among very patients, to aid in personalized treatment for these patients. In the present study, we constructed a comprehensive nomogram model to better predict the prognosis in BC in very young BC patients based on the SEER database. We believe that with the inclusion of the SEER database, this constructed nomogram based on 8 variables including age, race, grade, ER status, PR status, HER2 status, T stage, and N stage allowed a more accurate assessment and prediction of very young BC patients who had received curative resection surgery.

The 5-year survival of patients with BC aged ≤35 years is 75–80%, while the 5-year survival for patients aged >35 years is 80–85%.17 In our study, the 5-year survival was 88.1% in the training cohort and 91.5% in the validation cohort. The favorable prognosis in our study can be explained by the early stage of diagnosis of patients. In this study, only patients who underwent surgery were included in the analysis and the incidence of BC appears to be lower in women aged < 30 years of age and even lower in women aged <20 years, although survival was shorter among the former and much shorter in the latter group of patients. The results suggest that younger BC patients have the lowest survival rate.

In our study, Black BC patients of very young age had significantly inferior OS when compared with very young patients of another race. This result is consistent with the results from other studies, which showed that Black women are more likely to die from BC at every age.18,19 The possible reason may be the less awareness of symptoms, late diagnosis, genetic differences, and other unidentified cultural factors. Previous studies have shown that HER2-positive and triple-negative subtypes are associated with the shortest survival.20 In our study, however, HER2-positive status is associated with better prognosis than HER2-negative status in operable BC in very young patients. This result might be explained by the routine targeted HER2 therapy for the HER2-positive subtype21 and the confoundedness of unknown HER2 status.

As expected, the traditional critical prognostic factors including T stage, N stage, and grade showed strong correlations on survival outcomes in our nomogram. Likewise, we found that patients with ER negative status and PR negative status were negative predictors for survival in very young BC patients, although some patients with unknown ER or PR status were included in the analysis. Adjuvant tamoxifen therapy substantially improved the long-term survival in women with ER-positive tumors and of women with ER-known tumors. After a 5-year tamoxifen therapy, the proportional reduction in mortality corresponded to 3%, 21% and 23% in ER-poor, ER-unknown, and ER-positive tumors.22 PR was also a prognostic marker and a strong predictor of tamoxifen treatment response.23 In univariate analysis, histology was identified as a significant prognostic factor for OS. However, our multivariate analysis failed to identify histology as a significant predictor for OS. The 5th edition of World Health Organization classification subdivided BC into more than 20 distinctive histology-subtypes based on cell morphology, growth, and histological patterns.24 Several histological subtypes are associated with an extremely favorable prognosis, such as mucinous carcinoma, tubular carcinoma, papillary carcinoma, adenoid cystic carcinoma, and cribriform carcinoma.25 The main reason might be the confoundedness of other factors.

To the best of our knowledge, this is the first study to construct a nomogram predicting the overall survival of BC in very young patients based on a large dataset extracted from the SEER cohort. We also validated the constructed nomogram using an independent retrospective cohort from our center. The calibration curves showed a high degree of agreement between the predicted and actual observed survival rates in both the training and the validation cohorts, indicating that the nomogram established in this study are reliable.

Inevitably, some limitations of our study exist. First, this study was retrospective in nature and may have generated inevitable biases. Second, the data from the SEER database used in our study did not contain data about recurrence or treatment, which may affect survival outcomes. Another limitation is that other important factors such as BRCA1/2 mutation26 and the body mass index27 were not included in the database. Those important variables should be considered in future research.

In conclusion, based on a large-scale population from the SEER database, we have constructed a nomogram that accurately predicts survival outcomes for very young female patients with curatively resected BC. The nomogram constructed in this study showed excellent performance in both training and validation cohorts and may serve as an efficient tool for clinicians to predict the 3-, 5-, and 10-year OS of these patients and ultimately help guide individualized treatment.

Patient Data Confidentiality

We confirmed that all the data was anonymized and maintained with confidentiality in this study.

Data Sharing Statement

The authenticity of this article has been validated by uploading the key raw data onto the Research Data Deposit public platform (www.researchdata.org.cn), with the approval RDD number as RDDA2021001996.

Ethics Approval and Informed Consent

Since the SEER database is freely available to the public, use of the data from SEER received exemption from approval by the ethics committee. Ethics approval for retrospective data was obtained from the Medical Ethics Committee of our center. This study was conducted according to the ethical standards of the Declaration of Helsinki. Due to the retrospective nature of this study, informed consent was waived.

Author Contributions

All authors contributed to data analysis, drafting or revising the article, gave final approval of the version to be published, agreed to the submitted journal, and agree to be accountable for all aspects of the work.

Funding

The National Natural Science Foundation of China (No. 81903170) supported this work.

Disclosure

The authors declare no competing interests.

References

1. Sung H, Ferlay J, Siegel RL, et al. Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin. 2021;71(3):209–249. doi:10.3322/caac.21660

2. Chen W, Zheng R, Baade PD, et al. Cancer statistics in China, 2015. CA Cancer J Clin. 2016;66(2):115–132. doi:10.3322/caac.21338

3. Fan L, Strasser-Weippl K, Li JJ, et al. Breast cancer in China. Lancet Oncol. 2014;15(7):e279–289. doi:10.1016/S1470-2045(13)70567-9

4. Anders CK, Hsu DS, Broadwater G, et al. Young age at diagnosis correlates with worse prognosis and defines a subset of breast cancers with shared patterns of gene expression. J Clin Oncol. 2008;26(20):3324–3330. doi:10.1200/JCO.2007.14.2471

5. Narod SA. Breast cancer in young women. Nat Rev Clin Oncol. 2012;9(8):460–470. doi:10.1038/nrclinonc.2012.102

6. Leclere B, Molinie F, Tretarre B, et al. Trends in incidence of breast cancer among women under 40 in seven European countries: a GRELL cooperative study. Cancer Epidemiol. 2013;37(5):544–549. doi:10.1016/j.canep.2013.05.001

7. Johnson RH, Chien FL, Bleyer A. Incidence of breast cancer with distant involvement among women in the United States, 1976 to 2009. JAMA. 2013;309(8):800–805. doi:10.1001/jama.2013.776

8. Plichta JK, Ren Y, Thomas SM, et al. Implications for breast cancer restaging based on the 8th Edition AJCC staging manual. Ann Surg. 2020;271(1):169–176. doi:10.1097/SLA.0000000000003071

9. Giuliano AE, Edge SB, Hortobagyi GN. Eighth edition of the AJCC cancer staging manual: breast cancer. Ann Surg Oncol. 2018;25(7):1783–1785. doi:10.1245/s10434-018-6486-6

10. Iasonos A, Schrag D, Raj GV, Panageas KS. How to build and interpret a nomogram for cancer prognosis. J Clin Oncol. 2008;26(8):1364–1370. doi:10.1200/JCO.2007.12.9791

11. Li N, Zhong QQ, Yang XR, et al. Prognostic value of hepatitis B virus infection in very young patients with curatively resected breast cancer: analyses from an endemic area in China. Front Oncol. 2020;10:1403. doi:10.3389/fonc.2020.01403

12. Azim HA Jr, Partridge AH. Biology of breast cancer in young women. Breast Cancer Res. 2014;16(4):427. doi:10.1186/s13058-014-0427-5

13. Eiriz IF, Vaz Batista M, Cruz Tomas T, Neves MT, Guerra-Pereira N, Braga S. Breast cancer in very young women-a multicenter 10-year experience. ESMO Open. 2021;6(1):100029. doi:10.1016/j.esmoop.2020.100029

14. Singletary SE, Connolly JL. Breast cancer staging: working with the sixth edition of the AJCC cancer staging manual. CA Cancer J Clin. 2006;56(1):37–47; quiz 50–31. doi:10.3322/canjclin.56.1.37

15. Camp RL, Dolled-Filhart M, Rimm DL. X-tile: a new bio-informatics tool for biomarker assessment and outcome-based cut-point optimization. Clin Cancer Res. 2004;10(21):7252–7259. doi:10.1158/1078-0432.CCR-04-0713

16. Reiner AS, Watt GP, John EM, et al. Smoking, radiation therapy, and contralateral breast cancer risk in young women. J Natl Cancer Inst. 2021. doi:10.1093/jnci/djab047

17. Anders CK, Johnson R, Litton J, Phillips M, Bleyer A. Breast cancer before age 40 years. Semin Oncol. 2009;36(3):237–249. doi:10.1053/j.seminoncol.2009.03.001

18. DeSantis CE, Ma J, Goding Sauer A, Newman LA, Jemal A. Breast cancer statistics, 2017, racial disparity in mortality by state. CA Cancer J Clin. 2017;67(6):439–448. doi:10.3322/caac.21412

19. Kmietowicz Z. Young black UK women are less likely than young white women to survive breast cancer. BMJ. 2013;347:f6413. doi:10.1136/bmj.f6413

20. Sorlie T, Perou CM, Tibshirani R, et al. Gene expression patterns of breast carcinomas distinguish tumor subclasses with clinical implications. Proc Natl Acad Sci U S A. 2001;98(19):10869–10874. doi:10.1073/pnas.191367098

21. Goutsouliak K, Veeraraghavan J, Sethunath V, et al. Towards personalized treatment for early stage HER2-positive breast cancer. Nat Rev Clin Oncol. 2020;17(4):233–250. doi:10.1038/s41571-019-0299-9

22. Early Breast Cancer Trialists’ Collaborative Group. Tamoxifen for early breast cancer: an overview of the randomised trials. Lancet. 1998;351(9114):1451–1467. doi:10.1016/S0140-6736(97)11423-4

23. Stendahl M, Ryden L, Nordenskjold B, Jonsson PE, Landberg G, Jirstrom K. High progesterone receptor expression correlates to the effect of adjuvant tamoxifen in premenopausal breast cancer patients. Clin Cancer Res. 2006;12(15):4614–4618. doi:10.1158/1078-0432.CCR-06-0248

24. Yang WT, Bu H. [Updates in the 5(th) edition of WHO classification of tumours of the breast]. Zhonghua Bing Li Xue Za Zhi. 2020;49(5):400–405. (Chinese).

25. Zhang H, Zhang N, Moran MS, et al. Special subtypes with favorable prognosis in breast cancer: a registry-based cohort study and network meta-analysis. Cancer Treat Rev. 2020;91:102108. doi:10.1016/j.ctrv.2020.102108

26. Kehl KL, Giordano SH. BRCA1 and BRCA2 testing among young breast cancer survivors. JAMA Oncol. 2016;2(5):688–689. doi:10.1001/jamaoncol.2016.0976

27. Ligibel JA, Cirrincione CT, Liu M, et al. Body mass index, PAM50 subtype, and outcomes in node-positive breast cancer: CALGB 9741 (Alliance). J Natl Cancer Inst. 2015;107(9). doi:10.1093/jnci/djv179

Creative Commons License © 2021 The Author(s). This work is published and licensed by Dove Medical Press Limited. The full terms of this license are available at https://www.dovepress.com/terms.php and incorporate the Creative Commons Attribution - Non Commercial (unported, v3.0) License. By accessing the work you hereby accept the Terms. Non-commercial uses of the work are permitted without any further permission from Dove Medical Press Limited, provided the work is properly attributed. For permission for commercial use of this work, please see paragraphs 4.2 and 5 of our Terms.