Back to Journals » Journal of Inflammation Research » Volume 18
The Persistent Threat of Chronic Inflammation on the Mortality Among Cervical Cancer Survivors: A Mendelian Randomization and Machine Learning Analysis Using UK Biobank and Chinese Cohort Data
Authors Wang J, Chen Z
, Guan M, Ma Z, Peng L, Chen J, Fiori PL
, Carru C, Capobianco G, Coradduzza D
, Zhou L
Received 25 March 2025
Accepted for publication 22 July 2025
Published 30 July 2025 Volume 2025:18 Pages 10267—10282
DOI https://doi.org/10.2147/JIR.S528121
Checked for plagiarism Yes
Review by Single anonymous peer review
Peer reviewer comments 4
Editor who approved publication: Dr Felix Marsh-Wakefield
Jing Wang,1,2,* Zhichao Chen,3,* Mingfei Guan,4 Zebiao Ma,4 Lin Peng,5 Jiongyu Chen,5 Pier Luigi Fiori,2 Ciriaco Carru,2 Giampiero Capobianco,6 Donatella Coradduzza,2 Li Zhou4
1Department of Obstetrics and Gynecology, Second Affiliated Hospital of Shantou University Medical College, Shantou, People’s Republic of China; 2Department of Biomedical Sciences, University of Sassari, Sassari, Italy; 3Department of Cardiology, Second Affiliated Hospital of Shantou University Medical College, Shantou, People’s Republic of China; 4Department of Gynecologic Oncology, Cancer Hospital of Shantou University Medical College, Shantou, People’s Republic of China; 5Department of Central Laboratory, Cancer Hospital of Shantou University Medical College, Shantou, People’s Republic of China; 6Gynecologic and Obstetric Clinic, Department of Medicine, Surgery and Pharmacy, University of Sassari, Sassari, Italy
*These authors contributed equally to this work
Correspondence: Li Zhou, Department of Gynecologic Oncology, Cancer Hospital of Shantou University Medical College, Shantou, People’s Republic of China, Email [email protected]
Purpose: The association between inflammatory dysregulation and cervical carcinogenesis and progression has not yet been fully elucidated. We aimed to comprehensively evaluate the genetic association between inflammation and cervical cancer, and construct an accurate prognosis model based on circulating inflammatory parameters and indexes with machine learning (ML) algorithms.
Patients and Methods: We tested the genome-wide association of circulating inflammatory molecules (CIMs) (91 circulating inflammatory cytokines and 10 inflammatory cells) and summary data retrieved from the UK biobank (cases = 1659 and controls =381,902) with two-sample Mendelian randomization (MR) and colocalization analyses. Nine ML and logistic regression (LR) integrated prognosis models were developed for 1042 subjects with cervical cancer (random allocation into training and validation cohorts at 6:4 ratio).
Results: Three potential causative CIMs for cervical cancer were identified via a two-sample MR. However, neither reverse MR, nor Bayesian colocalization analyses supported shared causal variation. After feature selection with 3 algorithms (LASSO regression, Boruta and Support vector machines), the gradient boosting machine (GBM) model outperformed other models by achieving an area under the curve (AUC) of 0.930 and a Brier score of 0.027 in 1-year overall survival (OS) prediction. Similarly, the GBM model delivered the best overall performance in 5-year OS prediction with an AUC of 0.893 and a Brier score of 0.089. Following the Shapley Additive explanations (SHAP), the lymphocyte monocyte ratio, neutrophil count, platelet count, and platelet lymphocyte ratio were associated with 1-year OS, while the systemic immune-inflammation index, platelet neutrophil ratio, and monocyte count were significantly related to 5-year OS.
Conclusion: No substantial causal associations were observed between CIMs and cervical cancer. The cohort study findings reveal the persistent impact of inflammation on cervical cancer prognosis, highlighting the crucial role of chronic inflammation when investigating the biomarkers of cervical cancer progression and developing pharmacological interventions. The GBM model consistently achieved satisfactory performance in cervical cancer prognosis prediction with demographics and CIMs, meriting further validation and potential clinical implementation.
Keywords: cervical cancer, Mendelian randomization, colocalization analysis, machine learning, inflammation, overall survival
Introduction
Cervical cancer has been recognized as the most prevalent female reproductive malignancy and the third lethal cause of cancer mortality among young women.1 In 2024, the American Cancer Society reported 13,820 new cervical cancer cases and 4360 deaths within the United States.2 Due to HPV vaccination and screening for precancerous lesions, there has been a significant reduction in the incidence of cervical cancer over recent decades.3 Data derived from The African cancer statistics shows that the cervical carcinoma is the leading cause of malignant tumors among women in 19 countries in 2020.4 Over the last decades, despite tremendous progress that has been made in the prevention and treatment of cervical cancer, concerted efforts are still required to combat the disease.
Emerging studies suggest that inflammation is an important contributor to the mutation of dormant cells and initiating cervical carcinogenesis, leading to cancer development. Particularly, in individuals infected with high-risk types of human papillomavirus (HPV), the cervical squamous and columnar cell cycle is disrupted by chronic and persistent inflammation. The impact has been adequately clarified and may result in genetic mutations, including oncogene amplification and chromosomal instability.5,6 However, only 3% of women infected with high-risk HPV will ultimately progress to cervical cancer.7 Thus, in addition to HPV infection, inflammation and immune response may directly or indirectly participate in the oncogenesis of cervical cancer.8 Cytokines and chemokines are major effectors in the underlying mechanisms of tumorigenesis, invasion, and metastasis during chronic systemic inflammatory responses.9 Recent cellular and molecular experiments revealed that interleukin 6 (IL-6) regulated the survival and proliferation of cervical cancer cells via the Ras-MAPK signaling pathway.10 Prospective investigation has also associated serum IL-6 and IL-1β levels with cervical cancer risk. Cytokines such as IL-1β can induce cell-mediated immunity and tumor development. IL-1β promotes chronic inflammation, epithelial-mesenchymal transition, immune evasion, and tumor progression by activating critical signalling pathways such as NF-κB and MAPK, therefore contributing to the pathogenesis of cervical cancer.11
The challenge of inflammation involved in cervical cancer prognosis persists, with a large number of studies focusing on inflammation-induced cancer mechanisms to assess the impact of inflammatory factors on prognosis.12–14 Current research have concentrated on putative molecular markers for cervical cancer, including microRNAs, circRNAs, long noncoding RNAs, DNA methylation, and exosomes.15 Nevertheless, those molecular markers are not widely used in clinics due to high inspection costs and scarce large-scale validation. Routine blood test parameters or conversion calculation indexes are more responsive to clinical needs and low-cost economics. In this study, we aimed to construct survival prediction models using a series of circulating inflammatory cells and factors by routine blood tests. Patients diagnosed with cervical cancer are able to receive appropriate immunotherapy and improve cancer prognosis.
Machine learning (ML) has been widely applied to various fields of clinical practice for its statistical power and adequate data integration capacity. Among massive electronic medical records, ML has been acknowledged as a highly valid method for decision-making and prognostic prediction.16,17 After systematically searching PubMed and Web of Science on Dec 10, 2024, for articles containing the terms (“machine learning”, “artificial intelligence”, or “deep learning”) AND “cervical cancer” AND “prognostic model” without data or language restrictions, studies on mining the prognostic value of inflammatory parameters or indexes on cervical cancer were still rare. Therefore, we sought to develop models to predict 1-year and 5-year overall survival among individuals with cervical cancer integrating circulating inflammatory molecules (CIMs) via machine learning algorithms. In addition, it was unclear from observational and retrospective studies whether the causal relationship was due to exposure factors, or was present because of confounding factors such as environmental or socio-economic factors. MR analyses use the genetic variance associated with the exposure as the instrumental variable. Since genetic variance is randomly allocated, the study design should be less susceptible to confounding, reverse causation, and other forms of bias.18 Thus, we also conducted a two-sample Mendelian randomization (MR) analysis, as a complement to a retrospective cohort study, to explore the potential causal association between circulating inflammatory molecules and cervical cancer.
Materials and Methods
Two-Sample MR Analysis
We performed two-sample MR analyses to evaluate the causal effects of 91 circulating inflammatory cytokines and 10 inflammatory cell counts (Supplementary Data 1 and https://ftp.ebi.ac.uk/pub/databases/gwas/summary_statistics/) on the risk of cervical cancer. Within the UK Biobank (https://www.ukbiobank.ac.uk), summary data were accessible for 1659 cervical cancer cases and 381,902 controls from the United Kingdom.19 All single-nucleotide polymorphisms (SNPs) related to inflammation process, at a significance threshold of P values < 5 × 10−6, were selected (Supplementary Data 2). MR analyses were conducted using the inverse variance weighted (IVW) method as the principal approach, in addition to four remaining methods.20–23 We assessed the heterogeneity and directional pleiotropy using leave-one-out, Cochran’s Q, MR-PRESSO, and Egger intercept tests to inspect whether the MR’s assumptions held.24 To further examine the reverse causalities and potentially causal SNPs, a reverse MR and colocalization analysis were performed using the same method described earlier.25 As a validation analysis, we conducted a replication analysis of cervical cancer data on another data set (FinnGen consortium) to verify the findings.26 The significance threshold of the p-value was < 0.05 in a two-tailed t test. Since multiple relationships were tested, the Bonferroni corrections and multiple-testing p-values (0.05/101) were applied.27 For colocalization analysis, results with a posterior probability for H4 (PP4) > 80% were deemed as evidence of colocalization between inflammatory biomarkers and cervical cancer.25
Study Population
A retrospective, longitudinal cohort analyzed 1042 cases of cervical cancer admitted to the Cancer Hospital of Shantou University Medical College. All patients were pathologically confirmed for the diagnosis of cervical cancer and recruited from January 1, 2014, through December 31, 2021. To reduce interference with the outcomes, we excluded subjects with the following criteria: those with a history of acute inflammatory disease for one-month, or chronic inflammatory disease (n = 36), HIV infection (n = 1), loss of follow-up (n = 17), those without available inflammatory parameter tests (n = 2), and those who received chemotherapy or radiotherapy before laboratory testing (n = 7). The final enrolled individuals were randomly allocated to the training set (n = 626) and validation set (n = 416) in a 60:40 ratio (Figure 1).
|
Figure 1 Study flowchart. |
Ethical Statement and Protocols
The study was conducted in accordance with the Declaration of Helsinki. Its study protocols were approved by the ethics committees of the Cancer Hospital of Shantou University Medical College (Approval No. 2025009). All patient identity was substituted and anonymized with a hospitalization number. Therefore, written informed consent was waived by the ethics committees. This study was conducted strictly following the Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) guidelines.28 Development of the models adhered to the TRIPOD checklist, and is submitted as Supplementary Table 1.29
Exposure, Outcome, and Other Variables
The definition of cervical cancer was according to the International Classification of Diseases, 10th Edition (ICD-10), code C53. Laboratory tests, including c-reactive protein (CRP), white blood count (WBC), platelet count (PLT), lymphocyte count (LC), neutrophil count (NC), mean platelet volume (MPV), and platelet distribution width (PDW), were performed at the Department of Clinical Biochemistry, the Cancer Hospital of Shantou University Medical College and were directly retrieved from the electronic medical system. All the Laboratory items included in the cohort are described in Supplementary Table 2. The relevant inflammatory indices were calculated using the following formulas:
;
;
;
; Systemic immune-inflammation index (SII) = PLT × NLR;
.30
Multiple clinical features were collected for model construction, including age, body mass index (BMI, kg/m2), age at menarche, menopause, education level, history of hypertension and diabetes, length of hospital stay, FIGO stage, histological type of tumor, lymph node metastases, recurrence and metastases, line of therapies (LOTs), types of lumpectomy, bilateral salpingo-oophorectomy (BSO) and pelvic lymphadenectomy.
- 1-year overall survival (OS) and 5-year OS were considered primary outcomes. The state of survival was determined by medical records, telephone interviews, and any legal proof of death. The end of follow-up was defined as the latest date of confirmation of vital status, from 2017 to July 2024. The median follow-up duration for the included patients was 68 months.
Data Preprocessing
The 16 clinical features together with 14 CIMs were input as the candidates for the model construction. For variables with <5% missing data, we performed multiple imputations using chained equations (MICE). We iteratively regressed each incomplete variable on other covariates across five imputed datasets until convergence, with pooled estimates derived per Rubin’s rules.31 The Synthetic Minority Over-sampling Technique (SMOTE) addresses class imbalance by generating synthetic minority-class samples through linear interpolation between existing minority instances and their k-nearest neighbors, preserving data distribution while enhancing classifier performance.32 The details of missing data and proportions for all variables are provided in Supplementary Figure 1.
Feature Selection and Model Construction
We used logistic regression (LR) to rank the prognostic accuracy of all CIMs, in predicting cervical cancer patients survival, to derive receiver operating characteristic curve (ROC) and area under the curve (AUC) values. Three following feature mining methods were employed sequentially to identify important features: the least absolute shrinkage and selection operator (LASSO), Boruta and support vector machine (SVM). LASSO reduces prediction error by shrinking some of the penalized regression variable coefficients to zero and selecting variables whose coefficients remain non-zero after shrinkage. In this study, we tuned the model parameters using 5-fold cross-validation and 100 bootstrap iterations followed by a randomized hyperparameter search to find the optimal lambda value.33 Boruta is a random forest (RF) based feature selection algorithm that eliminates statistically irrelevant features by using an iterative strategy. The Boruta algorithm consists of the following major steps: 1. The information system with all variables is created, ensuring at least 5 shadow attributes are included; 2. Shuffle added attributes to remove response correlations; 3. Apply an RF classifier to the presented information system and calculate Z scores; 4. Determine the maximum Z score among shadow attributes (MZSA) and assign a hit to any attribute that outperforms it; 5. Use the MZSA for a two-sided significant test for irrelevant attributes; 6. Mark attribution is much lower than MZSA as unimportant, and the variables are permanently removed from the information system; 7. Consider characteristics with higher attribution than MZSA as important; 8. Remove shadow properties; 9. Repeat until all attributes are assigned importance or the algorithm reaches the set limit of RF runs.34 Support vector machine recursive feature elimination (SVM-RFE) is a feature selection process used to remove variables in a dataset that are statistically uncorrelated with the categorical outcomes. The SVM-RFE algorithm is fundamentally a backward elimination process. The highest-ranked variables are eliminated last. For instance, the p-ranked variable represents the least relevant feature within a model containing 1 to p-ranked variables.35 The ultimate feature selection procedure is to obtain the intersection of the variables identified by the three feature selection algorithms (LASSO, Boruta, and SVM-RFE) using a Wayne diagram. Supplementary Table 3 provides the basic principle and characteristics of the machine learning algorithms employed in this study.
Ten known comparators, including nine advanced ensemble methods (support vector machine (SVM), Gradient Boosting Machine (GBM), NeuralNetwork (NNT), random forest (RF), XGBoost, K nearest neighbor (KNN), Adaboost, LightGBM (LGBM) and CatBoost) and the traditional logistic regression (LR) were constructed for the OS of cervical cancer prediction. The optimal hyper-parameters were determined through 5-fold cross-validation and grid search on the training sets, maximizing prediction accuracy. Subsequently, model generalizability was assessed using the testing set with 100 bootstrap resamples for internal validation. Variable importance was ultimately derived from the optimally performing model following the internal validation.
Model Evaluation and Validation
To verify the effectiveness of models in the validation cohort, we assessed the model performance using AUC value, accuracy, specificity, sensitivity, Brier score, calibration curve, and decision curve analyses. The following formulas were applied to calculate the relevant indicators:
;
;
;
.
In the formulas, TP represents true positive, FP stands for false positive, TN is true negative, and FN means false negative. In calculating the Brier score, N stands for the sample size,
represents the predicted odds of OS, and
is the actual odds of OS.36 The Shapley Additive explanation (SHAP) provides the theoretical interpretation of ML in the prediction models by Lundberg et al.37 The SHAP value quantifies the impact of features on prediction outcomes by assessing the marginal contribution of each feature to the model output. The SHAP value disaggregates the prediction into a linear amalgamation of feature contributions, as shown below.
In the algorithm,
stands for the prediction method,
is the number of inputted indicators,
represents a constant, and
is denoted the SHAP value of the
-th feature.
Statistical Analysis
We assessed the frequencies (percent) and the median (interquartile range) for categorical and continuous variables. The Wilcoxon rank test was carried out to compare continuous variables with non-normal distributions between the two groups.38 The level of significance was defined at a two-tailed P-value < 0.05. All statistical analysis was performed using IBM SPSS Statistics, version 26.0, and R statistical software, version 4.3.0.
Results
Two Sample MR Study and Colocalization Analysis
In total, 4558 inflammation-related SNPs with F-statistics ranging from 18.65 to 4261.05 per instrumental variable (IV) were selected (see Supplementary Data 3). Based on IVW as the primary method, we found evidence of suggestive causal relationships between IL-8 (odds ratio [OR], OR = 1.413, P = 0.036), leukemia inhibitory factor (LIF) (OR = 1.400, P = 0.044), and WBC (OR = 0.774, P = 0.003) and overall cervical cancer risk (Figures 2 and 3A, Supplementary Data 3). There was little evidence of heterogeneity or pleiotropy shown in the sensitivity analysis (Cochran’s Q, Egger intercept, MR-PRESSO, and leave-one-out test) (Supplementary Data 4 and 11). We did not observe a reverse causal relationship between CIMs (IL-8, LIF, and WBC) and cervical cancer (as exposure) risk (see Supplementary Data 5). However, inconsistent with the findings from traditional MR methods, no shared causal variation (PP4 < 0.8) between the three CIMs, genetically predicted above, and overall cervical cancer risk was observed in colocalization analysis (Figure 3 and Supplementary Data 6). These findings were similar, whereas the suggestive causal associations between CIMs and cervical cancer risk were no longer robust after Bonferroni correction (P < 0.0005). Using the summary genetic data from the FinnGen consortium, the above estimates resembled those indicating an insignificant genetic correlation between CIMs and cervical cancer risk. The relevant results of the replication analysis are provided in Supplementary Data 7–10.
|
Figure 2 Circos plots of Mendelian randomization estimates for the genetic association between inflammation and cervical cancer (A) 91 circulating inflammatory cytokines; (B) 10 inflammatory cells. |
|
Figure 3 MR analysis forest plot (A) and colocalization analysis results of the causal relationships between inflammatory factors WBC (B), IL-8 (C), LIF (D), and cervical cancer. |
Cohort Study
Patient Characteristics
Between January 1, 2014, and December 31, 2021, 1042 patients diagnosed with cervical cancer met the inclusion criteria and were included (Figure 1). The training cohort included 626 subjects, while 416 were randomly allocated to the validation cohort. The median age of the training cohort was 50.0 years old at laboratory testing. Of the 626 cases, a minority were comorbid with hypertension (18.8%) and diabetes mellitus (8.1%). The cancer staging was relatively early, since 53.2% of the cohort were classified as stage I, and 27.6% were classified with stage II according to FIGO 2009 staging. Surgery was performed on 419 individuals (66.9%), and 307 (49%) had a bilateral salpingo-oophorectomy (BSO). Given the lines of therapies (LOTs), more than 40% of patients received a combination of surgery, chemotherapy, and radiation. However, 153 patients (24.4%) were diagnosed with cervical cancer along with lymph node metastases, and 50 (8.0%) sustained tumor recurrence and metastases during follow-up. The one-year and five-year mortality was 3.1% and 16.9% during a median follow-up time of 69 months (Table 1). The patient characteristics were roughly the same in the validation cohort. All laboratory tests and statistics for the 14 inflammatory parameters are listed in Supplementary Table 4.
|
Table 1 Baseline Characteristics of Included Population |
Feature Characteristics
Correlations between various CIMs were assessed using Pearson correlation coefficients, and illustrated in a graphical heat map provided in Supplementary Figure 2. As expected, positive correlations between different inflammatory indicators were observed (WBC and NC, MC and PNR). There were also negative correlations between PLT, PDW, and MPV. We assessed the predictive value of a single inflammatory marker on overall survival of cervical cancer. The strongest predictor of 1-year OS was NLR (AUC = 0.635), whereas CRP (AUC = 0.602) was the best predictive parameter for 5-year OS. The predictive AUCs of a single CIM varied between 0.511 to 0.635 for 1-year OS and between 0.467 to 0.602 for 5-year OS, respectively (Figure 4).
Development of Machine-Learning Models
For the machine-learning models, three feature mining algorithms (LASSO regression, Boruta, SVM-RFE) were performed to optimize the analysis performance characteristics. Supplementary Figure 3A demonstrates LASSO’s prediction error reduction through coefficient shrinkage, with some penalised regression variables set to zero. Using 5-fold cross-validation, 100 bootstrap iterations, and randomised hyperparameter tuning, 29 features significantly associated with 1-year OS were selected at the optimal lambda value. Supplementary Figure 3B displays the Boruta selection process, with 30 features ultimately identified through iterative random forest comparisons between original features and shadow counterparts. Supplementary Figure 3C and D present the SVM-RFE feature selection process, with recursive elimination of low-importance features guided by 5-fold cross-validation of accuracy and error rates, yielding 24 optimal features. The optimal features for 1-year OS model construction were derived from the intersection of results across all three feature selection algorithms (Supplementary Figure 3E and F). Twelve inflammatory markers and twelve clinical features were identified to be significantly related to 1-year OS. Similarly, under the intersection of the feature mining algorithms, 25 optimal parameters were selected for 5-year OS, including 11 inflammatory markers and 14 clinical characteristics ((Supplementary Figure 4A–F).
We assessed the models’ overall performance across a variety of characteristics in the validation cohort. For 1-year OS, all models showed favorable efficacy in predicting survival with AUCs ranging from 0.820 to 0.970 (Figure 5A). The corresponding Brier scores were 0.027 to 0.283 for ten models. Within the validation cohort, gradient boosting machine (GBM) demonstrated greater accuracy (0.930), AUC (0.939), sensitivity (0.846), specificity (0.942), and Brier score (0.027) and in comparison to the logistics regression model (LR) and other ML models (Table 2). The GBM model demonstrated superior performance and was therefore selected for variable importance analysis. The final rankings of prognostic features for 1-year overall survival prediction are presented in Supplementary Figure 5.
|
Table 2 Performance of Models for Predicting 1-Year and 5-Year OS in Patients with Cervical Cancer in Validation Cohorts |
As for 5-year OS, the GBM model also showed superior and high accuracy (0.888), AUC (0.893), sensitivity (0.771), specificity (0.912), and Brier score (0.089) in the validation cohort (Figure 5D). Despite the competitive efficiency of traditional LR and random forest models, the GBM model still demonstrated more stable and preferable performance than the competing methods, especially in accuracy and Brier score. Based on the GBM model, we further analyzed the feature importance regarding inflammatory markers and major clinical characteristics within the entire population. The details of the prognostic variable importance ranking for 5-year OS prediction are given in Supplementary Figure 6. Both the mean SHAP value and the individual SHAP values identified the important contribution of inflammatory markers to the prognosis of cervical cancer. LMR, PLT, NC, and PLR were associated with 1-year OS, while PNR, SII, MC, NLR, and PDW significantly affected 5-year OS (Figure 5).
Discussion
In the current study, we explored the impact of inflammation on the carcinogenesis and prognosis of cervical cancer in a 1042-case observational retrospective cohort and a two-sample Mendelian randomization study. The results from MR analysis for IL-8 (OR = 1.413; P = 0.036) and LIF (OR = 1.400; P = 0.044) indicated a suggestively positive association with cervical cancer risk. In contrast, a suggestive and negative association was found between WBC (OR = 0.774; P = 0.003) and carcinogenesis risk. However, random error from multiple hypothesis testing cannot be ruled out and the genetic causalities were no longer statistically significant after Bonferroni correction (P <0.05/101). These findings imply that the potential causal associations indicated by MR analyses should be interpreted cautiously, as they provide exploratory rather than confirmatory evidence for the involvement of inflammatory cytokines in the pathogenesis of cervical cancer. Further studies are necessary to validate these findings and clarify the underlying mechanisms. In an observational study based on a general oncology institute serving the population of eastern Guangdong Province, China, we compared the performance of the classical LR model with the efficacy of ML algorithms in predicting the prognosis of survivals with cervical cancer. By the intersection of three feature screening methods, the GBM achieved more consistent and better performance both in 1-year (AUC = 0.930, accuracy = 0.939, Brier score = 0.027) and 5-year OS prediction (AUC = 0.893, accuracy =0.888, Brier score =0.089). In addition, the SHAP values indicated that LMR was a risk factor for 1-year survival, while PLT, NC, and PLR were positively associated with better prognosis. Similarly, PNR, SII, and PDW were found to be negative factors for 5-year OS, whereas MC and NLR were protective factors.
A previous study using a two-sample MR approach found arteminin (β: 0.0024, P = 0.002), CCL13 levels (β: 0.0010, P = 0.016), IL-18 (β: −0.0010, P = 0.029), and IL-22RA1 (β: −0.0021, P = 0.046) to be associated with risk of cervical cancer.39 However, instruments were selected with a lax threshold of P < 1×10−5 in the study, which may have violated the first assumption in the MR study and introduced bias in the outcome estimates. Moreover, the previous MR study pooled effects for various cytokines without multiple corrections (the Bonferroni correction), and those estimates potentially suffer from type I error. The causal association between CIMs and cervical cancer was not observed after the investigation. This is because the initially significant causality might be an occasional event caused by multiple tests. The development of cervical cancer is a gradual, multi-stage, complex process involving viral infection, immunological modulation, and genetic determinants. Based on the analysis of MR results, inflammatory cytokines are not the root cause of cervical cancer. Moreover, inflammatory cytokines may play a significant role in its progression. They also activate critical signaling pathways in cervical cancer development and reshape the immunosuppressive microenvironment, promoting tumor cell proliferation, immune evasion, and metastasis. The implementation of Bonferroni Correction effectively controls false positive results. Notably, jointly Bayesian colocalization and a validation MR analysis using summary data from the FinnGen Cervical Cancer Database provided converging evidence supporting the insignificant effects of CIMs in the onset of cervical cancer risk. However, the predictive value of CIMs identified by ML models for cervical cancer patients remains significant, suggesting that they may function as prognostic indicators rather than causal factors. Our findings align with a previous MR study on the role of 41 inflammatory factors in cervical cancer.40 While the MR study did not provide evidence for a causal relationship between inflammatory cytokines and cervical cancer, they contributed to clarifying the causal nature of their epidemiological associations. Moreover, the findings remain valuable in guiding resource allocation and optimizing future research directions.
Although MR findings do not support robust causalities between IL-8, LIF, and cervical cancer, their potential involvement in the cervical cancer microenvironment remains biologically plausible. IL-8 has been reported to promote tumor progression in various cancers, including cervical cancer, by promoting angiogenesis, epithelial-mesenchymal transition (EMT), and immune cell recruitment.41,42 Similarly, LIF is considered to be responsible for regulating tumor immune evasion and tumor-stromal interactions. The reason for this difference suggests that IL-8 and LIF may influence the occurrence and development of cervical cancer through non-genetic mechanisms. Therefore, future studies should investigate the specific mechanisms of IL-8 and LIF in cervical cancer and validate their potential as therapeutic targets.43,44
The prevalence of cervical cancer is declining globally.3 However, more efforts are warranted in China, one of the countries with a high global burden of cervical cancer.45 The relatively high prevalence and incidence of cervical cancer in China are multifaceted, and primarily attributable to the widespread occurrence of HPV infection, inadequate HPV vaccination, insufficient screening, and significant disparities in healthcare spending between regions.46,47 Most of the current studies of cervical cancer focus on the analysis of prevalence, prevention, risk factors, staging, and therapies. Studies on the association between inflammation and the risk of short-term and long-term survival in patients with cervical cancer remain limited. In the health-care system, laboratory blood testing is often the first examination for assessing a patient, but clinicians often fail to identify the connection between circulating inflammatory molecules and prognosis, probably leading to delays in cervical cancer treatment. Our study constructed an ML model with the aim of assisting clinicians to identify patients at risk of adverse prognosis in time for early warning and timely intervention to enhance the survival of patients with cervical cancer.
Another highlighted contribution of CIMs in cervical cancer prognosis also provided a better comprehension of the underlying mechanisms regarding inflammation and tumor progression. Previous evidence from epidemiology has demonstrated that differentiated concentrations of LMR signaling regulated by the anti-tumor immune response is associated with patient prognosis in colorectal, pancreatic, and esophageal carcinomas.48,49 A reduced LMR may indicate increased monocyte activity. Similar to NC, increased monocyte activity may contribute to tumor growth and inhibit anti-tumor immune responses. Monocytes are attracted to tumor tissue, where they differentiate into tumor-associated macrophages (TMAs). TMAs promote tumor growth by secreting multiple growth factors and cytokines related to angiogenesis and the inhibition of anti-immune responses.50 In contrast, we observed that increased LMR before LOTs is associated with worse 1-year OS. The reason for this association remains unclear. Given the correlation between decreased MC level and hematologic toxicity, it may be attributed to poor tolerance of LOTs, eventually leading to poor clinical outcomes.51
As a long-term prognostic indicator for patients with cervical cancer, the SII was initially constructed in 2014 as a novel indicator derived from host lymphocyte, platelet, and neutrophil counts.52 Its prediction capability regarding tumors can be assessed by the physiopathological functions of those three types of cells. Lymphocytes are often crucial in tumor defense by exacerbating apoptosis in tumor cells via immune surveillance, therefore slowing cancer cell proliferation, invasion, and metastasis. Consequently, lymphopenia may impair immunosurveillance against carcinogenesis, which in turn, has a negative effect on the cancer prognosis.53 Neutrophils (NEs) and platelets, contrary to lymphocytes, exert a considerable tumor-promoting effect. NEs can attract myeloid-derived suppressor cells (MDSCs) within the tumor microenvironment. MDSCs are associated with the production of reactive oxygen species and T-lymphocyte inhibitions, resulting in weak tumor immunity.54 NEs also have a wide range of cytokines secretion functions, including secretion of vascular endothelial growth factor (VEGF), matrix metalloproteinase-9, TNF-β, and IL-6, which activate tumor invasion and metastasis.55 Platelets are able to bind to the surface of tumor cells to create microaggregates with a physical barrier of platelets protecting tumor cells from immune attack. Platelets also secrete soluble molecules that promote tumor cell activity, such as the transforming growth factor β (TGF-β) and platelet factor 4 (PF4).56 Those findings collectively underscore the signaling pathways implicated in inflammation and cancer progression.
ML prediction models for cervical cancer diagnosis have been reported in previous studies. Al Mudawi et al developed five ML models to extract diagnostic information about cervical cancer from 32 clinical features, including demographics and medical history. Partial ML algorithms based on clinical characteristics have been investigated in the detection of cervical cancer, claiming better accuracy than the LR model.57 Tseng et al reported an advanced ML model (C5.0) in the prediction of recurrence, among patients with cervical cancer, with optimal performance and an accuracy of 0.920.58 However, studies using ML to assess the predictive value of inflammatory markers on cervical cancer prognosis are extremely rare.
We employed the SHAP method to address machine learning models’ inherent opacity, providing population-level and individual-case explanations that clarify how CIMs affect cervical cancer prognosis. We also conducted comprehensive comparisons of multiple machine learning models’ predictive performance regarding the CIMs level and patient outcomes. The internal validation demonstrated that the GBM model offers superior prognostic value for cervical cancer patients. Moreover, all predictive variables incorporated in the model are routinely collected during hospitalization, ensuring clinical practicality, while the public availability of the GBM algorithm facilitates widespread implementation. Importantly, the ML model enables personalized prognostic assessment for patients with chronic inflammation, allowing physicians to identify patients who may derive extra benefit from enhanced surveillance, prompt intervention, or experimental therapies. Finally, through the combined application of Mendelian randomization and machine learning methods, this research provides novel insights into chronic inflammation’s role in cervical cancer pathogenesis and progression. It is beneficial for future investigation using multimodal data fusion of the complex interplay between inflammation and oncologic outcomes.
This study is based on a retrospective single-center cohort study design, the constructed ML models were not validated in an external cohort. Thus, the generalisation of our findings needs further investigation, and clinical application should be cautious before large-scale validation. Second, owing to the single-center and retrospective study design, selection bias may emerge from potential limitations in patient population representativeness and incomplete records. The collection and measurement of demographic features could lead to information bias due to variability in measurement approaches. Temporal evolution in diagnostic classifications or therapies may introduce additional validity concerns. Third, multimodal features such as HPV infection, image, and gene detection were not included. Some of these features have the potential to enhance prediction efficiency. Finally, we must recognize that the practical application of ML models remains challenging due to racial inequities, diversity in healthcare system composition, and attribution of responsibility. Long-term validation and guideline delineation are required. In addition, there are several limitations regarding the study’s MR analysis. Firstly, although we rigorously screened the instrumental variables and conducted replication analyses in different cervical cancer populations, we could not eliminate the impact of weak instrumental variable bias. Secondly, there are potential differences in genetic and environmental factors across populations and regions, and the present study was limited to the European population, so caution is warranted in extrapolating these findings to varied populations.
Conclusion
In conclusion, MR analyses in large-scale samples do not support the causal inference between inflammation and cervical cancer risk. The gradient boosting machine based on clinical characteristics and inflammatory markers achieved excellent prognostic performance. It outperformed traditional LR and 8 other ML models in predicting the survival outcome for cervical cancer patients. The circulating inflammatory markers contributed to the prediction of short-term and long-term overall survival among those patients. Prospective studies and basic research are further warranted to investigate the mechanisms underlying inflammation and cancer prognosis.
Data Sharing Statement
The original data for analysis are presented in the text and supplementary materials. Further reasonable requests for original data supporting the results of our study are available from the corresponding author. The summary statistics from the UK Biobank data for inflammatory cytokines and cervical cancer can be open-accessed from https://www.ukbiobank.ac.uk.
Ethics Approval and Consent to Participate
The North West Multi-centre Research Ethics Committee granted ethical approval to the UK Biobank study (reference number: 16/NW/0274), and all participants provided informed consent with electronic signatures. The Chinese cohort study protocols were approved by the ethics committees of the Cancer Hospital of Shantou University Medical College (Approval No. 2025009). All patient identities were substituted and anonymized with a hospitalization number. Therefore, the ethics committees waived written informed consent.
Acknowledgments
We express our sincere appreciation to all participants and staff who have contributed to the UK Biobank and FinnGen consortium. We are also grateful to all cervical cancer patients in the Cancer Hospital of Shantou University Medical College for their invaluable collaboration. We are very grateful to Prof. Stanley Lin for the language editing effort.
Author Contributions
All authors made a significant contribution to the work reported, whether that is in the conception, study design, execution, acquisition of data, analysis and interpretation, or in all these areas; took part in drafting, revising or critically reviewing the article; gave final approval of the version to be published; have agreed on the journal to which the article has been submitted; and agree to be accountable for all aspects of the work.
Funding
This study was supported by the Science and Technology Planning Project of Shantou, Guangdong province (No. STKJ2024065, and No. STKJ2024077) and Basic and Applied Basic Research Foundation – Enterprise Joint Fund General Program of Guangdong Province, China (No. 2022A1515220128).
Disclosure
The authors report no conflicts of interest in this work.
References
1. Cohen PA, Jhingran A, Oaknin A, Denny L. Cervical cancer. Lancet. 2019;393(10167):169–182. doi:10.1016/S0140-6736(18)32470-X
2. Siegel RL, Giaquinto AN, Jemal A. Cancer statistics, 2024. CA Cancer J Clin. 2024;74(1):12–49. doi:10.3322/caac.21820
3. Singh D, Vignat J, Lorenzoni V, et al. Global estimates of incidence and mortality of cervical cancer in 2020: a baseline analysis of the WHO global cervical cancer elimination initiative. Lancet Glob Health. 2023;11(2):e197–e206. doi:10.1016/s2214-109x(22)00501-0
4. Bray F, Parkin DM, Gnangnon F, et al. Cancer in sub-Saharan Africa in 2020: a review of current estimates of the national burden, data gaps, and future needs. Lancet Oncol. 2022;23(6):719–728. doi:10.1016/s1470-2045(22)00270-4
5. Crosbie EJ, Einstein MH, Franceschi S, Kitchener HC. Human papillomavirus and cervical cancer. Lancet. 2013;382(9895):889–899. doi:10.1016/s0140-6736(13)60022-7
6. Kusakabe M, Taguchi A, Sone K, Mori M, Osuga Y. Carcinogenesis and management of human papillomavirus-associated cervical cancer. Int J Clin Oncol. 2023;28(8):965–974. doi:10.1007/s10147-023-02337-7
7. Nelson CW, Mirabello L. Human papillomavirus genomics: understanding carcinogenicity. Tumour Virus Res. 2023;15:200258. doi:10.1016/j.tvr.2023.200258
8. de Freitas AC, Gurgel AP, Chagas BS, Coimbra EC, Do Amaral CM. Susceptibility to cervical cancer: an overview. Gynecol Oncol. 2012;126(2):304–311. doi:10.1016/j.ygyno.2012.03.047
9. Singh N, Baby D, Rajguru J, Patil P, Thakkannavar S, Pujari V. Inflammation and cancer. Ann Afr Med. 2019;18(3):121. doi:10.4103/aam.aam_56_18
10. Wei LH, Kuo ML, Chen CA, et al. Interleukin-6 promotes cervical tumor growth by VEGF-dependent angiogenesis via a STAT3 pathway. Oncogene. 2003;22(10):1517–1527. doi:10.1038/sj.onc.1206226
11. Vitkauskaite A, Urboniene D, Celiesiute J, et al. Circulating inflammatory markers in cervical cancer patients and healthy controls. J Immunotoxicol. 2020;17(1):105–109. doi:10.1080/1547691x.2020.1755397
12. Kim SC, Glynn RJ, Giovannucci E, et al. Risk of high-grade cervical dysplasia and cervical cancer in women with systemic inflammatory diseases: a population-based Cohort Study. Ann Rheum Dis. 2015;74(7):1360–1367. doi:10.1136/annrheumdis-2013-204993
13. Hemmat N, Bannazadeh Baghi H. Association of human papillomavirus infection and inflammation in cervical cancer. Pathog Dis. 2019;77(5). doi:10.1093/femspd/ftz048
14. Michels N, van Aart C, Morisse J, Mullee A, Huybrechts I. Chronic inflammation towards cancer incidence: a systematic review and meta-analysis of epidemiological studies. Crit Rev Oncol Hematol. 2021;157:103177. doi:10.1016/j.critrevonc.2020.103177
15. Najafi S. Circular RNAS as emerging players in cervical cancer tumorigenesis; a review to roles and biomarker potentials. Int J Biol Macromol. 2022;206:939–953. doi:10.1016/j.ijbiomac.2022.03.103
16. Deo RC. Machine learning in medicine. Circulation. 2020;142(16):1521–1523. doi:10.1161/circulationaha.120.050583
17. Swanson K, Wu E, Zhang A, Alizadeh AA, Zou J. From patterns to patients: advances in clinical machine learning for cancer diagnosis, prognosis, and treatment. Cell. 2023;186(8):1772–1791. doi:10.1016/j.cell.2023.01.035
18. Sanderson E, Glymour MM, Holmes MV, et al. Mendelian randomization. Nat Rev Meth Primers. 2022;2(1). doi:10.1038/s43586-021-00092-5
19. UK Biobank GWAS Results [http://www.nealelab.is/uk-biobank/]. 1400 EHR-derived broad PheWAS codes for 20 million imputed variants in 400,000 white British individuals. 2020. Available from: https://pheweb.org/UKB-SAIGE/.
20. Indurkhya A, Gardiner JC, Luo Z. The effect of outliers on confidence interval procedures for cost‐effectiveness ratios. Stat Med. 2001;20(9–10):1469–1477. doi:10.1002/sim.683
21. Bowden J, Davey Smith G, Haycock PC, Burgess S. Consistent estimation in Mendelian randomization with some invalid instruments using a weighted median estimator. Genet Epidemiol. 2016;40(4):304–314. doi:10.1002/gepi.21965
22. Bowden J, Davey Smith G, Burgess S. Mendelian randomization with invalid instruments: effect estimation and bias detection through Egger regression. Int J Epidemiol. 2015;44(2):512–525. doi:10.1093/ije/dyv080
23. Hartwig FP, Davey Smith G, Bowden J. Robust inference in summary data Mendelian randomization via the zero modal pleiotropy assumption. Int J Epidemiol. 2017;46(6):1985–1998. doi:10.1093/ije/dyx102
24. Zheng J, Baird D, Borges MC, et al. Recent developments in Mendelian randomization studies. Curr Epidemiol Rep. 2017;4(4):330–345. doi:10.1007/s40471-017-0128-6
25. Giambartolomei C, Vukcevic D, Schadt EE, et al. Bayesian test for colocalisation between pairs of genetic association studies using summary statistics. PLoS Genet. 2014;10(5):e1004383. doi:10.1371/journal.pgen.1004383
26. Kurki MI, Karjalainen J, Palta P, et al. FinnGen provides genetic insights from a well-phenotyped isolated population. Nature. 2023;613:7944):508–18. doi:10.1038/s41586-022-05473-8
27. Curtin F, Schulz P. Multiple correlations and Bonferroni’s correction. Biol. Psychiatry. 1998;44(8):775–777. doi:10.1016/s0006-3223(98)00043-2
28. von Elm E, Altman DG, Egger M, Pocock SJ, Gøtzsche PC, Vandenbroucke JP. The strengthening the reporting of observational studies in epidemiology (strobe) statement: guidelines for Reporting Observational Studies. Lancet. 2007;370(9596):1453–1457. doi:10.1016/s0140-6736(07)61602-x
29. Collins GS, Reitsma JB, Altman DG, Moons KG. Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (tripod): the tripod statement. BMJ. 2015;350(jan07 4):g7594–g7594. doi:10.1136/bmj.g7594
30. Citu C, Gorun F, Motoc A, et al. The predictive role of NLR, D-NLR, MLR, and Siri in COVID-19 mortality. Diagnostics. 2022;12(1):122. doi:10.3390/diagnostics12010122
31. Van Buuren S, Mice KGO. Multivariate imputation by chained equations in R. J Stat Softw. 2011;45(3). doi:10.18637/jss.v045.i03
32. Han H, Wang WY, Mao BH. Borderline-SMOTE: a new over-sampling method in imbalanced data sets learning. Lect Notes Comput Sci. 2005;878–887. doi:10.1007/11538059_91
33. Fonti V, Belitser E. Feature selection using lasso. VU Amsterdam Res Paper in Bus Anal. 2017;30:1–25.
34. Kursa MB, Rudnicki WR. Feature selection with the Boruta package. J Stat Softw. 2010;36(11). doi:10.18637/jss.v036.i11
35. Sanz H, Valim C, Vegas E, Oller JM, Reverter F. SVM-RFE: selection and visualization of the most relevant features through non-linear kernels. BMC Bioinf. 2018;19(1). doi:10.1186/s12859-018-2451-4
36. Rufibach K. Use of Brier score to assess binary predictions. J Clin Epidemiol. 2010;63(8):938–939. doi:10.1016/j.jclinepi.2009.11.009
37. Lundberg S, Lee SI. A unified approach to interpreting model predictions. In: NIPS. New York: Curran Associates;2017:4765–4774. doi:10.48550/arXiv.1705.07874
38. Li H, Johnson T. Wilcoxon’s signed-rank statistic: what null hypothesis and why it matters. Pharm Stat. 2014;13(5):281–285. doi:10.1002/pst.1628
39. Li Q, Kaidong L, Tian Z, et al. Association of inflammatory factors with cervical cancer: a bidirectional Mendelian randomization. J Inflamm Res. 2024;17:10119–10130. doi:10.2147/jir.s493854
40. Dang C, Liu M, Liu P, et al. Causal relationship between inflammatory factors and gynecological cancer: a Bayesian Mendelian randomization study. Sci Rep. 2024;14(1). doi:10.1038/s41598-024-80747-x
41. Meier C, Brieger A. The role of IL-8 in cancer development and its impact on immunotherapy resistance. Eur J Cancer. 2025;218:115267. doi:10.1016/j.ejca.2025.115267
42. Watrowski R, Schuster E, Polterauer S, et al. Genetic variants of interleukin-8 and interleukin-16 and their association with cervical cancer risk. Life. 2025;15(2):135. doi:10.3390/life15020135
43. Ma W, Yan H, Ma H, et al. Roles of leukemia inhibitory factor receptor in cancer. Int, J, Cancer. 2025;156(2):262–273. doi:10.1002/ijc.35157
44. Hu H, Zhao Q, Sang Y, et al. Role and mechanism of leukemia inhibitory factor receptor in cervical cancer invasion and metastasis. J Int Med Res. 2023;51(6):3000605231182557. doi:10.1177/03000605231182557
45. Tan N, Wu Y, Li B, Chen W. Burden of cancers in six female organs in China and worldwide. Chin Med J. 2024;137(18):2190–2201. doi:10.1097/cm9.0000000000003293
46. Liu Y, Guo J, Zhu G, Zhang B, Feng XL. Changes in rate and socioeconomic inequality of cervical cancer screening in northeastern China from 2013 to 2018. Front Med. 2022;9. doi:10.3389/fmed.2022.913361
47. Lin W, Wang Y, Liu Z, et al. Inequalities in awareness and attitude towards HPV and its vaccine between local and migrant residents who participated in cervical cancer screening in Shenzhen, China. Cancer Res Treat. 2020;52(1):207–217. doi:10.4143/crt.2019.053
48. Hu G, Liu G, Ma J, Hu R. Lymphocyte-to-monocyte ratio in esophageal squamous cell carcinoma prognosis. Clin Chim Acta. 2018;486:44–48. doi:10.1016/j.cca.2018.07.029
49. Hu R, Ma J, Hu G. Lymphocyte-to-monocyte ratio in pancreatic cancer: prognostic significance and meta-analysis. Clin Chim Acta. 2018;481:142–146. doi:10.1016/j.cca.2018.03.008
50. Benner B, Scarberry L, Suarez-Kelly LP, et al. Generation of monocyte-derived tumor-associated macrophages using tumor-conditioned media provides a novel method to study tumor-associated macrophages in vitro. J Immunother Cancer. 2019;7(1). doi:10.1186/s40425-019-0622-0
51. Shimanuki M, Imanishi Y, Sato Y, et al. Pretreatment monocyte counts and neutrophil counts predict the risk for febrile neutropenia in patients undergoing TPF chemotherapy for head and neck squamous cell carcinoma. Oncotarget. 2018;9(27):18970–18984. doi:10.18632/oncotarget.24863
52. Hu B, Yang XR, Xu Y, et al. Systemic immune-inflammation index predicts prognosis of patients after curative resection for hepatocellular carcinoma. Clin Cancer Res. 2014;20(23):6212–6222. doi:10.1158/1078-0432.ccr-14-0442
53. Ménétrier-Caux C, Ray-Coquard I, Blay JY, Caux C. Lymphopenia in cancer patients and its effects on response to immunotherapy: an opportunity for combination with cytokines? J Immunother Cancer. 2019;7(1). doi:10.1186/s40425-019-0549-5
54. Gabrilovich DI. Myeloid-derived suppressor cells. Cancer Immunol Res. 2017;5(1):3–8. doi:10.1158/2326-6066.cir-16-0297
55. Rosales C. Neutrophils at the crossroads of innate and adaptive immunity. J Leukoc Biol. 2020;108(1):377–396. doi:10.1002/jlb.4mir0220-574rr
56. Li S, Lu Z, Wu S, et al. The dynamic role of platelets in cancer progression and their therapeutic implications. Nat Rev Cancer. 2023;24(1):72–87. doi:10.1038/s41568-023-00639-6
57. Al Mudawi N, Alazeb A. A model for predicting cervical cancer using machine learning algorithms. Sensors. 2022;22(11):4132. doi:10.3390/s22114132
58. Tseng CJ, Lu CJ, Chang CC, Chen GD. Application of machine learning to predict the recurrence-proneness for cervical cancer. Neural Comput Appl. 2013;24(6):1311–1316. doi:10.1007/s00521-013-1359-1
© 2025 The Author(s). This work is published and licensed by Dove Medical Press Limited. The
full terms of this license are available at https://www.dovepress.com/terms
and incorporate the Creative Commons Attribution
- Non Commercial (unported, 4.0) License.
By accessing the work you hereby accept the Terms. Non-commercial uses of the work are permitted
without any further permission from Dove Medical Press Limited, provided the work is properly
attributed. For permission for commercial use of this work, please see paragraphs 4.2 and 5 of our Terms.

