Back to Journals » International Journal of Women's Health » Volume 17
Identification of the Shared Gene Signatures and Pathways Between Polycystic Ovary Syndrome and Endometrial Cancer Using Bioinformatics and Mendelian Randomization Analyses
Authors Ye D
, Yu Y, Xu C, Fu Z, Zhong F, Shen H
Received 21 July 2025
Accepted for publication 9 December 2025
Published 26 December 2025 Volume 2025:17 Pages 5669—5687
DOI https://doi.org/10.2147/IJWH.S555274
Checked for plagiarism Yes
Review by Single anonymous peer review
Peer reviewer comments 3
Editor who approved publication: Dr Matteo Frigerio
Dan Ye,1,* Yi Yu,2,* Chengjie Xu,3 Zhongpeng Fu,4 Fangfang Zhong,5 Haoran Shen1
1Department of Gynecology, Obstetrics & Gynecology Hospital of Fudan University, Shanghai Key Lab of Reproduction and Development, Shanghai Key Lab of Female Reproductive Endocrine Related Diseases, Shanghai, 200433, People’s Republic of China; 2Medical Center of Diagnosis and Treatment for Cervical Diseases, Obstetrics & Gynecology Hospital of Fudan University, Shanghai, 200011, People’s Republic of China; 3Department of Intelligence Science, Obstetrics & Gynecology Hospital of Fudan University, Shanghai, 200011, People’s Republic of China; 4Department of Ultrasonography, Obstetrics & Gynecology Hospital of Fudan University, Shanghai, 200011, People’s Republic of China; 5Department of Pathology, Obstetrics & Gynecology Hospital of Fudan University, Shanghai, 200011, People’s Republic of China
*These authors contributed equally to this work
Correspondence: Haoran Shen, Department of Gynecology, Obstetrics & Gynecology Hospital of Fudan University, Shanghai Key Lab of Reproduction and Development, Shanghai Key Lab of Female Reproductive Endocrine Related Diseases, Shanghai, 200433, People’s Republic of China, Tel +86-021-33189900, Email [email protected] Fangfang Zhong, Department of Pathology, Obstetrics & Gynecology Hospital of Fudan University, No. 419 Fangxie Road, Shanghai, 200011, People’s Republic of China, Tel +86-021-33189900, Email [email protected]
Aim: Polycystic ovary syndrome (PCOS) is a common endocrine disorder with high incidence. It has been reported that patients with PCOS are at great risk of developing endometrial cancer (EC). Our study was aimed to analyze the shared gene signatures and biological mechanism between PCOS and EC.
Methods: The datasets of PCOS and EC were downloaded from Gene Expression Omnibus (GEO) database, weighted gene co-expression network analysis (WGCNA), protein–protein interaction (PPI) network, functional enrichment analysis, miRNA and transcription factor prediction were applied to select key genes and pathways. In addition, Mendelian randomization (MR) was performed to analyze the association of PCOS with EC.
Results: Through WGCNA and PPI network, 8 key genes namely, IL-10, CXCL8, IFNG, MMP9, PECAM1, CYBB, MYD88 and IRF4 were identified. Function enrichment analysis indicated that type I interferon signaling pathway was the most important common pathways for PCOS and EC. Furthermore, a causal effect was found between EC and PCOS (Inverse variance weighted, p < 0.05) after bidirectional MR analysis.
Conclusion: This study, for the first time, systematically investigated the potential association between PCOS and EC through an integrative approach combining bioinformatics analysis and MR analysis. Type I interferon signaling pathway played key regulatory effect in PCOS and EC. Eight genes, such as MMP9, PECAM1 and CYBB, may be key markers linking PCOS and EC.
Keywords: polycystic ovary syndrome, endometrial cancer, gene, pathway, Mendelian randomization
Introduction
Polycystic ovary syndrome (PCOS) is a prevalent endocrine condition that impacts 4–12% of women in their reproductive age.1,2 PCOS is characterized by oligo/anovulation, hyperandrogenism, and the presence of polycystic ovaries. It is associated with a diverse range of clinical manifestations, including menstrual irregularities, infertility, hirsutism, and insulin resistance.3 Endometrial cancer (EC) is the most frequently encountered gynecological malignancy in women. According to global data from 2020, EC accounts for 4.5% of all cancers affecting women.4 Investigating the underlying pathological mechanisms and exploring treatment options for EC has been a key area of interest for scholars and experts.
Accumulating evidence demonstrates that women with PCOS exhibit a significantly elevated risk of developing EC.5–7 Epidemiological data indicate that PCOS-diagnosed women have a threefold increased likelihood of EC incidence compared to undiagnosed counterparts,8 the molecular mechanisms underlying this association remain poorly understood. Recent research has highlighted molecular mechanisms linking PCOS and EC, such as dysregulated steroid hormone signaling, chronic inflammation, and metabolic dysfunction. For instance, elevated androgen levels in PCOS may promote endometrial hyperplasia through aberrant activation of estrogen receptors and progesterone signaling.9 Additionally, characteristics such as obesity and anovulation in PCOS can increase estrogen levels and progesterone resistance, leading to endometrial hyperplasia and eventually EC.10 However, prior investigations into the PCOS-EC link have primarily relied on observational studies or single-omics approaches, which are limited in their ability to disentangle causal relationships or identify shared molecular pathways.
Gene expression profiling and bioinformatic analyses have emerged as pivotal approaches for identifying characteristic gene expression patterns, dysregulated biological pathways, and gene interaction networks. For instance, Surleen Kaur et al demonstrated through transcriptomic profiling that differentially expressed genes (DEGs) in PCOS tissues were significantly enriched in metabolic disorder and oxidative stress pathways, with potential implications in carcinogenesis.11 Another integrative genomic study identified 36 significantly dysregulated genes, among which 10 exhibited co-expression features across EC, ovarian cancer, and breast cancer, primarily involved in cell proliferation regulation, hormone response, and endogenous stimulus reactions.12 Despite these advancements, it is crucial to acknowledge that reliance solely on bioinformatic predictions and machine learning models presents inherent limitations in elucidating the complex molecular interconnections between PCOS and EC.
The rapid advances in genetics have led to the emergence of Mendelian randomization (MR) in medical research. MR analysis leverages genetic variants as instrumental variables to infer causal relationships between exposures and outcomes, minimizing confounding biases inherent in observational studies.13 Integrating MR with multi-omics approaches, such as weighted gene co-expression network analysis (WGCNA) and protein-protein interaction (PPI) modeling, enables a comprehensive exploration of shared genetic architectures and pathway-level mechanisms. In this study, we present the systematic investigation of the molecular links between PCOS and EC through an integrative framework combining bioinformatics, MR, and experimental validation. This study is the first to: (1) systematically map the molecular overlap between PCOS and EC using multi-omics data integration; (2) identify the type I interferon signaling pathway as a central regulatory mechanism in both conditions; and (3) establish a causal relationship between EC and PCOS through bidirectional MR analysis. Our findings provide novel insights into the genetic basis of PCOS-EC comorbidity and highlight potential biomarkers for further mechanistic and clinical investigation.
Data and Methods
Data Acquisition and Preprocessing
The datasets of PCOS (GSE95728, GSE34526 and GSE137684) and EC (GSE17025 and GSE63678) were downloaded from GEO database. The inclusion criteria were as follows: (1) datasets were obtained by searching with disease-specific keywords and consisted of human clinical biospecimens; (2) each dataset included both disease and healthy control groups, with a minimum of three biological replicates per group; and (3) consistency in sample source type and sequencing methodology was maintained across compared samples. The detailed information of datasets is displayed in Table 1.
|
Table 1 Information of NCBI GEO Data Set |
Raw CEL files were normalized using the Robust Multi-array Average (RMA) algorithm in R “affy” package to correct background signals and perform quantile normalization. After downloading the preprocessed, standardized and log2 transformed probe expression matrices, the probe-to-gene annotation was performed using the platform-specific annotation files to ensure consistency across datasets. Probes from each dataset were aligned to reference genome annotations, with those failing to map to any annotated gene being excluded. For genes represented by multiple probes, the median expression value was calculated to derive a representative expression measure. To eliminate systematic technical variation across integrated datasets, batch effect correction was performed using the ComBat algorithm implemented in the “sva” R package.14 This empirical Bayes framework-based approach effectively adjusts for non-biological variation while maintaining biologically meaningful differences. Principal Component Analysis (PCA) was employed both pre- and post-batch correction to visually assess the efficacy of the ComBat normalization procedure.
Disease-Related Module Selection by WGCNA
Based on the merged dataset of PCOS and the GSE17025 dataset of EC, the genes with large variation were selected, and the input genes were analyzed using R package WGCNA 1.61,15 so as to identify gene set modules with high covariation. When executing the WGCNA algorithm, the module partition thresholds were set as: minModuleSize=50, and MEDissThres=0.3.
Then the module genes related to the two diseases were intersected to obtain the common pathogenic genes. Gene Ontology (GO)16 and KEGG17 pathway enrichment analysis were performed on the obtained common pathogenic genes using clusterProfiler 3.8.1,18 and adjusted p < 0.05 was considered as a significant enrichment result. Due to the redundancy of GO BP results, R package simplifyEnrichment 1.4.019 was used to process the GO BP result. The package divided the GO similarity matrix through the binary cut method.
Differential Expression Analysis
The DEGs between disease and control groups were analyzed by the classical Bayesian method provided by the limma 3.10.3.20 Genes with thresholds of p < 0.05 and |logFC| > 0.5 were selected as DEGs.
Then, the DEGs for PCOS and EC were intersected to obtain the common DEGs.
PPI Network and Function Analysis
The PPI network analysis was conducted using the STRING database21 (Version 11.5; https://string-db.org/) to investigate functional associations among DEGs and identify key regulatory hubs involved in disease pathogenesis. We constructed a DEG-derived PPI network with interaction confidence scores filtered at a composite score threshold >0.4 to ensure statistical significance. Network visualization and topological analysis were performed using Cytoscape software22 (Version 3.7.2; https://cytoscape.org/). The Molecular Complex Detection (MCODE) algorithm,23 implemented as a Cytoscape plugin, was employed to detect densely connected functional modules within the PPI network. Default MCODE parameters were applied: degree cutoff = 2, node score cutoff = 0.2, k-core = 2, and maximum depth = 100. The module with the highest MCODE score (>5) was selected as key module genes.
The key module genes were performed GO and KEGG pathway enrichment analysis.
In addition, we also used CytoNCA plugin24 for network nodes topology analysis, including Degree, Betweenness, Closeness, Eigenvector, Subgragh, Information, LAC and Networkand, and the parameter was set to “without” weight. The top30 genes under each topological attribute were intersected.
Analysis of Key Function and Pathways
The common pathogenic genes obtained by WGCNA, and the common DEGs obtained by differential expression analysis were further intersected to obtain the common differentially expressed (DE)-pathogenic genes. Then, these genes were subjected to GO function and KEGG pathway analyses based on Cytoscape plugin ClueGO + CluePedia.25 The threshold of significance was set to adjusted p ≤ 0.01, and cytoscape was used to construct the functional network.
Validation of Key Genes and Functional Pathways
The validation sets of PCOS and EC were also used to analyze the DEGs using the classical Bayesian method provided by limma package. Genes with s of p < 0.05 and |logFC| > 0.5 were selected as DEGs. Then the overlapped DEGs were identified, which were also performed enrichment analysis of GO and KEGG pathway using clusterProfiler. Adjusted p value < 0.05 was considered as significant enrichment result. In addition, for the key genes obtained in the previous stage, the expression distribution in the verification set was shown by using box diagram.
PCOS and EC Shared miRNA Prediction Analysis
Using HMDD V3.0 database,26 the microRNAs (miRNAs) related to PCOS and EC were retrieved and intersected to select the common miRNAs of the two diseases. Then, the common miRNAs was subjected to GO analysis using the DIANA-miRPath v3.0.27
Predictive Analysis of Transcription Factors for Key Genes
The online database TRRUST V2.028 was used to predict the upstream transcription factors (TFs) of the key genes obtained from the above analysis. TFs with Q value < 0.05 were selected. Combined with the targeting relationship with key genes, TF-target network was constructed by using cytoscape software.
Mendelian Randomization Analysis
A bidirectional two-sample MR analysis was conducted using the “TwoSampleMR”package29 in R to explore the causal relationships between between PCOS and EC. Genome-wide association study (GWAS) data for EC (ukb-b-13545, European ancestry population, sample size = 462,933, cases/controls = 1,151/461,782) and PCOS (finngen_R11_E4_PCOS, European ancestry population, sample size = 243,907, cases/controls = 1,909/241,998) were obtained from the IEU OpenGWAS database30 (https://gwas.mrcieu.ac.uk/) and FinnGen database31 (https://www.finngen.fi/en), respectively. Instrumental variables (IVs) were selected based on the following criteria: (1) Single nucleotide polymorphisms (SNPs) associated with the exposure at a relaxed threshold of p < 1 × 10−5 (due to insufficient SNP counts under the stringent threshold of p < 5 × 10−8); (2) Removal of SNPs in linkage disequilibrium (LD) using clumping parameters (r² = 0.001, kb = 10,000); (3) Exclusion of weak instruments with F-statistics < 10; (4) Adjustment for potential confounding phenotypes using PhenoScanner. MR analyses were conducted using the TwoSampleMR R package (v0.5.6), applying five algorithms: inverse-variance weighted (IVW), MR-Egger, weighted median, simple mode, and weighted mode. All analyses adhered to the three core assumptions of MR: (1) Relevance (genetic variants are strongly associated with the exposure); (2) Independence (genetic variants are unrelated to confounders); (3) Exclusion restriction (genetic variants affect the outcome only through the exposure).
Heterogeneity was assessed using the mr_heterogeneity test (Q_pval > 0.05 indicated no significant heterogeneity), and horizontal pleiotropy was evaluated via MR-PRESSO (1000 simulations, p > 0.05 indicated no confounding pleiotropy). Robustness was further confirmed through leave-one-out analysis to exclude outlying SNPs. Causal effects were primarily determined by the IVW model (p < 0.05), with results visualized using scatter plots, forest plots, and funnel plots. Sensitivity analyses ensured the reliability of causal estimates, confirming no significant bias from individual SNPs or pleiotropic effects.
Patients
After approval from the Obstetrics and Gynecology Hospital of Fudan University, we retrospectively identified cases via text searches with a final diagnosis of atypical endometrial hyperplasia (AEH) or endometrial cancer (EC) (G1, stage IA, FIGO 2009) accessioned from January to December 2024 in this hospital. Further, screen out patients with or without PCOS in the two groups respectively. A specialty pathologist reviewed H&E slides, and a specialty sonographer re-examined the ultrasonic images for each case to verify the original diagnoses. Finally, six patients were selected from each of the PCOS-AEH group, PCOS-EC group, AEH group, and EC group for analysis. This study protocol adheres to the ethical principles outlined in the World Medical Association Declaration of Helsinki and has been formally reviewed and approved by the Institutional Review Board of Obstetrics & Gynecology Hospital of Fudan University (Approval No.: kyy2024-85).
Measurement of Serum Hormones and Metabolism Indicators
Concentrations of serum follicle-stimulating hormone (FSH), luteinizing hormone (LH), total testosterone (TT) and sex-hormone binding globulin (SHBG) were measured by radioimmunoassay according to the kit instructions (Siemens DPC). Serum fasting insulin (FINS), fasting blood glucose (FBG) were quantified by a Beckman Coulter AU5800 instrument, and Serum thyroid-stimulating hormone (TSH) was quantified by a Roche Cobas E801 instrument. Serum AMH was detected with the UNION immune analyzer’s AMH detection kit (single test strip) (YHLO, Shenzhen, China). The free androgen index (FAI) was calculated as follows: (testosterone (nM) × 100)/SHBG (nM). Body mass index (BMI) and the homeostasis model assessment-insulin resistance (HOMA-IR) index were calculated, and metabolic syndrome (MS) criteria were evaluated as reported previously.32 According to the standards for Chinese adults, obesity is defined as a body mass index (BMI) of ≥ 28 kg/m². The homeostasis model assessment of insulin resistance (HOMA-IR) index (fasting blood glucose [mmol/L] × fasting insulin [μU/mL]/22.5) is used to evaluate the status of insulin resistance (IR). When HOMA-IR ≥ 2.5, the patient is considered to have IR.
Immunohistochemistry
For CXCL8, IFN-γ, IL-10, and MMP9 staining protocols previously validated for clinical testing were performed in the clinical pathology laboratory on a the automatic immunochemical staining machine BondIII (M-211668 and M-212599, Leica, Germany). The following primary antibodies were used: CXCL8 (rabbit polyclonal, PA5-79913, Thermo Scientific, Waltham, MA, USA), IFN-γ (rabbit polyclonal, PA5-95560, Thermo Scientific, Rockford, IL, USA), IL-10 (rabbit polyclonal, PA5-85660, Invitrogen, San Diego, USA) and MMP9 (rabbit polyclonal, PA5-13199, Thermo Scientific, Waltham, MA, USA) with antigen retrieval performed in low pH (6.0) for β-catenin and high pH (9.0) Tris/EDTA solution (Agilent) for the other markers at 97°C for 20 minutes.
Statistical Analysis
Statistical analysis was conducted using SPSS 26.0 software (IBM Corporation, USA) and GraphPad Prism 9.0 software (GraphPad Software Inc., USA). Measurement data were expressed as mean ± standard deviation (Mean ± SD). Prior to analysis, normality (Kolmogorov–Smirnov test) and homogeneity of variance (Levene’s test) were assessed. For intergroup comparisons between two normally distributed groups with equal variances, independent samples t-test was applied. One-way analysis of variance (ANOVA) followed by Tukey’s post-hoc test was employed for multiple group comparisons. Statistical significance was defined as a p-value less than 0.05.
Results
Data Preprocessing
According to the method, batch effect correction was first performed on the two datasets of PCOS, and density distribution map was drawn as shown in Figure S1. The sample distribution of each dataset before batch effect removal was quite different, indicating that there was a batch effect. After the batch effect was removed, the data distribution among two datasets tended to be consistent, with similar meaning and variance.
WGCNA for Disease-Related Modules Screening
WGCNA analysis was performed on the PCOS analysis data set based on the expression values of TOP50% genes with large variation in each sample, and the soft threshold value of 4 was first selected (Figure 1A). Secondly, based on clustering and dynamic pruning methods, high-correlation genes were grouped into modules, and these modules were then clustered. Modules with correlation coefficient greater than 0.7, that was, modules with difference coefficient less than 0.3, were merged, and finally 6 modules were identified (Figure 1B).
Figure 1 Continued. Figure 1 Continued.![]()
![]()
Further, by calculating the correlation between the eigenvector gene of each module and the phenotype, the turquoise module (2865 genes) showed the most significant positive correlation with PCOS (correlation coefficient r=0.85; p < 0.001). Additionally, the yellow module (2531 genes) also showed a significant positive correlation with PCOS (correlation coefficient r = 0.67; p < 0.001). Thus, these two module genes were regarded as the key module genes related to PCOS (Figure 1C).
For EC, the soft threshold value was 3 (Figure 1D), and 11 modules were identified (Figure 1E). The green module (593 genes) showed the most significant positive correlation with EC (correlation coefficient r=0.63; p < 0.001). Additionally, the midnightblue module (73 genes) also showed a significant positive correlation with EC (correlation coefficient r = 0.43; p < 0.001). Thus, these two module genes were regarded as the key module genes related to EC (Figure 1F).
Then, the key module genes of the two diseases were intersected, and a total of 211 common pathogenic genes were obtained (Figure 1G). GO enrichment analysis of the 211 common pathogenic genes identified 239 biological processes that were clustered into 27 clusters through similarity clustering (Figure 1H). Four KEGG pathways were obtained, including type I interferon signaling pathway, genomic instability, cellular senescence, cell cycle, viral protein interaction with cytokine and cytokine receptor, and cytokine-cytokine receptor interaction (Figure 1I).
Differential Expression Analysis
According to the thresholds, 1403 up-regulated and 1129 down-regulated DEGs were obtained from PCOS. A total of 2783 up-regulated genes and 3789 down-regulated genes were obtained from EC. The volcano map was drawn, as shown in Figure 2A and B. By intersecting the uniformly up/down-regulated DEGs of the two diseases, 287 common up-regulated DEGs and 205 common down-regulated DEGs were obtained (Figure 2C).
Figure 2 Continued.![]()
PPI Network Analysis
Further, we constructed the PPI network for the above common DEGs, and obtained 1374 interaction pairs composed of 406 gene proteins (Table S1), which proved that there was a close interaction between these genes, which may play an important role in disease progression (Figure 2D). Further mining and analysis of PPI network identified 3 sub-modules with score>5 (Figure 2E).
GO and KEGG pathway enrichment analyses were performed on these three sub-modules successively. Module 1 was mainly enriched in type I interferon signaling pathway and NOD-like receptor signaling pathway. Module 2 was mainly enriched in cell chemotaxis, cell adhesion molecules, IL-17 signaling pathway, cellular senescence, TGF-β signaling pathway, etc. Module 3 was mainly enriched in neutrophil degranulation, acute inflammatory response, ferroptosis, HIF-1 signaling pathway (Figure 2F).
Finally, we used CytoNCA plug-in to conduct node topological property analysis. The TOP30 genes under each attribute were selected for intersection. Eight genes namely, IL-10, CXCL8, interferon gamma (IFNG, INF-γ), matrix metalloproteinase-9 (MMP9), platelet/endothelial cell adhesion molecule 1 (PECAM1), CYBB, MYD88 and IRF4 were considered to be the key genes (Figure 2G).
Analysis of Key Functional Pathways
In order to further determine which functions or pathways were the key influencing mechanisms of the two diseases, the intersection of the shared WGCNA module genes and common up/down-regulated genes obtained from the above analysis was extracted, and 76 shared DE pathogenic genes were finally obtained (Figure 3A).
Further GO function and KEGG pathway analysis on the 76 DE-pathogenic genes revealed that type I interferon signaling pathway was significantly enriched and occupies the largest proportion among all functional pathways (Figure 3B).
Key Pathway and Gene Validation Analysis
For the validation sets, differential gene expression analysis was also performed for PCOS and EC respectively, and the volcano maps are shown in Figure 4A. Venn analysis identified 28 common up-regulated DEGs and 22 common down-regulated DEGs (Figure 4B). GO and KEGG pathway enrichment analyses of these common up/down-regulated DEGs also identified the type I interferon signaling pathway (Figure 4C), further suggesting that type I interferon signaling pathway was a key regulatory pathway.
In order to verify the expression levels of key genes, the box diagram of the expression level distribution of the 8 key genes in the PCOS and EC validation set was drawn (Figure 4D). Except for the non-significant difference of CYBB in the EC validation set, the remaining genes showed significant up-regulation, which was consistent with the previous analysis results.
miRNAs and TFs Prediction
Based on the HMDD database, 36 miRNAs related to PCOS and 7 miRNAs related to EC were detected, among which hsa-miR-146a was associated with both diseases. Therefore, we focused on the functional pathway analysis of this miRNA, and the results showed that the type I interferon signaling pathway was significantly enriched again, demonstrating the importance of this pathway (Figure 5A).
Additionally, 26 TFs were predicted for the 8 key genes, and the TF-target network was constructed (Figure 5B).
Two-Sample MR Analysis
Analysis of Causal Relationship Between EC and PCOS
With EC as the exposure factor, PCOS as the outcome variable, and independent SNPs screened by harmonise_data function in the R-package TwoSampleMR as instrumental variable, MR analysis was performed using mr function in TwoSampleMR. Five algorithms (MR Egger, Weighted median, Inverse variance weighted (IVW), Simple mode, and Weighted mode) was applied. The results showed that there was a significant causal relationship between EC and PCOS (p < 0.05 and OR > 1 for IVW). EC was a risk factor for PCOS. Sensitivity analysis revealed that, there was no heterogeneity, and horizontal pleiotropy. Leave-one-out analysis showed that there was no serious bias, suggesting that the results were reliable (Figure 6A).
|
Figure 6 Two-sample MR analysis. (A) Leave-One-Out forest map for EC-PCOS. (B) Leave-One-Out forest map for PCOS –EC. |
Analysis of Causal Relationship Between PCOS and EC
With PCOS as the exposure factor, EC as the outcome variable, and independent SNPs screened by harmonise_data function in the R-package TwoSampleMR as instrumental variable, MR analysis was performed using mr function in TwoSampleMR. According to the IVW result, there was no significant causal relationship between PCOS and EC (p > 0.05). Sensitivity analysis revealed that the results were reliable (Figure 6B).
Expression Level of Key Genes
Total 24 patients were included in this study. The baseline characteristics of patients are shown in Table 2.
|
Table 2 Clinical Characteristics of Atypical Endometrial Hyperplasia (AEH), Endometrial Cancer (EC), PCOS-AEH and PCOS-EC |
The protein levels of IL-10, CXCL8, IFN-γ and MMP9 in AEH, EC, PCOS-AEH and PCOS-EC groups were detected by immunohistochemistry. As shown in Figure 7, the expression levels of IFN-γ and MMP9 in AEH, EC, PCOS-AEH and PCOS-EC showed a gradually increasing trend. For IL-10, its expression level in EC and PCOS-EC groups was significantly higher than that in AEH group. These results were consistent with the results of bioinformatics analysis.
Discussion
Currently, the association between PCOS and EC is well established, however, inconsistencies in current evidence mean that there is still a lack of clear understanding of the strength of this association. In this study, WGCNA and PPI network were applied to identify hub genes for PCOS and EC, and 8 key genes namely, IL-10, CXCL8, IFNG, MMP9, PECAM1, CYBB, MYD88 and IRF4 were identified. Function enrichment analysis indicated that type I interferon signaling pathway was the most important common pathways for PCOS and EC. Furthermore, through bidirectional MR, a causal effect was found between EC and PCOS (IVW, p < 0.05). Heterogeneity and sensitivity analysis showed that there was no pleiotropy in the estimation of causality between the two analyses and it was not affected by a single SNP, which further enhanced the reliability of the results.
Type I interferons (IFN-1) are cytokines that affect the expression of thousands of genes, resulting in profound cellular changes.33 IFN-1 was initially recognized as an antiviral agent, which has been reported to play essential roles in establishing and modulating host defense against microbial infection via induction of IFN-stimulated genes through JAK-STAT signaling pathway.34 Now, it became clear with time that cellular activation by IFN-1 is much broader than just fighting viruses. A recent study reported that IFN signaling is essential in the implantation process in normal endometrium, and dysregulation of IFN signaling is associated with endometriosis progression.35 The chronic low-grade inflammatory state has become one of the factors affecting PCOS. Some alterable immune factors in PCOS, such as IL-15 and IL-1, have been identified to be related to androgen synthesis and insulin resistance in PCOS. IFN-I plays a critical role in the regulation of inflammation and it is associated with various inflammatory diseases.36 To our best knowledge, our study for the first time suggested the important role of type I interferon signaling pathway in EC and PCOS. Further studies are needed to explore the detailed mechanism of this pathway in PCOS and EC.
In addition to this pathway, we also identified 8 key genes (IL-10, CXCL8, IFNG, MMP9, PECAM1, CYBB, MYD88 and IRF4), which may play key roles in the pathological process of PCOS in combination with the EC. Among the 8 genes, 5 were found to be enriched in the type I interferon signaling pathway, including IL-10, CXCL8, CYBB, MYD88 and IRF4. Thus we speculated that the 5 genes may be involved in the pathological process of PCOS the type I interferon signaling pathway. The exact mechanism of actions needs further explored.
For the remaining genes, MMP9 is a zinc-dependent enzyme, which has been documented to play a pivotal part in various diseases. Curry et al37 reported that MMP9 was involved in tissue remodeling of the ovaries and uterus and responsible for follicle growth. Elevated MMP9 expression has been reported to be associated with trophoblast invasion during pregnancy.38 Previous studies have hypothesized that elevated levels of MMP-9 may be associated with irregular menstruation and increased risk of cardiovascular disease in patients with PCOS.37,39 Functional studies have showed that MMP9 is expressed in proliferative phase endometrium, hyperplastic endometrium and EC.40,41 Moreover, the expression level of MMP-9 increases with the development of endometrial disease.42 IFNG is common in tumor microenvironment and body inflammation, which is primarily produced by T cells and NK cells in response to various inflammatory or immune stimuli.43 On the one hand, it acts as an immunogenicity enhancer by up-regulating the expression of genes required for MHC and antigen processing.44 On the other hand, IFNG can bind to PD-1 to act on tumor infiltrating T cells and inhibit tumor immune regulation.45 A recent study has reported that the IFNG related lncRNA signature predicts prognosis and indicates immune microenvironment infiltration in EC.46 PECAM1 is a soluble signaling molecule involved in inflammation and angiogenesis with predictive value for endothelial dysfunction in patients at risk. It has been reported that soluble PECAM1 is increased in PCOS and related to endothelial dysfunction.47 However, its role in EC has not been fully investigated.
Some previous observational studies have focused on the relationship between PCOS and EC. For instance, a meta-analysis demonstrated that PCOS was associated with a higher risk of EC.7 However, the included studies were of moderate or low quality, and they did not adjust their results for confounding factors. Another two studies revealed no relationship between PCOS and risk of EC after adjusting for BMI.48,49 The contradictory conclusions may be caused by the confounding factors. Thus, we used MR to analyze the association of PCOS with EC. Compared to conventional observational studies, MR can provide relatively strong and accurate evidence because it can control for the influence of confounding factors.13 Our results revealed a causal effect between EC and PCOS (IVW, p < 0.05). Heterogeneity and sensitivity analysis identified no pleiotropy in the estimation of causality and the result was not affected by a single SNP, suggesting the reliability of the results.
The identification of shared molecular signatures between PCOS and EC offers potential clinical applications. The type I interferon signaling pathway, a key regulator in both conditions, could serve as a biomarker for risk stratification or disease monitoring in PCOS patients. For instance, elevated expressions of interferon-stimulated genes (eg, IFNG, CXCL8) in endometrial tissues might indicate heightened EC risk, guiding personalized screening protocols. Additionally, the causal link from EC to PCOS (identified via MR) suggests that EC patients may benefit from endocrine evaluations to detect subclinical PCOS features. Therapeutically, targeting interferon signaling or its downstream effectors (eg, MMP9, MYD88) could offer dual benefits for metabolic and inflammatory dysregulation in PCOS while mitigating EC progression. However, these hypotheses require validation in prospective clinical trials.
This study has several limitations. First, the bioinformatics analysis relied on publicly available datasets with relatively small sample sizes. Second, while MR analysis mitigates confounding, the GWAS data used (European populations) could introduce population bias, restricting conclusions to individuals of similar ancestry. Third, the validation cohort included only 24 patients, and protein expression was assessed for four genes (IL-10, CXCL8, IFNG, MMP9), leaving other key genes unverified. Finally, the functional interpretation of shared pathways remains correlative; mechanistic experiments (eg, gene knockdown or overexpression in endometrial/ovarian cell lines) are needed to establish causality in cellular or animal models.
Conclusion
In conclusion, our study for the first time explored the association between PCOS and EC using both bioinformatics analysis and MR analysis. Type I interferon signaling pathway played key regulatory effect in PCOS and EC. Eight genes, such as MMP9, PECAM1 and CYBB, may be key markers linking PCOS and EC. Our study offers a clear perspective on the genetic connection between EC and PCOS. Further experimental and clinical studies should be conducted to verify the identified molecular signatures.
Data Sharing Statement
All data generated or analysed during this study are included in this published article [and its supplementary information files].
Ethics Approval and Consent to Participate
Informed consent was obtained from all subjects involved in the study. This study protocol adheres to the ethical principles outlined in the World Medical Association Declaration of Helsinki and has been formally reviewed and approved by the Institutional Review Board of Shanghai Obstetrics & Gynecology Hospital, Fudan University (Approval No.: kyy2024-85).
Acknowledgments
Dan Ye and Yi Yu contribute equally to this work and are co-first authors.
Funding
This work was supported by the Shanghai Educational Science Research Project (C2023160).
Disclosure
The authors report no conflicts of interest in this work.
References
1. Harris HR, Terry KL. Polycystic ovary syndrome and risk of endometrial, ovarian, and breast cancer: a systematic review. Fertil Res Pract. 2016;2(14):016–0029. doi:10.1186/s40738-016-0029-2
2. Okamura Y, Saito F, Takaishi K, et al. Polycystic ovary syndrome: early diagnosis and intervention are necessary for fertility preservation in young women with endometrial cancer under 35 years of age. Reprod Med Biol. 2016;16(1):67–71. doi:10.1002/rmb2.12012
3. Fauser B. Revised 2003 consensus on diagnostic criteria and long-term health risks related to polycystic ovary syndrome. Fertil Steril. 2004;81(1):19–25. doi:10.1016/j.fertnstert.2003.10.004
4. Sung H, Ferlay J, Siegel RL, et al. Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin. 2021;71(3):209–249. doi:10.3322/caac.21660
5. Scicchitano P, Dentamaro I, Carbonara R, et al. Cardiovascular risk in women with PCOS. Int J Endocrinol Metabolism. 2012;10(4):611. doi:10.5812/ijem.4020
6. Shafiee MN, Chapman C, Barrett D, et al. Reviewing the molecular mechanisms which increase endometrial cancer (EC) risk in women with polycystic ovarian syndrome (PCOS): time for paradigm shift? Gynecologic Oncol. 2013;131(2):489–492. doi:10.1016/j.ygyno.2013.06.032
7. Barry JA, Azizia MM, Hardiman PJ. Risk of endometrial, ovarian and breast cancer in women with polycystic ovary syndrome: a systematic review and meta-analysis. Human Reproduction Update. 2014;20(5):748–758. doi:10.1093/humupd/dmu012
8. Haoula Z, Salman M, Atiomo W. Evaluating the association between endometrial cancer and polycystic ovary syndrome. Hum Reprod. 2012;27(5):1327–1331. doi:10.1093/humrep/des042
9. Park JC, Lim SY, Jang TK, et al. Endometrial histology and predictable clinical factors for endometrial disease in women with polycystic ovary syndrome. Clin Exp Reprod Med. 2011;38(1):42–46. doi:10.5653/cerm.2011.38.1.42
10. Huang X, Zhong R, He X, et al. Investigations on the mechanism of progesterone in inhibiting endometrial cancer cell cycle and viability via regulation of long noncoding RNA NEAT1/microRNA-146b-5p mediated Wnt/β-catenin signaling. IUBMB Life. 2019;71(2):223–234. doi:10.1002/iub.1959
11. Kaur S, Archer KJ, Devi MG, et al. Differential gene expression in granulosa cells from polycystic ovary syndrome patients with and without insulin resistance: identification of susceptibility gene sets through network analysis. J Clin Endocrinol Metab. 2012;97(10):E2016–21. doi:10.1210/jc.2011-3441
12. Yumiceba V, López-Cortés A, Pérez-Villa A, et al. Oncology and pharmacogenomics insights in polycystic ovary syndrome: an integrative analysis. Front Endocrinol. 2020;11:585130. doi:10.3389/fendo.2020.585130
13. Emdin CA, Khera AV, Kathiresan S. Mendelian Randomization. JAMA. 2017;318(19):1925–1926. doi:10.1001/jama.2017.17219
14. Leek JT, Johnson WE, Parker HS, et al. The sva package for removing batch effects and other unwanted variation in high-throughput experiments. Bioinformatics. 2012;28(6):882–883. doi:10.1093/bioinformatics/bts034
15. Langfelder P, Horvath S. WGCNA: an R package for weighted correlation network analysis. BMC Bioinf. 2008;9(559):1471–2105. doi:10.1186/1471-2105-9-559
16. Ashburner M, Ball CA, Blake JA, et al. Gene ontology: tool for the unification of biology. The gene ontology consortium. Nat Genet. 2000;25(1):25–29. doi:10.1038/75556
17. Kanehisa M, Goto S. KEGG: kyoto encyclopedia of genes and genomes. Nucleic Acids Res. 2000;28(1):27–30. doi:10.1093/nar/28.1.27
18. Yu G, Wang L-G, Han Y, et al. clusterProfiler: an R package for comparing biological themes among gene clusters. Omics. 2012;16(5):284–287. doi:10.1089/omi.2011.0118
19. Gu Z, Hübschmann D. simplifyEnrichment: a bioconductor package for clustering and visualizing functional enrichment results. Genomics Proteomics Bioinf. 2023;21(1):190–202. doi:10.1016/j.gpb.2022.04.008
20. Smyth GK. Limma: Linear Models for Microarray Data, in Bioinformatics and Computational Biology Solutions Using R and Bioconductor. Springer; 2005:397–420.
21. Szklarczyk D, Franceschini A, Wyder S, et al. STRING v10: protein-protein interaction networks, integrated over the tree of life. Nucleic Acids Res. 2015;43(Database issue):28. doi:10.1093/nar/gku1003
22. Shannon P, Markiel A, Ozier O, et al. Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res. 2003;13(11):2498–2504. doi:10.1101/gr.1239303
23. Bader GD, Hogue CW. An automated method for finding molecular complexes in large protein interaction networks. BMC Bioinf. 2003;4(2):1471–2105. doi:10.1186/1471-2105-4-2
24. Tang Y, Li M, Wang J, et al. CytoNCA: a cytoscape plugin for centrality analysis and evaluation of protein interaction networks. Biosystems. 2015;127:67–72. doi:10.1016/j.biosystems.2014.11.005
25. Bindea G, Mlecnik B, Hackl H, et al. ClueGO: a Cytoscape plug-in to decipher functionally grouped gene ontology and pathway annotation networks. Bioinformatics. 2009;25(8):1091–1093. doi:10.1093/bioinformatics/btp101
26. Huang Z, Shi J, Gao Y, et al. HMDD v3.0: a database for experimentally supported human microRNA-disease associations. Nucleic Acids Res. 2019;47(D1):D1013–D1017. doi:10.1093/nar/gky1010
27. Vlachos IS, Zagganas K, Paraskevopoulou MD, et al. DIANA-miRPath v3. 0: deciphering microRNA function with experimental support. Nucleic Acids Res. 2015;43(W1):W460–W466. doi:10.1093/nar/gkv403
28. Han H, Cho J-W, Lee S, et al. TRRUST v2: an expanded reference database of human and mouse transcriptional regulatory interactions. Nucleic Acids Res. 2018;46(D1):D380–D386. doi:10.1093/nar/gkx1013
29. Hemani G, Zheng J, Elsworth B, et al. The MR-Base platform supports systematic causal inference across the human phenome. elife. 2018;7:e34408. doi:10.7554/eLife.34408
30. Elsworth B, Lyon M, Alexander T, et al. The MRC IEU OpenGWAS data infrastructure. BioRxiv. 2020;
31. Miller EC, Kauko A, Tom SE, et al. Risk of midlife stroke after adverse pregnancy outcomes: the FinnGen Study. Stroke. 2023;54(7):1798–1805. doi:10.1161/STROKEAHA.123.043052
32. Yang B, Xie L, Zhang H, et al. Insulin resistance and overweight prolonged fertility-sparing treatment duration in endometrial atypical hyperplasia patients. J Gynecol Oncol. 2018;29(3):19. doi:10.3802/jgo.2018.29.e35
33. Schreiber G. The molecular basis for differential type I interferon signaling. J Biol Chem. 2017;292(18):7285–7294. doi:10.1074/jbc.R116.774562
34. Chen K, Liu J, Cao X. Regulation of type I interferon signaling in immunity and inflammation: a comprehensive review. J Autoimmun. 2017;83:1–11. doi:10.1016/j.jaut.2017.03.008
35. Park Y, Han SJ. Interferon signaling in the endometrium and in endometriosis. Biomolecules. 2022;12(11):1554. doi:10.3390/biom12111554
36. Ji L, Li T, Chen H, Yang Y, Lu E, Liu J, Qiao W, Chen H. The crucial regulatory role of type I interferon in inflammatory diseases. Cell Biosci. 2023;13(1):023–01188.
37. Curry TE, Osteen KG. The matrix metalloproteinase system: changes, regulation, and impact throughout the ovarian and uterine reproductive cycle. Endocr Rev. 2003;24(4):428–465. doi:10.1210/er.2002-0005
38. Staun-Ram E, Goldman S, Gabarin D, et al. Expression and importance of matrix metalloproteinase 2 and 9 (MMP-2 and −9) in human trophoblast invasion. Reprod Biol Endocrinol. 2004;2(59):1477–7827. doi:10.1186/1477-7827-2-59
39. Lewandowski KC, Komorowski J, O’Callaghan CJ, et al. Increased circulating levels of matrix metalloproteinase-2 and −9 in women with the polycystic ovary syndrome. J Clin Endocrinol Metab. 2006;91(3):1173–1177. doi:10.1210/jc.2005-0648
40. Graesslin O, Cortez A, Fauvet R, et al. Metalloproteinase-2, −7 and −9 and tissue inhibitor of metalloproteinase-1 and −2 expression in normal, hyperplastic and neoplastic endometrium: a clinical-pathological correlation study. Ann Oncol. 2006;17(4):637–645. doi:10.1093/annonc/mdj129
41. Zhang X, Qi C, Lin J. Enhanced expressions of matrix metalloproteinase (MMP)-2 and −9 and vascular endothelial growth factors (VEGF) and increased microvascular density in the endometrial hyperplasia of women with anovulatory dysfunctional uterine bleeding. Fertil Steril. 2010;93(7):2362–2367. doi:10.1016/j.fertnstert.2008.12.142
42. Li X, Zha L, Li B, Sun R, Liu J, Zeng H. Clinical significance of MMP-9 overexpression in endometrial cancer: a PRISMA-compliant meta-analysis. Front Oncol. 2022;12:925424.
43. Kobayashi SD, Malachowa N, DeLeo FR. Neutrophils and bacterial immune evasion. J Innate Immun. 2018;10(5–6):432–441. doi:10.1159/000487756
44. Dunn GP, Old LJ, Schreiber RD. The immunobiology of cancer immunosurveillance and immunoediting. Immunity. 2004;21(2):137–148. doi:10.1016/j.immuni.2004.07.017
45. Zaidi MR. The interferon-gamma paradox in cancer. J Interferon Cytokine Res. 2019;39(1):30–38. doi:10.1089/jir.2018.0087
46. Gu C, Lin C, Zhu Z, et al. The IFN-γ-related long non-coding RNA signature predicts prognosis and indicates immune microenvironment infiltration in uterine corpus endometrial carcinoma. Front Oncol. 2022;12(955979). doi:10.3389/fonc.2022.955979
47. Pepene CE. Soluble platelet/endothelial cell adhesion molecule (sPECAM)-1 is increased in polycystic ovary syndrome and related to endothelial dysfunction. Gynecol Endocrinol. 2012;28(5):370–374. doi:10.3109/09513590.2011.632792
48. Fearnley EJ, Marquart L, Spurdle AB, et al. Polycystic ovary syndrome increases the risk of endometrial cancer in women aged less than 50 years: an Australian case-control study. Cancer Causes Control. 2010;21(12):2303–2308. doi:10.1007/s10552-010-9658-7
49. Zucchetto A, Serraino D, Polesel J, et al. Hormone-related factors and gynecological conditions in relation to endometrial cancer risk. Eur J Cancer Prev. 2009;18(4):316–321. doi:10.1097/CEJ.0b013e328329d830
© 2025 The Author(s). This work is published and licensed by Dove Medical Press Limited. The
full terms of this license are available at https://www.dovepress.com/terms
and incorporate the Creative Commons Attribution
- Non Commercial (unported, 4.0) License.
By accessing the work you hereby accept the Terms. Non-commercial uses of the work are permitted
without any further permission from Dove Medical Press Limited, provided the work is properly
attributed. For permission for commercial use of this work, please see paragraphs 4.2 and 5 of our Terms.
Recommended articles
Causal Effects of Immune Cells on Reproductive Ill-Health, Including Abnormal Spermatozoa, Polycystic Ovary Syndrome and Spontaneous Abortion: Mendelian Randomization Analyses
Chen S, Sun S, Zhou Z, Zhou Z, Zhang R, Song W, Xin H, Yang Q, Dai S, Huang K, Niu W, Shi H, Guo Y
Journal of Multidisciplinary Healthcare 2025, 18:3219-3232
Published Date: 6 June 2025
Beyond BMI: A Mendelian Randomization Study of the Causal Effects and Mediating Pathways of Regional Adipose Tissue Depots on Polycystic Ovary Syndrome
Yang J, Zhang X, Zhang H, Guo X, Ren F, Dong C
International Journal of Women's Health 2025, 17:3279-3291
Published Date: 25 September 2025
Causal Relationships Among Lifestyle, Lipidome, and Polycystic Ovary Syndrome: A Mendelian Randomization Study
Kou X, Jing X
International Journal of Women's Health 2025, 17:4291-4300
Published Date: 7 November 2025
Integrative Mendelian Randomization and Whole-Blood Transcriptomic Analysis Implicate a Myeloid–Inflammation Axis in Polycystic Ovary Syndrome
Fu Y, Li W, Ma L
International Journal of Women's Health 2026, 18:576568
Published Date: 14 January 2026
East Asian Mendelian-Randomization Evidence Linking PCOS to Gestational Diabetes Mellitus
Du J, Xing L, Chen Y, He Z, Zhong L, Zhao R
International Journal of Women's Health 2026, 18:559022
Published Date: 22 January 2026
