Integrated Datasets of Proteomic and Metabolomic Biomarkers to Predict Its Impacts on Comorbidities of Type 2 Diabetes Mellitus

Objective The objective of the current study is to accomplish a relative exploration of the biological roles of differentially dysregulated genes (DRGs) in type 2 diabetes mellitus (T2DM). The study aimed to determine the impact of these DRGs on the biological pathways and networks that are related to the associated disorders and complications in T2DM and to predict its role as prospective biomarkers. Methods Datasets obtained from metabolomic and proteomic profiling were used for investigation of the differential expression of the genes. A subset of DRGs was integrated into IPA software to explore its biological pathways, related diseases, and their regulation in T2DM. Upon entry into the IPA, only 94 of the DRGs were recognizable, mapped, and matched within the database. Results The study identified networks that explore the dysregulation of several functions; cell components such as degranulation of cells; molecular transport process and metabolism of cellular proteins; and inflammatory responses. Top disorders associated with DRGs in T2DM are related to organ injuries such as renal damage, connective tissue disorders, and acute inflammatory disorders. Upstream regulator analysis predicted the role of several transcription factors of interest, such as STAT3 and HIF alpha, as well as many kinases such as JAK kinases, which affects the gene expression of the dataset in T2DM. Interleukin 6 (IL6) is the top regulator of the DRGs, followed by leptin (LEP). Monitoring the dysregulation of the coupled expression of the following biomarkers (TNF, IL6, LEP, AGT, APOE, F2, SPP1, and INS) highlights that they could be used as potential prognostic biomarkers. Conclusion The integration of data obtained by advanced metabolomic and proteomic technologies has made it probable to advantage in understanding the role of these biomarkers in the identification of significant biological processes, pathways, and regulators that are associated with T2DM and its comorbidities.


Objective:
The objective of the current study is to accomplish a relative exploration of the biological roles of differentially dysregulated genes (DRGs) in type 2 diabetes mellitus (T2DM). The study aimed to determine the impact of these DRGs on the biological pathways and networks that are related to the associated disorders and complications in T2DM and to predict its role as prospective biomarkers. Methods: Datasets obtained from metabolomic and proteomic profiling were used for investigation of the differential expression of the genes. A subset of DRGs was integrated into IPA software to explore its biological pathways, related diseases, and their regulation in T2DM. Upon entry into the IPA, only 94 of the DRGs were recognizable, mapped, and matched within the database. Results: The study identified networks that explore the dysregulation of several functions; cell components such as degranulation of cells; molecular transport process and metabolism of cellular proteins; and inflammatory responses. Top disorders associated with DRGs in T2DM are related to organ injuries such as renal damage, connective tissue disorders, and acute inflammatory disorders. Upstream regulator analysis predicted the role of several transcription factors of interest, such as STAT3 and HIF alpha, as well as many kinases such as JAK kinases, which affects the gene expression of the dataset in T2DM. Interleukin 6 (IL6) is the top regulator of the DRGs, followed by leptin (LEP). Monitoring the dysregulation of the coupled expression of the following biomarkers (TNF, IL6, LEP, AGT, APOE, F2, SPP1, and INS) highlights that they could be used as potential prognostic biomarkers. Conclusion: The integration of data obtained by advanced metabolomic and proteomic technologies has made it probable to advantage in understanding the role of these biomarkers in the identification of significant biological processes, pathways, and regulators that are associated with T2DM and its comorbidities. Keywords: type 2 diabetes mellitus, pathway analysis, regulators, biomarkers, disorders, bioinformatics

Plain Summary
In summary, the incorporation of metabolomic and proteomic data through integrative pathway analysis using different tools would help in understanding the role of various biomarkers in the identification of the biological processes, pathways, upstream regulators, and pathophysiology that are associated with Type 2 Diabetes Mellitus (T2DM) and comorbidities. Therefore, such a study could help to recognize those patients at higher risk for a specific complication and its response to a particular class of anti-diabetic drugs. This study could help in personalized medicine for T2DM.

Introduction
Recent decades had indicated a remarkable upsurge in the prevalence of diabetes mellitus (DM) worldwide, particularly of type 2 diabetes. 1 Type 2 Diabetes Mellitus (T2DM) is the most diagnosed form of diabetes characterized by insulin resistance, impaired β-cell function, hyperglycemia, and some comorbidities, including obesity and cardiovascular disease. 2 The potential impact of diabetes on health, health care system, financial cost, and life expectancy increases in the upcoming years. Optimum treatment of T2DM requires a set of potentially multiple measures to manage hyperglycemia, hyperlipidemia, and to address the risk factors for the array of diabetic complications.
Identification of biomarkers for T2DM and its complications is a challenging issue because of the diverse nature of this disease. Different factors contribute to the heterogenicity of type DM such as the glycemic control, treatment response, duration of the diseases, age of onset and biochemical profile, body mass index, and variations in environmental exposures, which could affect the disease diversity. 3 Biomarkers are needed for the evaluation of chemical profiles, disease status, target validation, and treatment regimens. Advances in proteomics and metabolic profiling have increased the screening for experimental biomarkers. Serum biomarkers currently exist for T2DM, but it remains a challenge to evaluate pathophysiology on a patient by patient basis. Therefore, novel biomarkers based on the integration of different profiles such as metabolomics, proteomics, and transcriptomics would better reflect the regulation of gene expression and the biologic process in diabetes for preventative strategies and lessen the complications are needed.
Metabolites symbolize intermediate and end products of metabolic pathways that reflect the physiology and dysfunctions of metabolic processes and disorders. Recent technology allows for the assessment of metabolites opening new opportunities to study changes in biochemical pathways for insight into the biological mechanisms of disorders such as T2DM and its comorbidities. 4 This integration constitutes the promise of personalized medicine (PM).
The personalized medicine could help in the screening of subjects at risk of developing T2DM, as well as one or all of the complicating morbidities associated with microangiopathies, such as retinopathy, neuropathy, nephropathy, and macroangiopathy or large-vessel disease. They also have the potential to direct treatment planning, regarding personalized goal setting, choice of treatments, and treatment prioritization. 5 In this study, the aim is to accomplish a comparative and integrative investigation of metabolomic and proteomic datasets of gene expression to identify differentially regulated genes (DRGs) as potential predictive biomarkers associated with type 2 diabetes and its complications. The integration of the dataset of DRGs is used to reveal significant pathways and biological functions and diseases that are relevant to understand the pathogenesis of associated comorbidities of diabetes and its complications. The study pursued to recognize biological processes and metabolic pathways of DRGs, which are interrelated to T2DM that were differentially up-or downregulated in comparison to healthy controls. Therefore, the overall target of the current study is to recognize those patients at higher risk for a disorder or complication associated with T2DM through understanding the biological networks and pathways underlie these diseases that could respond better to management and drug treatment.

Study Selection and Sample Collection
Datasets from two previously published studies on T2DM by our group were selected for gene expression integration. 5,6 Each gene was described by fold change. All data was collected from serum/plasma samples of patients at the Hamad Medical Hospital, Qatar (HMC) with T2DM versus a healthy control group, all subjects were unrelated Arab subjects of different countries to ensure lack of inheritance. A total of 140 subjects were involved in the present study, of which 85 T2DM subjects and 55 healthy controls, non-diabetics. T2DM was diagnosed by the medical team of the diabetic unit (HMC) according to the American Diabetes Association (ADA) criteria, consisting of fasting plasma glucose ≥ 126 mg/dL (≥6.993 mmol/L), 2 hr. plasma glucose ≥200 mg/dL (11.1 mmol/L) during an oral glucose tolerance test and/or HbA1C ≥ 6.5%. 7 The diagnostic criteria were based on the diagnostic standards based on oral GTT following ADA diagnostic criteria to be sure that all study subjects, including the controls, are not pre-diabetic or diabetic. The age of diabetic patients was older than the age of the control, but this was accounted for in the current study as we compared the data using AUC, and a Supplementary Table 1 was provided. The study was approved by the Institutional Review Board of the Hamad Medical Corporation, Qatar University, and Georgetown University (HMC approval number 8249/08, QU-IRB-06/09 and 2008-538, respectively). The study was performed according to the principles expressed in the Declaration of Helsinki. Written informed consent was obtained from each subject after a full explanation of the purpose, nature, and risk of all procedures used.
After overnight fasting, venous blood was collected as previously described. 5 Serum, plasma, and buffy coat were separated from the whole blood and stored at −80ºC within 4 hours of collection. For maximum longevity and to avoid repeated freeze-thaw cycles, the plasma, and serum samples were aliquoted extensively and stored at −80ºC till further use. 5 For the metabolomic and lipidomic profiling experiment levels of high-density lipoprotein cholesterol, total cholesterol, and triglycerides were assayed by automated clinical laboratory methods using a diagnostic analyzer. Low density lipoprotein cholesterol levels were estimated using the Friedewald formula: LDL-C = TC -HDL-C (TG/5). 8 Serum aminotransferase, albumin, alkaline phosphatase, and creatinine were also assayed using a diagnostic analyzer at HMC as previously published. 9,10 Metabolite extraction from plasma was done by adding 175 µL of 66% acetonitrile (in water) containing internal standards to 25 µL of plasma. The samples were incubated on ice for 15 minutes and centrifuged at 14,000 rpm at 4ºC for 20 minutes. The supernatant was transferred to a fresh tube and dried under a vacuum. The dried samples were resuspended in 100 µL of solvent A (98% water and 2% acetonitrile) for UPLC-ESI-Q-TOF-MS analysis. In order to increase metabolome coverage, plasma lipidomics was performed by extracting lipids using the method described by. 11 For the protein expression profiling experiment, serum samples were delipidated according to the protocol described by Cham and Knowles in preparation for iTRAQ analysis. 12 ProteoExtract Albumin/IgG (from Calbiochem), and Vivapure anti-HSA-IgG kits were used to evaluate the efficiency of high abundance protein depletion from serum samples. Total protein concentration was calculated by the Bradford Assay. The Vivapure anti-HSA /IgG kit was used for the iTRAQ experiment.

Data Processing and Metabolite Identification
Centroided and integrated UPLC-TOFMS data were preprocessed using the XCMS software and normalized to the ion intensity of the respective internal standards for the metabolomic and lipidomic profiling experiment. 13 Multivariate analyses were performed to delineate significantly altered metabolites. The metabolites were identified via accurate mass-based search using the Madison Metabolomics Consortium Database (MMCD), 14 the Human Metabolome Database (HMDB), 15 and LIPID MAPS. The lipids with significant fold change in T2DM as compared to the control group were identified via a spectral matching based lipid identification software, SimLipid v 3.0 and LIPID MAPS, 16 and confirmed against fragmentation pattern of standards. Metabolite identifications confirmed by comparing the retention time under the same chromatographic conditions and by matching the fragmentation pattern of the parent ion from the biological sample to that of the standard metabolite using tandem mass spectrometry (UPLC-TOFMS/MS).

Nano-UPLC-MS/MS and iTRAQ Analysis
Nano UPLC-MS/MS analysis was conducted by an electrospray quadrupole time of flight (ESI-QTOF) mass spectrometer coupled with a Nano-Acquity-UPLC system. Relative abundance quantitation, peptide, and protein identifications were performed using Protein Pilot software 3.0 (ABSCIEX). Data were analyzed with MMTS as a fixed modification of cysteine, and the database was searched with a 95% confidence interval rate for protein identifications. High confidence peptides of the target proteins exhibiting rich production spectrum were selected for multiple reaction monitoring (MRM) assays. MRM data were processed using TargetLynx 2.0, while Graph Pad Prism program v 5.0 was used for statistical analysis and to generate the receiver operating characteristics. The Wilcoxon test was used for the comparison of each peptide.

Luminex Analysis
The serum samples of the study subjects (Diabetics and controls) were used to evaluate the different panels of biomarkers, including the inflammatory, adipokines, oxidative stress, metabolic, CVD, and bone markers. The following kits were used to evaluate such different panels, including the following multiplex assays, HMHMAG-34, HCVD1-67AK, APOMAG-62K, HADK1MAG-61k-03, HBN1A-51K, and HCVD2MAG-67K. All assays were performed in triplicates according to the manufacturer's instructions from Millipore (Merck Millipore, Billerica, MA, USA). The assays were performed using a Luminex200 (Austin, TX, USA). Of note, PCA analysis was performed for all dataset of the study to obtain DRGs as we previously published. 5,6 Statistical Analyses Clinical and biochemical data are expressed as mean ± SD. All statistical analyses were performed using the SPSS program for Windows (version 21 statistical software: Texas instruments, IL, USA). Differences between control and T2DM were performed using Student's t-test or Mann-Whitney/Wilcoxon when appropriate. ANCOVA was used to perform analysis, including the age to compare the biochemical data of the study subjects and AUC (see Supplementary Table 1). Two-tailed p value is significant when p < 0.05.

Bioinformatic Analysis
Ingenuity Pathway Analysis (IPA; http://www.ingenuity. com/) was performed to identify canonical pathways, diseases and functions, and gene networks that are most significant to the dataset and to categorize differentially dysregulated genes in specific diseases and functions. We also used Pathway Studio 9 (Elsevier) for Integrated pathway mapping. For Network generation, a data set containing gene/protein/metabolite identifier and corresponding fold change and UniProt ID was uploaded into the application. Each protein/metabolite ID was mapped to its corresponding gene/protein/metabolite in the Ariadne ResNet Mammalian database. The metabolites were grouped based on their fold change and used to develop networks based on regulation and connectivity.

Clinical Data of the Study Subjects
The characteristics of the study population are described in Table 1. As shown in Table 1, diabetic subjects have significantly higher values for age, BMI, glucose, HbA 1 c%, estimated average glucose (eAC), triglycerides, ALT, and CRP than control healthy subjects. Diabetic subjects have significantly lower values for C-peptide and insulin than healthy control subjects. Other variables are not significantly different between the two studied groups. Table 2 displays 94 differentially expressed proteins, of note few proteins (9) are downregulated, while the majority (85) are up-regulated, as shown in Table 2.

Bioinformatic Analysis
Data represent the differentially expressed regulated proteins were further evaluated to obtain details of biological processes, cellular functions, networks, and signaling pathways related to comorbidities and diseases associated with T2DM. The dataset was integrated into different software such as IPA software, and Pathway Studio 9 software for the analysis of molecular pathways and networks.

Biological Pathways Canonical Pathway Analysis
The dataset was analyzed using IPA core analysis to achieve a fundamental profile of the molecular processes underlying T2DM and its complications. Related canonical pathways categorized the differentially expressed genes. Categorization was based on a multiple testing correction of p-value less than 10 −2 of the present dataset to the numbers of genes of the IPA knowledge data of each pathway. Figure 1 displays a conical illustration of the top 15 significant enriched biological pathways in T2DM patients (see Supplementary Table S2 for details). The top 5 enriched signaling pathways in rank were LXR/RXR Activation, FXR/RXR Activation, Acute Phase Response Signaling,

Atherosclerosis
Signaling, and Clathrin-mediated Endocytosis, as shown in Table 3.
The Retinoid X Receptors (RXRs) were on top of the signaling pathways, which includes 31 upregulated genes and 3 downregulated genes of the 128, as shown in Supplementary Table S3). RXRs are nuclear receptors that exert the biological effects of retinoids by the participation of retinoic acid-mediated gene, which affects biological functions such as lipid metabolism, molecular transport, small molecule biochemistry. The second pathway is The Farnesoid X receptor (FXR), which includes 29 upregulated genes and 3 downregulated genes of the 137 genes of IPA base knowledge (Table S4). FXR is a member of the nuclear family of receptors and has a fundamental function in the regulation of numerous metabolic pathways, such as bile acid metabolism and its control ( Figure 2). The third pathway is the Acute Phase Response Signaling, whereas 34 DRGs of the dataset were detected of which one gene is downregulated, and 33 were upregulated out of the 188 genes of IPA base knowledge (Table S5). This pathway is a cytokine signaling pathway where it is activated by tissue injury, trauma, surgery, cancer, immunologic disorders, and in response to microorganisms as a protective pathway. Following that, the  next fourth pathway is the Atherosclerosis Signaling, whereas 20 DRGs of the data set were detected, of which two were downregulated, and 18 were upregulated of the 127 genes of IPA base knowledge (Table S6). This pathway is a specific form of a chronic inflammatory process that functions as a cell to cell signaling and interactions and cellular movement in the cardiovascular system. The next pathway is Clathrin-mediated Endocytosis Signaling, whereas 19 DRGs of the data set were detected, of which 2 genes were downregulated, and 17 were upregulated out of the 196 genes of IPA base knowledge shown in (Table S7). Clathrin-mediated  Endocytosis Signaling is involved in endocytosis, which is the principal pathway for the movement of nutrients, hormones, and other signaling molecules from the extracellular into intracellular structures across the plasma membrane. Other canonical biological pathways were IL-12 Signaling and Production in Macrophages pathway, whereas 16 DEGs of the dataset were detected of which one gene was downregulated, and 15 were upregulated of the 136 genes of the IPA base knowledge (Table S2). IL-12 is produced primarily by dendritic cells, macrophages, and monocytes, and affecting Th1 immune response and Th17 cells activation. Following that, the Production of Nitric Oxide and Reactive Oxygen Species in Macrophages, whereas 16 DEGs of the data set were detected of which one gene is downregulated, and 15 were upregulated of the 194 genes of IPA base knowledge as shown (Table S2). This pathway is central to the control of infection by microbes.

Diseases Associated with DRGs
Using IPA base knowledge, we detected several significant diseases associated with the present dataset of the current study (Table S8 and S9). The top five affected diseases in rank based on higher p values for multiple testing for corrections were presented in Table 4. These disorders are Neurologic Diseases, the Organismal Injury, and Abnormalities, the Psychological Disorders, the Inflammatory Response Disorders, and Metabolic Disorders. Further, we analyze the top significant disorders associated with each category of these diseases, as shown in Table 5.

Neurologic and Psychological Diseases Category
Among this category, progressive neurological disorders included 3 DRGs of the data set of which APOE and SERPING1 are upregulated, and INS is downregulated, which increases the prediction of the progressive neurological disorders, as shown in Table 5. Dementia is a neurological and psychological disorder that included 2 DRGs of the data set, whereas APOE is upregulated, and the INS is downregulated, which increases the prediction of the prediction to dementia, as shown in Table 5. Alzheimer's disease is a neurological and psychological disorder that included 2 DRGs of the data set, whereas APOE is upregulated, and the INS is downregulated, which increases the prediction of the prediction to Alzheimer's diseases as shown in Table 5. Eating disorder is a significant psychological problem in diabetic subjects. As displayed in Table 5. The eating disorder is predicted to increase with six DRGs, which are LEP, IL6, GCG, SERPINE1, TNF, and INS.

Organismal Injury and Abnormalities Category
Amyloidosis, is one of the Organismal Injury and Abnormalities. As displayed in Table 5, INS, IL6 increases while APOE and APOA1 decrease the prediction of amyloidosis.
Apoptosis of the endothelial cells is one of the Organismal Injury and Abnormalities predicted to be increased with (z-score 3.29). As presented in Table 5, the apoptosis of the endothelial cells is predicted to increase with 12 genes of DRGs, which are PLG, RBP4, KNG1, LEP, SERPINA3, AGT, SERPINE1, LUM, SERPINC1, TNF, and SPP1.
The damage of the genitourinary system is predicted to be increased with (z-score 2.203). As displayed in Table 5, the damage of the genitourinary system is predicted to increase with seven DRGs, which are CFB, AGT, IL6, C4A/C4B, ICAM1, TNF, and GC.

Inflammatory Response Disorders Category
Degranulation of cells is predicted to be increased with active z-score 2.60. As displayed in Table 5, the degranulation of cells is predicted to increase with seven genes, which are PPBP, CCL2, F2, LEP, GCG, C4A/C4B, and TNF. ( Figure S1) Activation of leukocytes is predicted to be increased with active z-score 2.359. As displayed in Table 5, the activation of leukocytes is predicted to increase with 17 genes of DRGs, which are RBP4, KLKB1, KNG1, CCL2, F2, LEP, AGT, IL6, APOA1, IGHG3, ATP, SERPINF1, FN1, ICAM1, TNF, GC, and VTN.

Metabolic Disorders Category
Hypertriglyceridemia is on the top of the metabolic disorders and cardiovascular diseases (CVD) associated with T2DM, with Z-score of 2.035. Hypertriglyceridemia is predicted to increase with the following upregulated genes APOA2, APOB, APOC3, APOE, IL6, SERPINF1, TNF, and the downregulated gene INS as shown in Table 5. Dyslipidemia is a metabolic disorder commonly associated with T2DM with a Z-score of 1.260. Dyslipidemia s predicted to increase with the following genes APOA2, APOB, APOC3, IL6, INS, and TNF, as shown in Table 5. See Supplementary figures (Figures S1).

Functional Disorders Associated with Dataset in T2DM
Furthermore, we analyzed the common functional disorders which are associated with DRGs of that set using IPA base knowledge. The top 5 functions affected are Cellular compromise, Protein synthesis, Molecular transport, Lipid metabolism, and Small molecule biochemistry. Table 6 Illustrates the genes associated with the top functions.
Further, we explore the top five functions in each category associated with T2DM and identify the genes of the DRGs of the data set per each function, as shown in Table 7. The most affected function in the cellular component category is the degranulation of cells (Table 7) (Table S10) for details.
We explore some specific disorders associated with organ injuries. Data obtained from the current analysis showed that   Figure 4A and B).

Regulation of the Dysregulated Gene Expressions of the Data Set and Their Impact on Biological Functions and Diseases Upstream Regulator Analysis of the Transcription Regulators and Kinases Factors
Further, we used the IPA upstream regulator analysis that would explain the changes in gene expression as downstream targets. The aim is to understand the underline regulation of the expression changes seen in the dataset of T2DM. Transcription factors are proteins that control the rate of RNA transcription to regulate the gene expression (up and down) based on the cell state and organ activity to help in cell homeostasis. The five most significant upstream transcript regulators based on Z score in the ranking were, Signal transducer and activator of transcription 3 (STAT3) as shown in Figure 5, Signal transducer and activator of transcription 1 (STAT1), Hypoxia-inducible factor 1-Alpha (HIF1A), followed by CCAAT/enhancer-binding protein alpha (CEBPA), followed by CCAAT/enhancer-binding protein beta (CEBPB) is presented in Table 8, and displayed in Supplementary figures (S3). Further, we observed the top 5 activated kinases, which are ATM, JAK2, JAK1, MTOR, and AKT1, as upstream regulators that affect the expression of some genes of the data set as shown in Table 9, and supplementary Figure S4.

Top Regulator Effects
Moreover, we look for the master regulator of proteins of the dataset, which regulate other proteins using IPA stream analysis. Also, this analysis identifies potential mechanisms linked with phenotype changes such as disease or functional disorders and explain the biological role of the upstream regulator, via its regulation on a gene or sets of genes. Table 10 illustrates the top three regulator proteins which regulate other proteins of the dataset. Among the top downregulated proteins of the dataset are, INS targets 10 molecules (Figure 6A), and SPP1 targets 9 of DEG ( Figure S5a), while among the upregulated proteins are TNF targets 32, IL6 targets 29 ( Figure 6B), LEP targets 19, AGT targets 14, and APOE targets 13 molecules of the data set (Table S11), and ( Figure S5b)

Mechanistic Networks of Top Regulators
Further, in order to understand the mechanistic pathway by which a single protein affects a downstream target protein of the data set, we did the mechanistic analysis. For example, insulin is the primary hormone underlying the pathogenesis of T2DM and its comorbidities, which targets TNF, AGT, APOA1, GCG, GHRL, ICAM1, IGKC, IL6, and LEP. It was interesting to investigate the interaction of INS with some of the upregulated target proteins of interest such as TNF, IL6, and LEP, which regulate other proteins to understand the mechanistic of their interactions and crosstalk among a dataset of T2DM. INS upregulates LEP expression directly or indirectly through intermediates such as D-glucose, POMC, PI3 complex, which in turn affects transcription factors such as FOXO1, STAT3, SIRT1, EP 300, and via effects on ligand-dependent nuclear receptor such as PPARG, and NR3C1 ( Figure S6). INS upregulates TNF gene expression directly or indirectly through intermediates such as D-glucose, SIRT1, EP 300, FOS, POMC,  The protein-protein interaction involves many hubs such as kinase, ligand-dependent nuclear receptor, transcription factor regulators, endogenous chemical, enzymes, and growth factors. As displayed in Figure (

Network Analysis
We investigated the interactions among the DRGs, pathways, regulators, and other molecules in type 2 diabetics and identified 12 eligible networks associated with various biologic processes, functions, and diseases-as displayed in a table (Table S12). connected to the hubs of nuclear complexes NCOR-LXR-Oxysterol-RXR-9 cis linked to retinoid signaling activation, and FXR-ligand FXR-Retinoic linked to FXR-RXR activation and bile acid metabolism. The networks display proteins involved in blood hemostasis and coagulation such as F9, KLKB1, SERPINC1, HPX1, which crosstalk with each other and connected to Apolipoproteins via PON1. ERK1/2, which are extracellular signal-regulated kinases, acts as a hub, which is regulated by RBR4, RETN, and HPX, as shown in ( Figure 9B). Network 3 displays the top functions and diseases, which are Developmental Disorder, Humoral Immune Response, Inflammatory Response, which comprise 13 DRGs. The network display two canonical pathways; the acute phase response signaling, which includes AFM, TNF, CFB, CRB, SERPING1, and CIS, and hub of NFKB family. The other crucial canonical pathway (CP) is the complement pathway, which includes genes such as CF1, CRB, SERPING1, CIS1, CIR, CFB, and complete component1 as a hub. The network displays the extensive crosstalk of TNF, SERPING1, and complement in network 3. Moreover, the figure display molecules involved in two crucial disorders associated with this network in T2DM, which are immunodeficiency and rheumatoid diseases (RD). In RD, many molecules are involved, such as complete component1, FCN, TNF, CFB, C1Q, and C1Q, as shown in Figure 9C. Network 4 displays the associated top functions and diseases, which are Cell-To-Cell Signaling and Interaction, Cellular Movement, Inflammatory Response, which comprise 9 DRGs. The top 3 CP are Hepatic Fibrosis, Atherosclerosis signaling, and GP6 signaling pathway. The immune response of macrophage as cell-to-cell signaling disorders involves 3 DRGs, which are IL6, LUM, VTN, the cellular movement disorders include IL6, VTN, and connective tissue disorders include collagen type11, IL6, LUM, PLG, and PRG4. Moreover, severe injury such as  cardiac hypertrophy is displayed, which indicates the involvement of several DRGs of IL6, LUM, LRG1, PLG, with other molecules such as Alpha actinin, PDGF, Tgf beta, and TLR2 and TLR4. These toxic injuries of liver, heart, and kidney are severe complications in long-term and uncontrolled T2DM, as shown in Figure 9D. Other networks are involved in different diseases and functions related to T2DM and its comorbidities such as Cancer, Cardiovascular Disease, Organismal Injury, and Abnormalities (network 5).

Discussion
The pathogenesis of T2DM and its associated comorbidities is presently challenging to identify specific biomarkers and pathways involved in its complications. Diabetes is a chronic polygenic disorder resulted from several biological processes that interact in a dense network, rather than from an abnormality of a single effector gene product. Since the biological functions are the results of molecular interactions, the functional annotations of differentially expressed genes should include the effect of many genes on different pathways and their interactions on the different biologic processes and networks that have a potential impact on T2DM. Understanding the biological pathways and their network information is useful in predicting the risk and understanding the progress of the disease using the integration of proteomics and metabolic dataset.
In the current study, IPA software was used to integrate the fold expression of DRGs for the development of molecular pathways and networks in T2DM. As illustrated in this study, the data explore significant pathways in T2DM subjects compared to control subjects to gain more understandings of the pathogenesis towards diabetic complications. This study identified differentially dysregulated genes as potential prognostic biomarkers involved in critical biological processes and pathways of proteins that are allied with T2DM comorbidities. The most important findings of this study are the identification of the upstream regulators which affect the gene expression such as transcription factors STAT3, STAT1, and HIF1A, cytoplasmic kinases such as JAK kinases and highlight their mechanistic actions that affect the expression. Furthermore, the findings of the present study identified the most commonly expressed genes: Tumor Necrosis Factor, TNF, Interleukin 6; IL6, Leptin; LEP, Angiotensinogen; AGT, Apolipoprotein E; APOE,Coagulation Factor II, Thrombin; F2, Secreted Phosphoprotein 1; SPP1,Resistin; RETN, and Insulin; INS that could be used as potential prognostic biomarkers. The data recognized that IL6 is the top regulator of the DRGs, followed by LEP. The study identified several networks which explore the dysregulation of several functions, including  (14) 56 (23) Note: Data in brackets represent numbers of direct mechanistic network affecting gene expression out of all networks (direct and indirect). cell components, and molecular transport process associated with inflammatory responses that modify the insulin pathway. Top comorbidities and complications associated with DRGs are neurological and psychological disorders, organ injuries related disorders such as renal damage, and connective tissue disorders, and acute inflammatory disorders.
In this study, we identified 94 genes that were differentially dysregulated in Arab subjects with T2DM compared to healthy, non-diabetic controls. In order to explore the mechanisms causing changes in gene expression, we identified upstream regulators in order to provide biological insight into the observed expression changes. The essential top upstream regulator identified was IL6 of the data set, followed by transcription factors and kinases. IL6 is the master of all regulators, which controls 63 genes of the dataset, and it exerts its effects on the observed gene expression via 23 regulatory mechanisms with activation of Z-score (3.175). IL6 interacting directly on PI3 (Peptidase Inhibitor 3) complex, P38MAPK (stress signaler p38 mitogen-activated protein kinase), AKT (Serine/Threonine Kinase), and affects TNF gene expression and also through other regulatory molecules such as STAT3 (Signal Transducer And  Activator Of Transcription 3), NFKBIB (NF-kappa-B inhibitor beta), and FOXO1 (forkhead box O1), (Figures 6  and 7B). The set of 23 regulators in total connects to the 63dataset gene. IL6 is a growth factor with cytokine activity and protein binding. Interleukin 6 (IL-6), a multifunctional cytokine and has been linked to the pathogenesis of T2DM. 17 Increasing the level of circulating IL-6 is a predictor biomarker of T2DM, especially in obese subjects and could be involved in the development of inflammation and insulin resistance. 18 Furthermore, IL6 downregulates the expression of the insulin gene, which contributes to the pathogenesis of T2DM and its related comorbidities. IL6 is participating in numerous biological processes such as acute inflammatory response as one of the top canonical pathways associated with inflammatory response disease such as activation of leukocytes and degranulation of the cells, injury of different organs such as renal damage, connective tissue disorders such as rheumatoid arthritis. It also affects many molecular functions, such as fat and protein metabolism. The complex signal transduction mechanism of IL-6/STAT3 may explicate the widespread effects of the IL6 as a cytokine. 19 Monitoring of IL6 in T2DM is of clinical significance as it is involved in many related complications such as atherosclerosis (Figure 8), hypertension (Figure 8), renal damage (Figure 4), metabolic, inflammatory disorders, and organs and tissues damage (Table 5).  indicates its pleiotropic cellular effects such as expression, proliferation, apoptosis, growth, differentiation, and migration. In the current study, STAT3 regulates the expression of AGT, AHSG (Alpha 2-HS Glycoprotein), APOA4 (Apolipoprotein A4), ATP, CCL2 (C-C Motif Chemokine Ligand 2), CFB (Complement Factor B), FN1 (Fibronectin 1), HP (Haptoglobin), ICAM1 (Intercellular Adhesion Molecule 1), and IL6 (Table 8, Figure 5). The current findings of the role of STA3, and STAT1 as TFs, indicates its significant role in acute-phase response and inflammation, which is a hallmark of the biological process and related disorders associated with comorbidities of T2DM such as the damage of kidney which is enriched in the dataset. Previous studies indicated the role of STAT 3 protein in insulin resistance and diabetes and related disorders such as damage of kidney, degranulation of cells, and apoptosis of endothelial cells in microangiopathy. 20 Such exploration of the role of the transcription factors and its downstream target genes could explain the numerous complications associated with T2DM.
Further, as upstream regulators that affect the gene expression of the dataset, we identified several kinases ( Table 9, Figure S4), which can target several genes of the data set. AKT (Protein kinase B, PKB) is one of 3 closely related serine/threonine-protein kinases (AKT1, Akt2, and AKT3), which regulates metabolism, proliferation, cell growth, and angiogenesis. 21 The dataset of the present study showed that AKT is regulated by insulin, EGF (Epidermal Growth Factor), PDGF (Platelet-Derived Growth Factor complex), TNF, and IGFI (Insulin-Like Growth Factor 1), and it regulates downstream several TFs such as GS3 B (Glutamine synthetase root isozyme B), NOS3 (Nitric Oxide Synthase 3), FOXO1 (Forkhead Box O1), mTOR (Mechanistic Target Of Rapamycin Kinase), and NFKB (Nuclear Factor Kappa B). It affects glucose metabolism function through translocation of the SLC2A4/GLUT4 (Solute Carrier Family 2 Member 4) glucose transporter to the cell surface post-insulin signaling effect. 21,22 The analysis of the present findings of the current study demonstrated that the expression of the dysregulated gene is controlled by many factors such as transcription factors, cellular kinases, growth factors and cytokines such as IL6, TNF, and LEP, and all of them are involved in consequence of the biological process, pathways and diseases associated with T2DM, as we discussed further in next paragraphs.
The canonical pathway analysis by the core analysis of IPA ( Figure 1) indicated several critical signaling pathways involved in the pathogenesis of T2DM and its comorbidities, such as Retinoid X Receptors (RXRs), and Acute Phase Response Signaling in the dataset of T2 DM (Table 3, and supplementary Tables S2-S7). RXRs are nuclear receptors that affect biologic functions such as lipid metabolism, molecular transport, and small molecule biochemistry. Activation of RXR is involved in cholesterol efflux in macrophages through an effect by the following DRGs of apolipoproteins family; APOE, APOC1, APOC2, APOC4, while inhibiting APOA5. In hepatocytes, activation of APOA4 enhances cholesterol efflux. The apolipoprotein family is transporters that are involved in lipid metabolism, which altered in diabetes and predisposed to several disorders such as atherosclerosis and cardiovascular diseases. 23 Furthermore, TNF alpha is one of the DRGs which is involved in the transport of lipids, steroids, and efflux of cholesterol. TNF uses different signaling pathways such as NFKBIA, and P38MAPK to exert its effect on lipid transport. 24 Such data are supported with the related functions observed in the present study such as fat acid metabolism, and lipolysis ( Figure S2), and are interconnected to hypertriglyceridemia as a top disease related to metabolic disorder and considered as the leading platform and risk factor for multiple disorders such as cardiovascular disorders, and progressive neurological disorders ( Table 5). The data of the present study revealed another nuclear receptor that is the Farnesoid X Receptor (FXR). The FXR is a member of the nuclear family of receptors and has emerged as a critical player in the control of numerous metabolic pathways such as a sensor of bile acid and its regulation (Figure 2). Bile acid receptor plays a vital role in fat and glucose metabolism. 25 The activation of the membrane G-protein receptor 5 (TGR5) by bile acids is associated with metabolic actions such as ameliorating insulin resistance via GLP-1 secretion. 26 Furthermore, the acute phase response pathway is one of the significant pathways observed in the current dataset. The pathway and its molecules are involved in inflammation included upregulated proteins via NFKB (Nuclear Factor Kappa B) as an upstream regulator, which is connected with 2761 nodes and could elucidate the plethoric effects of the acute phase response signaling pathway. One of these disorders associated with acute inflammatory response is the damage of the kidney. (Figure 4A and B) 28 all are involved in the damage of the kidney.
Numerous essential metals are essential for the biological functions of several enzymes, proteins, and transcriptional regulators. It is also vital in many biochemical reactions for cell functions in different tissues. For instance, Zn, Mg, and Mn are cofactors of several enzymes that participated in various biological pathways. Zn is implicated in the biosynthesis and secretion of insulin hormone from the beta-cells of the pancreas. Likewise, Cr augments the activity of insulin receptors on muscle cells, which increases the insulin-stimulated glucose uptake. 32 Several studies have described the pathogenic role of some essential metals might harmfully disturb pancreatic functions that lead to the development of diabetes. 32 The data set of the current study showed that many DRGs is involved in increased prediction to amount of metal ions in T2DM such as CCL2 (C-C Motif Chemokine Ligand 2), F2 (Thrombin), FN1 (Fibronectin 1), GCG (Glucagon), GHRL (Ghrelin And Obestatin Prepropeptide), APOE, APOC3, AGT, TTR (Transthyretin), TNF, PYY (Peptide YY), PTH (Parathyroid Hormone), PPBP (Pro-Platelet Basic Protein), PLG (Plasminogen), ORM1 (Orosomucoid 1), ICAM1 (Intercellular adhesion molecule 1) as shown in (Figure 3). For example, TNF regulates the transport of divalent metals such as Zinc in cells, via the control of ZIP-importers and metallothionein gene expression. 33 Complications observed with T2DM include disorders affecting the heart, blood vessels, nerves, eyes, liver, and kidneys. 34 Hyperglycemia and insulin resistance have been reported as crucial players in the development of microvascular and macrovascular complications, including atherosclerosis. 35 Overproduction of ROS species of endothelial dysfunction and inflammation precipitates the development of diabetic vascular disease. 36,37 Close analysis of the data set ( Figure S7b), shows the metabolite KNG1 (Kininogen 1) having a direct positive relationship of regulation for ROS generation, 38 while FN1 (Fibronectin 1) regulates endothelial cell proliferation, 39 F2 (Coagulation factor II) regulates endothelial cell function such as apoptosis 40 and PLG (Plasminogen) negatively regulates inflammation. 41 In attempting to monitor the onset of diabetic vascular disease, over-expression of KNG1, FNI, F2, and PLG could serve as prognostic biomarkers to monitor early vascular complications. Moreover, therapeutic applications to counteract vascular compromise, resolving of oxidative stress in T2DM patients could be achieved by regulating the expression of APOA1 which inhibits oxidative stress 42 and RETN (resistin) which regulates the inflammatory response. 43 Amyloidosis is a group of diseases in which misfolded proteins that result in progressive organ damage are formed in different tissues ( Table 4). The clinical presentation depends on its location, and the liver, kidney, and heart are commonly affected (Table S9), causing liver cirrhosis, nephrotic syndrome, and heart failure, respectively. 44 Several disorders were documented in the present study that dysfunction of the endothelium is considered as a crucial component in the pathogenesis of vascular disease including apoptosis, and oxidative stress ( Figure S7b), which eventually leads to the development of diabetic-related complications such as atherosclerosis, diabetic retinopathy, and nephropathy. 45 The damage of the genitourinary system includes several disorders such as urinary tract infection, abscess, chronic kidney disease, nephrotic syndrome, and urinary bladder, and urethral disorders. 46 Disorders related to degranulation of cells, such as the chemotactic activity of neutrophils with a reduction in the phagocytosis and bactericidal activity ( Figure S1) from diabetic patients is impaired, which predispose to infections . 47 Further, we demonstrated increased activation of leukocytes (Table 5) with increased release of inflammatory sets of proteins such as IL6, and TNF, which could play an important role in inflammatory disorders ( Figure 4A and B) associated with diabetes such as diabetic nephropathy . 48 The coagulation pathway ( Figure S7a) plays a critical role in the pathogenesis of cardiovascular disease in patients with neuropathy. 49 Many diabetes patients die of cardiovascular complications. The examination of Figures S7a showed many metabolites that affect the coagulation in some fashion. INS regulates vascularization; APOH, F2, and ICAM regulate platelet activation and adhesion; AGT regulates blood vessel contraction; TNF and CCL2 regulate blood chemokine in circulation; F9 regulates onset Hemophilia B, and PON1 inhibits onset cardiovascular disease. Directions of therapeutic goals should aim at monitoring and decreasing over-expression of the metabolites contributing to coagulation diseases and increasing the counterattack of PON1 (Paraoxonase 1) to prevent cardiovascular morbidity. 50 Other cellular processes plagued by T2DM patients include compromised wound healing, which is regulated by TNF and LEP metabolites; regulating their over-expression could resolve wound healing complications. 51 Insulin action pathway networks identified dysregulation of cellular processes such as glucose and lipid metabolism, which plays a vital role in the pathogenesis of T2DM (Figure S7 b-d). Monitoring the coupled expression of dysregulated proteins such as IL6, INS, LEP, IAPP (Islet Amyloid Polypeptide), AGT, TNF, CCL2, GHRL, KNG1 (Kininogen 1), GCG, RETN, and C3 (Complement C3) could be used as a predictive biomarker of T2DM comorbidities. For instance, AGT (angiotensinogen) plays an essential role in counteracting both the cardiovascular and non-cardiovascular actions of AngII. 52 Recent data have revealed that chronic administration of Ang 11 improves the action of insulin in glucose and lipid metabolism in obese mice. 53 Potential mechanisms of this beneficial effect included activation of insulin signaling, inhibition of the adverse actions of AngII, and augmented transport of insulin to the target tissues. 54,55 We observed an intersection between canonical signaling in T2DM and several DRGs that are tangled in neurological diseases, metabolic disorders, connective tissue disorders, immunodeficiency, and renal injuries associated with cellular functions and canonical pathways (Figure 9a-d). These communications are mediated through several factors, such as NFκB, Akt, ERK (extracellular signal-regulated kinases), TNF, and IL6. These factors were identified for their contributions in various diseases. Such networks indicate the complexity of the disorders associated with T2DM in terms of the target genes and their proteins on the different biological, cellular, and molecular functions which interact to cause a particular disorder. Even the complexity is evident in the current study as we identified in some examples that the upstream regulators affect abundant molecules such as growth factors like leptin; cytokines like TNF, and IL6; transcription factors such as HIFA, STAT, CEBPA, NFκB; and several kinases such as ERK, mTOR, AKT, and others.
Gathering information's of the DRGs of the dataset, and their role in the various biological process with associated comorbidities in T2DM could help in understanding the pathogenesis of various disorders, enhance the utility of these genes as biomarkers for prediction, prognosis, complication and response to treatment, and design of new drugs that can be used in management.
Using the analysis of current data integration, we were able to answer some questions of interest. For example, TNF as one of the upstream regulators that are involved in rheumatoid arthritis, and the clinical utility of Adalimumab as antibody and binder to TNF target, could be used for the treatment of active rheumatoid arthritis, which is predicted to inhibit TNF as a target. 56 Another question, which disease states could develop based on the activation of the acute phase response signalling signaling pathway in T2DM, the answer, for example, could be involved in the damage of the kidney, immunologic, and neurological disorders (Figures 4, 9A and C). STAT3 as an upstream regulator regulates 15 downstream target genes of the dataset and STAT3 could be involved in disorders related to inflammatory disorders such as connective tissues, 57 vascular disorders such as atherosclerosis, angiogenesis, 58 neurological disorders such as Alzheimer's disorders 59 and others based on the downstream target of the DRGs (Table 5, Figure 5). We analyzed the pathways, functions, diseases, and regulators to understand which gene or genes of that dataset could be used to monitor the prognosis, diagnosis, and efficacy of a particular disorder such as hypertension (IL6, TNF) for diagnosis of hypertension, while CCL2, IL6, LEP, and HPX for the efficacy of hypertension management, and, atherosclerosis and hemophilia ( Figure 8, and Figure S7a). For example, ANG could be used to monitor the efficacy of Aliskiren and Irbesartan in the treatment of T2DM associated with renal disorders. 60,61 In summary, the incorporation of metabolomics and proteomics data through integrative pathway analysis using different tools would help in understanding the role of various biomarkers in the identification of the biological processes, pathways, and pathophysiology that are associated with the comorbidities of T2DM. We identified the most commonly expressed genes of the study are the following TNF, IL6, LEP, AGT, APOE, F2, SPP1, RETN, and INS that could be used as potential prognostic biomarkers. The data recognized that IL6 is the top regulator of the DRGs, followed by LEP. LXR/RXR and acute phase response signalling pathways are dominant pathways involved in renal damage, insulin resistance, dyslipidemia, and cardiovascular disorders. We identified the role of upstream regulators such as STAT3 as a transcription factor that is involved in connective tissue disorders and atherosclerosis as it targets many proteins involved in such disorders. ANG could be used as an efficacy biomarker of renal disorders. TNF is involved in the regulation of metals such as Zinc, which affects B-cell functions and insulin secretion. Therefore, such information could help to recognize those patients at higher risk for a specific complication and its response to a particular class of anti-diabetic drugs. Prospective studies should be performed to validate the results obtained from the current work for the utility of the biomarkers using clinical studies.
A limitation of this study was a difference in the mean age of the control and diabetic: the age of the patents is older than the control, which could affect results. Therefore, we did an AUC analysis of some target biomarkers, as shown in supplementary Table 1, and previously published. 5,6 The data showed that these markers were differentially expressed in the diabetic cohort, regardless of the age of the individual.