Back to Journals » OncoTargets and Therapy » Volume 9

Identification of feature genes for smoking-related lung adenocarcinoma based on gene expression profile data

Authors Liu Y, Ni R, Zhang H, Miao L, Wang J, Jia W, Wang Y

Received 3 June 2016

Accepted for publication 3 September 2016

Published 7 December 2016 Volume 2016:9 Pages 7397—7407


Checked for plagiarism Yes

Review by Single-blind

Peer reviewers approved by Dr Akshita Wason

Peer reviewer comments 2

Editor who approved publication: Dr Samir Farghaly

Ying Liu, Ran Ni, Hui Zhang, Lijun Miao, Jing Wang, Wenqing Jia, Yuanyuan Wang

Respiration Department, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, Henan Province, People’s Republic of China

Abstract: This study aimed to identify the genes and pathways associated with smoking-related lung adenocarcinoma. Three lung adenocarcinoma associated datasets (GSE43458, GSE10072, and GSE50081), the subjects of which included smokers and nonsmokers, were downloaded to screen the differentially expressed feature genes between smokers and nonsmokers. Based on the identified feature genes, we constructed the protein–protein interaction (PPI) network and optimized feature genes using closeness centrality (CC) algorithm. Then, the support vector machine (SVM) classification model was constructed based on the feature genes with higher CC values. Finally, pathway enrichment analysis of the feature genes was performed. A total of 213 down-regulated and 83 up-regulated differentially expressed genes were identified. In the constructed PPI network, the top ten nodes with higher degrees and CC values included ANK3, EPHA4, FGFR2, etc. The SVM classifier was constructed with 27 feature genes, which could accurately identify smokers and nonsmokers. Pathways enrichment analysis for the 27 feature genes revealed that they were significantly enriched in five pathways, including proteoglycans in cancer (EGFR, SDC4, SDC2, etc.), and Ras signaling pathway (FGFR2, PLA2G1B, EGFR, etc.). The 27 feature genes, such as EPHA4, FGFR2, and EGFR for SVM classifier construction and cancer-related pathways of Ras signaling pathway and proteoglycans in cancer may play key roles in the progression and development of smoking-related lung adenocarcinoma.

Keywords: lung adenocarcinoma, feature genes, support vector machine (SVM) classification, pathway

Creative Commons License This work is published and licensed by Dove Medical Press Limited. The full terms of this license are available at and incorporate the Creative Commons Attribution - Non Commercial (unported, v3.0) License. By accessing the work you hereby accept the Terms. Non-commercial uses of the work are permitted without any further permission from Dove Medical Press Limited, provided the work is properly attributed. For permission for commercial use of this work, please see paragraphs 4.2 and 5 of our Terms.

Download Article [PDF]  View Full Text [HTML][Machine readable]