Prediction of selective estrogen receptor beta agonist using open data and machine learning approach
Authors Niu A, Xie L, Wang H, Zhu B, Wang S
Received 15 April 2016
Accepted for publication 4 July 2016
Published 18 July 2016 Volume 2016:10 Pages 2323—2331
Checked for plagiarism Yes
Review by Single-blind
Peer reviewer comments 2
Editor who approved publication: Prof. Dr. Wei Duan
Ai-qin Niu,1 Liang-jun Xie,2 Hui Wang,1 Bing Zhu,1 Sheng-qi Wang3
1Department of Gynecology, the First People’s Hospital of Shangqiu, Shangqiu, Henan, People’s Republic of China; 2Department of Image Diagnoses, the Third Hospital of Jinan, Jinan, Shandong, People’s Republic of China; 3Department of Mammary Disease, Guangdong Provincial Hospital of Chinese Medicine, the Second Clinical College of Guangzhou University of Chinese Medicine, Guangzhou, People’s Republic of China
Background: Estrogen receptors (ERs) are nuclear transcription factors that are involved in the regulation of many complex physiological processes in humans. ERs have been validated as important drug targets for the treatment of various diseases, including breast cancer, ovarian cancer, osteoporosis, and cardiovascular disease. ERs have two subtypes, ER-α and ER-β. Emerging data suggest that the development of subtype-selective ligands that specifically target ER-β could be a more optimal approach to elicit beneficial estrogen-like activities and reduce side effects.
Methods: Herein, we focused on ER-β and developed its in silico quantitative structure-activity relationship models using machine learning (ML) methods.
Results: The chemical structures and ER-β bioactivity data were extracted from public chemogenomics databases. Four types of popular fingerprint generation methods including MACCS fingerprint, PubChem fingerprint, 2D atom pairs, and Chemistry Development Kit extended fingerprint were used as descriptors. Four ML methods including Naïve Bayesian classifier, k-nearest neighbor, random forest, and support vector machine were used to train the models. The range of classification accuracies was 77.10% to 88.34%, and the range of area under the ROC (receiver operating characteristic) curve values was 0.8151 to 0.9475, evaluated by the 5-fold cross-validation. Comparison analysis suggests that both the random forest and the support vector machine are superior for the classification of selective ER-β agonists. Chemistry Development Kit extended fingerprints and MACCS fingerprint performed better in structural representation between active and inactive agonists.
Conclusion: These results demonstrate that combining the fingerprint and ML approaches leads to robust ER-β agonist prediction models, which are potentially applicable to the identification of selective ER-β agonists.
Keywords: estrogen receptor subtype β, selective estrogen receptor modulators, quantitative structure-activity relationship models, machine learning approach
This work is published and licensed by Dove Medical Press Limited. The full terms of this license are available at https://www.dovepress.com/terms.php and incorporate the Creative Commons Attribution - Non Commercial (unported, v3.0) License. By accessing the work you hereby accept the Terms. Non-commercial uses of the work are permitted without any further permission from Dove Medical Press Limited, provided the work is properly attributed. For permission for commercial use of this work, please see paragraphs 4.2 and 5 of our Terms.Download Article [PDF] View Full Text [HTML][Machine readable]