The Construction of Primary Screening Model and Discriminant Model for Chronic Obstructive Pulmonary Disease in Northeast China
Authors Li X, Guo Y, Li W, Wang W, Zhang F, Li S
Received 17 February 2020
Accepted for publication 12 June 2020
Published 31 July 2020 Volume 2020:15 Pages 1849—1861
Checked for plagiarism Yes
Review by Single-blind
Peer reviewer comments 2
Editor who approved publication: Dr Richard Russell
Xiaomeng Li,1 Yuhao Guo,2 Wenyang Li,1 Wei Wang,1 Fang Zhang,1 Shanqun Li3
1Department of Respiratory and Critical Care Medicine, The First Hospital of China Medical University, Shenyang 110000, People’s Republic of China; 2Department of Mathematics and Statistics, Xi’an JiaoTong University, Xi’an 710049, People’s Republic of China; 3Department of Pulmonary and Critical Care Medicine, Zhongshan Hospital, Fudan University, Shanghai 200020, People’s Republic of China
Correspondence: Wei Wang Email firstname.lastname@example.org
Objective: The diagnosis of chronic obstructive pulmonary disease (COPD) is challenging, especially in the primary institution which lacks spirometer. To reduce the rate of COPD missed diagnoses in Northeast China, which has a higher prevalence of COPD, this study aimed to establish efficient primary screening and discriminant models of COPD in this region.
Patients and Methods: Subjects from Northeast China were enrolled from December 2017 to April 2019 from The First Hospital of China Medical University. Pulmonary function tests and questionnaire were given to all participants. Using illness or no illness as the goal for screening models and disease severity as the goal for discriminant models, multivariate linear regression, logical regression, linear discriminant analysis, K-nearest neighbor, decision tree and support vector machine were constructed through R language and Python software. After comparing effectiveness among them, the most optimal primary screening and discriminant models were established.
Results: Enrolled were 232 COPD patients (124 GOLD I–II and 108 GOLD III–IV) and 218 normal controls. Eight primary screening models were established. The optimal model was Y = − 1.2562– 0.3891X4 (education level) + 1.7996X5 (dyspnea) + 0.5102X6 (cooking fuel grade) + 1.498X7 (smoking index) + 0.8077X9 (family history)-0.5552X11 (BMI) + 0.538X13 (cough with sputum) + 2.0328X14 (wheezing) + 1.3378X16 (farmers) + 0.8187X17 (mother’s smoking exposure history during pregnancy)-0.389X18 (kitchen ventilation) + 0.6888X19 (childhood heating). Six discriminant models were established. The optimal model was decision tree (the optimal variables: dyspnea (x5), cooking fuel grade (x6), second-hand smoking index (x8), BMI (x11), cough (x12), cough with sputum (x13), wheezing (x14), farmer (x16), kitchen ventilation (x18), and childhood heating (x19)). The code was established to combine the discriminant model with computer technology.
Conclusion: Many factors were related to COPD in Northeast China. Stepwise logistic regression and decision tree were the optimal screening and discriminant models for COPD in this region.
Keywords: chronic obstructive pulmonary disease, screening, discriminant, severity, model
This work is published and licensed by Dove Medical Press Limited. The full terms of this license are available at https://www.dovepress.com/terms.php and incorporate the Creative Commons Attribution - Non Commercial (unported, v3.0) License. By accessing the work you hereby accept the Terms. Non-commercial uses of the work are permitted without any further permission from Dove Medical Press Limited, provided the work is properly attributed. For permission for commercial use of this work, please see paragraphs 4.2 and 5 of our Terms.Download Article [PDF] View Full Text [HTML][Machine readable]