Back to Journals » International Journal of General Medicine » Volume 19

Short-Term Prognosis in Acute Asthma Exacerbations: A Comparative Evaluation of Machine Learning Models Using Spirometry and Tracheal Respiratory Sound Analysis

Authors Güçsav MO ORCID logo, Güllü MK ORCID logo, Özgür S ORCID logo, Topaloğlu İ ORCID logo, Unat ÖS ORCID logo, Serçe Unat D ORCID logo, Erbaycu AE ORCID logo

Received 9 February 2026

Accepted for publication 11 May 2026

Published 19 May 2026 Volume 2026:19 602311

DOI https://doi.org/10.2147/IJGM.S602311

Checked for plagiarism Yes

Review by Single anonymous peer review

Peer reviewer comments 2

Editor who approved publication: Professor Reynold Panettieri Jr



Mutlu Onur Güçsav,1,2 Mehmet Kemal Güllü,3 Su Özgür,2 İhsan Topaloğlu,4 Ömer Selim Unat,2,5 Damla Serçe Unat,1 Ahmet Emin Erbaycu1

1Department of Pulmonology, Izmir Bakırcay University, Cigli Training and Research Hospital, Izmir, Turkiye; 2Translational Pulmonary Research Center (EgeSAM), Ege University, Izmir, Turkiye; 3Department of Electrical and Electronics Engineering, Izmir Bakırcay University, Izmir, Turkiye; 4Department of Pulmonology, Kafkas University, Kars, Turkiye; 5Department of Pulmonology, University of Health Science, Dr. Suat Seren Chest Diseases and Surgery Training Research Hospital, Izmir, Turkiye

Correspondence: Mutlu Onur Güçsav, Izmir Bakircay University, Cigli Training and Research Hospital, Yeni Road, Ata Sanayi Street, No: 18, Cıglı, Izmir, 35620, Turkiye, Tel +90 5334744642, Fax +90 232 398 37 00, Email [email protected]

Purpose: Acute asthma exacerbations are a common cause of emergency department visits and require rapid risk stratification to guide disposition decisions. Although spirometry is the primary objective method used to assess clinical severity, it may not reliably predict clinical outcomes. In this context, machine learning–based approaches have gained increasing attention for improving prognostic assessment in emergency settings. To evaluate the prognostic relevance of spirometric parameters, tracheal respiratory sound–derived acoustic features, and machine learning-based classification models for short-term outcomes in adults presenting to the emergency department with acute asthma exacerbations.
Patients and Methods: In this prospective cohort study, adults with acute asthma exacerbations underwent spirometry and tracheal respiratory sound recording before and after emergency department treatment. Short-term prognosis was defined as hospitalization during the index visit or emergency department re-presentation within seven days. Changes in spirometric and acoustic features were analyzed, and machine learning models incorporating demographic, spirometric, acoustic, and combined feature sets were developed and compared.
Results: Baseline airflow limitation at presentation was the strongest determinant of short-term prognosis. Patients with poor outcomes had significantly lower pre-treatment and post-treatment FEV1 and PEF values. Acoustic feature changes showed minimal correlation with ΔFEV1 and ΔPEF; however, selected frequency-domain features differed between prognosis groups, indicating complementary physiological information. Machine learning models incorporating spirometric variables achieved the highest performance, with accuracy up to 84.4% and balanced classification metrics. Sound-based models demonstrated moderate but clinically meaningful performance (accuracy approximately 67– 77%), while multimodal models did not outperform spirometry-only models.
Conclusion: In acute asthma exacerbations, baseline spirometric impairment remains the most reliable predictor of short-term outcomes. Tracheal respiratory sound analysis combined with machine learning may provide complementary prognostic information, particularly when spirometry is unavailable or unreliable, supporting its role as a clinical decision-support tool rather than a replacement for conventional assessment.

Keywords: asthma, spirometry, tracheal respiratory sounds, prognosis, deep learning

Introduction

Asthma is a common chronic respiratory disease that affects over 300 million people worldwide. Its prevalence varies geographically, ranging from 7% in countries like France and Germany to 11% in the United States, and up to 15–18% in the United Kingdom.1,2 Despite available treatments, a large proportion of asthma patients remain uncontrolled and continue to experience exacerbations. Studies have reported the prevalence of uncontrolled asthma to range between 15% and 75%, depending on the population studied and the criteria applied.3 Exacerbations are a major contributor to asthma-related emergency department (ED) visits, hospitalizations, and mortality, thus imposing a substantial healthcare and socioeconomic burden.4,5

Emergency department presentations due to acute asthma exacerbations require rapid and effective management. Optimal care involves early recognition, timely intervention, close monitoring, appropriate referrals, and structured transitional care after discharge.6 In this context, accurate assessment of exacerbation severity and prognosis is crucial for guiding admission decisions and treatment strategies. Previous studies have emphasized that decisions regarding hospital admission or safe discharge in the emergency department are often complex and are based on a combination of objective airflow measurements, clinical findings, and historical risk factors rather than on a single physiological parameter alone.7

Current clinical guidelines, such as those from the Global Initiative for Asthma (GINA), recommend a comprehensive approach to assessing exacerbation severity, incorporating clinical symptoms, vital signs, and objective indicators of airflow limitation—most commonly peak expiratory flow (PEF) and forced expiratory volume in one second (FEV1).8 While these measures provide quantifiable information on airway obstruction, their correlations with clinical symptoms and oxygen desaturation have been shown to be weak.9 Moreover, several studies have demonstrated that reliance on isolated spirometric indices during acute exacerbations may be insufficient for predicting short-term outcomes or the need for hospitalization.9,10 In particular, prior emergency department studies have shown that changes in PEF do not consistently parallel changes in FEV1, and that patient classification based on these two parameters may differ substantially, thereby limiting the reliability of spirometric measures alone for guiding acute management decisions.7,11

Physical examination remains a fundamental component of the initial evaluation. Airway inflammation during an exacerbation typically causes mucosal swelling, increased secretions, and airflow limitation, resulting in characteristic respiratory sounds such as wheezing and rhonchi. Traditionally assessed by auscultation, these sounds are interpreted subjectively, which introduces interobserver variability and limits standardization.12 This subjectivity has prompted increasing interest in computerized and quantitative approaches to respiratory sound analysis as a means of providing more objective and reproducible assessment of airway pathology.13

Recent advances in digital medicine have enabled the objective recording and analysis of respiratory sounds. These technologies are increasingly used in the diagnosis and monitoring of respiratory disorders, especially in sleep-related and obstructive airway diseases.14,15 Normal breath sounds—classified as tracheal, bronchial, bronchovesicular, or vesicular—are produced by turbulent airflow through the tracheobronchial tree, creating acoustic patterns that reflect airway mechanics.16 In contrast, adventitious sounds like wheezes and rhonchi reflect pathological changes. Wheezes are continuous musical sounds typically above 400 Hz, while rhonchi are lower-pitched sounds below 200 Hz originating from larger airways.17 Importantly, alterations in respiratory sound characteristics have been shown to reflect underlying airway inflammation and obstruction even in the absence of audible wheezing on conventional auscultation, suggesting that acoustic signals may capture clinically relevant information not readily detected by routine physical examination.18

Machine learning (ML) techniques have shown promise in enhancing respiratory sound analysis. Studies demonstrate that ML-based methods can support diagnosis, assess severity, and monitor disease progression in conditions such as asthma and chronic obstructive pulmonary disease (COPD).13,19,20 Algorithms including artificial neural networks (ANNs), hidden Markov models (HMMs), k-nearest neighbors (k-NN), Gaussian mixture models (GMMs), self-organizing maps (SOMs), and ensemble models like Random Forest, XGBoost, and LightGBM have been increasingly applied to classify respiratory sounds and predict clinical outcomes.13 Beyond diagnostic applications, analysis of continuous adventitious respiratory sounds has been reported to provide complementary information to spirometry, particularly in the assessment of airway reversibility and short-term response to bronchodilator therapy, with acoustic changes observed in some patients despite minimal spirometric improvement.12,21

However, despite these technological advances, there is still no standardized framework for incorporating respiratory sound analysis into routine clinical decision-making for acute asthma care. The lack of objective findings from physical examinations and the limitations of spirometric tools challenge the ability to stratify risk and make timely disposition decisions in the ED.

In this study, we aim to address this gap by developing and evaluating machine learning-based predictive models using acoustic (spectral) and channel-level features derived from tracheal breath sounds. Our goal is to predict short-term prognosis and support hospitalization decisions in patients presenting with asthma exacerbations. We conducted a comparative analysis of three widely used tree-based ensemble algorithms—Random Forest, XGBoost, and LightGBM—to assess their performance and interpretability. This AI-based, non-invasive approach has the potential to complement existing clinical tools, improve risk stratification, and assist clinicians in making timely and accurate decisions in the acute care setting.

Materials and Methods

Study Design

This prospective cohort study was conducted in the emergency and chest diseases departments of two university hospitals in Turkey between January and December 2024. Ethical approval was obtained from the institutional ethics committee of a tertiary care hospital (Decision No: 1285, dated 08.11.2023). All experimental procedures used on humans in this study adhered to the principles of the Declaration of Helsinki and were approved by the institutional and national ethics committee. Adult patients who presented with acute asthma exacerbations during the study period were evaluated (n = 86). A convenience sampling method was employed. Patients who provided informed consent and successfully completed all assessments were included in the final analysis. Exclusion criteria were as follows: coexisting asthma and chronic obstructive pulmonary disease (COPD) (n = 2); known upper airway pathologies (eg, vocal cord paralysis, nasopharyngeal tumors); neurological disorders affecting spirometry performance (eg, dementia, Parkinson’s disease, neuromuscular disorders); and incomplete or missing data (n = 19).

Dataset

After applying exclusion criteria, 21 participants were removed, resulting in a final dataset of 65 patients. The dataset was reviewed for completeness, and no additional missing values were identified. Variables included demographic characteristics (age and sex), respiratory function parameters, and acoustic features extracted from tracheal breath sounds. The primary outcome was short-term clinical prognosis, defined as a binary classification task: 0 indicating good prognosis, and 1 indicating poor prognosis.

Devices for Data Acquisition

Respiratory function was measured using the MIR Smart One® (portable turbine spirometer) (Italy) (49 × 109 × 21 mm, 60.7 g), approved by the Food and Drug Administration (FDA) and compliant with the 2019 American Thoracic Society (ATS) and the European Respiratory Society (ERS) Spirometry standards.22 It is equipped with Bluetooth connectivity, allowing wireless pairing with smartphones or earphones. Calibration was verified using a 1-liter syringe before each session.

Tracheal breath sounds were recorded using the Stemoscope PRO® (digital stethoscope) (USA), (42 × 14 mm, 56 g), with a frequency range of 20–2000 Hz and active noise cancellation. The device was connected via Bluetooth to a mobile application. Recording quality was verified through pilot testing on five individuals.

Data Acquisition Procedure

All respiratory function and sound recordings were performed before any medical treatment and 1 hour after the administration of standard therapy, including ipratropium bromide + salbutamol sulfate (0.5 + 2.5 mg/2.5 mL nebulized), budesonide (0.5 mg/mL nebulized), and intravenous methylprednisolone (40 mg) in the emergency department, in a noise-isolated room. Respiratory function and sound recording were synchronized with a single command. The following steps were followed during the measurement of respiratory function:

Instructions

The procedure was explained, and the spirometry device was introduced to the patient.

Posture and Positioning

Patients were seated in an upright position (90°), with feet flat on the floor and back supported.

Mouthpiece Placement

The disposable mouthpiece was placed securely between the lips to prevent air leakage.

Verification

A healthcare professional confirmed the correct positioning of both the device and the mouthpiece.

Maneuver Execution

Patients were instructed to take a deep inhalation followed by a rapid and forceful exhalation.

Patients who were unable to perform the maneuver correctly on the first attempt were allowed up to two additional trials (maximum three attempts). During the maneuver, tracheal respiratory sounds were simultaneously recorded using the digital stethoscope placed on the upper half of the neck, corresponding to the palpable portion of the trachea. To ensure adequate temporal resolution for acoustic analysis, respiratory sound data were acquired at a sampling frequency of 4 kHz.

To ensure data quality, each recording was limited to a duration of 5–10 seconds, capturing one complete respiratory cycle. Both the spirometry and sound recordings were started and stopped simultaneously via a single command to ensure temporal synchronization and were immediately transmitted via Bluetooth to a secure mobile platform. The quality and completeness of the data were reviewed in real-time by a clinician before proceeding to the next participant. The overall workflow of data acquisition, preprocessing, feature extraction, and machine learning–based classification is illustrated in Figure 1.

Diagram of data acquisition and classification for prognosis prediction using demographic features, spirometry and tracheal sounds.

Figure 1 Workflow of data acquisition, feature extraction, and machine learning–based classification for short-term prognosis prediction.

Feature Extraction and Preprocessing

Data Overview

Demographic information, tracheal breath sounds, and respiratory function data were collected as described above. To preserve their clinical interpretability, no processing was applied to the demographic variables (age and sex). Respiratory function was represented by two spirometry parameters: peak expiratory flow (PEF) and forced expiratory volume in one second (FEV1). Tracheal breath sound recordings were analyzed using spectral analysis and acoustic tube modeling following a series of preprocessing steps.

Filtering

Respiratory sounds were initially recorded using digital stethoscopes at a 4 kHz sampling rate and then resampled to 8 kHz to improve frequency resolution for linear predictive coding analysis. Prior to analysis, all signals were amplitude normalized. Active respiratory segments were identified using a 100 ms moving average energy detector. Segments exceeding 10% of the maximum energy were retained, while silent periods and noise artifacts were excluded. A first-order pre-emphasis filter (coefficient = 0.97) was applied, and an 8th-order Butterworth low-pass filter was used to prevent aliasing. This preprocessing ensured consistency across recordings and improved the reliability of downstream feature extraction.

Segmentation

Inspiration and expiration phases were manually segmented by a multidisciplinary team. Visual and auditory inspection, supported by spectrogram analysis, guided segmentation. All annotations were independently verified by a physician (Figure 2).19

Two panels showing: upper panel—time-domain waveform of a tracheal breath sound; lower panel—corresponding spectrogram showing frequency distribution over time.

Figure 2 Segmentation of tracheal breath sound is demonstrated through waveform and spectrogram views.

Normalization

Given the relatively small sample size, all continuous features -excluding age and sex- were standardized using z-score normalization (mean= 0, SD= 1). This transformation was applied prior to model development and cross-validation to ensure comparability across features.

Feature Extraction

Spectral Band Energy Analysis

The first method analyzed the frequency content of respiratory sounds. Previous studies have shown that different frequency ranges carry specific information about airway conditions.23 We divided the acoustic spectrum into six frequency bands based on known characteristics of tracheal sounds in asthma patients: 100–300 Hz contains normal tracheal breathing sounds, 300–400 Hz captures baseline tracheal activity, 400–800 Hz identifies moderate-pitch wheeze, 800–1200 Hz detects high-pitch wheeze, 1200–1600 Hz reveals sharp wheeze sounds, and 1600–2000 Hz includes very fine tracheal sounds. Research has shown that asthmatic wheeze in tracheal recordings occurs mainly between 400–1600 Hz.24

We calculated the energy distribution in each frequency band using Welch’s method with 50 ms analysis windows and 50% overlap. Wheeze detection was performed using time-frequency analysis, looking for sustained narrow-band peaks that indicate airway obstruction.25 We measured two wheeze parameters: the percentage of time containing wheeze sounds, and the average intensity of these sounds. Breathing pattern characteristics were obtained from the sound envelope using Hilbert transform, which provided the ratio between inhalation and exhalation phases and how sharply these phases transition.26 We also calculated spectral shape features (centroid, bandwidth, skewness, and kurtosis) that describe the overall frequency distribution of respiratory sounds.27

Acoustic Tube Modeling

The second method modeled the airways as a tube with varying cross-sectional area.28 We performed linear predictive coding (LPC) analysis using 14 coefficients on pre-processed signals to estimate the airway’s acoustic properties. The LPC coefficients were converted to reflection coefficients, which indicate how sound reflects at different points along the airway due to changes in cross-sectional area. From these reflection coefficients, we estimated how the airway diameter changes from the vocal cords to the mouth.29

We extracted resonance frequencies (formants) and their bandwidths by finding the mathematical roots of the LPC equation. Formants represent the natural resonance frequencies of the airway tube and relate directly to airway size.30 Narrow bandwidths indicate cleaner resonances, which typically occur when airflow is less turbulent and airways are more open.31

Statistical Analysis

All statistical analyses were conducted using IBM SPSS Statistics for Windows, Version 27.0 (IBM Corp., Armonk, NY, USA). Machine learning analyses were performed in MATLAB (version 2017a, MathWorks, USA). Categorical variables were summarized as frequencies and percentages. Continuous variables were reported using mean, standard deviation, minimum, maximum, and median values. The Kolmogorov–Smirnov test was used to assess the normality of continuous variables. Group comparisons were conducted using the Student’s t-test or Mann–Whitney U-test, depending on normality assumptions. Chi-square tests were applied for categorical comparisons. A p-value < 0.05 was considered statistically significant.

Model Development and Evaluation

To assess prognostic performance across different clinical use cases, multiple machine learning classifiers were trained using varying combinations of input features. Age and sex were included in all models as baseline covariates, given their routine availability in both in-person clinical evaluations and remote care settings.

To reflect a remote monitoring scenario, such as smartphone-based or home-centered follow-up, models were first developed using acoustic features alone. Separate models relying exclusively on spirometric parameters—peak expiratory flow (PEF) and forced expiratory volume in one second (FEV1)—were constructed to represent in-hospital assessments based on conventional respiratory function testing. In addition, combined models incorporating both acoustic and spirometric features were evaluated to examine whether multimodal integration could enhance prognostic performance.

In the primary analysis, a formal feature selection strategy was not applied, as the objective was to evaluate the combined effect of different clinical variables within predefined feature sets. Instead, Principal Component Analysis (PCA) was used as a dimensionality reduction technique to mitigate the impact of high-dimensional feature spaces.32 No additional feature selection procedures were performed prior to model development. This approach was preferred to preserve the clinical interpretability of predefined feature groups.

The following classification algorithms were evaluated:

  • Support Vector Machines (SVM) with linear, cubic, and Gaussian kernels
  • K-Nearest Neighbors (KNN) using multiple distance metrics
  • Naive Bayes classifiers with Gaussian and kernel-based estimators
  • Artificial Neural Networks (ANN) with one to three hidden layers and different activation functions
  • Ensemble learning methods, including Bagging, AdaBoost, LogitBoost, and RUSBoost

Hyperparameter optimization was carried out using grid search. The grid-based approach enabled systematic evaluation of predefined hyperparameter values and incorporated early stopping for models with suboptimal performance. All performance evaluations were carried out using 5-fold stratified cross-validation from the metric results computed by MATLAB Classification Learner Toolbox to provide an internal assessment of model performance and reduce the risk of optimistic bias. Performance was quantified using accuracy, sensitivity, specificity, precision, F1-score, and Matthews Correlation Coefficient (MCC). To address model performance reliability, we calculated 95% confidence intervals for all AUC values using cross-validation derived metrics as reference estimates. AUC confidence intervals were computed using the Hanley and McNeil method, accounting for sample size n=65 and class distribution. For each algorithm, we used cross-validation AUC values rather than in-sample predictions to obtain more robust performance estimates.

Results

Patient Characteristics and Pulmonary Function Parameters

A total of 65 patients were included in the study, with a mean age of 58.2 years. The majority of the cohort was female (80%), while 20% were male. During the emergency department (ED) visit, 33.8% of patients required hospitalization due to insufficient response to treatment, whereas the remaining patients were discharged after ED management. Among discharged patients, two individuals re-presented to the emergency department within seven days.

Pulmonary function parameters measured before and after treatment are summarized in Table 1. Significant improvements were observed in FEV1 (mL), FEV1 (%), and PEF following treatment (all p < 0.001).

Table 1 Pulmonary Function Parameters Before and After Treatment

Patients were stratified into poor prognosis (hospitalization or ED re-admission within 7 days) and good prognosis groups. Patients with poor prognosis were significantly older than those with good prognosis (64.1 ± 12.5 vs 54.9 ± 13.1 years, p = 0.007). Comparisons of pulmonary function parameters between these groups are presented in Table 2.

Table 2 Comparison of Pulmonary Function Parameters Between Patients with Poor and Good Prognosis

Patients with poor prognosis were significantly older than those with good prognosis (p = 0.007). Both pre-treatment and post-treatment pulmonary function parameters, including FEV1 (mL), FEV1 (%), and PEF, were significantly lower in the poor prognosis group (all p < 0.001).

Changes in pulmonary function parameters following treatment did not differ significantly between groups. Median changes in PEF (p = 0.157) and FEV1 (p = 0.667) were comparable between poor and good prognosis patients.

Correlation Between Acoustic Features and Pulmonary Function Changes

Following the analysis of treatment-related changes in acoustic features, correlations between these changes and pulmonary function parameters (ΔPEF and ΔFEV1) were evaluated using Spearman correlation analysis.

No statistically significant correlations were observed between changes in any acoustic features and changes in pulmonary function parameters. Although some acoustic features, particularly wheezing-related parameters and high-frequency spectral components, showed higher correlation coefficients compared to others, these findings did not reach statistical significance and therefore do not support the presence of a meaningful association between acoustic changes and pulmonary function improvement (Supplementary Tables S1 and S2).

Comparison of Acoustic Features Between Poor- and Good-Prognosis Groups

Baseline and post-treatment changes in acoustic features were compared between patients with poor prognosis (hospitalization or emergency department revisit within 7 days) and those with good prognosis using spectral band energy analysis (Table 3).

Table 3 Comparison of Spectral Band Energy Changes Between Poor- and Good-Prognosis Groups

Patients with poor prognosis demonstrated significantly higher changes in high-frequency acoustic energy (800–1200 Hz) compared with patients with good prognosis (p = 0.014). In contrast, spectral bandwidth was significantly greater in the good-prognosis group than in the poor-prognosis group (p = 0.027).

No statistically significant differences were observed between the two groups with respect to low-frequency, mid-frequency, very high-frequency, or ultra-high-frequency bands, nor in wheezing-related parameters, inspiratory-to-expiratory ratio, phase sharpness, or spectral centroid (all p > 0.05).

Changes in acoustic channel modeling features before and after treatment were compared between patients with poor and good prognosis, and the results are summarized in Table 4.

Table 4 Comparison of Acoustic Channel Modeling Feature Changes Between Poor and Good Prognosis Groups

Machine Learning–Based Classification Performance

Performance of Machine Learning Models Across Feature Sets

Table 5 summarizes the classification performance of five machine learning algorithms across four different feature configurations. Overall, model performance varied substantially depending on the type of physiological information included in the feature set.

Table 5 Performance of Machine Learning Models Across Different Feature Sets

For Model 1 (demographic + acoustic features), classification performance was moderate. The KNN algorithm achieved the highest accuracy (76.6%) and F1 score (76.8%), followed closely by SVM (accuracy: 75.0%, F1 score: 74.4%). In contrast, Naive Bayes and Ensemble-Bag models demonstrated relatively limited performance, with accuracies of 64.1%, suggesting that acoustic channel features alone may be insufficient for robust discrimination.

In Model 2, which incorporated demographic, spectral, and acoustic features, performance remained comparable to Model 1 but did not show consistent improvement across classifiers. KNN again yielded the highest accuracy (76.6%), while SVM performance slightly declined compared with Model 1. The addition of spectral features did not substantially enhance classification metrics, indicating possible redundancy or noise when combined with acoustic features alone.

In contrast, Model 3, which integrated demographic variables with spirometric lung function measures, demonstrated the strongest overall performance across all feature sets. The SVM classifier achieved the highest accuracy, precision, recall, and F1 score (all 84.4%), indicating strong and balanced classification ability. This model also demonstrated a high AUC of 0.78 (95% CI: 0.67–0.89), indicating stable discrimination performance. KNN also performed well (accuracy: 81.2%), while Naive Bayes, ANN, and Ensemble-Bag showed consistent but slightly lower performance. These findings suggest that spirometric variables provide the most discriminative information among the evaluated feature domains.

Finally, Model 4, which combined demographic, spirometric, spectral, and acoustic features, did not outperform the spirometry-based Model 3. Accuracies ranged between 67.2% and 75.0%, with KNN achieving the highest accuracy (75.0%) and SVM showing moderate performance (76.6%). The inclusion of all feature domains did not lead to improved classification performance across models.

Additionally, receiver operating characteristic (ROC) curve analyses were performed to further evaluate the discriminative performance of the models. The ROC curves for all four models are presented in Figure 3.

Receiver operating characteristic (ROC) curves for four models using different feature sets (Model 1–4). Each panel displays the performance of SVM, KNN, Naive Bayes, ANN, and ensemble classifiers, with corresponding AUC values.

Figure 3 ROC curves of four models using different feature sets, with corresponding AUC values shown for each classifier.

Overall, the results indicate that spirometric features play a dominant role in model performance, while the incremental contribution of spectral and acoustic features appears limited when spirometry is already included. Among classifiers, SVM and KNN consistently demonstrated robust performance across feature sets, supporting their suitability for this classification task.

Calibration Analysis

Calibration analysis showed that model performance varied across feature sets and algorithms. Spirometry-based models demonstrated lower Brier scores (0.101–0.244), indicating more consistent calibration. In contrast, acoustic feature-based models showed a wider range of Brier scores (0.078–0.721), with higher prediction error observed in some algorithms. Models combining spectral and acoustic features exhibited intermediate performance (Brier scores: 0.088–0.339). Although most models satisfied the Hosmer–Lemeshow goodness-of-fit criterion (p > 0.05), visual inspection of calibration curves revealed deviations from the ideal diagonal line in several cases (Supplementary Figures 14).

Discussion

In this prospective emergency department cohort of adults presenting with acute asthma exacerbations, we evaluated the prognostic relevance of spirometric parameters, tracheal sound–derived acoustic features, and machine learning–based classification models for short-term outcomes, defined as hospitalization during the index visit or re-presentation within seven days. Overall, our findings indicate that baseline airflow limitation remains the strongest determinant of short-term prognosis, while respiratory sound features capture complementary but physiologically distinct information.

Patients with poor short-term outcomes demonstrated significantly greater airflow limitation at presentation. Both pre-treatment and post-treatment FEV1 and PEF values were consistently lower in this group compared with patients with good prognosis. Although spirometric parameters improved following emergency department treatment at the cohort level, the magnitude of improvement did not differ meaningfully between prognosis groups. This observation reinforces earlier emergency medicine evidence showing that absolute lung function at presentation is more informative for disposition decisions than short-term reversibility alone.7,11 It also aligns with contemporary guideline-based approaches that emphasize baseline severity in acute asthma risk stratification.8

Changes in tracheal sound–derived acoustic features showed minimal correlation with changes in spirometric parameters such as ΔFEV1 and ΔPEF, yet selected frequency-domain features differed significantly between prognosis groups. This supports the concept that spirometry and respiratory acoustics reflect different physiological dimensions of airway obstruction. Tracheal sounds are influenced by airflow turbulence, airway geometry, and vortex formation, and therefore may remain altered even when spirometric indices improve after bronchodilation.33 Prior experimental and clinical studies have similarly demonstrated that high-frequency components of respiratory sounds are sensitive to airflow limitation and airway inflammation independent of spirometric measures.18,34,35

In our cohort, patients with poor prognosis exhibited significantly greater changes in high-frequency acoustic energy (800–1200 Hz), whereas patients with good prognosis showed larger increases in spectral bandwidth. These findings suggest that persistent high-frequency acoustic alterations may reflect heterogeneous or residual airway dysfunction that is not fully captured by global spirometric indices. Similar frequency-dependent associations between respiratory sounds and airway obstruction have been reported in both asthma and other obstructive airway conditions.18,34,36

Wheeze-related parameters did not discriminate short-term prognosis in our study. Although wheezing is a hallmark feature of asthma and is widely used for diagnostic and severity classification, its prognostic utility appears limited in acute settings. Wheezes are intermittent, region-specific, and highly dependent on airflow conditions, which may reduce their ability to reflect overall disease burden or short-term outcome. This is consistent with prior reports showing strong performance of wheeze-based features for severity classification, but less consistent associations with clinical outcomes beyond diagnosis.19

The machine learning analyses further emphasize the dominant prognostic role of spirometry. Models incorporating demographic and spirometric variables achieved the highest and most balanced performance. The spirometry-driven model reached an accuracy of 84.4% with SVM and 81.2% with KNN, with concordant F1 scores and AUC values, indicating robust discrimination across outcome groups. These findings are concordant with clinical evidence demonstrating that objective airflow limitation at presentation provides the most reliable basis for acute asthma disposition decisions.7,11

Sound-based models demonstrated moderate but clinically meaningful performance. When demographic and acoustic channel features were used alone, classification accuracy reached approximately 76.6%. However, adding spectral features did not yield consistent improvement, and the fully integrated multimodal model did not outperform the spirometry-only model. Although the multimodal model achieved relatively high AUC values, this did not translate into better overall classification accuracy. This pattern suggests that increasing feature dimensionality without adding independent prognostic signal may introduce redundancy and increase the risk of overfitting, particularly in small-to-moderate sample sizes. Similar concerns regarding model complexity and generalizability have been highlighted in systematic reviews of respiratory sound analysis and artificial intelligence applications.13

In the present study, a formal feature selection strategy was not implemented in the primary analysis, as the main objective was to assess the combined contribution of different clinical variables within predefined feature sets. Instead, Principal Component Analysis (PCA) was applied as a dimensionality reduction technique. However, the inclusion of heterogeneous features without systematic feature optimization may have introduced redundancy and noise, which could have adversely affected model stability and generalizability. This may partly explain why expanding the feature space did not consistently result in improved classification performance across models. These observations highlight the importance of incorporating structured feature selection procedures alongside dimensionality reduction approaches in future studies. Particularly in relatively small datasets, unoptimized feature inclusion may increase the risk of overfitting and reduce the robustness of model performance.32,37,38

When interpreted in the context of existing literature, these findings are not unexpected. Most machine learning studies using respiratory sounds focus on diagnostic classification or severity labeling under controlled conditions, rather than short-term prognosis in heterogeneous emergency department populations. For example, recent studies have reported very high accuracy for asthma diagnosis or wheeze classification using sound-based machine learning approaches.17,20,39 However, diagnosis and severity classification represent fundamentally different prediction tasks than short-term outcomes such as hospitalization or early return visits, which are additionally influenced by treatment response, clinician decision-making, comorbidities, and social factors.

Limitations

This study has several limitations. First, the sample size was modest and derived from a single emergency department, which may limit the generalizability of the findings, particularly for machine learning models using high-dimensional acoustic features. Second, the composite outcome of hospitalization or emergency department re-presentation within seven days may be influenced by non-physiological factors such as clinical decision-making and access to follow-up care. Third, only limited clinical variables were included, and additional factors such as prior exacerbation history, medication use, or inflammatory markers were not assessed. The models were constructed solely based on spirometric and acoustic features. This may limit model performance and clinical applicability. Future studies incorporating these clinical variables may improve predictive performance and enhance clinical utility. In addition, respiratory sound recordings were obtained from the trachea only, which may underrepresent localized peripheral airway phenomena. Although stratified 5-fold cross-validation was employed, this approach may not be sufficient to adequately address the risk of overfitting in small datasets. This limitation may affect model stability and limit the robustness and generalizability of the findings. Therefore, the results should be interpreted with caution, and validation in larger, independent external cohorts is necessary to confirm these findings.

Conclusion

Taken together, our results suggest that machine learning–assisted respiratory sound analysis should be viewed as a complementary tool rather than a replacement for spirometry. While spirometry-driven models provide the most reliable prognostic information, sound-based models achieved non-trivial accuracy and may offer practical value in clinical scenarios where spirometry cannot be reliably obtained. In such settings, tracheal sound analysis may serve as an adjunctive physiological signal to support clinical assessment, consistent with the broader trajectory of research on computerized respiratory sound analysis and intelligent auscultation systems.12,13,21

Data Sharing Statement

The datasets generated and analyzed during the current study are available from the corresponding author (MOG) upon reasonable request.

Ethics Approval and Informed Consent

The study protocol was reviewed and approved by the Izmir Bakırcay University Ethics Committee (Approval No: 1285, dated 08.11.2023).

Author Contributions

All authors made a significant contribution to the work reported, whether that is in the conception, study design, execution, acquisition of data, analysis and interpretation, or in all these areas; took part in drafting, revising or critically reviewing the article; gave final approval of the version to be published; have agreed on the journal to which the article has been submitted; and agree to be accountable for all aspects of the work.

Funding

This study has been supported by the Scientific Research Projects Coordination Unit of Izmir Bakırçay University (Grant Number: BBAP.2023.013).

Disclosure

The authors declared no conflicts of interest in this work.

References

1. Masoli M, Fabian D, Holt S, Beasley R. The global burden of asthma: executive summary of the GINA Dissemination Committee report. Allergy. 2004;59(5):469–15. doi:10.1111/j.1398-9995.2004.00526.x

2. Peters SP, Ferguson G, Deniz Y, Reisner C. Uncontrolled asthma: a review of the prevalence, disease burden and options for treatment. Respir Med. 2006;100(7):1139–1151. doi:10.1016/j.rmed.2006.03.031

3. Chipps BE, Zeiger RS, Borish L, et al. Key findings and clinical implications from the epidemiology and natural history of asthma: outcomes and Treatment Regimens (TENOR) study. J Allergy Clin Immunol. 2012;130(2):332–342. doi:10.1016/j.jaci.2012.04.014

4. Czira A, Turner M, Martin A, et al. A systematic literature review of burden of illness in adults with uncontrolled moderate/severe asthma. Respir Med. 2022;191:106670. doi:10.1016/j.rmed.2021.106670

5. Tupper OD, Ulrik CS. Long-term predictors of severe exacerbations and mortality in adults with asthma. Respir Res. 2021;22(1):1–9. doi:10.1186/s12931-021-01864-z

6. Hasegawa K, Craig SS, Teach SJ, et al. Management of asthma exacerbations in the emergency department. J Allergy Clin Immunol Pract. 2021;9(7):2599–2610. doi:10.1016/j.jaip.2020.12.037

7. Grunfeld AF, Fitzgerald JM. Discharge considerations for adult asthmatic patients treated in emergency departments. Can Respir J. 1996;3:322–327. doi:10.1155/1996/254627

8. Global Initiative for Asthma. Global strategy for asthma management and prevention: 2024 update. Available from: https://ginasthma.org/2024-report/. Accessed November 10, 2025.

9. Weber EJ, Silverman RA, Callaham ML, et al. A prospective multicenter study of factors associated with hospital admission among adults with acute asthma. Am J Med. 2002;113(5):371–378. doi:10.1016/S0002-9343(02)01242-1

10. Schneider J, Matsuda K, House S, Ferguson I, Aubuchon K, Lewis L. Chicago, Illinois 2012 Society for Academic Emergency Medicine. Dyspnea scores as predictors of hospital admission in acute asthma exacerbations. 2012. 6. http://digitalcommons.wustl.edu/em_conf/6

11. Nowak RM, Pensler MI, Sarkar DD, et al. Comparison of peak expiratory flow and FEV1 admission criteria for acute bronchial asthma. Ann Emerg Med. 1982;11(2):64–69. doi:10.1016/S0196-0644(82)80298-9

12. Lozano-Garcia M, Fiz JA, Martinez-Rivera C, Torrents A, Ruiz-Manzano J, Jane R. Novel approach to continuous adventitious respiratory sound analysis for the assessment of bronchodilator response. PLoS One. 2017;12(2):e0171455. doi:10.1371/journal.pone.0171455

13. Palaniappan R, Sundaraj K, Sundaraj S. Artificial intelligence techniques in respiratory sound analysis: a systematic review. Biomed Tech. 2014;59(1):7–18. doi:10.1515/bmt-2013-0074

14. Kim JW, Kim T, Shin J, et al. Prediction of obstructive sleep apnea using respiratory sounds. Clin Exp Otorhinolaryngol. 2019;12(1):72–79. doi:10.21053/ceo.2018.00388

15. Jacome C, Oliveira A, Marques A. Computerized respiratory sounds in stable and exacerbated COPD. Clin Respir J. 2017;11(5):612–620. doi:10.1111/crj.12392

16. Nabi FG, Sundaraj K, Kiang LC, et al. Wheeze sound analysis using computer-based techniques: a systematic review. Biomed Tech. 2019;64(1):1–28. doi:10.1515/bmt-2016-0219

17. Naqvi SZH, Arooj M, Aziz S, Khan MU, Choudhary MA, Ul Hassan MN. Spectral analysis of lung sounds for classification of asthma and pneumonia wheezing. In: 2020 International Conference on Electrical, Communication, and Computer Engineering (ICECCE). IEEE; 2020:1–6. doi:10.1109/ICECCE49384.2020.9179417.

18. Shimoda T, Obase Y, Nagasaka Y, et al. Lung sound analysis for localization of airway inflammation in asthma. J Asthma Allergy. 2017;10:99–108. doi:10.2147/JAA.S125938

19. Nabi FG, Sundaraj K, Lam CK, Palaniappan R. Characterization and classification of asthmatic wheeze sounds by severity using spectral integrated features. Comput Biol Med. 2019;104:52–61. doi:10.1016/j.compbiomed.2018.10.035

20. Topaloglu I, Ozduygu G, Atasoy C, et al. Machine learning-driven lung sound analysis: a novel methodology for asthma diagnosis. Adv Respir Med. 2025;93(5):32. doi:10.3390/arm93050032

21. Lozano-Garcia M, Davidson CM, Jane R. Analysis of Tracheal and pulmonary continuous adventitious respiratory sounds in asthma. In: 2019 41st Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC). IEEE; 2019:4930–4933. doi:10.1109/EMBC.2019.8859310.

22. Graham BL, Steenbruggen I, Miller MR, et al. Standardization of spirometry: 2019 update. Am J Respir Crit Care Med. 2019;200(8):e70–e88. doi:10.1164/rccm.201908-1590ST

23. Pramono RXA, Bowyer S, Rodriguez-Villegas E. Automatic adventitious respiratory sound analysis: a systematic review. PLoS One. 2017;12(5):e0177926. doi:10.1371/journal.pone.0177926

24. Fernandez-Granero MA, Sanchez-Morillo D, Leon-Jimenez A. Computerised analysis of telemonitored respiratory sounds for predicting acute exacerbations of COPD. Sensors. 2015;15(10):26978–26996. doi:10.3390/s151026978

25. Taplidou SA, Hadjileontiadis LJ. Wheeze detection based on time-frequency analysis of breath sounds. Comput Biol Med. 2007;37(8):1073–1083. doi:10.1016/j.compbiomed.2006.09.007

26. Charleston-Villalobos S, Martinez-Hernandez G, Gonzalez-Camarena R, Chi-Lem G, Carrillo JG, Aljama-Corrales T. Assessment of multichannel lung sounds parameterization for two-class classification in interstitial lung disease patients. Comput Biol Med. 2011;41(7):473–482. doi:10.1016/j.compbiomed.2011.04.009

27. Reichert S, Gass R, Brandt C, Andres E. Analysis of respiratory sounds: state of the art. Clin Med Circ Respirat Pulm Med. 2008;2:45–58. doi:10.4137/CCRPM.S530

28. Moussavi Zahra. Fundamentals of Respiratory System and Sounds Analysis 1 (Switzerland: Springer Cham). 2006. doi:10.1007/978-3-031-01617-2;6:.

29. Wakita H. Direct estimation of the vocal tract shape by inverse filtering of acoustic speech waveforms. IEEE Trans Audio Electroacoust. 1973;21(5):417–427. doi:10.1109/TAU.1973.1162506

30. Chambres G, Hanna P, Desainte-Catherine M. Automatic detection of patients with respiratory diseases using lung sound analysis. In: 2018 International Conference on Content-Based Multimedia Indexing (CBMI). IEEE; 2018: 1–6.

31. Rabiner LR, Schafer RW. Digital Processing of Speech Signals. 1st ed. Englewood Cliffs (NJ): Prentice-Hall; 1978.

32. Jolliffe IT. Principal Component Analysis. 2nd ed. New York: Springer; 2002.

33. Mori M, Ono M, Hisada T, et al. Relationship between forced expiratory flow and tracheal sounds. Respiration. 1988;54(2):78–88. doi:10.1159/000195505

34. Habukawa C, Murakami K, Horii N, Yamada M, Nagasaka Y. A new modality using breath sound analysis to evaluate the control level of asthma. Allergol Int. 2013;62(1):29–35. doi:10.2332/allergolint.12-OA-0428

35. Habukawa C, Murakami K, Endoh M, Yamada M, Horii N, Nagasaka Y. Evaluation of airflow limitation using a new modality of lung sound analysis in asthmatic children. Allergol Int. 2015;64(1):84–89. doi:10.1016/j.alit.2014.08.006

36. Oliveira A, Marques A. Respiratory sounds in healthy people: a systematic review. Respir Med. 2014;108(4):550–570. doi:10.1016/j.rmed.2014.01.004

37. Pudjihartono N, Fadason T, Kempa-Liehr AW, et al. Review of feature selection methods for machine learning-based disease risk prediction. Front Bioinform. 2022;2:927312. doi:10.3389/fbinf.2022.927312

38. Guyon I, Elisseeff A. An Introduction of variable and feature selection. J Mach Learn Res. 2003;3:1157–1182. doi:10.1162/153244303322753616

39. Naqvi SZH, Choudhry MA. An automated system for classification of chronic obstructive pulmonary disease and pneumonia patients using lung sound analysis. Sensors. 2020;20(22):6512. doi:10.3390/s20226512

Creative Commons License © 2026 The Author(s). This work is published and licensed by Dove Medical Press Limited. The full terms of this license are available at https://www.dovepress.com/terms and incorporate the Creative Commons Attribution - Non Commercial (unported, 4.0) License. By accessing the work you hereby accept the Terms. Non-commercial uses of the work are permitted without any further permission from Dove Medical Press Limited, provided the work is properly attributed. For permission for commercial use of this work, please see paragraphs 4.2 and 5 of our Terms.