Back to Journals » Clinical, Cosmetic and Investigational Dermatology » Volume 15

Differential Diagnosis of Rosacea Using Machine Learning and Dermoscopy

Authors Ge L, Li Y, Wu Y , Fan Z , Song Z 

Received 23 May 2022

Accepted for publication 20 July 2022

Published 1 August 2022 Volume 2022:15 Pages 1465—1473

DOI https://doi.org/10.2147/CCID.S373534

Checked for plagiarism Yes

Review by Single anonymous peer review

Peer reviewer comments 2

Editor who approved publication: Dr Jeffrey Weinberg



Lan Ge,1 Yaoying Li,1 Yaguang Wu,1 Ziwei Fan,2 Zhiqiang Song1

1Department of Dermatology, The First Affiliated Hospital of Army Medical University, Chongqing, People’s Republic of China; 2Lianren Digital Health Technology Co., LTD, Shanghai, People’s Republic of China

Correspondence: Zhiqiang Song, Department of Dermatology, The First Affiliated Hospital of Army Medical University, No. 30, Gaotanyanzheng Street, Shapingba District, Chongqing, People’s Republic of China, Email [email protected]

Introduction: Rosacea is a common chronic inflammatory disease occurring on the face, whose diagnosis is mainly based on symptoms and physical signs. Due to some overlap in symptoms and signs with other inflammatory skin diseases, young and inexperienced doctors often make misdiagnoses and missed diagnoses in clinical practices. We analyze the results of skin physiology and dermatoscopy using machine learning method and identify the characteristics of acne rosacea, which differentiate it from other common facial inflammatory skin diseases so as to improve the accuracy of clinical and differential diagnosis of rosacea.
Methods: A total of 495 patients who were jointly diagnosed by two experienced doctors were included. Basic data, clinical symptoms, physiological skin detection, and dermatoscopy results were collected, and the clinical characteristics of rosacea and other common facial inflammatory diseases were summarized according to the descriptive analysis results. The model was established using a machine learning method and compared with the judgment results of young and inexperienced doctors to verify whether the model can improve the accuracy of clinical diagnosis and differential diagnosis of rosacea.
Results: The proportion of yellow and red halos, vascular polygons, as well as follicular pustules, showed by dermatoscopy, and the melanin index in physiological skin detection revealed statistical significance in differentiating rosacea and other common facial inflammatory diseases (all P < 0.01). After adopting the machine learning, we found that GBM (Gradient Boosting Machine) algorithm was the best, and the error rate of this model in the validation set was 5.48%. In the final man-machine comparison, the accuracy of the GBM algorithm model for the classification of skin disease was significantly higher than that of young and inexperienced doctors.
Conclusion: Dermatoscopy combined with machine learning can effectively improve the diagnosis and differential diagnosis accuracy of rosacea and other facial inflammatory skin diseases.

Keywords: rosacea, dermatoscope, machine learning

Background

Rosacea is a chronic relapsing inflammatory disease that is commonly seen on the face of women aged 20–50. It mainly involves facial nerves and vessels as well as the sebaceous gland unit of the hair follicle. The main clinical manifestations of rosacea are intermittent flushing, persistent erythema, papules, pustules, and telangiectasia. Hypertrophy and eye changes have also been reported in a few patients. However, the clinical presentation of acne is similar to that of rosacea, with acne commonly appearing on the face and shoulders, and skin changes including intermittent flushing, papules and pustules.1–3

The diagnosis of rosacea is clinically challenging. The pathogenesis of rosacea is still unclear, and relatively specific biological markers are lacking. Moreover, the clinical manifestations are diverse and can be induced or aggravated by many factors. Improper treatments, such as with hormones for external use and other diseases, can complicate its diagnosis, making rosacea overlap with other inflammatory diseases with erythema occurring in the face.4 Although histopathology may assist in differential diagnosis, it should not be used as a regular diagnostic method considering facial cosmetic problems.5,6 Some studies have shown that using dermatoscopy to detect the changes in vascular polygons around the hair follicles can effectively help improve the diagnosis of rosacea.7–10

Over recent years, the application of machine learning in the medical field has rapidly developed. Combined with skin imaging data, machine learning has a good application prospect in the screening, diagnosis, and evaluation of skin diseases.11–15 The main idea of the GBM algorithm is to build a new base learner based on the gradient descent direction of the loss function of the previously established base learner for the purpose of integrating these base learners and thus making the overall loss function of the model continuously decrease and the model to continuously improve.

In this study, we used machine learning techniques to analyze the results of skin physiological monitoring and dermatoscopy to find out relevant indicators that have relative specificity and sensitivity to the clinical diagnosis of rosacea in order to establish a mathematical model that can accurately identify rosacea and help clinical doctors to more quickly and accurately diagnose rosacea.

Materials and Methods

Research Object

A total of 495 patients with facial diseases and healthy faces who were jointly diagnosed by two experienced doctors were included in this study. Among these patients, 350 patients, included for medical statistics and machine learning modeling, were diagnosed with facial diseases, including 150 with rosacea, 100 with acne, 100 with facial dermatitis (30 with seborrheic dermatitis, 30 with atopic dermatitis, 40 with contact dermatitis, 100 normal controls). Machine learning modeling was performed according to the ratio of training set: validation set = 7:3. Another 45 patients, 15 with rosacea, 15 with acne, 15 with facial dermatitis (contact dermatitis, atopic dermatitis, seborrheic dermatitis: 5 cases each), were included for the validation model.

Skin Physiological Detection and Dermatoscopy

Dermatoscopy

After being cleaned, the skin lesions were fully exposed to the CBS-908 dermatoscopy of China Boshi. First, pictures were taken in polarized light mode and then in non-polarized light mode (50 times). At the same time, new structural patterns were observed.

Skin Physiological Detection

The MPA580 multi-probe skin tester made by CK from Germany was used. No skincare products were applied to the skin after faces were cleaned. Patients were required to stay indoors to rest for 30 minutes. The indoor environment had no direct sunlight, no windows, and no ventilation. Room temperature was controlled at 25–28 ℃ and relative humidity at 50–60%. Percutaneous water loss, cuticle moisture content, pH, lipid, and erythema value in the central forehead, left cheek, right cheek, and jaw of the patients were measured. The average value of the measurement of four facial parts was taken as the final value, and the values of the measurements were compared.

Clinical Data Collection

Basic data were collected, and each patient underwent skin physiology and dermatoscopy when they visited the doctor. Skin physiology examination included transepidermal water loss (TEWL), water content, elasticity, pH, melanin, erythema, lactate, and lipid, which were used as continuous variables. Dermatoscopy included the assessment of blood vessels, hair follicles, vellus, and scales, which were used as dichotomous variables (positive/negative).

Statistical Analysis

SPSS26.0 software was used for statistical analysis. For continuous variables, Brown-Forsythe and Welch tests were used for homogeneity test of variance, and Games-Howell tests were adopted to compare differences between groups. For categorical variables, the Chi-square test was used to compare differences between groups. In the machine learning part, we used the H2O machine learning platform (https://www.h2o.ai/). H2O is a fully open-source distributed memory machine learning platform with linear scalability, the platform supports the most widely used statistical and machine learning algorithms, including GLM (Generalized Linear Model), GBM (Gradient Boosting Machine), XGBoost (eXtreme Gradient Boosting), DeepLearning, StackEnsemble, GLRM (Generalized Low Rank Models), and more. We take 70% of the patients as the training set and 30% of the patients as the validation set, and use the 5-fold cross-validation method to validate the model. The results of skin physiology detection are used as continuous variables, and the results of dermoscopy are input as categorical variables. In view of the limitation of the total number of samples, the deep learning model will be over-fitted, so we use the AutoML method of the H2O platform, and select the four algorithms of GLM, GBM, GLRM, and XGBoost for modeling.16–18

Results

Skin Physiological Detection Result

Among 450 patients used for modeling, 54 were males (12%), and 396 were females (88%), aged between 13 and 65 (Table 1). After using the Games-Howell test to compare pairwise differences between the rosacea group and three other groups, namely the acne group, the dermatitis group, and the normal group, the skin physiological detection result (Table S1) revealed that most of the measurement indicators could not precisely differentiate patients with rosacea from those with other inflammatory facial diseases (Table 2). Moreover, the mean melanin index was lower in the rosacea group than in the inflammatory disease groups, and there were significant differences between the rosacea group and the acne group as well as between the rosacea group and the dermatitis group (p < 0.05), while no significant difference was found between the rosacea group and the normal group (p = 0.411). We believed that the melanin index might be a potential indicator to differentiate rosacea from other facial inflammatory skin diseases.

Table 1 Patients’ Characteristics

Table 2 Difference Between Rosacea and the Others (Skin Physiological Testing)

Dermatoscopy Result

The results of dermatoscopy were compared between groups using the Chi-square test (Table 3), which showed significant differences (p < 0.05) between the rosacea group and the other three groups in terms of vascular polygons as well as yellow and red halos around hair follicles and pustules, especially vascular polygons, whose positive rate in the rosacea group reached 100%, while the rate in the acne group, the dermatitis group, and the normal group was 8%, 4%, and 1%, respectively.

Table 3 Difference Between Rosacea and the Others (Dermoscopy)

Machine Learning Modeling Result

First, we grouped the training and validation sets according to the ratio of 7:3 and tried to carry out dichotomous modeling in the rosacea and the other three groups. After inputting the results of skin physiology and dermatoscopy, we found that regardless of which algorithm was used, the AUC (a performance index used to evaluate the merits and demerits of the dichotomous model; the closer to 1, the better) of the dichotomous model was all above 0.99 (Table 4), which could well differentiate rosacea from other facial skin diseases. Among them, vascular polygons had the highest proportion, which was consistent with our previous medical statistical results. Since the results of the dichotomous model were satisfactory, we wanted to try the four classification model, which is a method that can accurately identify the four groups of rosacea, acne, facial dermatitis and normal controls. Next, we tried four classification modeling and found that Gradient Boosting Machine (GBM) algorithm had the lowest log loss,16,20–22 which was the highest-rated model (Table 5). The results of the model training set and validation set were presented in the form of a confusion matrix. The error rate of this model was 0 in the training set and 5.48% in the validation set (Figure 1).

Table 4 Machine Learning Results (2-Class)

Table 5 Machine Learning Results (4-Class)

Figure 1 The confusion matrix of 4-class GBM model. (A)The training set. (B)The validation set.

The main idea of the GBM algorithm is to establish a new machine learning based on the gradient descent direction of the loss function of the previously established one for the purpose of integrating these machine learnings, thus making the overall loss function of the model continuously decline and the model to continuously improve. Our results showed that through the use of machine learning methods, based on the results of skin physiological detection and dermatoscopy, both dichotomous and four classification models could accurately carry out the differential diagnosis of rosacea, especially in the four-classification model where we could effectively differentiate rosacea, acne and dermatitis patients to improve the accuracy of diagnosis among different facial skin diseases and help clinicians make a diagnosis.

Man-Machine Comparison Test

In order to confirm the practical value of the classification model of machine learning in clinical practice, we re-collected 45 patients (15 with rosacea, 15 with acne, and 15 with dermatitis) for GBM model prediction and invited three resident doctors for diagnosis. The results showed that the total accuracy rate of GBM model prediction was 84.4% (38/45). In each group, the accuracy rate was 93.3% (14/15), 73.3% (11/15) and 86.6% (13/15), respectively. Therefore, the accuracy rate in the rosacea group and the dermatitis group was over 90% and 80%, respectively, while the accuracy rate in the acne group was slightly lower. The overall accuracy rate of the three doctors was 35.5% (16/45), 37.8% (17/45), and 37.8% (17/45), respectively. The detailed results of the comparison between the three doctors and machine learning are shown in Table 6 and Table S2. As a result, the accuracy of the machine learning model was greatly improved compared to that of young and inexperienced doctors. It is also verified that the traditional skin physiological detection and dermatoscopy combined with machine learning technology could effectively improve the efficiency of clinicians in the differential diagnosis of rosacea, which is conducive to the diagnosis and treatment of patients.

Table 6 Human-Machine Comparison Results

Discussion

Over recent years, with the changes in modern lifestyle, especially the booming development of various skin cosmetic treatments, inflammatory facial skin diseases, which mainly manifest as facial flushing, papules, and pustules, have become increasingly frequent in clinical practices. The most common inflammatory facial skin diseases include rosacea, seborrheic dermatitis, contact dermatitis, atopic dermatitis, and acne. However, the diagnosis of skin diseases is mostly symptomatic, and the clinical manifestations of these inflammatory facial skin diseases in different periods are very similar and “overlapping”, so there are often misdiagnosis and mistreatment. In the present study, we used dermatoscopy and skin physiological detection to examine patients with common inflammatory facial diseases, including rosacea, manifested by erythema, papules, and pustules. Besides, we established a model through comprehensive analysis with machine learning method to detect the characteristics that can accurately differentiate rosacea from other common inflammatory facial skin diseases, aiming to improve the accuracy of clinical and differential diagnosis of rosacea.

In the present study, we first described and analyzed the characteristics and forms of common facial skin diseases from the perspective of conventional medical statistics. We found that the melanin index in physiological skin detection and the vascular polygons, as well as yellow and red halos around hair follicles and pustules in dermatoscopy, were all invaluable indicators in the differential diagnosis of rosacea, which is consistent with a previous study.19 The research included 115 patients, including 25 rosacea patients, all of whom had positive dermoscopic results for vascular polygons.17 However, because some patients with other facial skin diseases also have positive cases of vascular polygons, it was necessary to find a more optimized method based on the results of vascular polygons to improve the efficiency of our differential diagnosis. Our study found that the positive rate of vascular polygons, as well as light yellow and yellowish red halos around hair follicles, reached 100% in the rosacea group, while the rate in the acne group, the dermatitis group, and the normal group was 8%, 4% and 1%, respectively. Therefore, we believe that vascular polygons as well as light yellow and yellowish red halos around hair follicles could be used as an indicator in the diagnosis and differential diagnosis of rosacea to distinguish rosacea patients from patients with other facial skin diseases.

In the process of machine learning modeling, we found that log loss scores were relatively poor (logloss > 1) when only the results of physiological skin detection were included in the model, and reliable models could be obtained only after the simultaneous inclusion of dermatoscopy results. Therefore, dermatoscopy is indispensable for the differential diagnosis of rosacea.

In addition, by observing and studying several factors with high weights in the dichotomous and four classification models, we found that except vascular polygons, which were the result of conventional statistics, the melanin index, yellow and red halos, and pustules around hair follicles were not ranked very high, while keratotic plugs, erythema, punctate vessels, transepidermal water loss (TEWL) and water content ranked the highest, which is a very interesting phenomenon. TEWL and water content are two important indicators reflecting skin barrier function. Combined with clinical experience and the results of our analysis, there was no significant difference in these two indicators between the rosacea group and the acne group and between the rosacea group and the normal group. However, in the dermatitis group, TEWL was significantly increased, while water content was significantly decreased. Therefore, we believe that the skin barrier function of primary rosacea does not significantly differ from that of normal people, and there is no serious barrier damage as we previously thought. The damage to skin barrier function in patients with rosacea is significantly different from that of patients with atopic dermatitis, contact dermatitis, and seborrheic dermatitis. Based on this, it is reasonable to believe that if patients with rosacea are also associated with abnormal physiological indicators, there might be other causes of barrier function damage, such as drugs for internal use (eg, glucocorticoids) and other diseases, such as rosacea associated with seborrheic dermatitis. The case of primary rosacea associated with other diseases is also worth exploring. Suppose we want to optimize the model and make the included indicators more concise and precise in the future. In that case, we can start from these factors with high weights to facilitate the application of the model in clinical practices to assist doctors in the differential diagnosis of various facial skin diseases and their complications so that patients can receive the correct treatment.

Finally, the highlight of this study is that we collected the testing results of 45 cases and carried out the man-machine comparison study. Previous research related to skin diseases has mainly focused on the machine learning differential diagnosis of skin canceration and the man-machine comparison,23 so our study adds to this field several man-machine comparison results of differential diagnosis related to inflammatory facial diseases, thus expanding the application of machine learning technology in skin diseases.

Conclusion

In conclusion, dermatoscopy combined with machine learning revealed better sensitivity and specificity for the diagnosis of rosacea and could effectively improve the diagnosis rate of inexperienced doctors for rosacea. Of course, due to the complexity of the current model, it is difficult to promote it in clinical practice, so our team plans to expand the sample size further so as to optimize the model and make it applicable in clinical practices in the future.

Ethical Approval and Consent to Participation

The ethics review committee approved the study of The First Affiliated Hospital of Army Medical University. The current study complies with the Declaration of Helsinki, before initiation of the study, written or verbal informed consent was obtained from all of the participants, and verbal informed consent was acceptable and approved by the ethics review committee.

Acknowledgments

We wish to thank Lianren Digital Health Technology Co., LTD, for suggestions regarding statistical methods and research ideas. Zhiqiang Song is correspondence author for this study.

Disclosure

All authors declare that they have no conflicts of interest.

References

1. Steinhoff M, Schauber J, Leyden JJ. New insights into rosacea pathophysiology: a review of recent findings. J Am Acad Dermatol. 2013;69(6 Suppl 1):S15–S26. doi:10.1016/j.jaad.2013.04.045

2. Wilkin J, Dahl M, Detmar M, et al. Standard classification of rosacea: report of the National Rosacea Society expert committee on the classification and staging of rosacea. J am Acad Dermatol. 2002;46(4):584–587. doi:10.1067/mjd.2002.120625

3. Wilkin J, Dahl M, Detmar M, et al. Standard grading system for rosacea: report of the National Rosacea Society Expert Committee on the classification and staging of rosacea. J Am Acad Dermatol. 2004;50(6):907–912. doi:10.1016/j.jaad.2004.01.048

4. Hampton PJ, Berth-Jones J, Duarte Williamson CE, et al. British Association of Dermatologists guidelines for the management of people with rosacea 2021. Br J Dermatol. 2021;185(4):725–735. doi:10.1111/bjd.20485

5. Crawford GH, Pelle MT, James WD. Rosacea: I. Etiology, pathogenesis, and subtype classification. J Am Acad Dermatol. 2004;51(3):327–341. doi:10.1016/j.jaad.2004.03.030

6. Haimovic A, Sanchez M, Judson MA, Prystowsky S. Sarcoidosis: a comprehensive review and update for the dermatologist: part I. Cutaneous disease. J Am Acad Dermatol. 2012;66(5):699–e1. doi:10.1016/j.jaad.2012.02.003

7. Lallas A, Kyrgidis A, Tzellos TG, et al. Accuracy of dermoscopic criteria for the diagnosis of psoriasis, dermatitis, lichen planus and pityriasis rosea. Br J Dermatol. 2012;166(6):1198–1205. doi:10.1111/j.1365-2133.2012.10868.x

8. Zalaudek I, Argenziano G, Di Stefani A, et al. Dermoscopy in general dermatology. Dermatology. 2006;212(1):7–18. doi:10.1159/000089015

9. Lallas A, Apalla Z, Lefaki I, et al. Dermoscopy of early stage mycosis fungoides. J Eur Acad Dermatol Venereol. 2013;27(5):617–621. doi:10.1111/j.1468-3083.2012.04499.x

10. Argenziano G, Soyer HP, Chimenti S, et al. Dermoscopy of pigmented skin lesions: results of a consensus meeting via the Internet. J Am Acad Dermatol. 2003;48(5):679–693. doi:10.1067/mjd.2003.281

11. Aggarwal SLP. Data augmentation in dermatology image recognition using machine learning. Skin Res Technol. 2019;25(6):815–820. doi:10.1111/srt.12726

12. Binol H, Plotner A, Sopkovich J, Kaffenberger B, Niazi MKK, Gurcan MN. Ros-NET: a deep convolutional neural network for automatic identification of rosacea lesions. Skin Res Technol. 2020;26(3):413–421. doi:10.1111/srt.12817

13. Soyturk A, Sen F, Uncu AT, Celik I, Uncu AO. De novo assembly and characterization of the first draft genome of quince (Cydonia oblonga Mill). Sci Rep. 2021;11(1):3818. doi:10.1038/s41598-021-83113-3

14. Gerges F, Shih F, Azar D. Automated diagnosis of acne and rosacea using convolution neural networks. 2021 4th International Conference on Artificial Intelligence and Pattern Recognition; Xiamen. China: Association for Computing Machinery; 2021:607–613.

15. Zhao Z, Wu C-M, Zhang S, et al. A novel convolutional neural network for the diagnosis and classification of rosacea: usability study. JMIR Med Inform. 2021;9(3):e23415. doi:10.2196/23415

16. Abdullah AA, Kanaya S. Machine Learning Using H2O R Package: An Application in Bioinformatics. Singapore: Springer Singapore; 2019.

17. Pepe MS. An interpretation for the ROC curve and inference using GLM procedures. Biometrics. 2000;56(2):352–359. doi:10.1111/j.0006-341X.2000.00352.x

18. Torlay L, Perrone-Bertolotti M, Thomas E, Baciu M. Machine learning-XGBoost analysis of language networks to classify patients with epilepsy. Brain Inform. 2017;4(3):159–169. doi:10.1007/s40708-017-0065-7

19. Lallas A, Argenziano G, Apalla Z, et al. Dermoscopic patterns of common facial inflammatory skin diseases. J Eur Acad Dermatol Venereol. 2014;28(5):609–614. doi:10.1111/jdv.12146

20. Atkinson EJ, Therneau TM, Melton LJ 3rd, et al. Assessing fracture risk using gradient boosting machine (GBM) models. J Bone Miner Res. 2012;27(6):1397–1404. doi:10.1002/jbmr.1577

21. Lu J, Lu D, Zhang X, et al. Estimation of elimination half-lives of organic chemicals in humans using gradient boosting machine. Biochim Biophys Acta. 2016;1860(11 Pt B):2664–2671. doi:10.1016/j.bbagen.2016.05.019

22. Miller PJ, McArtor DB, Lubke GH. A gradient boosting machine for hierarchically clustered data. Multivariate Behav Res. 2017;52(1):117. doi:10.1080/00273171.2016.1265433

23. Tschandl P, Codella N, Akay BN, et al. Comparison of the accuracy of human readers versus machine-learning algorithms for pigmented skin lesion classification: an open, web-based, international, diagnostic study. Lancet Oncol. 2019;20(7):938–947. doi:10.1016/S1470-2045(19)30333-X

Creative Commons License © 2022 The Author(s). This work is published and licensed by Dove Medical Press Limited. The full terms of this license are available at https://www.dovepress.com/terms.php and incorporate the Creative Commons Attribution - Non Commercial (unported, v3.0) License. By accessing the work you hereby accept the Terms. Non-commercial uses of the work are permitted without any further permission from Dove Medical Press Limited, provided the work is properly attributed. For permission for commercial use of this work, please see paragraphs 4.2 and 5 of our Terms.