Back to Journals » Journal of Pain Research » Volume 19
Development of a High-Sensitivity Screening Tool for Neuropathic Pain Integrating PainDETECT and BS-POP Using Machine Learning
Authors Furuya T, Yoshikai N, Suzuki S, Sawada H, Matsumoto K, Saito S, Ozaki R, Tsujisawa H, Nakanishi K
Received 9 January 2026
Accepted for publication 14 April 2026
Published 16 May 2026 Volume 2026:19 594918
DOI https://doi.org/10.2147/JPR.S594918
Checked for plagiarism Yes
Review by Single anonymous peer review
Peer reviewer comments 2
Editor who approved publication: Professor King Hei Stanley Lam
Tomohiro Furuya,1 Noriaki Yoshikai,2 Satoshi Suzuki,1 Hirokatsu Sawada,1 Koji Matsumoto,1 Sosuke Saito,1 Ryo Ozaki,1 Hirohiko Tsujisawa,1 Kazuyoshi Nakanishi1
1Department of Orthopaedic Surgery, Nihon University Itabashi Hospital, Tokyo, Japan; 2Institute of Science and Engineering, College of Science and Technology, Nihon University, Tokyo, Japan
Correspondence: Kazuyoshi Nakanishi, Department of Orthopaedic Surgery, Nihon University Itabashi Hospital, Tokyo, 173-8610, Japan, Tel +81-3-3972-8111, Email [email protected]
Purpose: Accurate identification of neuropathic pain (NeP) remains challenging in routine clinical practice. While PainDETECT is widely used, its sensitivity is limited by its focus on somatic symptoms. This study aimed to develop a high-sensitivity screening tool by integrating PainDETECT with the Brief Scale for Psychiatric Problems in Orthopaedic Patients (BS-POP) using machine learning.
Patients and Methods: Neuropathic pain was diagnosed based on comprehensive clinical evaluation including medical history, neurological examination, and imaging findings when available. We analyzed clinical data from 1083 consecutive patients with pain. The study involved two phases: evaluation of conventional tools via statistical modeling and construction of a random forest–based classification model.
Results: The proposed system achieved an overall accuracy of 75.6%. For NeP, the sensitivity was 70.3% and specificity was 86.0%, representing higher sensitivity compared with the conventional PainDETECT cutoff method (17.6%).
Conclusion: Integrating psychosocial factors via BS-POP and utilizing machine learning significantly enhances NeP screening performance. This system may support earlier and more appropriate pain management in clinical practice.
Keywords: neuropathic pain, painDETECT, BS-POP, machine learning, psychosocial factors, orthopedic surgery
Introduction
Neuropathic pain, defined by the International Association for the Study of Pain (IASP) as “pain caused by a lesion or disease of the somatosensory nervous system”,1 presents with a highly heterogeneous clinical profile. Patients often experience abnormal sensations such as numbness, burning, or electric shock-like pain, in addition to psychosocial symptoms including insomnia, depression, anxiety, and loss of vitality. These psychological factors frequently contribute to the persistence and exacerbation of pain, posing a major challenge in clinical management. Globally, approximately 20% of the population suffers from chronic pain2 and a Japanese survey reported that 53% of patients with chronic spinal disorders have neuropathic pain.3
Because neuropathic pain arises from a wide range of etiologies, diagnosis is often difficult. Distinguishing it from nociceptive and psychogenic pain is essential for accurate diagnosis, and a variety of screening tools have been developed for this purpose. Representative instruments include PainDETECT, the Leeds Assessment of Neuropathic Symptoms and Signs (LANSS), and the Douleur Neuropathique 4 (DN4) questionnaire. However, these tools have limited sensitivity and specificity, particularly in early or mild cases.4 Moreover, most existing questionnaires focus on somatic symptoms and do not sufficiently capture psychosocial factors. Therefore, there is a growing need for a novel tool that can evaluate neuropathic pain in a more sensitive and multidimensional manner in clinical practice.
PainDETECT5 was designed to identify characteristic symptoms of neuropathic pain and diagnose patients based on a scoring system, but its sensitivity and specificity remain limited.6 Clinical guidelines, such as those issued by the Neuropathic Pain Special Interest Group (NeuPSIG) and the European Federation of Neurological Societies (EFNS), caution against relying solely on screening questionnaires for diagnosis.7 A key limitation of conventional tools is that they insufficiently account for psychological aspects of pain, despite the fact that neuropathic pain is strongly influenced by insomnia, depression, anxiety, reduced vitality, and related psychosocial factors.
To address this limitation, the Brief Scale for Psychiatric Problems in Orthopaedic Patients (BS-POP) was developed in Japan.8 BS-POP consists of both patient- and physician-administered questionnaires, allowing simultaneous collection of subjective and objective assessments of psychological conditions. This dual evaluation enables quantitative assessment of depression, anxiety, and maladaptive behaviors related to pain. Since its development in the early 2000s by a research group affiliated with the Japanese Orthopaedic Association, BS-POP has been used in orthopedic practice to identify psychosocial barriers to recovery in patients with chronic pain and musculoskeletal disorders.9,10 Although widely recognized in Japan as a screening tool for psychosocial conditions, its combined use with PainDETECT for neuropathic pain screening has not been evaluated.
In this study, we aimed to develop a new high-sensitivity screening tool for neuropathic pain by integrating PainDETECT with BS-POP and applying machine learning algorithms. By incorporating psychosocial factors into the screening process, we sought to overcome the limitations of existing tools and provide more accurate and earlier diagnosis of neuropathic pain. The BS-POP questionnaires for patients and physicians are shown in Table 1 and Table 2, respectively.
|
Table 1 The BS-POP Questionnaire for Patients |
|
Table 2 The BS-POP Questionnaire for Physicians |
Materials and Methods
Diagnostic Reference Standard
Neuropathic pain in this study was diagnosed based on comprehensive clinical judgment during routine medical practice at our institution. The diagnosis was determined through integration of the patient’s medical history, pain characteristics, neuroanatomical plausibility, neurological examination findings (eg, hypoesthesia or allodynia), and imaging findings when available.
Diagnoses were made by orthopaedic surgeons involved in routine clinical care, including general orthopaedic physicians. Although formal application of the Neuropathic Pain Special Interest Group (NeuPSIG) grading system (possible, probable, or definite neuropathic pain) was not systematically performed, the diagnostic process was guided by internationally recognized diagnostic concepts and national clinical guidelines, with consideration of somatosensory system involvement.
Patients who were clinically judged to have a predominant neuropathic pain mechanism were classified into the neuropathic pain group. Patients whose symptoms were considered consistent with nociceptive mechanisms related to tissue injury were classified as having nociceptive pain. Cases in which the underlying pain mechanism could not be clearly determined were categorized as psychogenic or uncertain pain.
Screening questionnaire scores (PainDETECT and BS-POP) were not used as the sole basis for the clinical diagnosis. Instead, diagnoses were determined through comprehensive clinical assessment as described above. When diagnostic changes occurred during follow-up, the final confirmed clinical diagnosis rather than the initial diagnosis was used as the reference standard for analysis.
Study Design
This study consisted of two phases. In Phase 1, we evaluated the diagnostic performance of conventional screening approaches using statistical modeling. In Phase 2, we constructed and validated a machine learning–based classification model to improve screening sensitivity for neuropathic pain. The Phase 1 dataset was included within the larger Phase 2 dataset.
Phase 1: Evaluation of Conventional Tools via Statistical Modeling
We retrospectively analyzed 250 patients who visited the Department of Orthopaedic Surgery at Nihon University Itabashi Hospital between October 2020 and June 2021. All participants completed the PainDETECT and BS-POP questionnaires.
Each case was classified into neuropathic pain, nociceptive pain, or psychogenic/uncertain pain based on clinical diagnosis and national neuropathic pain guidelines. Classification was performed primarily by the attending physician, and cases were reclassified when diagnostic changes occurred during the clinical course.
Diagnostic performance was evaluated using three analytical models.
Model 1: Neuropathic pain was determined based on the established PainDETECT cutoff score of 19 points.
Model 2: A multiple regression model was constructed using PainDETECT questionnaire responses as explanatory variables, and pain type was classified according to the derived regression equation.
Model 3: Covariance structure analysis integrating PainDETECT and BS-POP variables was performed to evaluate latent relationships between neuropathic pain symptoms and psychosocial factors. Pain classification was determined based on the resulting structural equation.
Model fit for the covariance structure analysis was evaluated using goodness-of-fit indices including the Goodness-of-Fit Index (GFI), Comparative Fit Index (CFI), and Root Mean Square Error of Approximation (RMSEA). A model was considered to demonstrate acceptable fit when GFI and CFI were greater than 0.90 and RMSEA was less than 0.05.
The following cases were excluded from the analysis: (1) patients without pain complaints, (2) patients with uncertain diagnoses, (3) patients with pain potentially attributable to tumors, and (4) cases with incomplete questionnaire data.
Phase 2: Machine Learning–Based Classification Model
To further improve diagnostic accuracy, a machine learning–based classification model was developed using data from patients who visited our department between October 2020 and November 2024. A three-class classification framework was adopted consisting of neuropathic pain, nociceptive pain, and psychogenic/uncertain pain.
Data Preprocessing
Responses from PainDETECT and BS-POP questionnaires were converted into numerical variables and used as explanatory features. Demographic variables including age and sex were also incorporated as additional explanatory variables.
The dataset was randomly divided into five subsets. One subset (20% of the data) was reserved as an independent test dataset, while the remaining four subsets (80% of the data) were used for model development.
Model Development
Within the training dataset, four-fold cross-validation11 was performed. In each iteration, three subsets were used for model training and one subset was used as a validation dataset. This validation subset was rotated across the four iterations so that each subset served once as validation data. Performance metrics were averaged across the cross-validation folds.
A random forest algorithm was used to construct the classification model.12
Hyperparameters of the random forest model were optimized during cross-validation. Parameters considered during tuning included the number of trees, maximum tree depth, and minimum number of samples required for node splitting. A grid-search strategy was used to identify parameter combinations that maximized sensitivity for neuropathic pain detection while maintaining acceptable specificity.
Model Evaluation
Model performance was evaluated using classification metrics derived from the confusion matrix. The primary evaluation metrics were sensitivity and specificity because the main objective of the screening system was to maximize detection of neuropathic pain. Additional evaluation metrics included overall accuracy and F1 score.
Receiver operating characteristic (ROC) curves and area under the curve (AUC) were not evaluated because the present model was designed primarily as a screening tool emphasizing sensitivity and specificity.
Feature Importance Analysis and Software
Feature importance was calculated using the mean decrease in Gini impurity13 provided by the random forest algorithm.
This analysis enabled identification of questionnaire items that most strongly influenced the classification of pain types.
All machine learning analyses were conducted using Python with the scikit-learn machine learning library.
Results
Phase 1: Evaluation of Conventional Tools via Statistical Modeling
A total of 250 cases were analyzed using three models. There were 129 males and 121 females, with a mean age of 57.8 years (range: 2–93 years). Neuropathic pain was observed in 85 cases, nociceptive pain in 121 cases, and psychogenic/uncertain pain in 44 cases.
The diagnostic performance of the three models is summarized (Table 3).
|
Table 3 Diagnostic Performance of Each Method for Neuropathic Pain |
Model 1 (PainDETECT cutoff score): Sensitivity was 17.6% and specificity was 97.6%, demonstrating extremely low sensitivity and limited clinical utility.
Model 2 (multiple regression analysis): The regression equation F1(X), derived from PainDETECT responses (Xi, i = 1–7), achieved a sensitivity of 51.8% and specificity of 91.5%, showing substantial improvement compared with Model 1.
Model 3 (covariance structure analysis): A path diagram integrating PainDETECT and BS-POP is shown in Figure 1. The derived equation F2(X) yielded a sensitivity of 61.2% and specificity of 89.1%. These results indicate that combining PainDETECT with BS-POP improved screening performance compared with PainDETECT alone.
Among the 1083 patients included in Phase 2537 were male and 546 were female, with a mean age of 58.0 years (range, 7–95 years). Based on clinical diagnosis, 369 patients had neuropathic pain, 640 had nociceptive pain, and 74 had psychogenic or uncertain pain.
The classification performance was evaluated using the independent test set (n = 217). The confusion matrix and performance metrics are shown in Table 4 and Table 5. The overall accuracy of the system was 75.6%. For neuropathic pain, the sensitivity and specificity were 70.3% and 86.0%, respectively.
|
Table 4 Confusion Matrix of the Machine Learning Classification Model |
|
Table 5 Performance Metrics Derived from the Confusion Matrix |
These results suggest that the machine learning model integrating PainDETECT and BS-POP improved diagnostic accuracy.
The confusion matrix of the classification model is shown (Table 4).
Values represent the number of patients classified into each category by the machine learning model.
Performance metrics derived from the confusion matrix are summarized (Table 5).
Sensitivity and specificity are shown for each pain type.
An example of evaluating the feature importance of explanatory variables is shown (Figure 2). The overall configuration of the diagnostic classification system is shown (Figure 3).
The requirements for this screening system and the implementation plan are shown (Table 6).
|
Table 6 Requirements for the Screening System and Implementation Plan |
Discussion
In this study, we developed a novel screening system for neuropathic pain that demonstrated higher sensitivity than previously reported tools. The primary novelty of this study lies in the integration of somatic symptom assessment (PainDETECT) and psychosocial evaluation (BS-POP) within a machine learning–based diagnostic framework for neuropathic pain screening.
The system achieved an overall accuracy of 75.6%, with sensitivity and specificity for neuropathic pain of 70.3% and 86.0%, respectively. These values represent a substantial improvement compared with conventional methods, which have often reported sensitivities as low as 10–20%.
The sensitivity of the conventional PainDETECT cutoff observed in the present study (17.6%) was lower than that reported in several previous validation studies. One possible explanation is the clinical characteristics of the study population. Our cohort consisted primarily of patients presenting to an orthopedic department with heterogeneous pain conditions, including early-stage or mixed pain mechanisms. In such populations, neuropathic features may be less pronounced, which can reduce the sensitivity of symptom-based questionnaires such as PainDETECT. In addition, strict application of the conventional cutoff score may further decrease sensitivity in clinical screening settings.
One likely explanation for this improvement is the incorporation of psychosocial factors into the screening process. In neuropathic pain, heightened fear and anxiety regarding pain often trigger hypervigilance and avoidance behaviors, leading to reduced activity, disuse syndrome, functional decline, and depression. This vicious cycle, known as the fear-avoidance model, highlights the impact of catastrophic thinking and negative emotions on patient quality of life. By integrating PainDETECT, which qualitatively evaluates pain, with BS-POP, which quantifies psychosocial conditions, we created a diagnostic model that more comprehensively reflects patient status. BS-POP provides both patient- and physician-based assessments, enabling a multidimensional evaluation of depression, anxiety, and stress, which are often overlooked by PainDETECT alone.
Another advantage of the proposed system is its simplicity and feasibility in clinical settings. Patients can complete the questionnaires while waiting for consultation, allowing rapid screening for neuropathic pain.
This approach may facilitate rapid screening in clinical settings. In addition, future digital implementation may allow patient-reported monitoring of pain states and treatment responses. However, the clinical impact of such applications requires further investigation.
The use of random forest models also provided an additional benefit: quantification of variable importance. As illustrated in Figure 2, the algorithm allows clinicians to visualize which questionnaire items most strongly influence diagnostic outcomes. For instance, if “difficulty at work (BS8)” shows high importance, targeted interventions such as workplace adjustments or stress management may be prioritized. In this way, psychosocial feedback can be integrated into treatment planning, potentially improving outcomes and patient satisfaction while raising the quality of pain management.
In addition to improving classification performance, the machine learning framework enabled a data-driven evaluation of the relative importance of questionnaire items. This approach may provide clinicians with additional insights into the psychosocial and functional factors associated with different pain mechanisms.
In recent years, machine learning approaches have increasingly been applied in medical research for diagnostic support and risk prediction. In the present study, we adopted a random forest–based ensemble learning approach because the dataset consisted mainly of questionnaire-based variables and the overall sample size was relatively limited. Ensemble learning methods are known to perform well with structured datasets and moderate sample sizes, making them suitable for the present study design.
Despite the limited dataset, the model achieved relatively high sensitivity for neuropathic pain detection. This finding suggests that integrating psychosocial information with conventional pain questionnaires may enhance the ability to identify neuropathic pain mechanisms in clinical practice.
Limitations
This study has several limitations. Neuropathic pain was analyzed as a single category encompassing conditions such as myelopathy, cauda equina syndrome, and radiculopathy. Disease-specific characteristics were not separately considered, which may have influenced diagnostic accuracy. Further analyses stratified by disease type will be necessary to confirm clinical utility and refine the model.
In addition, the diagnosis of neuropathic pain was based on comprehensive clinical evaluation rather than a standardized grading system such as NeuPSIG, which may have introduced some degree of diagnostic variability.
This was also a retrospective study conducted at a single institution, which may limit the generalizability of the findings.
Future studies using larger multicenter datasets and external validation will be necessary to further confirm the robustness and clinical applicability of the proposed screening system.
Conclusion
By integrating PainDETECT with the Brief Scale for Psychiatric Problems in Orthopaedic Patients (BS-POP) and applying machine learning techniques, we developed a novel screening system capable of classifying three types of pain—neuropathic, nociceptive, and psychogenic—achieving an overall accuracy of 75.6%.
The integration of psychosocial factors addressed important limitations of conventional screening tools and enabled a more comprehensive assessment of pain mechanisms.
This approach may facilitate earlier identification of neuropathic pain in clinical settings. However, further validation in independent populations is required before broader clinical application can be considered.
Further validation studies are warranted to establish the generalizability and clinical impact of this screening system.
Ethical Approval
This study was approved by the Institutional Review Board of Nihon University Itabashi Hospital (Approval No. RK-250513-2, approved on May 27, 2025). Written informed consent was obtained from participants when required, and an opt-out method was applied for cases where direct consent could not be obtained, in accordance with the institutional ethics guidelines.
Acknowledgments
The authors thank all patients and clinical staff of the Department of Orthopaedic Surgery at Nihon University Itabashi Hospital for their cooperation.
Author Contributions
All authors made a significant contribution to the work reported, whether that is in the conception, study design, execution, acquisition of data, analysis and interpretation, or in all these areas; took part in drafting, revising or critically reviewing the article; gave final approval of the version to be published; have agreed on the journal to which the article has been submitted; and agree to be accountable for all aspects of the work.
Funding
This research received no external funding.
Disclosure
The authors report no conflicts of interest in this work.
References
1. Jensen TS, Baron R, Haanpää M. A new definition of neuropathic pain. Pain. 2011;152(10):2204–10. doi:10.1016/j.pain.2011.06.017
2. Goldberg DS, McGee SJ. Pain as a public health priority. BMC Public Health. 2011;11:770. doi:10.1186/1471-2458-11-770
3. Ogawa S. Development of a screening questionnaire for neuropathic pain in Japanese chronic pain patients. Pain Clinic. 2010;31:1187–1194. Japanese.
4. Mathieson S, Maher CG, Terwee CB, de Folly-Campos T, Lin CW. Neuropathic pain screening questionnaires have limited measurement properties: a systematic review. J Clin Epidemiol. 2015;68(8):957–966. doi:10.1016/j.jclinepi.2015.03.006
5. Freynhagen R, Baron R, Gockel U, Tölle TR. painDETECT: a new screening questionnaire to identify neuropathic components in patients with back pain. Curr Med Res Opin. 2006;22(10):1911–1920. doi:10.1185/030079906X132488
6. Timmerman H, Steegers MAH, Huygen FJPM, et al. Investigating the validity of the DN4 in a consecutive population of patients with chronic pain. PLoS One. 2017;12(11):e0187961. doi:10.1371/journal.pone.0187961
7. Japan Society of Pain Clinicians. Clinical practice guideline for neuropathic pain [Internet]. Available from: https://www.jspc.gr.jp/Contents/public/pdf/shi-guide08_10.pdf.
8. Tominaga R, Makino N, Kato H, et al. Development of the brief scale for psychiatric problems in orthopaedic patients (BS-POP). J Orthop Sci. 2013;18(5):804–810. doi:10.1007/s00776-013-0435-2
9. Higuchi D, Yamada K, Hashimoto H, et al. Validation of the BS-POP for screening psychosocial factors in patients with musculoskeletal pain. J Orthop Sci. 2016;21(5):699–705. doi:10.1016/j.jos.2016.06.012
10. Matsudaira K, Konishi H, Fukushima Y, et al. Psychosocial factors in chronic low back pain: use of BS-POP in Japanese orthopedic settings. Spine Surg Relat Res. 2017;1(3):130–137. doi:10.22603/ssrr.1.2016-0008
11. Scikit-learn developers. Cross-validation: evaluating estimator performance [Internet]. Available from: https://scikit-learn.org/stable/modules/cross_validation.html.
12. Géron A. Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow.
13. Strobl C, Boulesteix AL, Kneib T, Augustin T, Zeileis A. Conditional variable importance for random forests. BMC Bioinform. 2008;9:307. doi:10.1186/1471-2105-9-307
© 2026 The Author(s). This work is published and licensed by Dove Medical Press Limited. The
full terms of this license are available at https://www.dovepress.com/terms
and incorporate the Creative Commons Attribution
- Non Commercial (unported, 4.0) License.
By accessing the work you hereby accept the Terms. Non-commercial uses of the work are permitted
without any further permission from Dove Medical Press Limited, provided the work is properly
attributed. For permission for commercial use of this work, please see paragraphs 4.2 and 5 of our Terms.
