Independent assessment of the WHO Skin Neglected Tropical Diseases application for leprosy detection

Deps et al.

Objectives

To independently evaluate the World Health Organization (WHO) Skin Neglected Tropical Diseases (NTDs) application, focusing on the diagnostic performance of its underlying artificial intelligence model for leprosy detection. The primary objective was to determine the proportion of images in which leprosy appeared among the model’s Top-5 diagnostic predictions. The secondary objective was to qualitatively analyze diagnostic error patterns. 

Methods

A data set of 439 anonymized clinical images from confirmed leprosy cases (1996–2024) was analyzed, spanning the full clinical spectrum (indeterminate, tuberculoid, borderline/dimorphous, and lepromatous/ Virchowian forms) and including reactional and atypical presentations. After excluding 16 images due to processing errors, 423 images were retained: 367 classical leprosy lesions and 56 reactional or atypical leprosy-related presentations. All images were evaluated using the WHO desktop version of the visual classifier. Top-5 sensitivity (recall) for leprosy was estimated, alongside a qualitative error analysis focusing on intrapatient inconsistencies and challenging lesion types. 

Results

The model achieved an overall Top-5 sensitivity (recall) of 84.9%, with higher sensitivity for classical lesions (87.2%) than for reactional or atypical presentations (69.6%). Qualitative review revealed inconsistent predictions for visually similar lesions from the same patient, and misclassifications concentrated among necrotic, inflammatory, and infiltrative lesions. 

Conclusions

The WHO Skin NTDs application demonstrates substantial promise as a clinical decision-support and educational tool, especially for classical leprosy. Performance gaps for reactional and atypical forms highlight the need for algorithmic refinement. Enhancing data set diversity and integrating patient-level context may improve diagnostic robustness.

Article's language
English
Original research