Skin conditions are among the most common ailments around the world. In the United States, up to 37 percent of clinic visits involve a skin concern of some kind.
However, there is also a global shortage of dermatologists, which is forcing patients to seek help from general practitioners who tend to give a less accurate diagnosis. This can lead to suboptimal referrals and delayed or wrongful treatment.
To help with the workload, researchers from Google developed an AI system that can accurately diagnose some of the most common skin conditions seen in primary care. In the paper titled "A Deep Learning System for Differential Diagnosis of Skin Disease," the researchers claimed that the AI system achieved accuracy across 26 skin conditions and is on par with U.S. board-certified dermatologists.
Working Like A Dermatologist
When clinicians see a patient, they often do not give just one diagnosis. They generate a differential diagnosis, a list of possible conditions that have to be followed up with laboratory tests, imaging, procedures, and consultations.
In a blog post published on Thursday, Sept. 12, the researchers said that their deep learning system or DLS does the same. It processes inputs that include clinical images of skin abnormality and metadata (self-reported components of medical history).
The team trained and evaluated the DLS using 17,777 de-identified cases from a teledermatology practice across two states. Cases from between 2010 and 2017 were used to train the AI. Meanwhile, data from 2017 to 2018 were used for evaluation. During training, the DLS leveraged over 50,000 differential diagnoses from over 40 dermatologists.
To test the DLS' accuracy, the researchers compiled diagnoses from three U.S. board-certified dermatologists. They reported that the AI's ranked list of skin conditions achieved 71 percent top 1 and 93 percent top 3 accuracies.
Moreover, when the AI was compared on against clinicians on a subset of a validation data set, the DLS achieved top-3 diagnostic accuracy of 90 percent or comparable to dermatologists and "substantially higher" than primary care physicians and nurse practitioners.
To check for potential biases, they tested the AI based on the Fitzpatrick skin type. The Fitzpatrick skin type is a scale that ranges from Type I (described as pale, always burns and peels, never tans) Type 6 (characterized by dark brown skin color, never burns, and always tans). They found that the model's performance was the same.
Despite the promising performance of the AI, the researchers said that it is not ready to diagnose just yet. Because of the limited data set, the AI cannot accurately detect skin cancer.
Helping Clinicians
The system was not meant to replace clinicians but become a tool to aid in diagnosing skin conditions.
"For example, such a DLS could help triage cases to guide prioritization for clinical care or could help non-dermatologists initiate dermatologic care more accurately and potentially improve access," the researchers wrote.