In a recently published review the life journal, a group of authors reviewed machine learning (ML) image-processing techniques for skin cancer detection using clinical images, evaluating their effectiveness, available datasets, and challenges.
Study: Automated skin cancer detection using clinical images: a comprehensive review. Image credit: PopTika/Shutterstock.com
Background
In recent decades, the incidence of skin cancer has increased, with melanoma diagnoses increasing by 31% between 2012-2022 and accounting for 80% of related deaths. However, early detection can lead to a survival rate of up to 99%, which drops to 30% if the disease metastasizes.
Pigmented skin lesions (PSL), ranging from benign moles to malignant melanomas, are central to this concern. Dermoscopy, using a magnifying lens, helps diagnose PSL. Still, non-specialists often rely on standard cameras, sending images to specialists for diagnosis—a method almost as effective as personal consultation.
Importantly, dermatologists performed better with macroscopic than dermoscopic imaging in one study. Recently, computational methods, particularly ML, have shown promise in aiding early stage skin cancer diagnosis.
More research is needed because early detection of melanoma is crucial for effective treatment. Nevertheless, current diagnostic methods may not be consistently accessible or reliable, highlighting the potential of computer-assisted tools to bridge the detection gap.
The need for early melanoma detection
Melanoma, a very deadly form of skin cancer, is often detected at an advanced stage. However, research over the past decade has indicated that early diagnosis can dramatically reduce mortality.
In this effort, many have used imaging methods and artificial intelligence to diagnose these malignancies. Various unique and conventional approaches have been proposed to address this diagnostic challenge.
Existing literature and gaps
Although there is a plethora of evaluations in skin cancer detection through artificial intelligence, a gap exists in the comprehensive analysis of skin cancer diagnosis using clinical images and machine learning.
Few have demonstrated a complete overview of all available clinical datasets. For clarity, the authors compared the review with other recent reviews considering year scope, type of imaging modality, and major tasks in the automated skin cancer detection pipeline.
Public dataset of clinical skin images
There are numerous public datasets with clinical skin images used by various groups over the past decade. Prominent among them are DermQuest and Med-Node. However, different datasets have led to non-comparable error metrics in the reviewed papers.
Image preprocessing techniques
Often captured images contain artifacts, making segmentation a challenge. Pre-processing corrects these irregularities, such as lighting inconsistencies or the presence of hair, ensuring that the algorithms work correctly. Most of the literature reviewed uses four main types of preprocessing methods.
Lighting correction
Image may contain lighting artifacts. To prevent shading and lesion border confusion, shading is minimized before sectioning.
Various techniques have been used to remove such irregularities, such as data-driven approaches to the “Hue, Saturation, Value” (HSV) color space, and other techniques such as thresholding algorithms.
Artifact removal
Artifacts such as noise, skin lines or hair can affect image quality. Various tools and methods have been developed to mitigate these effects, including the DullRazor for hair removal and the Gaussian filter for noise reduction. However, the effectiveness of these methods varies and must be used judiciously to avoid underestimating machine learning training.
Image resizing and cropping
Ensuring uniformity in image shape is important for training Convolutional Neural Network (CNN) models. This is achieved through image cropping, resizing and re-scaling.
data augmentation
Improving machine learning performance involves generating a variety of training examples. This is done through data augmentation, which is particularly useful for unbalanced datasets. Techniques include cropping, rotation, and noise addition, among others.
Other preprocessing methods
Various researchers have explored different approaches, such as contrast enhancement, histogram equalization, and the use of algorithms such as FastCut. These methods aim to improve the overall quality and reliability of the dataset.
Image segmentation in dermatology
Image segmentation divides an image into segments or clusters of pixels, called image objects. This facilitates image analysis, facilitating easy extraction of lesions. However, skin image segmentation remains challenging, often requiring pre- and post-processing.
Major segmentation methods
Numerous techniques have been explored for image segmentation. Two commonly used methods are Otsu’s method and K-means clustering. Some researchers used the standard Otsu method, while others combined the Otsu method with other techniques, achieving 100% accuracy in determining lesion size.
However, the reliability of such claims remains controversial. In contrast, numerous researchers have used K-means clustering algorithms, with differences in accuracy due to different datasets.
Alternative partitioning methods
Other segmentation techniques explored include the Chan-Vace active contour method, fast independent component analysis, and synthesis and integration of intermediate decay omnibus (SCIDOG).
These methods reported different performance metrics, but the absence of consistent assessment across methods limited performance comparisons.
Post-segmentation processing
After segmentation, post-processing techniques enhance the segmented image. Widely used methods include morphological operations and Gaussian filtering. Specific studies have added further steps, such as artifact removal and hole-filling for optimization.
Extraction of diagnostic features of dermatology
Importance of features
In ML, feature extraction and selection are important. Many studies of skin lesion diagnosis have used a myriad of features, often rooted in dermatology’s ABCD rule, which covers asymmetry, border irregularity, color, diameter, and texture.
Some researchers use the power of deep neural networks, especially CNNs, for feature extraction.
ABCD rule and its modern utility
The ABCD rule, established in 1985, provides criteria for early melanoma detection, focusing on asymmetry, border irregularity, color variation, and diameter greater than 6 mm. This rule remains dominant in contemporary research.
Detailed feature extraction techniques
A variety of methods have been used to assess asymmetry in lesions, from calculating major and minor axes to measuring lesion stiffness and variation. Similarly, to assess wound boundaries, techniques range from calculating compactness, stiffness, and convexity to measuring circumferential error and using convex hulls.
Classification of skin lesions with ML
Skin lesion classification, an important step in skin cancer detection, relies on computer-aided systems that preprocess, segment, and extract features from images, which result in classification of lesions into different classes. The process uses extracted descriptors to provide information about the PSL.
ML models for CNN due to their accuracy in image classification and feature extraction capabilities. Furthermore, some researchers adopted pre-trained CNNs, while others used ML methods such as support vector machine (SVM) and k-nearest neighbors (KNN) for their classification work.
Hierarchy overview
This section includes the results obtained from the classification task of the reviewed articles Some articles lack a classification, so no results are cited from them.
It is important to note that for a fair comparison of classification performance across multiple tasks, the task should be performed on identical datasets. However, most of the articles used different image sets, so the results are presented for the paper with comparable datasets.
Challenges and observations
Accuracy as a metric should be applied with caution, especially when in unbalanced datasets. Some datasets, although yielding excellent results, had size limitations, questioning the reliability of the results.
Also, innovative approaches, such as the effect of darkening areas of the skin or the balanced learning process, highlight the different techniques employed in the field.