Breast Ultrasound Image BI-RADS Classification Based on Vision Transformer
Keywords:
Ultrasound Imaging, Breast cancer, Vision Transformer, Convolutional Neural Network (CNN), Image Classification.Abstract
The most common malignancy among women is breast cancer. Medical ultrasound images are a common tool for detecting breast cancer. In medical ultrasound image classification, Convolutional neural networks(CNNs) have demonstrated great success. However, in most studies on convolutional neural networks categorizes breast tumors into benign and malignant types. Additionally, as convolutional neural networks have a limited receptive field, they are unable to acquire global information. In order to resolve this issue, we explored the feasibility of using Vision Transformer (ViT) in breast ultrasound image BI-RADS classification tasks through transfer learning. We collected publicly available breast ultrasound datasets and enhanced the quality of ultrasound images using the CLAHE algorithm. Through a transfer learning strategy, we trained the ViT model. Using an independent test set, we compared the classification results of ViT with CNNs serving as the baseline model. Breast cancer were categorized based on the BI-RADS criteria, and the results were evaluated using precision, accuracy, and F1 score. According to the experimental analysis results,ViT's transfer learning model produced 94.57% accuracy, 94.11% precision, and 94.29% F1 scores in the classification of breast ultrasound images, respectively, in the breast ultrasound image’s BI-RADS classification. The classification performance of the ViT model outperformed the CNN models, including DenseNet201, Xception, MobileNet, and GoogLeNet. The study showed that ViT can be effectively utilized in classifying breast ultrasound images according to the BI-RADS system. The ViT model performed well in breast cancer classification and showed potential as an alternative to CNN.