Retinopathy of prematurity (ROP) is a major cause of vision loss and blindness among premature infants. Timely screening, diagnosis, and intervention can effectively prevent the deterioration of ROP. However, there are several challenges in ROP diagnosis globally, including high subjectivity, low screening efficiency, regional disparities in screening coverage, and severe shortage of pediatric ophthalmologists. The application of artificial intelligence (AI) as an assistive tool for diagnosis or an automated method for ROP diagnosis can improve the efficiency and objectivity of ROP diagnosis, expand screening coverage, and enable automated screening and quantified diagnostic results. In the global environment that emphasizes the development and application of medical imaging AI, developing more accurate diagnostic networks, exploring more effective AI-assisted diagnosis methods, and enhancing the interpretability of AI-assisted diagnosis, can accelerate the improvement of AI policies of ROP and the implementation of AI products, promoting the development of ROP diagnosis and treatment.
In recent years, the task of object detection and segmentation in medical image is the research hotspot and difficulty in the field of image processing. Instance segmentation provides instance-level labels for different objects belonging to the same class, so it is widely used in the field of medical image processing. In this paper, medical image instance segmentation was summarized from the following aspects: First, the basic principle of instance segmentation was described, the instance segmentation models were classified into three categories, the development context of the instance segmentation algorithm was displayed in two-dimensional space, and six classic model diagrams of instance segmentation were given. Second, from the perspective of the three models of two-stage instance segmentation, single-stage instance segmentation and three-dimensional (3D) instance segmentation, we summarized the ideas of the three types of models, discussed the advantages and disadvantages, and sorted out the latest developments. Third, the application status of instance segmentation in six medical images such as colon tissue image, cervical image, bone imaging image, pathological section image of gastric cancer, computed tomography (CT) image of lung nodule and X-ray image of breast was summarized. Fourth, the main challenges in the field of medical image instance segmentation were discussed and the future development direction was prospected. In this paper, the principle, models and characteristics of instance segmentation are systematically summarized, as well as the application of instance segmentation in the field of medical image processing, which is of positive guiding significance to the study of instance segmentation.
To address the issue of a large number of network parameters and substantial floating-point operations in deep learning networks applied to image segmentation for cardiac magnetic resonance imaging (MRI), this paper proposes a lightweight dilated parallel convolution U-Net (DPU-Net) to decrease the quantity of network parameters and the number of floating-point operations. Additionally, a multi-scale adaptation vector knowledge distillation (MAVKD) training strategy is employed to extract latent knowledge from the teacher network, thereby enhancing the segmentation accuracy of DPU-Net. The proposed network adopts a distinctive way of convolutional channel variation to reduce the number of parameters and combines with residual blocks and dilated convolutions to alleviate the gradient explosion problem and spatial information loss that might be caused by the reduction of parameters. The research findings indicate that this network has achieved considerable improvements in reducing the number of parameters and enhancing the efficiency of floating-point operations. When applying this network to the public dataset of the automatic cardiac diagnosis challenge (ACDC), the dice coefficient reaches 91.26%. The research results validate the effectiveness of the proposed lightweight network and knowledge distillation strategy, providing a reliable lightweighting idea for deep learning in the field of medical image segmentation.
Colorectal cancer (CRC) is a common malignant tumor that seriously threatens human health. CRC presents a formidable challenge in terms of accurate identification due to its indistinct boundaries. With the widespread adoption of convolutional neural networks (CNNs) in image processing, leveraging CNNs for automatic classification and segmentation holds immense potential for enhancing the efficiency of colorectal cancer recognition and reducing treatment costs. This paper explores the imperative necessity for applying CNNs in clinical diagnosis of CRC. It provides an elaborate overview on research advancements pertaining to CNNs and their improved models in CRC classification and segmentation. Furthermore, this work summarizes the ideas and common methods for optimizing network performance and discusses the challenges faced by CNNs as well as future development trends in their application towards CRC classification and segmentation, thereby promoting their utilization within clinical diagnosis.
To address issues such as loss of detailed information, blurred target boundaries, and unclear structural hierarchy in medical image fusion, this paper proposes an adaptive feature medical image fusion network based on a full-scale diffusion model. First, a region-level feature map is generated using a kernel-based saliency map to enhance local features and boundary details. Then, a full-scale diffusion feature extraction network is employed for global feature extraction, alongside a multi-scale denoising U-shaped network designed to fully capture cross-layer information. A multi-scale feature integration module is introduced to reinforce texture details and structural information extracted by the encoder. Finally, an adaptive fusion scheme is applied to progressively fuse region-level features, global features, and source images layer by layer, enhancing the preservation of detail information. To validate the effectiveness of the proposed method, this paper validates the proposed model on the publicly available Harvard dataset and an abdominal dataset. By comparing with nine other representative image fusion methods, the proposed approach achieved improvements across seven evaluation metrics. The results demonstrate that the proposed method effectively extracts both global and local features of medical images, enhances texture details and target boundary clarity, and generates fusion image with high contrast and rich information, providing more reliable support for subsequent clinical diagnosis.
This study proposes an automated neurofibroma detection method for whole-body magnetic resonance imaging (WBMRI) based on radiomics and ensemble learning. A dynamic weighted box fusion mechanism integrating two dimensional (2D) object detection and three dimensional (3D) segmentation is developed, where the fusion weights are dynamically adjusted according to the respective performance of the models in different tasks. The 3D segmentation model leverages spatial structural information to effectively compensate for the limited boundary perception capability of 2D methods. In addition, a radiomics-based false positive reduction strategy is introduced to improve the robustness of the detection system. The proposed method is evaluated on 158 clinical WBMRI cases with a total of 1,380 annotated tumor samples, using five-fold cross-validation. Experimental results show that, compared with the best-performing single model, the proposed approach achieves notable improvements in average precision, sensitivity, and overall performance metrics, while reducing the average number of false positives by 17.68. These findings demonstrate that the proposed method achieves high detection accuracy with enhanced false positive suppression and strong generalization potential.
Colorectal polyps are important early markers of colorectal cancer, and their early detection is crucial for cancer prevention. Although existing polyp segmentation models have achieved certain results, they still face challenges such as diverse polyp morphology, blurred boundaries, and insufficient feature extraction. To address these issues, this study proposes a parallel coordinate fusion network (PCFNet), aiming to improve the accuracy and robustness of polyp segmentation. PCFNet integrates parallel convolutional modules and a coordinate attention mechanism, enabling the preservation of global feature information while precisely capturing detailed features, thereby effectively segmenting polyps with complex boundaries. Experimental results on Kvasir-SEG and CVC-ClinicDB demonstrate the outstanding performance of PCFNet across multiple metrics. Specifically, on the Kvasir-SEG dataset, PCFNet achieved an F1-score of 0.897 4 and a mean intersection over union (mIoU) of 0.835 8; on the CVC-ClinicDB dataset, it attained an F1-score of 0.939 8 and an mIoU of 0.892 3. Compared with other methods, PCFNet shows significant improvements across all performance metrics, particularly in multi-scale feature fusion and spatial information capture, demonstrating its innovativeness. The proposed method provides a more reliable AI-assisted diagnostic tool for early colorectal cancer screening.
High resolution (HR) magnetic resonance images (MRI) or computed tomography (CT) images can provide clearer anatomical details of human body, which facilitates early diagnosis of the diseases. However, due to the imaging system, imaging environment and human factors, it is difficult to obtain clear high-resolution images. In this paper, we proposed a novel medical image super resolution (SR) reconstruction method via multi-scale information distillation (MSID) network in the non-subsampled shearlet transform (NSST) domain, namely NSST-MSID network. We first proposed a MSID network that mainly consisted of a series of stacked MSID blocks to fully exploit features from images and effectively restore the low resolution (LR) images to HR images. In addition, most previous methods predict the HR images in the spatial domain, producing over-smoothed outputs while losing texture details. Thus, we viewed the medical image SR task as the prediction of NSST coefficients, which make further MSID network keep richer structure details than that in spatial domain. Finally, the experimental results on our constructed medical image datasets demonstrated that the proposed method was capable of obtaining better peak signal to noise ratio (PSNR), structural similarity (SSIM) and root mean square error (RMSE) values and keeping global topological structure and local texture detail better than other outstanding methods, which achieves good medical image reconstruction effect.
To address the challenges faced by current brain midline segmentation techniques, such as insufficient accuracy and poor segmentation continuity, this paper proposes a deep learning network model based on a two-stage framework. On the first stage of the model, prior knowledge of the feature consistency of adjacent brain midline slices under normal and pathological conditions is utilized. Associated midline slices are selected through slice similarity analysis, and a novel feature weighting strategy is adopted to collaboratively fuse the overall change characteristics and spatial information of these associated slices, thereby enhancing the feature representation of the brain midline in the intracranial region. On the second stage, the optimal path search strategy for the brain midline is employed based on the network output probability map, which effectively addresses the problem of discontinuous midline segmentation. The method proposed in this paper achieved satisfactory results on the CQ500 dataset provided by the Center for Advanced Research in Imaging, Neurosciences and Genomics, New Delhi, India. The Dice similarity coefficient (DSC), Hausdorff distance (HD), average symmetric surface distance (ASSD), and normalized surface Dice (NSD) were 67.38 ± 10.49, 24.22 ± 24.84, 1.33 ± 1.83, and 0.82 ± 0.09, respectively. The experimental results demonstrate that the proposed method can fully utilize the prior knowledge of medical images to effectively achieve accurate segmentation of the brain midline, providing valuable assistance for subsequent identification of the brain midline by clinicians.
Online hashing methods are receiving increasing attention in cross modal medical image retrieval research. However, existing online methods often lack the learning ability to maintain semantic correlation between new and existing data. To this end, we proposed online semantic similarity cross-modal hashing (OSCMH) learning framework to incrementally learn compact binary hash codes of medical stream data. Within it, a sparse representation of existing data based on online anchor datasets was designed to avoid semantic forgetting of the data and adaptively update hash codes, which effectively maintained semantic correlation between existing and arriving data and reduced information loss as well as improved training efficiency. Besides, an online discrete optimization method was proposed to solve the binary optimization problem of hash code by incrementally updating hash function and optimizing hash code on medical stream data. Compared with existing online or offline hashing methods, the proposed algorithm achieved average retrieval accuracy improvements of 12.5% and 14.3% on two datasets, respectively, effectively enhancing the retrieval efficiency in the field of medical images.