Medical image registration plays an important role in medical diagnosis and treatment planning. However, the current registration methods based on deep learning still face some challenges, such as insufficient ability to extract global information, large number of network model parameters, slow reasoning speed and so on. Therefore, this paper proposed a new model LCU-Net, which used parallel lightweight convolution to improve the ability of global information extraction. The problem of large number of network parameters and slow inference speed was solved by multi-scale fusion. The experimental results showed that the Dice coefficient of LCU-Net reached 0.823, the Hausdorff distance was 1.258, and the number of network parameters was reduced by about one quarter compared with that before multi-scale fusion. The proposed algorithm shows remarkable advantages in medical image registration tasks, and it not only surpasses the existing comparison algorithms in performance, but also has excellent generalization performance and wide application prospects.
Non-rigid registration plays an important role in medical image analysis. U-Net has been proven to be a hot research topic in medical image analysis and is widely used in medical image registration. However, existing registration models based on U-Net and its variants lack sufficient learning ability when dealing with complex deformations, and do not fully utilize multi-scale contextual information, resulting insufficient registration accuracy. To address this issue, a non-rigid registration algorithm for X-ray images based on deformable convolution and multi-scale feature focusing module was proposed. First, it used residual deformable convolution to replace the standard convolution of the original U-Net to enhance the expression ability of registration network for image geometric deformations. Then, stride convolution was used to replace the pooling operation of the downsampling operation to alleviate feature loss caused by continuous pooling. In addition, a multi-scale feature focusing module was introduced to the bridging layer in the encoding and decoding structure to improve the network model’s ability of integrating global contextual information. Theoretical analysis and experimental results both showed that the proposed registration algorithm could focus on multi-scale contextual information, handle medical images with complex deformations, and improve the registration accuracy. It is suitable for non-rigid registration of chest X-ray images.
Most current medical image segmentation models are primarily built upon the U-shaped network (U-Net) architecture, which has certain limitations in capturing both global contextual information and fine-grained details. To address this issue, this paper proposes a novel U-shaped network model, termed the Multi-View U-Net (MUNet), which integrates self-attention and multi-view attention mechanisms. Specifically, a newly designed multi-view attention module is introduced to aggregate semantic features from different perspectives, thereby enhancing the representation of fine details in images. Additionally, the MUNet model leverages a self-attention encoding block to extract global image features, and by fusing global and local features, it improves segmentation performance. Experimental results demonstrate that the proposed model achieves superior segmentation performance in coronary artery image segmentation tasks, significantly outperforming existing models. By incorporating self-attention and multi-view attention mechanisms, this study provides a novel and efficient modeling approach for medical image segmentation, contributing to the advancement of intelligent medical image analysis.
In computer-aided medical diagnosis, obtaining labeled medical image data is expensive, while there is a high demand for model interpretability. However, most deep learning models currently require a large amount of data and lack interpretability. To address these challenges, this paper proposes a novel data augmentation method for medical image segmentation. The uniqueness and advantages of this method lie in the utilization of gradient-weighted class activation mapping to extract data efficient features, which are then fused with the original image. Subsequently, a new channel weight feature extractor is constructed to learn the weights between different channels. This approach achieves non-destructive data augmentation effects, enhancing the model's performance, data efficiency, and interpretability. Applying the method of this paper to the Hyper-Kvasir dataset, the intersection over union (IoU) and Dice of the U-net were improved, respectively; and on the ISIC-Archive dataset, the IoU and Dice of the DeepLabV3+ were also improved respectively. Furthermore, even when the training data is reduced to 70 %, the proposed method can still achieve performance that is 95 % of that achieved with the entire dataset, indicating its good data efficiency. Moreover, the data-efficient features used in the method have interpretable information built-in, which enhances the interpretability of the model. The method has excellent universality, is plug-and-play, applicable to various segmentation methods, and does not require modification of the network structure, thus it is easy to integrate into existing medical image segmentation method, enhancing the convenience of future research and applications.
Colorectal cancer (CRC) is a common malignant tumor that seriously threatens human health. CRC presents a formidable challenge in terms of accurate identification due to its indistinct boundaries. With the widespread adoption of convolutional neural networks (CNNs) in image processing, leveraging CNNs for automatic classification and segmentation holds immense potential for enhancing the efficiency of colorectal cancer recognition and reducing treatment costs. This paper explores the imperative necessity for applying CNNs in clinical diagnosis of CRC. It provides an elaborate overview on research advancements pertaining to CNNs and their improved models in CRC classification and segmentation. Furthermore, this work summarizes the ideas and common methods for optimizing network performance and discusses the challenges faced by CNNs as well as future development trends in their application towards CRC classification and segmentation, thereby promoting their utilization within clinical diagnosis.
Intelligent medical image segmentation methods have been rapidly developed and applied, while a significant challenge is domain shift. That is, the segmentation performance degrades due to distribution differences between the source domain and the target domain. This paper proposed an unsupervised end-to-end domain adaptation medical image segmentation method based on the generative adversarial network (GAN). A network training and adjustment model was designed, including segmentation and discriminant networks. In the segmentation network, the residual module was used as the basic module to increase feature reusability and reduce model optimization difficulty. Further, it learned cross-domain features at the image feature level with the help of the discriminant network and a combination of segmentation loss with adversarial loss. The discriminant network took the convolutional neural network and used the labels from the source domain, to distinguish whether the segmentation result of the generated network is from the source domain or the target domain. The whole training process was unsupervised. The proposed method was tested with experiments on a public dataset of knee magnetic resonance (MR) images and the clinical dataset from our cooperative hospital. With our method, the mean Dice similarity coefficient (DSC) of segmentation results increased by 2.52% and 6.10% to the classical feature level and image level domain adaptive method. The proposed method effectively improves the domain adaptive ability of the segmentation method, significantly improves the segmentation accuracy of the tibia and femur, and can better solve the domain transfer problem in MR image segmentation.
Aiming at the problems of missing important features, inconspicuous details and unclear textures in the fusion of multimodal medical images, this paper proposes a method of computed tomography (CT) image and magnetic resonance imaging (MRI) image fusion using generative adversarial network (GAN) and convolutional neural network (CNN) under image enhancement. The generator aimed at high-frequency feature images and used double discriminators to target the fusion images after inverse transform; Then high-frequency feature images were fused by trained GAN model, and low-frequency feature images were fused by CNN pre-training model based on transfer learning. Experimental results showed that, compared with the current advanced fusion algorithm, the proposed method had more abundant texture details and clearer contour edge information in subjective representation. In the evaluation of objective indicators, QAB/F, information entropy (IE), spatial frequency (SF), structural similarity (SSIM), mutual information (MI) and visual information fidelity for fusion (VIFF) were 2.0%, 6.3%, 7.0%, 5.5%, 9.0% and 3.3% higher than the best test results, respectively. The fused image can be effectively applied to medical diagnosis to further improve the diagnostic efficiency.
Online hashing methods are receiving increasing attention in cross modal medical image retrieval research. However, existing online methods often lack the learning ability to maintain semantic correlation between new and existing data. To this end, we proposed online semantic similarity cross-modal hashing (OSCMH) learning framework to incrementally learn compact binary hash codes of medical stream data. Within it, a sparse representation of existing data based on online anchor datasets was designed to avoid semantic forgetting of the data and adaptively update hash codes, which effectively maintained semantic correlation between existing and arriving data and reduced information loss as well as improved training efficiency. Besides, an online discrete optimization method was proposed to solve the binary optimization problem of hash code by incrementally updating hash function and optimizing hash code on medical stream data. Compared with existing online or offline hashing methods, the proposed algorithm achieved average retrieval accuracy improvements of 12.5% and 14.3% on two datasets, respectively, effectively enhancing the retrieval efficiency in the field of medical images.
To address issues such as loss of detailed information, blurred target boundaries, and unclear structural hierarchy in medical image fusion, this paper proposes an adaptive feature medical image fusion network based on a full-scale diffusion model. First, a region-level feature map is generated using a kernel-based saliency map to enhance local features and boundary details. Then, a full-scale diffusion feature extraction network is employed for global feature extraction, alongside a multi-scale denoising U-shaped network designed to fully capture cross-layer information. A multi-scale feature integration module is introduced to reinforce texture details and structural information extracted by the encoder. Finally, an adaptive fusion scheme is applied to progressively fuse region-level features, global features, and source images layer by layer, enhancing the preservation of detail information. To validate the effectiveness of the proposed method, this paper validates the proposed model on the publicly available Harvard dataset and an abdominal dataset. By comparing with nine other representative image fusion methods, the proposed approach achieved improvements across seven evaluation metrics. The results demonstrate that the proposed method effectively extracts both global and local features of medical images, enhances texture details and target boundary clarity, and generates fusion image with high contrast and rich information, providing more reliable support for subsequent clinical diagnosis.
To address the issue of a large number of network parameters and substantial floating-point operations in deep learning networks applied to image segmentation for cardiac magnetic resonance imaging (MRI), this paper proposes a lightweight dilated parallel convolution U-Net (DPU-Net) to decrease the quantity of network parameters and the number of floating-point operations. Additionally, a multi-scale adaptation vector knowledge distillation (MAVKD) training strategy is employed to extract latent knowledge from the teacher network, thereby enhancing the segmentation accuracy of DPU-Net. The proposed network adopts a distinctive way of convolutional channel variation to reduce the number of parameters and combines with residual blocks and dilated convolutions to alleviate the gradient explosion problem and spatial information loss that might be caused by the reduction of parameters. The research findings indicate that this network has achieved considerable improvements in reducing the number of parameters and enhancing the efficiency of floating-point operations. When applying this network to the public dataset of the automatic cardiac diagnosis challenge (ACDC), the dice coefficient reaches 91.26%. The research results validate the effectiveness of the proposed lightweight network and knowledge distillation strategy, providing a reliable lightweighting idea for deep learning in the field of medical image segmentation.