Fatigue driving is one of the leading causes of traffic accidents, posing a significant threat to drivers and road safety. Most existing methods focus on studying whole-brain multi-channel electroencephalogram (EEG) signals, which involve a large number of channels, complex data processing, and cumbersome wearable devices. To address this issue, this paper proposes a fatigue detection method based on frontal EEG signals and constructs a fatigue driving detection model using an asymptotic hierarchical fusion network. The model employed a hierarchical fusion strategy, integrating an attention mechanism module into the multi-level convolutional module. By utilizing both cross-attention and self-attention mechanisms, it effectively fused the hierarchical semantic features of power spectral density (PSD) and differential entropy (DE), enhancing the learning of feature dependencies and interactions. Experimental validation was conducted on the public SEED-VIG dataset. The proposed model achieved an accuracy of 89.80% using only four frontal EEG channels. Comparative experiments with existing methods demonstrate that the proposed model achieves high accuracy and superior practicality, providing valuable technical support for fatigue driving monitoring and prevention.
Fetal electrocardiogram (ECG) signals provide important clinical information for early diagnosis and intervention of fetal abnormalities. In this paper, we propose a new method for fetal ECG signal extraction and analysis. Firstly, an improved fast independent component analysis method and singular value decomposition algorithm are combined to extract high-quality fetal ECG signals and solve the waveform missing problem. Secondly, a novel convolutional neural network model is applied to identify the QRS complex waves of fetal ECG signals and effectively solve the waveform overlap problem. Finally, high quality extraction of fetal ECG signals and intelligent recognition of fetal QRS complex waves are achieved. The method proposed in this paper was validated with the data from the PhysioNet computing in cardiology challenge 2013 database of the Complex Physiological Signals Research Resource Network. The results show that the average sensitivity and positive prediction values of the extraction algorithm are 98.21% and 99.52%, respectively, and the average sensitivity and positive prediction values of the QRS complex waves recognition algorithm are 94.14% and 95.80%, respectively, which are better than those of other research results. In conclusion, the algorithm and model proposed in this paper have some practical significance and may provide a theoretical basis for clinical medical decision making in the future.
Aiming at the problem that the unbalanced distribution of data in sleep electroencephalogram(EEG) signals and poor comfort in the process of polysomnography information collection will reduce the model's classification ability, this paper proposed a sleep state recognition method using single-channel EEG signals (WKCNN-LSTM) based on one-dimensional width kernel convolutional neural networks(WKCNN) and long-short-term memory networks (LSTM). Firstly, the wavelet denoising and synthetic minority over-sampling technique-Tomek link (SMOTE-Tomek) algorithm were used to preprocess the original sleep EEG signals. Secondly, one-dimensional sleep EEG signals were used as the input of the model, and WKCNN was used to extract frequency-domain features and suppress high-frequency noise. Then, the LSTM layer was used to learn the time-domain features. Finally, normalized exponential function was used on the full connection layer to realize sleep state. The experimental results showed that the classification accuracy of the one-dimensional WKCNN-LSTM model was 91.80% in this paper, which was better than that of similar studies in recent years, and the model had good generalization ability. This study improved classification accuracy of single-channel sleep EEG signals that can be easily utilized in portable sleep monitoring devices.
The count and recognition of white blood cells in blood smear images play an important role in the diagnosis of blood diseases including leukemia. Traditional manual test results are easily disturbed by many factors. It is necessary to develop an automatic leukocyte analysis system to provide doctors with auxiliary diagnosis, and blood leukocyte segmentation is the basis of automatic analysis. In this paper, we improved the U-Net model and proposed a segmentation algorithm of leukocyte image based on dual path and atrous spatial pyramid pooling. Firstly, the dual path network was introduced into the feature encoder to extract multi-scale leukocyte features, and the atrous spatial pyramid pooling was used to enhance the feature extraction ability of the network. Then the feature decoder composed of convolution and deconvolution was used to restore the segmented target to the original image size to realize the pixel level segmentation of blood leukocytes. Finally, qualitative and quantitative experiments were carried out on three leukocyte data sets to verify the effectiveness of the algorithm. The results showed that compared with other representative algorithms, the proposed blood leukocyte segmentation algorithm had better segmentation results, and the mIoU value could reach more than 0.97. It is hoped that the method could be conducive to the automatic auxiliary diagnosis of blood diseases in the future.
The processing mechanism of the human brain for speech information is a significant source of inspiration for the study of speech enhancement technology. Attention and lateral inhibition are key mechanisms in auditory information processing that can selectively enhance specific information. Building on this, the study introduces a dual-branch U-Net that integrates lateral inhibition and feedback-driven attention mechanisms. Noisy speech signals input into the first branch of the U-Net led to the selective feedback of time-frequency units with high confidence. The generated activation layer gradients, in conjunction with the lateral inhibition mechanism, were utilized to calculate attention maps. These maps were then concatenated to the second branch of the U-Net, directing the network’s focus and achieving selective enhancement of auditory speech signals. The evaluation of the speech enhancement effect was conducted by utilising five metrics, including perceptual evaluation of speech quality. This method was compared horizontally with five other methods: Wiener, SEGAN, PHASEN, Demucs and GRN. The experimental results demonstrated that the proposed method improved speech signal enhancement capabilities in various noise scenarios by 18% to 21% compared to the baseline network across multiple performance metrics. This improvement was particularly notable in low signal-to-noise ratio conditions, where the proposed method exhibited a significant performance advantage over other methods. The speech enhancement technique based on lateral inhibition and feedback-driven attention mechanisms holds significant potential in auditory speech enhancement, making it suitable for clinical practices related to artificial cochleae and hearing aids.
Colorectal cancer (CRC) is a common malignant tumor that seriously threatens human health. CRC presents a formidable challenge in terms of accurate identification due to its indistinct boundaries. With the widespread adoption of convolutional neural networks (CNNs) in image processing, leveraging CNNs for automatic classification and segmentation holds immense potential for enhancing the efficiency of colorectal cancer recognition and reducing treatment costs. This paper explores the imperative necessity for applying CNNs in clinical diagnosis of CRC. It provides an elaborate overview on research advancements pertaining to CNNs and their improved models in CRC classification and segmentation. Furthermore, this work summarizes the ideas and common methods for optimizing network performance and discusses the challenges faced by CNNs as well as future development trends in their application towards CRC classification and segmentation, thereby promoting their utilization within clinical diagnosis.
Breast cancer is a malignancy caused by the abnormal proliferation of breast epithelial cells, predominantly affecting female patients, and it is commonly diagnosed using histopathological images. Currently, deep learning techniques have made significant breakthroughs in medical image processing, outperforming traditional detection methods in breast cancer pathology classification tasks. This paper first reviewed the advances in applying deep learning to breast pathology images, focusing on three key areas: multi-scale feature extraction, cellular feature analysis, and classification. Next, it summarized the advantages of multimodal data fusion methods for breast pathology images. Finally, the study discussed the challenges and future prospects of deep learning in breast cancer pathology image diagnosis, providing important guidance for advancing the use of deep learning in breast diagnosis.
Manual segmentation of coronary arteries in computed tomography angiography (CTA) images is inefficient, and existing deep learning segmentation models often exhibit low accuracy on coronary artery images. Inspired by the Transformer architecture, this paper proposes a novel segmentation model, the double parallel encoder u-net with transformers (DUNETR). This network employed a dual-encoder design integrating Transformers and convolutional neural networks (CNNs). The Transformer encoder transformed three-dimensional (3D) coronary artery data into a one-dimensional (1D) sequential problem, effectively capturing global multi-scale feature information. Meanwhile, the CNN encoder extracted local features of the 3D coronary arteries. The complementary features extracted by the two encoders were fused through the noise reduction feature fusion (NRFF) module and passed to the decoder. Experimental results on a public dataset demonstrated that the proposed DUNETR model achieved a Dice similarity coefficient of 81.19% and a recall rate of 80.18%, representing improvements of 0.49% and 0.46%, respectively, over the next best model in comparative experiments. These results surpassed those of other conventional deep learning methods. The integration of Transformers and CNNs as dual encoders enables the extraction of rich feature information, significantly enhancing the effectiveness of 3D coronary artery segmentation. Additionally, this model provides a novel approach for segmenting other vascular structures.
Objective To establish a machine learning-based risk prediction model of combined chronic obstructive pulmonary disease (COPD) with lung cancer, so as to explore the high risk factors for COPD patients with lung cancer and to lay the foundation for early detection of lung cancer risk in COPD patients. Methods A total of 154 patients from the Second Hospital of Dalian Medical University from 2010 to 2021 were retrospectively analyzed, including 99 patients in the COPD group and 55 patients in the COPD with lung cancer group. the chest high resolution computed tomography (HRCT) scans and pulmonary function test of each patient were acquired. The main analyses were as follow: (1) to valid the statistically differences of the basic information (such as age, body mass index, smoking index), laboratory test results, pulmonary function parameters and quantitative parameters of chest HRCT between the two groups; (2) to analyze the indicators of high risk factors for lung cancer in COPD patients using univariate and binary logistic regression (LR) methods; and (3) to establish the machine learning model (such as LR and Gaussian process) for COPD with lung cancer patients. Results Based on the statistical analysis and LR methods, decreased BMI, increased whole lung emphysema index, increased whole lung mean density, and increased percentage activity of exertional spirometry and prothrombin time were risk factors for COPD with lung cancer patients. Based on the machine learning prediction model for COPD with lung cancer patients, the area under the receiver operating characteristic curve for LR and Gaussian process were obtained as 0.88 using the soluble fragments of prothrombin time percentage activity, whole lung emphysema index, whole lung mean density, and forced vital capacity combined with neuron-specific enolase and cytokeratin 19 as features. Conclusion The prediction model of COPD with lung cancer patients using a machine learning approach can be used for early detection of lung cancer risk in COPD patients.
The effective classification of multi-task motor imagery electroencephalogram (EEG) is helpful to achieve accurate multi-dimensional human-computer interaction, and the high frequency domain specificity between subjects can improve the classification accuracy and robustness. Therefore, this paper proposed a multi-task EEG signal classification method based on adaptive time-frequency common spatial pattern (CSP) combined with convolutional neural network (CNN). The characteristics of subjects' personalized rhythm were extracted by adaptive spectrum awareness, and the spatial characteristics were calculated by using the one-versus-rest CSP, and then the composite time-domain characteristics were characterized to construct the spatial-temporal frequency multi-level fusion features. Finally, the CNN was used to perform high-precision and high-robust four-task classification. The algorithm in this paper was verified by the self-test dataset containing 10 subjects (33 ± 3 years old, inexperienced) and the dataset of the 4th 2018 Brain-Computer Interface Competition (BCI competition Ⅳ-2a). The average accuracy of the proposed algorithm for the four-task classification reached 93.96% and 84.04%, respectively. Compared with other advanced algorithms, the average classification accuracy of the proposed algorithm was significantly improved, and the accuracy range error between subjects was significantly reduced in the public dataset. The results show that the proposed algorithm has good performance in multi-task classification, and can effectively improve the classification accuracy and robustness.