Breast tumor segmentation in ultrasound images is fundamental for quantitative analysis and plays a crucial role in the diagnosis and treatment of breast cancer. Recently, existing methods have mainly focused on spatial domain implementations, with less attention to the frequency domain. In this paper, we propose a Multi-frequency and Multi-scale Interactive CNN-Transformer Hybrid Network (MFMSNet). Specifically, we utilize Octave convolutions instead of conventional convolutions to effectively separate high-frequency and low-frequency components while reducing computational complexity. Introducing the Multi-frequency Transformer block (MF-Trans) enables efficient interaction between high-frequency and low-frequency information, thereby capturing long-range dependencies. Additionally, we incorporate Multi-scale interactive fusion module (MSIF) to merge high-frequency feature maps of different sizes, enhancing the emphasis on tumor edges by integrating local contextual information. Experimental results demonstrate the superiority of our MFMSNet over seven state-of-the-art methods on two publicly available breast ultrasound datasets and one thyroid ultrasound dataset. In the evaluation of MFMSNet, tests were conducted on the BUSI, BUI, and DDTI datasets, comprising 130 images (BUSI), 47 images (BUI), and 128 images (DDTI) in the respective test sets. Employing a five-fold cross-validation approach, the obtained dice coefficients are as follows: 83.42 % (BUSI), 90.79 % (BUI), and 79.96 % (DDTI). The code is available at https://github.com/wrc990616/MFMSNet.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1016/j.compbiomed.2024.108616 | DOI Listing |
Med Image Anal
January 2025
Nuffield Department of Medicine, University of Oxford, Oxford, UK; Department of Engineering Science, University of Oxford, Oxford, UK; Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, University of Oxford, Oxford, UK; Ludwig Institute for Cancer Research, Nuffield Department of Clinical Medicine, University of Oxford, Oxford, UK; Oxford National Institute for Health Research (NIHR) Biomedical Research Centre, Oxford, UK. Electronic address:
Predicting disease-related molecular traits from histomorphology brings great opportunities for precision medicine. Despite the rich information present in histopathological images, extracting fine-grained molecular features from standard whole slide images (WSI) is non-trivial. The task is further complicated by the lack of annotations for subtyping and contextual histomorphological features that might span multiple scales.
View Article and Find Full Text PDFSensors (Basel)
January 2025
Faculty of Applied Sciences, Macao Polytechnic University, Macao SAR 999078, China.
Visible-infrared person re-identification (VI-ReID) is a challenging cross-modality retrieval task to match a person across different spectral camera views. Most existing works focus on learning shared feature representations from the final embedding space of advanced networks to alleviate modality differences between visible and infrared images. However, exclusively relying on high-level semantic information from the network's final layers can restrict shared feature representations and overlook the benefits of low-level details.
View Article and Find Full Text PDFSensors (Basel)
December 2024
School of Computer Science, Xi'an Polytechnic University, Xi'an 710600, China.
Interacting hand reconstruction presents significant opportunities in various applications. However, it currently faces challenges such as the difficulty in distinguishing the features of both hands, misalignment of hand meshes with input images, and modeling the complex spatial relationships between interacting hands. In this paper, we propose a multilevel feature fusion interactive network for hand reconstruction (HandFI).
View Article and Find Full Text PDFSensors (Basel)
December 2024
Institute of Computer and Communication Engineering, Department of Electrical Engineering, National Cheng Kung University, Tainan 701, Taiwan.
Precision depth estimation plays a key role in many applications, including 3D scene reconstruction, virtual reality, autonomous driving and human-computer interaction. Through recent advancements in deep learning technologies, monocular depth estimation, with its simplicity, has surpassed the traditional stereo camera systems, bringing new possibilities in 3D sensing. In this paper, by using a single camera, we propose an end-to-end supervised monocular depth estimation autoencoder, which contains an encoder with a structure with a mixed convolution neural network and vision transformers and an effective adaptive fusion decoder to obtain high-precision depth maps.
View Article and Find Full Text PDFSensors (Basel)
December 2024
College of Power Engineering, Naval University of Engineering, Wuhan 430033, China.
Arbitrary-oriented ship detection has become challenging due to problems of high resolution, poor imaging clarity, and large size differences between targets in remote sensing images. Most of the existing ship detection methods are difficult to use simultaneously to meet the requirements of high accuracy and speed. Therefore, we designed a lightweight and efficient multi-scale feature dilated neck module in the YOLO11 network to achieve the high-precision detection of arbitrary-oriented ships in remote sensing images.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!