Multi-modal image registration aims to spatially align two images from different modalities to make their feature points match with each other. Captured by different sensors, the images from different modalities often contain many distinct features, which makes it challenging to find their accurate correspondences. With the success of deep learning, many deep networks have been proposed to align multi-modal images, however, they are mostly lack of interpretability. In this paper, we first model the multi-modal image registration problem as a disentangled convolutional sparse coding (DCSC) model. In this model, the multi-modal features that are responsible for alignment (RA features) are well separated from the features that are not responsible for alignment (nRA features). By only allowing the RA features to participate in the deformation field prediction, we can eliminate the interference of the nRA features to improve the registration accuracy and efficiency. The optimization process of the DCSC model to separate the RA and nRA features is then turned into a deep network, namely Interpretable Multi-modal Image Registration Network (InMIR-Net). To ensure the accurate separation of RA and nRA features, we further design an accompanying guidance network (AG-Net) to supervise the extraction of RA features in InMIR-Net. The advantage of InMIR-Net is that it provides a universal framework to tackle both rigid and non-rigid multi-modal image registration tasks. Extensive experimental results verify the effectiveness of our method on both rigid and non-rigid registrations on various multi-modal image datasets, including RGB/depth images, RGB/near-infrared (NIR) images, RGB/multi-spectral images, T1/T2 weighted magnetic resonance (MR) images and computed tomography (CT)/MR images. The codes are available at https://github.com/lep990816/Interpretable-Multi-modal-Image-Registration.

Download full-text PDF

Source
http://dx.doi.org/10.1109/TIP.2023.3240024DOI Listing

Publication Analysis

Top Keywords

multi-modal image
24
image registration
20
nra features
16
features
10
interpretable multi-modal
8
registration network
8
disentangled convolutional
8
convolutional sparse
8
sparse coding
8
images
8

Similar Publications

Multi-channel spatio-temporal graph attention contrastive network for brain disease diagnosis.

Neuroimage

January 2025

College of Artificial Intelligence, Nanjing University of Aeronautics and Astronautics, Nanjing, 211106, China; Key Laboratory of Brain-Machine Intelligence Technology, Ministry of Education, Nanjing University of Aeronautics and Astronautics, Nanjing, 211106, China. Electronic address:

Dynamic brain networks (DBNs) can capture the intricate connections and temporal evolution among brain regions, becoming increasingly crucial in the diagnosis of neurological disorders. However, most existing researches tend to focus on isolated brain network sequence segmented by sliding windows, and they are difficult to effectively uncover the higher-order spatio-temporal topological pattern in DBNs. Meantime, it remains a challenge to utilize the structure connectivity prior in the DBNs analysis.

View Article and Find Full Text PDF

Atrial fibrillation (AF) is the most prevalent clinical arrhythmia, posing significant mortality and morbidity challenges. Outcomes of current catheter ablation treatment strategies are suboptimal, highlighting the need for innovative approaches. A major obstacle lies in the inability to comprehensively assess both structural and functional remodelling in AF.

View Article and Find Full Text PDF

Complex experimental protocols often require multi-modal data acquisition with precisely aligned timing, as well as state- and behavior-dependent interventions. Tailored solutions are mostly restricted to individual experimental setups and lack flexibility and interoperability. We present an open-source, Linux-based integrated software solution, called 'Syntalos', for simultaneous acquisition and synchronization of data from an arbitrary number of sources, including multi-channel electrophysiological recordings and different live imaging devices, as well as closed-loop, real-time interventions with different actuators.

View Article and Find Full Text PDF

GMmorph: dynamic spatial matching registration model for 3D medical image based on gated Mamba.

Phys Med Biol

January 2025

School of Software Engineering, Xi'an Jiaotong University, Xi 'an Jiaotong University Innovation Port, Xi 'an, Shaanxi Province, Xi'an, Shaanxi, 710049, CHINA.

Deformable registration aims to achieve nonlinear alignment of image space by estimating a dense displacement field. It is commonly used as a preprocessing step in clinical and image analysis applications, such as surgical planning, diagnostic assistance, and surgical navigation. We aim to overcome these challenges: Deep learning-based registration methods often struggle with complex displacements and lack effective interaction between global and local feature information.

View Article and Find Full Text PDF

The sharing of multimodal magnetic resonance imaging (MRI) data is of utmost importance in the field, as it enables a deeper understanding of facial nerve-related pathologies. However, there is a significant lack of multi-modal neuroimaging databases specifically focused on these conditions, which hampers our comprehensive knowledge of the neural foundations of facial paralysis. To address this critical gap and propel advancements in this area, we have released the Multimodal Neuroimaging Dataset of Meige Syndrome, Facial Paralysis, and Healthy Controls (MND-MFHC).

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!