The recognition of surgical triplets plays a critical role in the practical application of surgical videos. It involves the sub-tasks of recognizing instruments, verbs, and targets, while establishing precise associations between them. Existing methods face two significant challenges in triplet recognition: 1) the imbalanced class distribution of surgical triplets may lead to spurious task association learning, and 2) the feature extractors cannot reconcile local and global context modeling. To overcome these challenges, this paper presents a novel multi-teacher knowledge distillation framework for multi-task triplet learning, known as MT4MTL-KD. MT4MTL-KD leverages teacher models trained on less imbalanced sub-tasks to assist multi-task student learning for triplet recognition. Moreover, we adopt different categories of backbones for the teacher and student models, facilitating the integration of local and global context modeling. To further align the semantic knowledge between the triplet task and its sub-tasks, we propose a novel feature attention module (FAM). This module utilizes attention mechanisms to assign multi-task features to specific sub-tasks. We evaluate the performance of MT4MTL-KD on both the 5-fold cross-validation and the CholecTriplet challenge splits of the CholecT45 dataset. The experimental results consistently demonstrate the superiority of our framework over state-of-the-art methods, achieving significant improvements of up to 6.4% on the cross-validation split.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1109/TMI.2023.3345736 | DOI Listing |
Plant Dis
January 2025
Biotechnology, plant protection, Nongsheng Group C735, Zijin Campus, Zhejiang University, Hangzhou, Zhejiang, China, 310058;
To meet the need of crop leaf disease detection in complex scenarios, this study designs a method based on the computing power of mobile devices that ensures both detection accuracy and real-time efficiency, offering significant practical application value. Based on a comparison with existing mainstream detection models, this paper proposes a target detection and recognition algorithm, TG_YOLOv5, which utilizes multi-dimensional data fusion on the YOLOv5 model. The triplet attention mechanism and C3CBAM module are incorporated into the network structure to capture connections between spatial and channel dimensions of input feature maps, thereby enhancing the model's feature extraction capabilities without significantly increasing the parameter count.
View Article and Find Full Text PDFAngew Chem Int Ed Engl
December 2024
Shaanxi Key Laboratory of Macromolecular Science and Technology, Xi'an Key Laboratory of Hybrid Luminescent Materials and Photonic Device, MOE Key Laboratory of Material Physics and Chemistry under Extraordinary Conditions, School of Chemistry and Chemical Engineering, Northwestern Polytechnical University, Xi'an, 710072, P. R. China.
Limited by the two mutually exclusive physicochemical processes of separation and recombination of photogenerated carriers, achieving photoluminescence and photocatalysis simultaneously is extremely challenging but essential for ever-growing complex issues and specialized scenarios. Here we proposed a biomimetic isolation-conduction strategy induced by an arene-perfluoroarene (A-P) interaction for enabling photoluminescence and photocatalytic hydrogen evolution reaction (HER) activity in the co-assembly of aromatic monomers and octafluoronapthalene (OFN). Inspired by the isolation-conduction effect of periodic isolation of myelin sheaths on the axons of vertebrate nerve fibers by node of Ranvier, we use OFN as a molecular isolator embedded in the aromatic monomers array to block the singlet-to-triplet pathway, while the enlarged intermolecular dipoles resulting from the A-P interactions facilitate the conduction of photogenerated carriers in the isolated regions.
View Article and Find Full Text PDFSensors (Basel)
November 2024
Department of Electrical and Electronic Engineering, Auckland University of Technology, Auckland 1010, New Zealand.
Spoofing attacks (or Presentation Attacks) are easily accessible to facial recognition systems, making the online financial system vulnerable. Thus, it is urgent to develop an anti-spoofing solution with superior generalization ability due to the high demand for spoofing attack detection. Although multi-modality methods such as combining depth images with RGB images and feature fusion methods could currently perform well with certain datasets, the cost of obtaining the depth information and physiological signals, especially that of the biological signal is relatively high.
View Article and Find Full Text PDFDiagnostics (Basel)
November 2024
College of Computer Science and Engineering, Taibah University, Medina 41477, Saudi Arabia.
Background/objectives: In contrast to traditional biometric modalities, such as facial recognition, fingerprints, and iris scans or even DNA, the research orientation towards chest X-ray recognition has been spurred by its remarkable recognition rates. Capturing the intricate anatomical nuances of an individual's skeletal structure, the ribcage of the chest, lungs, and heart, chest X-rays have emerged as a focal point for identification and verification, especially in the forensic field, even in scenarios where the human body damaged or disfigured. Discriminative feature embedding is essential for large-scale image verification, especially in applying chest X-ray radiographs for identity identification and verification.
View Article and Find Full Text PDFBiomed Opt Express
December 2024
Yangtze Delta Region Institute (Huzhou), University of Electronic Science and Technology of China (UESTC), Huzhou 313001, China.
The problems of complex background, low quality of finger vein images, and poor discriminative features have been the bottleneck of feature extraction and finger vein recognition. To this end, we propose a feature extraction algorithm based on the open-set testing protocol. In order to eliminate the interference of irrelevant areas, this paper proposes the idea of segmentation-assisted classification, that is, using the rough mask of the finger vein to constrain the feature learning process so that the network can focus on the vein area and learn greater weight for the vein.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!