IEEE Trans Pattern Anal Mach Intell
February 2025
Scene text recognition (STR) methods have struggled to attain high accuracy and fast inference speed. Auto-Regressive (AR)-based models implement the recognition in a character-by-character manner, showing superiority in accuracy but with slow inference speed. Alternatively, Parallel Decoding (PD)-based models infer all characters in a single decoding pass, offering faster inference speed but generally worse accuracy.
View Article and Find Full Text PDFIEEE Trans Pattern Anal Mach Intell
February 2025
Currently, deep neural networks (DNNs) are widely adopted in different applications. Despite its commercial values, training a well-performing DNN is resource-consuming. Accordingly, the well-trained model is valuable intellectual property for its owner.
View Article and Find Full Text PDFDimethyl sulfide (DMS), a degradation product of dimethylsulfoniopropionate (DMSP), is a significant trace gas influencing global temperature. This study examined the distribution of DMSP lyase activity (DLA) and the degradation of DMSP and dimethyl sulfoxide (DMSO) by bacteria to elucidate the influences of DMSP lyase and bacteria on the distributions of DMS and DMSP in the Yellow Sea and the East China Sea during the summer. We observed that DMS and DMSP concentrations in transect B, located near the Yellow Sea Cold Water Mass, declined with deepening water depth, coinciding with the changing trend of the temperatures.
View Article and Find Full Text PDFIEEE J Biomed Health Inform
January 2025
Generative dataset expansion methods can effectively alleviate the scarcity of data in dermoscopic image segmentation but commonly employ a two-stage synthesis strategy that contains additional learnable components and complex design, which results in high computational resource costs. Diffusion models utilizing a self-conditioning strategy have shown strong potential for efficiently reusing priors in the pipeline without relying on excessively complicated conditioning designs. Inspired by this, we propose a dataset expansion method called SCCS-Diff.
View Article and Find Full Text PDFIEEE Trans Neural Netw Learn Syst
December 2024
In this work, we propose MEDICO, a multiview deep generative model for molecule generation, structural optimization, and the SARS-CoV-2 inhibitor discovery. To the best of our knowledge, MEDICO is the first-of-this-kind graph generative model that can generate molecular graphs similar to the structure of targeted molecules, with a multiview representation learning framework to sufficiently and adaptively learn comprehensive structural semantics from targeted molecular topology and geometry. We show that our MEDICO significantly outperforms the state-of-the-art methods in generating valid, novel, and unique molecules under benchmarking comparisons, particularly achieving ˜85% improvement compared with the state-of-the-art methods in terms of validity.
View Article and Find Full Text PDFIEEE Trans Biomed Eng
January 2025
Knot tying is a critical task in robotic surgery, which is considerably important for surgical success and postoperative recovery. Despite of the well-established protocols and significant progress using medical robots at macro scale, the need for automatedly tying mechanically robust knots on mini-incisions remains largely unmet, particularly with relieved suture deformation, avoided suture slippage, reduced workspace consumption, and enhanced precision and biomechanical compatibility. Here, we propose an innovative dual-arm nanorobotic system featured by stereo microscope and additional rotation degree of freedom (DOF) mounted on each arm, enabling automated, precise, and controllable knot tying on mini-incisions.
View Article and Find Full Text PDFIEEE Trans Pattern Anal Mach Intell
April 2025
Visual Object Tracking (VOT) aims to estimate the positions of target objects in a video sequence, which is an important vision task with various real-world applications. Depending on whether the initial states of target objects are specified by provided annotations in the first frame or the categories, VOT could be classified as instance tracking (e.g.
View Article and Find Full Text PDFMulti-modal models have shown appealing performance in visual recognition tasks, as free-form text-guided training evokes the ability to understand fine-grained visual content. However, current models cannot be trivially applied to scene text recognition (STR) due to the compositional difference between natural and text images. We propose a novel instruction-guided scene text recognition (IGTR) paradigm that formulates STR as an instruction learning problem and understands text images by predicting character attributes, e.
View Article and Find Full Text PDFIEEE Trans Med Imaging
January 2025
Unsupervised domain adaptation (UDA) in medical image segmentation aims to improve the generalization of deep models by alleviating domain gaps caused by inconsistency across equipment, imaging protocols, and patient conditions. However, existing UDA works remain insufficiently explored and present great limitations: (i) Exhibit cumbersome designs that prioritize aligning statistical metrics and distributions, which limits the model's flexibility and generalization while also overlooking the potential knowledge embedded in unlabeled data; (ii) More applicable in a certain domain, lack the generalization capability to handle diverse shifts encountered in clinical scenarios. To overcome these limitations, we introduce MedCon, a unified framework that leverages general unsupervised contrastive pre-training to establish domain connections, effectively handling diverse domain shifts without tailored adjustments.
View Article and Find Full Text PDFIEEE Trans Neural Netw Learn Syst
December 2024
In this article, we derive the predictor-corrector (PC) methods with three-order convergent precision, together with a class of specific general linear three-step (GLTS) rules provided. Afterward, a time-varying optimization (TVO) problem, which is deemed as a discrete TVO has been formulated and studied. The classical discrete zeroing neural network via Zhang et al.
View Article and Find Full Text PDFIEEE Trans Neural Syst Rehabil Eng
December 2024
Amyotrophic lateral sclerosis (ALS) is a chronic, progressive neurodegenerative disease that mainly causes damage to upper and lower motor neurons. This leads to a progressive deterioration in the voluntary mobility of the upper and lower extremities in ALS patients, which underscores the pressing need for an assistance system to facilitate communication and body movement without relying on neuromuscular function. In this paper, we developed a daily assistance system for ALS patients based on a wearable multimodal brain-computer interface (BCI) mouse.
View Article and Find Full Text PDFIEEE Trans Vis Comput Graph
December 2024
Graphical overlays that layer visual elements onto charts, are effective to convey insights and context in financial narrative visualizations. However, automating graphical overlays is challenging due to complex narrative structures and limited understanding of effective overlays. To address the challenge, we first summarize the commonly used graphical overlays and narrative structures, and the proper correspondence between them in financial narrative visualizations, elected by a survey of 1752 layered charts with corresponding narratives.
View Article and Find Full Text PDFIEEE Trans Image Process
November 2024
The misalignment between RGB and thermal images significantly impairs RGB-Thermal semantic segmentation accuracy. Current non-end-to-end methods treat RGB-Thermal registration independently of semantic segmentation, resulting in fusion errors, redundant computations, and poor real-time performance. Semantic segmentation accuracy directly correlates with registration precision: better registration yields more accurate segmentation.
View Article and Find Full Text PDFIEEE Trans Med Imaging
November 2024
Pediatric teeth exhibit significant changes in type and spatial distribution across different age groups. This variation makes pediatric teeth segmentation from cone-beam computed tomography (CBCT) more challenging than that in adult teeth. Existing methods mainly focus on adult teeth segmentation, which however cannot be adapted to spatial distribution of pediatric teeth with individual changes (SDPTIC) in different children, resulting in limited accuracy for segmenting pediatric teeth.
View Article and Find Full Text PDFIEEE Trans Image Process
November 2024
Existing light field salient object detection (LFSOD) models predominantly rely on convolutional neural networks or local attention to process light field data, consequently encountering difficulties in modeling intra-slice and cross-slice long-range dependencies within focal stacks. In this paper, we ponder the feasibility of relying solely on the pure Transformer architecture to address this dilemma and propose a novel quasi-pure Transformer-based framework for LFSOD, termed TLFNet. TLFNet incorporates innovative Transformer-based fusion modules (PGFormer) along with an edge enhancement module.
View Article and Find Full Text PDFAt present, research on the biomechanical response of the cupula of human semicircular canals (HSCs) has focused on indirect inference through the nystagmus view, which is limited by the participation of the human nervous system. In this study, 3D printing technology and hydrogel modification methods were used to fabricate a one-dimensional bionic semicircular canal (BSC) model with a ratio of 1:1 to the horizontal HSC. Target tracking technology was used to observe the deformation of the cupula.
View Article and Find Full Text PDFAn atomic-level understanding of the elementary steps of catalytic reactions is crucial for a more molecularly driven catalyst design. Herein, we present a comprehensive study of temperature-dependent stepwise decomposition of methanol on a single-crystal Co(0001) surface using a series of surface science techniques and density functional theory calculation. Visualization of surface products was realized by scanning tunneling microscopy.
View Article and Find Full Text PDFIn this work, we report on defect generation in multilayer GaSe through hydrostatic pressure quenching and UV laser irradiation. The Raman line width from the UV 266 nm irradiated sample is much wider than that in pressure-quenched GaSe, corresponding to a wider defect energy distribution range in the former sample than the latter. After quenching from 11.
View Article and Find Full Text PDFImmunogenic cell death (ICD) enhances anti-tumor immunity by releasing tumor-associated antigens and activating the anti-tumor immune system response. However, its potential remains understudied in combination therapies. Here, we develop a mathematical model to quantify the role of ICD in optimizing the efficacy of combined radiotherapy (RT) and macrophage-based immunotherapy.
View Article and Find Full Text PDFBackground: NASH-associated liver cancer (NALC) is a significant contributor to global cancer mortality, closely linked to the increasing prevalence of non-alcoholic fatty liver disease (NAFLD) and non-alcoholic steatohepatitis (NASH). This study comprehensively examines the global burden of NALC from 1990 to 2021.
Methods: This study used data from the Global Burden of Disease (GBD) 2021 database to analyze NALC death and age-standardized death rates (ASDR) globally and regionally from 1990 to 2021.
Background: A number of studies have demonstrated that legumain is engaged in the pulmonary diseases. Nevertheless, the role of legumain is indistinct in patients with acute exacerbation of chronic obstructive pulmonary disease (AECOPD). The aim is to identify the correlation of serum legumain with AECOPD patients through a prospective cohort study.
View Article and Find Full Text PDFPurpose: This study aims to investigate the risk factors associated with blood-brain barrier(BBB) disruption in patients with myelin oligodendrocyte glycoprotein antibody associated disease(MOGAD).
Patients And Methods: We collected clinical data from 95 patients diagnosed with MOGAD at the Department of Neurology, the First Affiliated Hospital of Zhengzhou University from October 2018 to May 2024. Patients were classified into normal or damaged BBB groups based on cerebrospinal fluid (CSF) albumin/serum albumin (QAlb).
Background: Accurate preoperative non-invasive assessment of HER2 expression in breast cancer is crucial for personalized treatment and prognostic stratification.
Purpose: To evaluate the effectiveness of radiomics models based on multi-parametric magnetic resonance imaging (MRI) in distinguishing HER2 expression status in invasive breast cancer.
Material And Methods: We conducted a retrospective analysis of baseline MRI scans and clinical data from 400 patients with breast cancer between January 2018 and December 2019.
Aims: The aim of this study is to elucidate the genetic landscape of microspherophakia (MSP) and describe the genotype-phenotype correlation of MSP. Additionally, the study seeks to enhance the understanding of the pathogenic mechanisms of MSP through the discovery of novel loci.
Methods: Patients diagnosed with MSP at the Eye and ENT Hospital of Fudan University, Shanghai, were included in the study and all underwent panel-based next-generation sequencing and bioinformatics analysis.