Publications by authors named "Yongdong Zhang"

fetal circulatory characteristics differ from those after birth; the right ventricle assumes more than 60% of the workload in the systemic circulation before birth. The right ventricle automated myocardial performance index (AM+MPI) is a simple, time-efficient, and stable assessment method used to evaluate overall cardiac function. This study aimed to apply the right ventricle AM+MPI technique to explore its role in normal mid-to-late pregnancy fetal cardiac function and compare the AM+MPI differences between diabetic and normal fetuses.

View Article and Find Full Text PDF

Weakly-supervised Temporal Action Localization (WTAL) aims to localize action instances with only video-level labels during training, where two primary issues are localization incompleteness and background interference. To relieve these two issues, recent methods adopt an attention mechanism to activate action instances and simultaneously suppress background ones, which have achieved remarkable progress. Nevertheless, we argue that these two issues have not been well resolved yet.

View Article and Find Full Text PDF

Cutting planes (cuts) play an important role in solving mixed-integer linear programs (MILPs), which formulate many important real-world applications. Cut selection heavily depends on (P1) which cuts to prefer and (P2) how many cuts to select. Although modern MILP solvers tackle (P1)-(P2) by human-designed heuristics, machine learning carries the potential to learn more effective heuristics.

View Article and Find Full Text PDF

Understanding the historical variations in organic matter (OM) input to lake sediments and the possible mechanisms regulating this phenomenon is important for studying carbon cycling and burial in lake systems; however, this topic remains poorly addressed for macrophyte-dominated lakes. To bridge these gaps, we analyzed bulk OM and molecular geochemical proxies in a dated sediment core from Lake Liangzi, a typical submerged macrophyte-dominated lake in East China, to infer changes in OM input to sediments over the past 169 years due to the intensification of human activities in the catchment. A relatively primitive OM input pattern was observed in ca.

View Article and Find Full Text PDF

Stratigraphic determination of the Anthropocene, the "Great Acceleration", requires more key globally synchronous stratigraphic markers which reflect the significant human impacts on Earth. Lacustrine sediment magnetic characteristics are of considerable importance in Anthropocene studies because they respond sensitively to environmental changes. There are many shallow lakes in the Songnen Plain (SNP) in northeast China, which are conducive to obtaining Anthropocene sedimentary records.

View Article and Find Full Text PDF

Generalized Zero-Shot Learning (GZSL) aims at recognizing images from both seen and unseen classes by constructing correspondences between visual images and semantic embedding. However, existing methods suffer from a strong bias problem, where unseen images in the target domain tend to be recognized as seen classes in the source domain. To address this issue, we propose a Prototype-augmented Self-supervised Generative Network by integrating self-supervised learning and prototype learning into a feature generating model for GZSL.

View Article and Find Full Text PDF

We tackle the problem of establishing dense correspondences between a pair of images in an efficient way. Most existing dense matching methods use 4D convolutions to filter incorrect matches, but 4D convolutions are highly inefficient due to their quadratic complexity. Besides, these methods learn features with fixed convolutions which cannot make learnt features robust to different challenge scenarios.

View Article and Find Full Text PDF

Semi-supervised video object segmentation is the task of segmenting the target in sequential frames given the ground truth mask in the first frame. The modern approaches usually utilize such a mask as pixel-level supervision and typically exploit pixel-to-pixel matching between the reference frame and current frame. However, the matching at pixel level, which overlooks the high-level information beyond local areas, often suffers from confusion caused by similar local appearances.

View Article and Find Full Text PDF

Federated learning (FL) is a promising framework for privacy-preserving and distributed training with decentralized clients. However, there exists a large divergence between the collected local updates and the expected global update, which is known as the client drift and mainly caused by heterogeneous data distribution among clients, multiple local training steps, and partial client participation training. Most existing works tackle this challenge based on the empirical risk minimization (ERM) rule, while less attention has been paid to the relationship between the global loss landscape and the generalization ability.

View Article and Find Full Text PDF

The abundance and composition of aliphatic hydrocarbon biomarkers were determined in dated sediment cores from Lakes Qijiapao (QJP) and Huoshaoheipao (HSH) in the Songnen Plain, Northeast China, to investigate historical environmental changes in these lakes and identify likely controlling factors. Based on these results, the recent environmental history of the two lakes can be divided into three periods. Before 1950, low P values (avg.

View Article and Find Full Text PDF

Establishing effective correspondences between a pair of images is difficult due to real-world challenges such as illumination, viewpoint and scale variations. Modern detector-based methods typically learn fixed detectors from a given dataset, which is hard to extract repeatable and reliable keypoints for various images with extreme appearance changes and weakly textured scenes. To deal with this problem, we propose a novel Dynamic Keypoint Detection Network (DKDNet) for robust image matching via a dynamic keypoint feature learning module and a guided heatmap activator.

View Article and Find Full Text PDF

Limited human activities in catchments make remote alpine lakes valuable sites for studying the evolution of lake environments in response to climate change and atmospheric deposition; however, this issue remains rarely studied owing to the scarcity of monitoring data. In this study, water quality evolution in Lake Jiren, a remote alpine lake on the southeastern margin of the Tibetan Plateau, over the past two centuries was reconstructed through geochemical analyses of aliphatic hydrocarbons, major and trace elements, and organic matter (OM) pyrolysis products in a dated sediment core, and the associated drivers were identified by temporally comparing the geochemical results with document records. All geochemical data demonstrated that the lake water remained relatively pure until 1947, after which the n-alkane and αβ-hopane proxies indicated eutrophication and petroleum contamination.

View Article and Find Full Text PDF

As a crucial application in privacy protection, scene text removal (STR) has received amounts of attention in recent years. However, existing approaches coarsely erasing texts from images ignore two important properties: the background texture integrity (BI) and the text erasure exhaustivity (EE). These two properties directly determine the erasure performance, and how to maintain them in a single network is the core problem for STR task.

View Article and Find Full Text PDF

Semi-supervised learning (SSL) has attracted increasing attention in medical image segmentation, where the mainstream usually explores perturbation-based consistency as a regularization to leverage unlabelled data. However, unlike directly optimizing segmentation task objectives, consistency regularization is a compromise by incorporating invariance towards perturbations, and inevitably suffers from noise in self-predicted targets. The above issues result in a knowledge gap between supervised guidance and unsupervised regularization.

View Article and Find Full Text PDF

Point cloud shape correspondence aims at accurately mapping one point cloud to another point cloud with various 3D shapes. Since point clouds are usually sparse, disordered, irregular, and with diverse shapes, it is challenging to learn consistent point cloud representations and achieve the accurate matching of different point cloud shapes. To address the above issues, we propose a Hierarchical Shape-consistent TRansformer for unsupervised point cloud shape correspondence (HSTR), including a multi-receptive-field point representation encoder and a shape-consistent constrained module in a unified architecture.

View Article and Find Full Text PDF

Noninterventional embolization does not require the use of a catheter, and the treatment of solid tumors in combination with thermal ablation can avoid some of the risks of the surgical procedure. Therefore, we developed an efficient tumor microenvironment-gelled nanocomposites with poly [(l-glutamic acid--l-tyrosine)--l-serine--l-cysteine] (PGTSCs) coated-nanoparticles (FeO&Au@PGTSCs), from which the prepared PGTSCs were given possession of pH response to an acidic tumor microenvironment. FeO&Au@PGTSC in noninterventional embolization treatment not only achieved the smart targeted medicine delivery but also meshed with noninvasive multimodal thermal ablation therapy and multimodal imaging of solid tumors via intravenous injection.

View Article and Find Full Text PDF

Weakly supervised object localization (WSOL) aims to predict both object locations and categories with only image-level class labels. However, most existing methods rely on class-specific image regions for localization, resulting in incomplete object localization. To alleviate this problem, we propose a novel end-to-end task-aware framework with a transformer encoder-decoder architecture (TAFormer) to learn class-agnostic foreground maps, including a representation encoder, a localization decoder, and a classification decoder.

View Article and Find Full Text PDF

Rationale And Objectives: This study assessed the role of second-look automated breast ultrasound (ABUS) adjunct to mammography (MAM) versus MAM alone in asymptomatic women and compared it with supplementing handheld ultrasound (HHUS).

Materials And Methods: Women aged 45 to 64 underwent HHUS, ABUS, and MAM among six hospitals in China from 2018 to 2022. We compared the screening performance of three strategies (MAM alone, MAM plus HHUS, and MAM plus ABUS) stratified by age groups and breast density.

View Article and Find Full Text PDF

Scene text spotting is of great importance to the computer vision community due to its wide variety of applications. Recent methods attempt to introduce linguistic knowledge for challenging recognition rather than pure visual classification. However, how to effectively model the linguistic rules in end-to-end deep networks remains a research challenge.

View Article and Find Full Text PDF

Visible-infrared person re-identification (VI-ReID) is challenging due to the large modality discrepancy between visible and infrared images. Existing methods mainly focus on learning modality-shared representations by embedding images from different modalities into a common feature space, in which some discriminative modality information is discarded. Different from these methods, in this paper, we propose a novel Modality-Specific Memory Network (MSMNet) to complete the missing modality information and aggregate visible and infrared modality features into a unified feature space for the VI-ReID task.

View Article and Find Full Text PDF

Weakly supervised object localization (WSOL) aims at localizing objects with only image-level labels, which has better scalability and practicability than fully supervised methods. However, without pixel-level supervision, existing methods tend to generate rough localization maps, which hinders localization performance. To alleviate this problem, we propose an adversarial transformer network (ATNet), which aims to obtain a well-learned localization model with pixel-level pseudo labels.

View Article and Find Full Text PDF

To reduce the extreme label dependence of supervised product quantization methods, the semi-supervised paradigm usually employs massive unlabeled data to assist in regularizing deep networks, thereby improving model performance. However, the existing method focuses on the overall distribution consistency between unlabeled data and class prototypes, while ignoring subtle individual variances between unlabeled instances. Therefore, the local neighborhood structure is not fully explored, which will cause the model to easily overfit in the training set.

View Article and Find Full Text PDF

The exploration of linguistic information promotes the development of scene text recognition task. Benefiting from the significance in parallel reasoning and global relationship capture, transformer-based language model (TLM) has achieved dominant performance recently. As a decoupled structure from the recognition process, we argue that TLM's capability is limited by the input low-quality visual prediction.

View Article and Find Full Text PDF

In weakly supervised (WSAL) and unsupervised temporal action localization (UAL), the target is to simultaneously localize temporal boundaries and identify category labels of actions with only video-level category labels (WSAL) or category numbers in a dataset (UAL) during training. Among existing methods, attention based methods have achieved superior performance in both tasks by highlighting action segments with foreground attention weights. However, without the segment-level supervision on the attention weight learning, the quality of the attention weight hinders the performance of these methods.

View Article and Find Full Text PDF

The mainstream of image and sentence matching studies currently focuses on fine-grained alignment of image regions and sentence words. However, these methods miss a crucial fact: the correspondence between images and sentences does not simply come from alignments between individual regions and words but from alignments between the phrases they form respectively. In this work, we propose a novel Decoupled Cross-modal Phrase-Attention network (DCPA) for image-sentence matching by modeling the relationships between textual phrases and visual phrases.

View Article and Find Full Text PDF