Publications by authors named "Siwei Ma"

Engaging in phone conversations or other cognitively challenging tasks while driving detrimentally impacts cognitive functions and has been associated with increased risk of accidents. Existing EEG methods have been shown to differentiate between load and no load, but not between different levels of cognitive load. Furthermore, it has not been investigated whether EEG measurements of load can be used to predict safety outcomes in critical events.

View Article and Find Full Text PDF

Stereochemically pure saccharides have indispensable roles in fields ranging from medicinal chemistry to materials science and organic synthesis. However, the development of a simple, stereoselective, and efficient glycosylation protocol to access α- and β-C-glycosides (particularly 2-deoxy entities) remains a persistent challenge. Existing studies have primarily focused on C1 modification of carbohydrates and transformation of glycosyl radical precursors.

View Article and Find Full Text PDF

Video Coding for Machines (VCM) aims to compress visual signals for machine analysis. However, existing methods only consider a few machines, neglecting the majority. Moreover, the machine's perceptual characteristics are not leveraged effectively, resulting in suboptimal compression efficiency.

View Article and Find Full Text PDF

To cost-effectively transmit high-quality dynamic 3D human images in immersive multimedia applications, efficient data compression is crucial. Unlike existing methods that focus on reducing signal-level reconstruction errors, we propose the first dynamic 3D human compression framework based on human priors. The layered coding architecture significantly enhances the perceptual quality while also supporting a variety of downstream tasks, including visual analysis and content editing.

View Article and Find Full Text PDF

The accelerated proliferation of visual content and the rapid development of machine vision technologies bring significant challenges in delivering visual data on a gigantic scale, which shall be effectively represented to satisfy both human and machine requirements. In this work, we investigate how hierarchical representations derived from the advanced generative prior facilitate constructing an efficient scalable coding paradigm for human-machine collaborative vision. Our key insight is that by exploiting the StyleGAN prior, we can learn three-layered representations encoding hierarchical semantics, which are elaborately designed into the basic, middle, and enhanced layers, supporting machine intelligence and human visual perception in a progressive fashion.

View Article and Find Full Text PDF

In this paper, we present the first attempt at determining where the achievable rate-distortion (R-D) performance bound in versatile video coding (VVC) intra coding is when considering the mutual dependency in the rate-distortion optimization (RDO) process. In particular, the abundant search space of encoding parameters in VVC intra coding is practically explored with a beam search-based joint rate-distortion optimization (BSJRDO) scheme. As such, the partitioning, prediction and transform decisions are jointly optimized across different coding units (CUs) with a customized search subset instead of the full space.

View Article and Find Full Text PDF

Learned image compression methods have achieved satisfactory results in recent years. However, existing methods are typically designed for RGB format, which are not suitable for YUV420 format due to the variance of different formats. In this paper, we propose an information-guided compression framework using cross-component attention mechanism, which can achieve efficient image compression in YUV420 format.

View Article and Find Full Text PDF

In-loop filters have attracted increasing attention due to the remarkable noise-reduction capability in the hybrid video coding framework. However, the existing in-loop filters in Versatile Video Coding (VVC) mainly take advantage of the image local similarity. Although some non-local based in-loop filters can make up for this shortcoming, the widely-used unsupervised parameter estimation method by non-local filters limits the performance.

View Article and Find Full Text PDF

This study describes and demonstrates key steps in a carbon-negative process for manufacturing cement from widely abundant seawater-derived magnesium (Mg) feedstocks. In contrast to conventional Portland cement, which starts with carbon-containing limestone as the source material, the proposed process uses membrane-free electrolyzers to facilitate the conversion of carbon-free magnesium ions (Mg) in seawater into magnesium hydroxide [Mg(OH)] precursors for the production of Mg-based cement. After a low-temperature carbonation curing step converts Mg(OH) into magnesium carbonates through reaction with carbon dioxide (CO), the resulting Mg-based binders can exhibit compressive strength comparable to that achieved by Portland cement after curing for only 2 days.

View Article and Find Full Text PDF

As a highly ill-posed issue, single-image super-resolution (SISR) has been widely investigated in recent years. The main task of SISR is to recover the information loss caused by the degradation procedure. According to the Nyquist sampling theory, the degradation leads to the aliasing effect and makes it hard to restore the correct textures from low-resolution (LR) images.

View Article and Find Full Text PDF

Objective: This study aims to evaluate the effect of in-vehicle audio warning at flashing-light-controlled grade crossings based on driving simulation and eye-tracking systems.

Background: Collisions at flashing-light-controlled grade crossings have severe consequences. In-vehicle audio warning has the potential to regulate driver behavior.

View Article and Find Full Text PDF

Traditional neural network compression (NNC) methods decrease the model size and floating-point operations (FLOPs) in the manner of screening out unimportant weight parameters; however, the intrinsic sparsity characteristics have not been fully exploited. In this article, from the perspective of signal processing and analysis for network parameters, we propose to use a compressive sensing (CS)-based method, namely NNCS, for performance improvements. Our proposed NNCS is inspired by the discovery that sparsity levels of weight parameters in the transform domain are greater than those in the original domain.

View Article and Find Full Text PDF

Train-vehicle collisions at STOP-sign-controlled grade crossings attract many concerns in China and across the world. Researchers have demonstrated that the cost-effective approaches to improve grade crossing safety are the redesign of signs and pavement markings as well as the application of in-vehicle audio warning. However, the impacts of improved signs design and audio warning on drivers' visual performance have barely been discussed.

View Article and Find Full Text PDF

In the latest video coding standard, namely Versatile Video Coding (VVC), more directional intra modes and reference lines have been utilized to improve prediction efficiency. However, complex content still cannot be predicted well with only the adjacent reference samples. Although nonlocal prediction has been proposed to further improve the prediction efficiency in existing algorithms, explicit signalling or matching error potentially limits the coding efficiency.

View Article and Find Full Text PDF

Existing compression methods typically focus on the removal of signal-level redundancies, while the potential and versatility of decomposing visual data into compact conceptual components still lack further study. To this end, we propose a novel conceptual compression framework that encodes visual data into compact structure and texture representations, then decodes in a deep synthesis fashion, aiming to achieve better visual reconstruction quality, flexible content manipulation, and potential support for various vision tasks. In particular, we propose to compress images by a dual-layered model consisting of two complementary visual features: 1) structure layer represented by structural maps and 2) texture layer characterized by low-dimensional deep representations.

View Article and Find Full Text PDF

Geometric partitioning has attracted increasing attention by its remarkable motion field description capability in the hybrid video coding framework. However, the existing geometric partitioning (GEO) scheme in Versatile Video Coding (VVC) causes a non-negligible burden for signaling the side information. Consequently, the coding efficiency is limited.

View Article and Find Full Text PDF

Cross-component linear model (CCLM) prediction has been repeatedly proven to be effective in reducing the inter-channel redundancies in video compression. Essentially speaking, the linear model is identically trained by employing accessible luma and chroma reference samples at both encoder and decoder, elevating the level of operational complexity due to the least square regression or max-min based model parameter derivation. In this paper, we investigate the capability of the linear model in the context of sub-sampled based cross-component correlation mining, as a means of significantly releasing the operation burden and facilitating the hardware and software design for both encoder and decoder.

View Article and Find Full Text PDF

Recent deep network-based compressive sensing (CS) methods have achieved great success. However, most of them regard different sampling matrices as different independent tasks and need to train a specific model for each target sampling matrix. Such practices give rise to inefficiency in computing and suffer from poor generalization ability.

View Article and Find Full Text PDF

A new general synthesis of pharmaceutically important azolo[1,5-]pyrimidines starting from widely available 3(5)-aminoazoles, aldehydes, and triethylamine is developed. The key is to enable the vinylation reaction that allows the in situ generation of elusive acyclic enamines and the subsequent annulation reaction to occur. This direct and practical strategy is capable of constructing a range of 5,6-unsubstituted pyrazolo[1,5-]pyrimidines and [1,2,4]triazolo[1,5-]pyrimidines.

View Article and Find Full Text PDF

Human pose transfer (HPT) is an emerging research topic with huge potential in fashion design, media production, online advertising and virtual reality. For these applications, the visual realism of fine-grained appearance details is crucial for production quality and user engagement. However, existing HPT methods often suffer from three fundamental issues: detail deficiency, content ambiguity and style inconsistency, which severely degrade the visual quality and realism of generated images.

View Article and Find Full Text PDF

The forthcoming Versatile Video Coding (VVC) standard adopts the trellis-coded quantization, which leverages the delicate trellis graph to map the quantization candidates within one block into the optimal path. Despite the high compression efficiency, the complex trellis search with soft-decision quantization may hinder the applications due to high complexity and low throughput capacity. To reduce the complexity, in this paper, we propose a low complexity trellis-coded quantization scheme in a scientifically sound way with theoretical modeling of the rate and distortion.

View Article and Find Full Text PDF

In the U.S., about 27 million metric tons of municipal solid waste are used as fuel in Waste-to-Energy (WTE) power plants, generating about seven million tons of mixed bottom ash and fly ash (combined ash) annually, which are disposed of in landfills after metal separation.

View Article and Find Full Text PDF

Background: Vocabulary skills in infants with cleft lip and/or palate (CL/P) are related to various factors. They remain underexplored among Mandarin-speaking infants with CL/P. This study identified receptive and expressive vocabulary skills among Mandarin-speaking infants with unrepaired CL/P prior to cleft palate surgery and their associated factors.

View Article and Find Full Text PDF

Neural coding is one of the central questions in systems neuroscience for understanding how the brain processes stimulus from the environment, moreover, it is also a cornerstone for designing algorithms of brain-machine interface, where decoding incoming stimulus is highly demanded for better performance of physical devices. Traditionally researchers have focused on functional magnetic resonance imaging (fMRI) data as the neural signals of interest for decoding visual scenes. However, our visual perception operates in a fast time scale of millisecond in terms of an event termed neural spike.

View Article and Find Full Text PDF