Publications by authors named "He Xiaohai"

Plagioclase is a principal component of the Earth's crust, whose compositional and structural analysis is vital for understanding the crust's construction and evolution. Accurate identification of extinction angle features plays an important role in determining the sodium-calcium content in plagioclase. Manual evaluation of these extinction angle features is tedious and dependent on human expertise.

View Article and Find Full Text PDF

Despite significant advancements in CNN-based object detection technology, adverse weather conditions can disrupt imaging sensors' ability to capture clear images, thereby adversely impacting detection accuracy. Mainstream algorithms for adverse weather object detection enhance detection performance through image restoration methods. Nevertheless, the majority of these approaches are designed for a specific degradation scenario, making it difficult to adapt to diverse weather conditions.

View Article and Find Full Text PDF
Article Synopsis
  • Computer vision technology is increasingly applied in various fields, particularly for target detection and feature extraction to analyze motion data.
  • In biology, there are challenges with data analysis for entities like bacteria and tumors, which opens avenues for further research.
  • This paper presents a new optical MRI imaging method that utilizes computer vision to analyze the morphological features of kidney tumors, offering a non-invasive approach for clinical diagnosis and treatment.
View Article and Find Full Text PDF
Article Synopsis
  • Early detection of cognitive impairment in older adults can help reduce age-related disabilities, with gait parameters serving as key indicators of cognitive decline.
  • The study developed a new machine vision-based recognition network called Deep Optimized GaitPart (DO-GaitPart) to analyze walking patterns in a dataset of 158 older adults, incorporating innovative features to enhance performance.
  • The results demonstrated DO-GaitPart's effectiveness, achieving a notable accuracy in identifying cognitive states from gait analysis, suggesting its potential as a valuable tool for cognitive assessment.
View Article and Find Full Text PDF

Lossy image coding techniques usually result in various undesirable compression artifacts. Recently, deep convolutional neural networks have seen encouraging advances in compression artifact reduction. However, most of them focus on the restoration of the luma channel without considering the chroma components.

View Article and Find Full Text PDF

Chinese medical machine reading comprehension question-answering (cMed-MRCQA) is a critical component of the intelligence question-answering task, focusing on the Chinese medical domain question-answering task. Its purpose enable machines to analyze and understand the given text and question and then extract the accurate answer. To enhance cMed-MRCQA performance, it is essential to possess a profound comprehension and analysis of the context, deduce concealed information from the textual content and, subsequently, precisely determine the answer's span.

View Article and Find Full Text PDF

Structural MRI and PET imaging play an important role in the diagnosis of Alzheimer's disease (AD), showing the morphological changes and glucose metabolism changes in the brain respectively. The manifestations in the brain image of some cognitive impairment patients are relatively inconspicuous, for example, it still has difficulties in achieving accurate diagnosis through sMRI in clinical practice. With the emergence of deep learning, convolutional neural network (CNN) has become a valuable method in AD-aided diagnosis, but some CNN methods cannot effectively learn the features of brain image, making the diagnosis of AD still presents some challenges.

View Article and Find Full Text PDF

In recent years, deep learning models have been applied to neuroimaging data for early diagnosis of Alzheimer's disease (AD). Structural magnetic resonance imaging (sMRI) and positron emission tomography (PET) images provide structural and functional information about the brain, respectively. Combining these features leads to improved performance than using a single modality alone in building predictive models for AD diagnosis.

View Article and Find Full Text PDF

Digital cores can characterize the true internal structure of rocks at the pore scale. This method has become one of the most effective ways to quantitatively analyze the pore structure and other properties of digital cores in rock physics and petroleum science. Deep learning can precisely extract features from training images for a rapid reconstruction of digital cores.

View Article and Find Full Text PDF

Alzheimer's disease (AD) is a neurodegenerative disorder, the most common cause of dementia, so the accurate diagnosis of AD and its prodromal stage mild cognitive impairment (MCI) is significant. Recent studies have demonstrated that multiple neuroimaging and biological measures contain complementary information for diagnosis. Many existing multi-modal models based on deep learning simply concatenate each modality's features despite substantial differences in representation spaces.

View Article and Find Full Text PDF

Chinese medical knowledge-based question answering (cMed-KBQA) is a vital component of the intelligence question-answering assignment. Its purpose is to enable the model to comprehend questions and then deduce the proper answer from the knowledge base. Previous methods solely considered how questions and knowledge base paths were represented, disregarding their significance.

View Article and Find Full Text PDF

Background And Objective: The Chinese medical question answer matching (cMedQAM) task is the essential branch of the medical question answering system. Its goal is to accurately choose the correct response from a pool of candidate answers. The relatively effective methods are deep neural network-based and attention-based to obtain rich question-and-answer representations.

View Article and Find Full Text PDF

Visual-based social group detection aims to cluster pedestrians in crowd scenes according to social interactions and spatio-temporal position relations by using surveillance video data. It is a basic technique for crowd behaviour analysis and group-based activity understanding. According to the theory of proxemics study, the interpersonal relationship between individuals determines the scope of their self-space, while the spatial distance can reflect the closeness degree of their interpersonal relationship.

View Article and Find Full Text PDF

Modeling the three-dimensional (3D) structure from a given 2D image is of great importance for analyzing and studying the physical properties of porous media. As an intractable inverse problem, many methods have been developed to address this fundamental problems over the past decades. Among many methods, the deep learning-(DL) based methods show great advantages in terms of accuracy, diversity, and efficiency.

View Article and Find Full Text PDF

Medical visual question answering (Med-VQA) aims to accurately answer clinical questions about medical images. Despite its enormous potential for application in the medical domain, the current technology is still in its infancy. Compared with general visual question answering task, Med-VQA task involve more demanding challenges.

View Article and Find Full Text PDF

Text-to-image synthesis is a fundamental and challenging task in computer vision, which aims to synthesize realistic images from given descriptions. Recently, text-to-image synthesis methods have achieved great improvements in the quality of synthesized images. However, very few works have explored its application in the scenario of face synthesis, which is of great potentials in face-related applications and the public safety domain.

View Article and Find Full Text PDF

Medical visual question answering (Med-VQA) aims to leverage a pre-trained artificial intelligence model to answer clinical questions raised by doctors or patients regarding radiology images. However, owing to the high professional requirements in the medical field and the difficulty of annotating medical data, Med-VQA lacks sufficient large-scale, well-annotated radiology images for training. Researchers have mainly focused on improving the ability of the model's visual feature extractor to address this problem.

View Article and Find Full Text PDF

The new generation video coding standard Versatile Video Coding (VVC) has adopted many novel technologies to improve compression performance, and consequently, remarkable results have been achieved. In practical applications, less data, in terms of bitrate, would reduce the burden of the sensors and improve their performance. Hence, to further enhance the intra compression performance of VVC, we propose a fusion-based intra prediction algorithm in this paper.

View Article and Find Full Text PDF

Despite the fact that Versatile Video Coding (VVC) achieves a superior coding performance to High-Efficiency Video Coding (HEVC), it takes a lot of time to encode video sequences due to the high computational complexity of the tools. Among these tools, Multiple Transform Selection (MTS) require the best of several transforms to be obtained using the Rate-Distortion Optimization (RDO) process, which increases the time spent video encoding, meaning that VVC is not suited to real-time sensor application networks. In this paper, a low-complexity multiple transform selection, combined with the multi-type tree partition algorithm, is proposed to address the above issue.

View Article and Find Full Text PDF

. Alzheimer's disease (AD) is a degenerative brain disorder, one of the main causes of death in elderly people, so early diagnosis of AD is vital to prompt access to medication and medical care. Fluorodeoxyglucose positron emission tomography (FDG-PET) proves to be effective to help understand neurological changes via measuring glucose uptake.

View Article and Find Full Text PDF

Background And Objective: Multi-modal medical images, such as magnetic resonance imaging (MRI) and positron emission tomography (PET), have been widely used for the diagnosis of brain disorder diseases like Alzheimer's disease (AD) since they can provide various information. PET scans can detect cellular changes in organs and tissues earlier than MRI. Unlike MRI, PET data is difficult to acquire due to cost, radiation, or other limitations.

View Article and Find Full Text PDF

Frail older adults have an increased risk of adverse health outcomes and premature death. They also exhibit altered gait characteristics in comparison with healthy individuals. In this study, we created a Fried's frailty phenotype (FFP) labelled casual walking video set of older adults based on the West China Health and Aging Trend study.

View Article and Find Full Text PDF

The amount of multimedia data, such as images and videos, has been increasing rapidly with the development of various imaging devices and the Internet, bringing more stress and challenges to information storage and transmission. The redundancy in images can be reduced to decrease data size via lossy compression, such as the most widely used standard Joint Photographic Experts Group (JPEG). However, the decompressed images generally suffer from various artifacts (e.

View Article and Find Full Text PDF

Background: Alzheimer's disease (AD) is the most common symptom of aggressive and irreversible dementia that affects people's ability of daily life. At present, neuroimaging technology plays an important role in the evaluation and early diagnosis of AD. With the widespread application of artificial intelligence in the medical field, deep learning has shown great potential in computer-aided AD diagnosis based on MRI.

View Article and Find Full Text PDF

Most methods that model 3D porous media from 2D images are based on binary images. In this paper, we propose a method for reconstructing 3D greyscale isotropic porous media images from a single image. Our proposed method incorporates a fast-sampling procedure to control the continuity and variability between adjoining reconstruction layers, a new similarity calculation method to obtain the most similar patterns from a pattern dictionary, and a central area simulation procedure to solve the block effect problem.

View Article and Find Full Text PDF