Recently, the multimodal large language model (MLLM) represented by GPT-4V has been a new rising research hotspot, which uses powerful large language models (LLMs) as a brain to perform multimodal tasks. The surprising emergent capabilities of the MLLM, such as writing stories based on images and optical character recognition-free math reasoning, are rare in traditional multimodal methods, suggesting a potential path to artificial general intelligence. To this end, both academia and industry have endeavored to develop MLLMs that can compete with or even outperform GPT-4V, pushing the limit of research at a surprising speed.
View Article and Find Full Text PDFIEEE Trans Pattern Anal Mach Intell
June 2024
Factorization machines (FMs) are widely used in recommender systems due to their adaptability and ability to learn from sparse data. However, for the ubiquitous non-interactive features in sparse data, existing FMs can only estimate the parameters corresponding to these features via the inner product of their embeddings. Undeniably, they cannot learn the direct interactions of these features, which limits the model's expressive power.
View Article and Find Full Text PDFRecently, learning-based multi-exposure fusion (MEF) methods have made significant improvements. However, these methods mainly focus on static scenes and are prone to generate ghosting artifacts when tackling a more common scenario, i.e.
View Article and Find Full Text PDFThe efficiency of communication across workers is a significant factor that affects the performance of federated learning. Though periodic communication strategy is applied to reduce communication rounds in training, the communication cost is still high when the training data distributions are not independently and identically distributed (non-IID) which is common in federated learning. Recently, some works introduce variance reduction to eliminate the effect caused by non-IID data among workers.
View Article and Find Full Text PDFRecent studies have shown that amphoteric regulatory protein (AREG), a member of the epidermal growth factor (EGF) family, is expressed in many cancers and is an independent prognostic indicator for patients with pancreatic cancer, but whether AREG is regulated at the epigenetic level to promote the development of pancreatic cancer (PC) has not been elucidated. Our results support the notion that AREG is overexpressed in pancreatic cancer tissues and cell lines. Functionally, the deletion of AREG impedes pancreatic cancer (PC) cell proliferation, migration, and invasion.
View Article and Find Full Text PDFDesigning molecules with desirable physiochemical properties and functionalities is a long-standing challenge in chemistry, material science, and drug discovery. Recently, machine learning-based generative models have emerged as promising approaches for molecule design. However, further refinement of methodology is highly desired as most existing methods lack unified modeling of 2D topology and 3D geometry information and fail to effectively learn the structure-property relationship for molecule design.
View Article and Find Full Text PDFIEEE Trans Neural Netw Learn Syst
October 2024
Text classification is one of the fundamental tasks in natural language processing, which requires an agent to determine the most appropriate category for input sentences. Recently, deep neural networks have achieved impressive performance in this area, especially pretrained language models (PLMs). Usually, these methods concentrate on input sentences and corresponding semantic embedding generation.
View Article and Find Full Text PDFExcessive proliferation, invasion, metastasis, and immune resistance in pancreatic cancer (PC) makes it one of the most lethal malignant tumors. Recently, DDX60 was found to be involved in the development of various tumors and in immunotherapy. Therefore, we aimed to investigate whether DDX60 is a new factor involved in PC immunotherapy.
View Article and Find Full Text PDFSaturation information in hazy images is conducive to effective haze removal, However, existing saturation-based dehazing methods just focus on the saturation value of each pixel itself, while the higher-level distribution characteristic between pixels regarding saturation remains to be harnessed. In this paper, we observe that the pixels, which share the same surface reflectance coefficient in the local patches of haze-free images, exhibit a linear relationship between their saturation component and the reciprocal of their brightness component in the corresponding hazy images normalized by atmospheric light. Furthermore, the intercept of the line described by this linear relationship on the saturation axis is exactly the saturation value of these pixels in the haze-free images.
View Article and Find Full Text PDFIEEE Trans Neural Netw Learn Syst
October 2024
Automatically solving math word problems (MWPs) is a challenging task for artificial intelligence (AI) and machine learning (ML) research, which aims to answer the problem with a mathematical expression. Many existing solutions simply model the MWP as a sequence of words, which is far from precise solving. To this end, we turn to how humans solve MWPs.
View Article and Find Full Text PDFIEEE Trans Pattern Anal Mach Intell
October 2023
Recent studies have shown that recommender systems are vulnerable, and it is easy for attackers to inject well-designed malicious profiles into the system, resulting in biased recommendations. We cannot deprive these data's injection right and deny their existence's rationality, making it imperative to study recommendation robustness. Despite impressive emerging work, threat assessment of the bi-level poisoning problem and the imperceptibility of poisoning users remain key challenges to be solved.
View Article and Find Full Text PDFJ Comput Sci Technol
November 2022
Unlabelled: Generating molecules with desired properties is an important task in chemistry and pharmacy. An efficient method may have a positive impact on finding drugs to treat diseases like COVID-19. Data mining and artificial intelligence may be good ways to find an efficient method.
View Article and Find Full Text PDFAs one of the important psychological stress reactions, Micro-expressions (MEs) are spontaneous and subtle facial movements, which usually occur in a high-stake situation and can reveal genuine human feelings and cognition. ME, Recognition (MER) has essential applications in many fields such as lie detection, criminal investigation, and psychological healing. However, due to the challenges of learning discriminative ME features via fleeting facial subtle reactions as well as the shortage of available MEs data, this research topic is still far from well-studied.
View Article and Find Full Text PDFOnline education brings more possibilities for personalized learning, in which identifying the cognitive state of learners is conducive to better providing learning services. Cognitive diagnosis is an effective measurement to assess the cognitive state of students through response data of answering the problems(e.g.
View Article and Find Full Text PDFIEEE Trans Neural Netw Learn Syst
November 2023
Entity summarization is a novel and efficient way to understand real-world facts and solve the increasing information overload problem in large-scale knowledge graphs (KG). Existing studies mainly rely on ranking independent entity descriptions as a list under a certain scoring standard such as importance. However, they often ignore the relatedness and even semantic overlap between individual descriptions.
View Article and Find Full Text PDFIEEE Trans Neural Netw Learn Syst
June 2022
Traffic anomalies, such as traffic accidents and unexpected crowd gathering, may endanger public safety if not handled timely. Detecting traffic anomalies in their early stage can benefit citizens' quality of life and city planning. However, traffic anomaly detection faces two main challenges.
View Article and Find Full Text PDFIEEE Trans Image Process
October 2021
To train accurate deep object detectors under the extreme foreground-background imbalance, heuristic sampling methods are always necessary, which either re-sample a subset of all training samples (hard sampling methods, e.g. biased sampling, OHEM), or use all training samples but re-weight them discriminatively (soft sampling methods, e.
View Article and Find Full Text PDFIEEE Trans Neural Netw Learn Syst
February 2023
Sentence semantic matching requires an agent to determine the semantic relation between two sentences, which is widely used in various natural language tasks, such as natural language inference (NLI) and paraphrase identification (PI). Much recent progress has been made in this area, especially attention-based methods and pretrained language model-based methods. However, most of these methods focus on all the important parts in sentences in a static way and only emphasize how important the words are to the query, inhibiting the ability of the attention mechanism.
View Article and Find Full Text PDFDestabilizing and reprogramming regulatory T (Treg) cells have become a potential strategy to treat tumor. Mounting evidence indicates that the transcription factor Helios is required for the stable differentiation of Treg lineage. Hence, we investigated whether Helios suppression could be a potential treatment option for pancreatic cancer patients.
View Article and Find Full Text PDFAbsolute phase unwrapping in the phase-shifting profilometry (PSP) is significant for dynamic 3-D measurements over a large depth range. Among traditional phase unwrapping methods, spatial phase unwrapping can only retrieve a relative phase map, and temporal phase unwrapping requires auxiliary projection sequences. We propose a shading-based absolute phase unwrapping (SAPU) framework for in situ 3-D measurements without additional projection patterns.
View Article and Find Full Text PDFIEEE Trans Neural Netw Learn Syst
September 2022
Most of the recent image segmentation methods have tried to achieve the utmost segmentation results using large-scale pixel-level annotated data sets. However, obtaining these pixel-level annotated training data is usually tedious and expensive. In this work, we address the task of semisupervised semantic segmentation, which reduces the need for large numbers of pixel-level annotated images.
View Article and Find Full Text PDFIn a typical digital fringe projection (DFP) system, the shadows in the fringe images cause errors in the phase map. We propose a novel discriminative repair approach to remove the shadow-induced error in the phase map. The proposed approach first classifies the shadow area in the phase map obtained by the DFP into two categories: valid shadow area and invalid shadow area.
View Article and Find Full Text PDFPixel-by-pixel phase unwrapping (PPU) has been employed to rapidly achieve three-dimensional (3-D) shape measurement without additional projection patterns. However, the maximum measurement depth range that traditional PPU can handle is within 2π in phase domain; thus PPU fails to measure the dynamic object surface when the object moves in a large depth range. In this paper, we propose a novel adaptive pixel-by-pixel phase unwrapping (APPU), which extends PPU to an unlimited depth range.
View Article and Find Full Text PDF