Publications by authors named "Kwan-Liu Ma"

Data slice finding is an emerging technique for validating machine learning (ML) models by identifying and analyzing subgroups in a dataset that exhibit poor performance, often characterized by distinct feature sets or descriptive metadata. However, in the context of validating vision models involving unstructured image data, this approach faces significant challenges, including the laborious and costly requirement for additional metadata and the complex task of interpreting the root causes of underperformance. To address these challenges, we introduce AttributionScanner, an innovative human-in-the-loop Visual Analytics (VA) system, designed for metadata-free data slice finding.

View Article and Find Full Text PDF

The advances in multi-modal foundation models (FMs) (e.g., CLIP and LLaVA) have facilitated the auto-labeling of large-scale datasets, enhancing model performance in challenging downstream tasks such as open-vocabulary object detection and segmentation.

View Article and Find Full Text PDF

Egocentric networks, often visualized as node-link diagrams, portray the complex relationship (link) dynamics between an entity (node) and others. However, common analytics tasks are multifaceted, encompassing interactions among four key aspects: strength, function, structure, and content. Current node-link visualization designs may fall short, focusing narrowly on certain aspects and neglecting the holistic, dynamic nature of egocentric networks.

View Article and Find Full Text PDF

Sensemaking on a large collection of documents (corpus) is a challenging task often found in fields such as market research, legal studies, intelligence analysis, political science, or computational linguistics. Previous works approach this problem from topic- and entity-based perspectives, but the capability of the underlying NLP model limits their effectiveness. Recent advances in prompting with LLMs present opportunities to enhance such approaches with higher accuracy and customizability.

View Article and Find Full Text PDF
Article Synopsis
  • * Our study identifies five key challenges in prompt evaluation and proposes a feature-oriented workflow, focusing on summary characteristics like complexity and naturalness rather than traditional metrics like ROUGE.
  • * We introduce Awesum, a visual analytics system that helps users refine summarization prompts and found that it simplifies systematic evaluation for non-technical users; the workflow could also apply to other natural language generation tasks.
View Article and Find Full Text PDF

Implicit neural representations (INRs) have emerged as a powerful tool for compressing large-scale volume data. This opens up new possibilities for in situ visualization. However, the efficient application of INRs to distributed data remains an underexplored area.

View Article and Find Full Text PDF

Multivariate networks are commonly found in realworld data-driven applications. Uncovering and understanding the relations of interest in multivariate networks is not a trivial task. This paper presents a visual analytics workflow for studying multivariate networks to extract associations between different structural and semantic characteristics of the networks (e.

View Article and Find Full Text PDF

Cooperation among teams or individuals of healthcare professionals (HCPs) is one of the crucial factors towards patients' survival outcome. However, it is challenging to uncover and understand such factors in the complex Multiteam System (MTS) communication networks representing daily HCP cooperation. In this paper, we present a study on MTS communication networks constructed with real-world cancer patients' Electronic Health Record (EHR) access logs.

View Article and Find Full Text PDF

Recent advancements in pre-trained language-image models have ushered in a new era of visual comprehension. Leveraging the power of these models, this article tackles two issues within the realm of visual analytics: (1) the efficient exploration of large-scale image datasets and identification of data biases within them; (2) the evaluation of image captions and steering of their generation process. On the one hand, by visually examining the captions generated from language-image models for an image dataset, we gain deeper insights into the visual contents, unearthing data biases that may be entrenched within the dataset.

View Article and Find Full Text PDF

A common way to evaluate the reliability of dimensionality reduction (DR) embeddings is to quantify how well labeled classes form compact, mutually separated clusters in the embeddings. This approach is based on the assumption that the classes stay as clear clusters in the original high-dimensional space. However, in reality, this assumption can be violated; a single class can be fragmented into multiple separated clusters, and multiple classes can be merged into a single cluster.

View Article and Find Full Text PDF

Volume data is commonly found in many scientific disciplines, like medicine, physics, and biology. Experts rely on robust scientific visualization techniques to extract valuable insights from the data. Recent years have shown path tracing to be the preferred approach for volumetric rendering, given its high levels of realism.

View Article and Find Full Text PDF

When telling a data story, an author has an intention they seek to convey to an audience. This intention can be of many forms such as to persuade, to educate, to inform, or even to entertain. In addition to expressing their intention, the story plot must balance being consumable and enjoyable while preserving scientific integrity.

View Article and Find Full Text PDF

Implicit neural networks have demonstrated immense potential in compressing volume data for visualization. However, despite their advantages, the high costs of training and inference have thus far limited their application to offline data processing and non-interactive rendering. In this article, we present a novel solution that leverages modern GPU tensor cores, a well-implemented CUDA machine learning framework, an optimized global-illumination-capable volume rendering algorithm, and a suitable acceleration data structure to enable real-time direct ray tracing of volumetric neural representations.

View Article and Find Full Text PDF

Vision transformer (ViT) expands the success of transformer models from sequential data to images. The model decomposes an image into many smaller patches and arranges them into a sequence. Multi-head self-attentions are then applied to the sequence to learn the attention between patches.

View Article and Find Full Text PDF

Unsurprisingly, we have observed tremendous interests and efforts in the application of machine learning (ML) to many data visualization problems, which are having success and leading to new capabilities. However, there is a space in visualization research that is either completely or partly agnostic to ML that should not be lost in this current VIS+ML movement. The research that this space can offer is imperative to the growth of our field and it is important that we remind ourselves to invest in this research as well as show what it could bear.

View Article and Find Full Text PDF

Real-world machine learning applications need to be thoroughly evaluated to meet critical product requirements for model release, to ensure fairness for different groups or individuals, and to achieve a consistent performance in various scenarios. For example, in autonomous driving, an object classification model should achieve high detection rates under different conditions of weather, distance, etc. Similarly, in the financial setting, credit-scoring models must not discriminate against minority groups.

View Article and Find Full Text PDF

Spatial statistical analysis of multivariate volumetric data can be challenging due to scale, complexity, and occlusion. Advances in topological segmentation, feature extraction, and statistical summarization have helped overcome the challenges. This work introduces a new spatial statistical decomposition method based on level sets, connected components, and a novel variation of the restricted centroidal Voronoi tessellation that is better suited for spatial statistical decomposition and parallel efficiency.

View Article and Find Full Text PDF

Volume data is found in many important scientific and engineering applications. Rendering this data for visualization at high quality and interactive rates for demanding applications such as virtual reality is still not easily achievable even using professional-grade hardware. We introduce FoVolNet-a method to significantly increase the performance of volume data visualization.

View Article and Find Full Text PDF

Identifying unique characteristics in a network through comparison with another network is an essential network analysis task. For example, with networks of protein interactions obtained from normal and cancer tissues, we can discover unique types of interactions in cancer tissues. This analysis task could be greatly assisted by contrastive learning, which is an emerging analysis approach to discover salient patterns in one dataset relative to another.

View Article and Find Full Text PDF

Environmental sensors provide crucial data for understanding our surroundings. For example, air quality maps based on sensor readings help users make decisions to mitigate the effects of pollution on their health. Standard maps show readings from individual sensors or colored contours indicating estimated pollution levels.

View Article and Find Full Text PDF

Many real-world applications involve analyzing time-dependent phenomena, which are intrinsically functional, consisting of curves varying over a continuum (e.g., time).

View Article and Find Full Text PDF

The rapid development of Convolutional Neural Networks (CNNs) in recent years has triggered significant breakthroughs in many machine learning (ML) applications. The ability to understand and compare various CNN models available is thus essential. The conventional approach with visualizing each model's quantitative features, such as classification accuracy and computational complexity, is not sufficient for a deeper understanding and comparison of the behaviors of different models.

View Article and Find Full Text PDF

Depending on the node ordering, an adjacency matrix can highlight distinct characteristics of a graph. Deriving a "proper" node ordering is thus a critical step in visualizing a graph as an adjacency matrix. Users often try multiple matrix reorderings using different methods until they find one that meets the analysis goal.

View Article and Find Full Text PDF

Diffusion tensor imaging (DTI) has been used to study the effects of neurodegenerative diseases on neural pathways, which may lead to more reliable and early diagnosis of these diseases as well as a better understanding of how they affect the brain. We introduce a predictive visual analytics system for studying patient groups based on their labeled DTI fiber tract data and corresponding statistics. The system's machine-learning-augmented interface guides the user through an organized and holistic analysis space, including the statistical feature space, the physical space, and the space of patients over different groups.

View Article and Find Full Text PDF

Optimizing the performance of large-scale parallel codes is critical for efficient utilization of computing resources. Code developers often explore various execution parameters, such as hardware configurations, system software choices, and application parameters, and are interested in detecting and understanding bottlenecks in different executions. They often collect hierarchical performance profiles represented as call graphs, which combine performance metrics with their execution contexts.

View Article and Find Full Text PDF

A PHP Error was encountered

Severity: Warning

Message: fopen(/var/lib/php/sessions/ci_sessionfa2in64d9k3vditmu0qv5hg9phv9abg6): Failed to open stream: No space left on device

Filename: drivers/Session_files_driver.php

Line Number: 177

Backtrace:

File: /var/www/html/index.php
Line: 316
Function: require_once

A PHP Error was encountered

Severity: Warning

Message: session_start(): Failed to read session data: user (path: /var/lib/php/sessions)

Filename: Session/Session.php

Line Number: 137

Backtrace:

File: /var/www/html/index.php
Line: 316
Function: require_once