Publications by authors named "Michael Correll"

Dashboards remain ubiquitous tools for analyzing data and disseminating the findings. Understanding the range of dashboard designs, from simple to complex, can support development of authoring tools that enable end-users to meet their analysis and communication goals. Yet, there has been little work that provides a quantifiable, systematic, and descriptive overview of dashboard design patterns.

View Article and Find Full Text PDF

Dashboards are no longer mere static displays of metrics; through functionality such as interaction and storytelling, they have evolved to support analytic and communicative goals like monitoring and reporting. Existing dashboard design guidelines, however, are often unable to account for this expanded scope as they largely focus on best practices for visual design. In contrast, we frame dashboard design as facilitating an analytical conversation: a cooperative, interactive experience where a user may interact with, reason about, or freely query the underlying data.

View Article and Find Full Text PDF

Idealized probability distributions, such as normal or other curves, lie at the root of confirmatory statistical tests. But how well do people understand these idealized curves? In practical terms, does the human visual system allow us to match sample data distributions with hypothesized population distributions from which those samples might have been drawn? And how do different visualization techniques impact this capability? This article shares the results of a crowdsourced experiment that tested the ability of respondents to fit normal curves to four different data distribution visualizations: bar histograms, dotplot histograms, strip plots, and boxplots. We find that the crowd can estimate the center (mean) of a distribution with some success and little bias.

View Article and Find Full Text PDF

Working with data in table form is usually considered a preparatory and tedious step in the sensemaking pipeline; a way of getting the data ready for more sophisticated visualization and analytical tools. But for many people, spreadsheets - the quintessential table tool - remain a critical part of their information ecosystem, allowing them to interact with their data in ways that are hidden or abstracted in more complex tools. This is particularly true for data workers [61], people who work with data as part of their job but do not identify as professional analysts or data scientists.

View Article and Find Full Text PDF

For decades, statisticians and methodologists have insisted researchers utilize graphical analysis much more heavily. Despite cogent and passionate recommendations, there has been no graphical revolution. Instead, researchers rely heavily on misleading graphics that violate visual processing heuristics.

View Article and Find Full Text PDF

In this paper, we examine the robustness of scagnostics through a series of theoretical and empirical studies. First, we investigate the sensitivity of scagnostics by employing perturbing operations on more than 60M synthetic and real-world scatterplots. We found that two scagnostic measures, Outlying and Clumpy, are overly sensitive to data binning.

View Article and Find Full Text PDF

Understanding and accounting for uncertainty is critical to effectively reasoning about visualized data. However, evaluating the impact of an uncertainty visualization is complex due to the difficulties that people have interpreting uncertainty and the challenge of defining correct behavior with uncertainty information. Currently, evaluators of uncertainty visualization must rely on general purpose visualization evaluation frameworks which can be ill-equipped to provide guidance with the unique difficulties of assessing judgments under uncertainty.

View Article and Find Full Text PDF

Famous examples such as Anscombe's Quartet highlight that one of the core benefits of visualizations is allowing people to discover visual patterns that might otherwise be hidden by summary statistics. This visual inspection is particularly important in exploratory data analysis, where analysts can use visualizations such as histograms and dot plots to identify data quality issues. Yet, these visualizations are driven by parameters such as histogram bin size or mark opacity that have a great deal of impact on the final visual appearance of the chart, but are rarely optimized to make important features visible.

View Article and Find Full Text PDF

Dashboards are one of the most common use cases for data visualization, and their design and contexts of use are considerably different from exploratory visualization tools. In this paper, we look at the broad scope of how dashboards are used in practice through an analysis of dashboard examples and documentation about their use. We systematically review the literature surrounding dashboard use, construct a design space for dashboards, and identify major dashboard types.

View Article and Find Full Text PDF

Thematic maps are commonly used for visualizing the density of events in spatial data. However, these maps can mislead by giving visual prominence to known base rates (such as population densities) or to artifacts of sample size and normalization (such as outliers arising from smaller, and thus more variable, samples). In this work, we adapt Bayesian surprise to generate maps that counter these biases.

View Article and Find Full Text PDF

When making an inference or comparison with uncertain, noisy, or incomplete data, measurement error and confidence intervals can be as important for judgment as the actual mean values of different groups. These often misunderstood statistical quantities are frequently represented by bar charts with error bars. This paper investigates drawbacks with this standard encoding, and considers a set of alternatives designed to more effectively communicate the implications of mean and error data to a general audience, drawing from lessons learned from the use of visual statistics in the information visualization community.

View Article and Find Full Text PDF

Motivation: The advent of next-generation sequencing (NGS) has created unprecedented opportunities to examine viral populations within individual hosts, among infected individuals and over time. Comparing sequence variability across viral genomes allows for the construction of complex population structures, the analysis of which can yield powerful biological insights. However, the simultaneous display of sequence variation, coverage depth and quality scores across thousands of bases presents a unique visualization challenge that has not been fully met by current NGS analysis tools.

View Article and Find Full Text PDF

Many visualization tasks require the viewer to make judgments about aggregate properties of data. Recent work has shown that viewers can perform such tasks effectively, for example to efficiently compare the maximums or means over ranges of data. However, this work also shows that such effectiveness depends on the designs of the displays.

View Article and Find Full Text PDF

Unlabelled: Since the 1960s, simian hemorrhagic fever virus (SHFV; Nidovirales, Arteriviridae) has caused highly fatal outbreaks of viral hemorrhagic fever in captive Asian macaque colonies. However, the source(s) of these outbreaks and the natural reservoir(s) of this virus remain obscure. Here we report the identification of two novel, highly divergent simian arteriviruses related to SHFV, Mikumi yellow baboon virus 1 (MYBV-1) and Southwest baboon virus 1 (SWBV-1), in wild and captive baboons, respectively, and demonstrate the recent transmission of SWBV-1 among captive baboons.

View Article and Find Full Text PDF

Key biological properties such as high genetic diversity and high evolutionary rate enhance the potential of certain RNA viruses to adapt and emerge. Identifying viruses with these properties in their natural hosts could dramatically improve disease forecasting and surveillance. Recently, we discovered two novel members of the viral family Arteriviridae: simian hemorrhagic fever virus (SHFV)-krc1 and SHFV-krc2, infecting a single wild red colobus (Procolobus rufomitratus tephrosceles) in Kibale National Park, Uganda.

View Article and Find Full Text PDF

The visual system can make highly efficient aggregate judgements about a set of objects, with speed roughly independent of the number of objects considered. While there is a rich literature on these mechanisms and their ramifications for visual summarization tasks, this prior work rarely considers more complex tasks requiring multiple judgements over long periods of time, and has not considered certain critical aggregation types, such as the localization of the mean value of a set of points. In this paper, we explore these questions using a common visualization task as a case study: relative mean value judgements within multi-class scatterplots.

View Article and Find Full Text PDF

CD8+ T cell responses rapidly select viral variants during acute human immunodeficiency virus (HIV)/simian immunodeficiency virus (SIV) infection. We used pyrosequencing to examine variation within three SIV-derived epitopes (Gag₃₈₆₋₃₉₄GW9, Nef₁₀₃₋₁₁₁RM9, and Rev₅₉₋₆₈SP10) targeted by immunodominant CD8+ T cell responses in acutely infected Mauritian cynomolgus macaques. In animals recognizing all three epitopes, variation within Rev₅₉₋₆₈SP10 was associated with delayed accumulation of variants in Gag₃₈₆₋₃₉₄GW9 but had no effect on variation within Nef₁₀₃₋₁₁₁RM9.

View Article and Find Full Text PDF

A PHP Error was encountered

Severity: Warning

Message: fopen(/var/lib/php/sessions/ci_sessionrpjgnormtg64ibgfl0fllpov1ngbpbt1): Failed to open stream: No space left on device

Filename: drivers/Session_files_driver.php

Line Number: 177

Backtrace:

File: /var/www/html/index.php
Line: 316
Function: require_once

A PHP Error was encountered

Severity: Warning

Message: session_start(): Failed to read session data: user (path: /var/lib/php/sessions)

Filename: Session/Session.php

Line Number: 137

Backtrace:

File: /var/www/html/index.php
Line: 316
Function: require_once