Publications by authors named "Craig Greenberg"

The sex chromosomes contain complex, important genes impacting medical phenotypes, but differ from the autosomes in their ploidy and large repetitive regions. To enable technology developers along with research and clinical laboratories to evaluate variant detection on male sex chromosomes X and Y, we create a small variant benchmark set with 111,725 variants for the Genome in a Bottle HG002 reference material. We develop an active evaluation approach to demonstrate the benchmark set reliably identifies errors in challenging genomic regions and across short and long read callsets.

View Article and Find Full Text PDF

Most data in cold-atom experiments comes from images, the analysis of which is limited by our preconceptions of the patterns that could be present in the data. We focus on the well-defined case of detecting dark solitons-appearing as local density depletions in a Bose-Einstein condensate (BEC)-using a methodology that is extensible to the general task of pattern recognition in images of cold atoms. Studying soliton dynamics over a wide range of parameters requires the analysis of large datasets, making the existing human-inspection-based methodology a significant bottleneck.

View Article and Find Full Text PDF

The National Institute of Standards and Technology has been conducting Speaker Recognition Evaluations (SREs) for over 20 years. This article provides an overview of the practice of evaluating speaker recognition technology as it has evolved during this time. Focus is given to the current state of speaker recognition evaluation.

View Article and Find Full Text PDF

Ecology has reached the point where data science competitions, in which multiple groups solve the same problem using the same data by different methods, will be productive for advancing quantitative methods for tasks such as species identification from remote sensing images. We ran a competition to help improve three tasks that are central to converting images into information on individual trees: (1) crown segmentation, for identifying the location and size of individual trees; (2) alignment, to match ground truthed trees with remote sensing; and (3) species classification of individual trees. Six teams (composed of 16 individual participants) submitted predictions for one or more tasks.

View Article and Find Full Text PDF

The data dependency due to multiple use of the same subjects has impact on the standard error (SE) of the detection cost function (DCF) in speaker recognition evaluation. The DCF is defined as a weighted sum of the probabilities of type I and type II errors at a given threshold. A two-layer data structure is constructed: target scores are grouped into target sets based on the dependency, and likewise for non-target scores.

View Article and Find Full Text PDF