Publications by authors named "Carl Nettelblad"

Pooling and imputation are computational methods that can be combined for achieving cost-effective and accurate high-density genotyping of both common and rare variants, as demonstrated in a MAGIC wheat population. The plant breeding industry has shown growing interest in using the genotype data of relevant markers for performing selection of new competitive varieties. The selection usually benefits from large amounts of marker data, and it is therefore crucial to dispose of data collection methods that are both cost-effective and reliable.

View Article and Find Full Text PDF

The idea of using ultrashort X-ray pulses to obtain images of single proteins frozen in time has fascinated and inspired many. It was one of the arguments for building X-ray free-electron lasers. According to theory, the extremely intense pulses provide sufficient signal to dispense with using crystals as an amplifier, and the ultrashort pulse duration permits capturing the diffraction data before the sample inevitably explodes.

View Article and Find Full Text PDF

Motivation: Genotype imputation has the potential to increase the amount of information that can be gained from the often limited biological material available in ancient samples. As many widely used tools have been developed with modern data in mind, their design is not necessarily reflective of the requirements in studies of ancient DNA. Here, we investigate if an imputation method based on the full probabilistic Li and Stephens model of haplotype frequencies might be beneficial for the particular challenges posed by ancient data.

View Article and Find Full Text PDF

Background: Despite continuing technological advances, the cost for large-scale genotyping of a high number of samples can be prohibitive. The purpose of this study is to design a cost-saving strategy for SNP genotyping. We suggest making use of pooling, a group testing technique, to drop the amount of SNP arrays needed.

View Article and Find Full Text PDF

With capabilities of sequencing ancient DNA to high coverage often limited by sample quality or cost, imputation of missing genotypes presents a possibility to increase the power of inference as well as cost-effectiveness for the analysis of ancient data. However, the high degree of uncertainty often associated with ancient DNA poses several methodological challenges, and performance of imputation methods in this context has not been fully explored. To gain further insights, we performed a systematic evaluation of imputation of ancient data using Beagle v4.

View Article and Find Full Text PDF

Dimensionality reduction is a data transformation technique widely used in various fields of genomics research. The application of dimensionality reduction to genotype data is known to capture genetic similarity between individuals, and is used for visualization of genetic variation, identification of population structure as well as ancestry mapping. Among frequently used methods are principal component analysis, which is a linear transform that often misses more fine-scale structures, and neighbor-graph based methods which focus on local relationships rather than large-scale patterns.

View Article and Find Full Text PDF

One hidden yet important issue for developing neural network potentials (NNPs) is the choice of training algorithm. In this article, we compare the performance of two popular training algorithms, the adaptive moment estimation algorithm (Adam) and the extended Kalman filter algorithm (EKF), using the Behler-Parrinello neural network and two publicly accessible datasets of liquid water [Morawietz et al., Proc.

View Article and Find Full Text PDF

Single Particle Imaging (SPI) with intense coherent X-ray pulses from X-ray free-electron lasers (XFELs) has the potential to produce molecular structures without the need for crystallization or freezing. Here we present a dataset of 285,944 diffraction patterns from aerosolized Coliphage PR772 virus particles injected into the femtosecond X-ray pulses of the Linac Coherent Light Source (LCLS). Additional exposures with background information are also deposited.

View Article and Find Full Text PDF

Modern Flash X-ray diffraction Imaging (FXI) acquires diffraction signals from single biomolecules at a high repetition rate from X-ray Free Electron Lasers (XFELs), easily obtaining millions of 2D diffraction patterns from a single experiment. Due to the stochastic nature of FXI experiments and the massive volumes of data, retrieving 3D electron densities from raw 2D diffraction patterns is a challenging and time-consuming task. We propose a semi-automatic data analysis pipeline for FXI experiments, which includes four steps: hit-finding and preliminary filtering, pattern classification, 3D Fourier reconstruction, and post-analysis.

View Article and Find Full Text PDF

Haploid high quality reference genomes are an important resource in genomic research projects. A consequence is that DNA fragments carrying the reference allele will be more likely to map successfully, or receive higher quality scores. This reference bias can have effects on downstream population genomic analysis when heterozygous sites are falsely considered homozygous for the reference allele.

View Article and Find Full Text PDF

The possibility of imaging single proteins constitutes an exciting challenge for x-ray lasers. Despite encouraging results on large particles, imaging small particles has proven to be difficult for two reasons: not quite high enough pulse intensity from currently available x-ray lasers and, as we demonstrate here, contamination of the aerosolized molecules by nonvolatile contaminants in the solution. The amount of contamination on the sample depends on the initial droplet size during aerosolization.

View Article and Find Full Text PDF

In imaging modalities recording diffraction data, such as the imaging of viruses at X-ray free electron laser facilities, the original image can be reconstructed assuming known phases. When phases are unknown, oversampling and a constraint on the support region in the original object can be used to solve a non-convex optimization problem using iterative alternating-projection methods. Such schemes are ill-suited for finding the optimum solution for sparse data, since the recorded pattern does not correspond exactly to the original wave function.

View Article and Find Full Text PDF

Diffraction before destruction using X-ray free-electron lasers (XFELs) has the potential to determine radiation-damage-free structures without the need for crystallization. This article presents the three-dimensional reconstruction of the Melbournevirus from single-particle X-ray diffraction patterns collected at the LINAC Coherent Light Source (LCLS) as well as reconstructions from simulated data exploring the consequences of different kinds of experimental sources of noise. The reconstruction from experimental data suffers from a strong artifact in the center of the particle.

View Article and Find Full Text PDF

Modern technology for producing extremely bright and coherent x-ray laser pulses provides the possibility to acquire a large number of diffraction patterns from individual biological nanoparticles, including proteins, viruses, and DNA. These two-dimensional diffraction patterns can be practically reconstructed and retrieved down to a resolution of a few angstroms. In principle, a sufficiently large collection of diffraction patterns will contain the required information for a full three-dimensional reconstruction of the biomolecule.

View Article and Find Full Text PDF

Background: The advent of next-generation sequencing (NGS) has made whole-genome sequencing of cohorts of individuals a reality. Primary datasets of raw or aligned reads of this sort can get very large. For scientific questions where curated called variants are not sufficient, the sheer size of the datasets makes analysis prohibitively expensive.

View Article and Find Full Text PDF

We use extremely bright and ultrashort pulses from an x-ray free-electron laser (XFEL) to measure correlations in x rays scattered from individual bioparticles. This allows us to go beyond the traditional crystallography and single-particle imaging approaches for structure investigations. We employ angular correlations to recover the three-dimensional (3D) structure of nanoscale viruses from x-ray diffraction data measured at the Linac Coherent Light Source.

View Article and Find Full Text PDF

The existence of noise and column-wise artifacts in the CSPAD-140K detector and in a module of the CSPAD-2.3M large camera, respectively, is reported for the L730 and L867 experiments performed at the CXI Instrument at the Linac Coherent Light Source (LCLS), in low-flux and low signal-to-noise ratio regime. Possible remedies are discussed and an additional step in the preprocessing of data is introduced, which consists of performing a median subtraction along the columns of the detector modules.

View Article and Find Full Text PDF

Single-particle diffraction from X-ray Free Electron Lasers offers the potential for molecular structure determination without the need for crystallization. In an effort to further develop the technique, we present a dataset of coherent soft X-ray diffraction images of Coliphage PR772 virus, collected at the Atomic Molecular Optics (AMO) beamline with pnCCD detectors in the LAMP instrument at the Linac Coherent Light Source. The diameter of PR772 ranges from 65-70 nm, which is considerably smaller than the previously reported ~600 nm diameter Mimivirus.

View Article and Find Full Text PDF

This study explores the capabilities of the Coherent X-ray Imaging Instrument at the Linac Coherent Light Source to image small biological samples. The weak signal from small samples puts a significant demand on the experiment. Aerosolized particles of ∼40 nm in diameter were injected into the submicrometre X-ray focus at a reduced pressure.

View Article and Find Full Text PDF

Background: This paper describes a combined heuristic and hidden Markov model (HMM) method to accurately impute missing genotypes in livestock datasets. Genomic selection in breeding programs requires high-density genotyping of many individuals, making algorithms that economically generate this information crucial. There are two common classes of imputation methods, heuristic methods and probabilistic methods, the latter being largely based on hidden Markov models.

View Article and Find Full Text PDF

Single particle diffractive imaging data from Rice Dwarf Virus (RDV) were recorded using the Coherent X-ray Imaging (CXI) instrument at the Linac Coherent Light Source (LCLS). RDV was chosen as it is a well-characterized model system, useful for proof-of-principle experiments, system optimization and algorithm development. RDV, an icosahedral virus of about 70 nm in diameter, was aerosolized and injected into the approximately 0.

View Article and Find Full Text PDF

Advances in X-ray detectors and increases in the brightness of X-ray sources combined with more efficient sample delivery techniques have brought about tremendous increases in the speed of data collection in diffraction experiments. Using X-ray free-electron lasers such as the Linac Coherent Light Source (LCLS), more than 100 diffraction patterns can be collected in a second. These high data rates are invaluable for flash X-ray imaging (FXI), where aerosolized samples are exposed to the X-ray beam and the resulting diffraction patterns are used to reconstruct a three-dimensional image of the sample.

View Article and Find Full Text PDF

In quantitative trait locus (QTL) mapping significance of putative QTL is often determined using permutation testing. The computational needs to calculate the significance level are immense, 10 up to 10 or even more permutations can be needed. We have previously introduced the PruneDIRECT algorithm for multiple QTL scan with epistatic interactions.

View Article and Find Full Text PDF

We present a proof-of-concept three-dimensional reconstruction of the giant mimivirus particle from experimentally measured diffraction patterns from an x-ray free-electron laser. Three-dimensional imaging requires the assembly of many two-dimensional patterns into an internally consistent Fourier volume. Since each particle is randomly oriented when exposed to the x-ray pulse, relative orientations have to be retrieved from the diffraction data alone.

View Article and Find Full Text PDF