Accuracy assessment of diploid consensus sequences.

IEEE/ACM Trans Comput Biol Bioinform

Department of Computer Science, University of Southern California, Los Angeles 90089, USA.

Published: April 2007

If the origins of fragments are known in genome sequencing projects, it is straightforward to reconstruct diploid consensus sequences. In reality, however, this is not true. Although there are proposed methods to reconstruct haplotypes from genome sequencing projects, an accuracy assessment is required to evaluate the confidence of the estimated diploid consensus sequences. In this paper, we define the confidence score of diploid consensus sequences. It requires the calculation of the likelihood of an assembly. To calculate the likelihood, we propose a linear time algorithm with respect to the number of polymorphic sites. The likelihood calculation and confidence score are used for further improvements of haplotype estimation in two directions. One direction is that low-scored phases are disconnected. The other direction is that, instead of using nominal frequency 1/2, the haplotype frequency is estimated to reflect the actual contribution of each haplotype. Our method was evaluated on the simulated data whose polymorphism rate (1.2 percent) was based on Ciona intestinalis. As a result, the high accuracy of our algorithm was indicated: The true positive rate of the haplotype estimation was greater than 97 percent.

Download full-text PDF

Source
http://dx.doi.org/10.1109/TCBB.2007.1007DOI Listing

Publication Analysis

Top Keywords

diploid consensus
16
consensus sequences
16
accuracy assessment
8
genome sequencing
8
sequencing projects
8
confidence score
8
haplotype estimation
8
diploid
4
assessment diploid
4
consensus
4

Similar Publications

Some unique asexual species persist over time and contradict the consensus that sex is a prerequisite for long-term evolutionary survival. How they escape the dead-end fate remains enigmatic. Here, we generated a haplotype-resolved genome assembly on the basis of a single individual and collected genomic data from worldwide populations of the parthenogenetic diploid oribatid mite to identify signatures of persistence without sex.

View Article and Find Full Text PDF

Plasmopara viticola, the causal agent of grapevine downy mildew, is a biotrophic oomycete engaged in a tight coevolutionary relationship with its host. Rapid adaptation of the pathogen is favored by annual sexual reproduction that generates genotypic diversity. With the aim of studying the recombination landscape across the P.

View Article and Find Full Text PDF

Placental Mosaicism in Multiple Gestation: Complete Hydatidiform Mole with Coexisting Twin Fetus.

Acta Med Philipp

June 2024

Division of Trophoblastic Diseases, Department of Obstetrics and Gynecology, Philippine General Hospital, University of the Philippines Manila.

Hydatidiform mole coexistent with a live fetus (CMCF) is a rare entity occurring in 1:20,000 to 1:100,000 pregnancies. Three mechanisms of this type are possible: (1) a singleton pregnancy consisting of partial mole with a triploid fetus, (2) a twin gestation consisting of an androgenic complete hydatidiform mole with a biparental diploid fetus, and (3) a twin gestation consisting of a biparental diploid fetus with a normal placenta and a partial hydatidiform mole (PHM) with a triploid fetus. The abnormal triploid fetus in a partial mole tends to die in the first trimester while the fetus coexisting with a complete or partial mole in the dizygotic twin pregnancy has a chance to survive.

View Article and Find Full Text PDF

Evaluation of somatic copy number variation detection by NGS technologies and bioinformatics tools on a hyper-diploid cancer genome.

Genome Biol

June 2024

Computational Genomics and Bioinformatics Branch, Center for Biomedical Informatics and Information Technology (CBIIT), National Cancer Institute, Rockville, MD, USA.

Background: Copy number variation (CNV) is a key genetic characteristic for cancer diagnostics and can be used as a biomarker for the selection of therapeutic treatments. Using data sets established in our previous study, we benchmark the performance of cancer CNV calling by six most recent and commonly used software tools on their detection accuracy, sensitivity, and reproducibility. In comparison to other orthogonal methods, such as microarray and Bionano, we also explore the consistency of CNV calling across different technologies on a challenging genome.

View Article and Find Full Text PDF

Accurately reconstructing the reticulate histories of polyploids remains a central challenge for understanding plant evolution. Although phylogenetic networks can provide insights into relationships among polyploid lineages, inferring networks may be hindered by the complexities of homology determination in polyploid taxa. We use simulations to show that phasing alleles from allopolyploid individuals can improve phylogenetic network inference under the multispecies coalescent by obtaining the true network with fewer loci compared with haplotype consensus sequences or sequences with heterozygous bases represented as ambiguity codes.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!