Background: The emergence of the third generation sequencing technology, featuring longer read lengths, has demonstrated great advancement compared to the next generation sequencing technology and greatly promoted the biological research. However, the third generation sequencing data has a high level of the sequencing error rates, which inevitably affects the downstream analysis. Although the issue of sequencing error has been improving these years, large amounts of data were produced at high sequencing errors, and huge waste will be caused if they are discarded. Thus, the error correction for the third generation sequencing data is especially important. The existing error correction methods have poor performances at heterozygous sites, which are ubiquitous in diploid and polyploidy organisms. Therefore, it is a lack of error correction algorithms for the heterozygous loci, especially at low coverages.
Results: In this article, we propose a error correction method, named QIHC. QIHC is a hybrid correction method, which needs both the next generation and third generation sequencing data. QIHC greatly enhances the sensitivity of identifying the heterozygous sites from sequencing errors, which leads to a high accuracy on error correction. To achieve this, QIHC established a set of probabilistic models based on Bayesian classifier, to estimate the heterozygosity of a site and makes a judgment by calculating the posterior probabilities. The proposed method is consisted of three modules, which respectively generates a pseudo reference sequence, obtains the read alignments, estimates the heterozygosity the sites and corrects the read harboring them. The last module is the core module of QIHC, which is designed to fit for the calculations of multiple cases at a heterozygous site. The other two modules enable the reads mapping to the pseudo reference sequence which somehow overcomes the inefficiency of multiple mappings that adopt by the existing error correction methods.
Conclusions: To verify the performance of our method, we selected Canu and Jabba to compare with QIHC in several aspects. As a hybrid correction method, we first conducted a groups of experiments under different coverages of the next-generation sequencing data. QIHC is far ahead of Jabba on accuracy. Meanwhile, we varied the coverages of the third generation sequencing data and compared performances again among Canu, Jabba and QIHC. QIHC outperforms the other two methods on accuracy of both correcting the sequencing errors and identifying the heterozygous sites, especially at low coverage. We carried out a comparison analysis between Canu and QIHC on the different error rates of the third generation sequencing data. QIHC still performs better. Therefore, QIHC is superior to the existing error correction methods when heterozygous sites exist.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7677778 | PMC |
http://dx.doi.org/10.1186/s12864-020-07008-9 | DOI Listing |
Brief Bioinform
November 2024
Biotherapeutics Molecule Discovery, Boehringer Ingelheim Pharmaceutical Inc., 900 Ridgebury Road, Ridgefield, CT 06877, United States.
Antibody generation requires the use of one or more time-consuming methods, namely animal immunization, and in vitro display technologies. However, the recent availability of large amounts of antibody sequence and structural data in the public domain along with the advent of generative deep learning algorithms raises the possibility of computationally generating novel antibody sequences with desirable developability attributes. Here, we describe a deep learning model for computationally generating libraries of highly human antibody variable regions whose intrinsic physicochemical properties resemble those of the variable regions of the marketed antibody-based biotherapeutics (medicine-likeness).
View Article and Find Full Text PDFEmerg Microbes Infect
January 2025
Key Laboratory of Jiangxi Province for Transfusion Medicine, Department of Blood Transfusion, The First Affiliated Hospital, Jiangxi Medical College, Nanchang University, Nanchang, Jiangxi 330006, China.
The tRNA-derived small RNAs (tsRNAs) are a new class of non coding RNAs, which are stable in body fluids and can be used as potential biomarkers for disease diagnosis. However, the exact value of tsRNAs in the diagnosis of tuberculosis (TB) is still unclear. The objective of the present study was to evaluate the performance of the serum tsRNAs biosignature to distinguish between active TB, healthy controls, latent TB infection, and other respiratory diseases.
View Article and Find Full Text PDFCirc Genom Precis Med
January 2025
Centre for Heart Lung Innovation, University of British Columbia, Vancouver. (K.H., M.A., L.R., Y.L., A.S., H.H., L.R.B., Z.W.L.).
Background: Protein-truncating mutations in the titin gene are associated with increased risk of atrial fibrillation. However, little is known about the underlying pathophysiology.
Methods: We identified a heterozygous titin truncating variant (TTNtv) in a patient with unexplained early onset atrial fibrillation and normal ventricular function.
J Exp Biol
January 2025
Department of Biology, Colorado State University, Fort Collins, CO, USA.
Decapod crustaceans regulate molting through steroid molting hormones, ecdysteroids, synthesized by the molting gland (Y-organ, YO). Molt-inhibiting hormone (MIH), a neuropeptide synthesized and secreted by the eyestalk ganglia, negatively regulates YO ecdysteroidogenesis. MIH signaling is mediated by cyclic nucleotide second messengers.
View Article and Find Full Text PDFNew Microbes New Infect
February 2025
Department of Veterinary Public Health and Epidemiology, Lala Lajpat Rai University of Veterinary and Animal Sciences, Hisar, Haryana, 125004, India.
Background: is a zoonotic tapeworm, commonly known as Asian It is an emerging sister species of with pigs as intermediate hosts. The present study aimed at genetic characterization and population structure analysis of metacestodes in slaughtered pigs in Haryana, north India.
Methods: In total, the vital organs of 253 slaughtered pigs were screened for the presence of metacestodes.
Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!