Accurate determination of node and arc multiplicities in de bruijn graphs using conditional random fields.

BMC Bioinformatics

Department of Information Technology, Ghent University-imec, IDLab, Ghent, B-9052, Belgium.

Published: September 2020

Background: De Bruijn graphs are key data structures for the analysis of next-generation sequencing data. They efficiently represent the overlap between reads and hence, also the underlying genome sequence. However, sequencing errors and repeated subsequences render the identification of the true underlying sequence difficult. A key step in this process is the inference of the multiplicities of nodes and arcs in the graph. These multiplicities correspond to the number of times each k-mer (resp. k+1-mer) implied by a node (resp. arc) is present in the genomic sequence. Determining multiplicities thus reveals the repeat structure and presence of sequencing errors. Multiplicities of nodes/arcs in the de Bruijn graph are reflected in their coverage, however, coverage variability and coverage biases render their determination ambiguous. Current methods to determine node/arc multiplicities base their decisions solely on the information in nodes and arcs individually, under-utilising the information present in the sequencing data.

Results: To improve the accuracy with which node and arc multiplicities in a de Bruijn graph are inferred, we developed a conditional random field (CRF) model to efficiently combine the coverage information within each node/arc individually with the information of surrounding nodes and arcs. Multiplicities are thus collectively assigned in a more consistent manner.

Conclusions: We demonstrate that the CRF model yields significant improvements in accuracy and a more robust expectation-maximisation parameter estimation. True k-mers can be distinguished from erroneous k-mers with a higher F score than existing methods. A C++11 implementation is available at https://github.com/biointec/detox under the GNU AGPL v3.0 license.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7491180PMC
http://dx.doi.org/10.1186/s12859-020-03740-xDOI Listing

Publication Analysis

Top Keywords

nodes arcs
12
node arc
8
multiplicities
8
arc multiplicities
8
multiplicities bruijn
8
bruijn graphs
8
conditional random
8
sequencing errors
8
bruijn graph
8
crf model
8

Similar Publications

Pathways from insulin resistance to incident cardiovascular disease: a Bayesian network analysis.

Cardiovasc Diabetol

November 2024

Department of Epidemiology, Beijing Neurosurgical Institute, Beijing Tiantan Hospital, Capital Medical University, No.119 South 4th Ring West Road, Fengtai District, Beijing, 100070, China.

Background: Insulin resistance coexist with many metabolic disorders, whether these disorders were promotors or pathway-factors for the association of insulin resistance and cardiovascular disease (CVD) remained unclear. We aimed to investigate the pathways related to elevated the triglyceride-glucose (TyG) index and pathways through elevated TyG index to the occurrence of CVD in Chinese adults.

Methods: A total of 96,506 participants were enrolled from the Kailuan study.

View Article and Find Full Text PDF

Stability maintenance in systems refers to the capacity to preserve inherent stability characteristics. In this article, stability maintenance of large boolean networks (BNs) subjected to perturbations is investigated using a distributed pinning control (PC) strategy. The concept of edge removal as a form of perturbation is introduced, and several criteria for achieving global stability are established.

View Article and Find Full Text PDF

Observation of 3D acoustic quantum Hall states.

Sci Bull (Beijing)

July 2024

State Key Laboratory of Quantum Optics and Quantum Optics Devices, Institute of Laser Spectroscopy, Shanxi University, Taiyuan 030006, China; Collaborative Innovation Center of Extreme Optics, Shanxi University, Taiyuan 030006, China; Key Laboratory of Materials Physics, Ministry of Education, School of Physics, Zhengzhou University, Zhengzhou 450001, China; Laboratory of Zhongyuan Light, School of Physics, Zhengzhou University, Zhengzhou 450001, China; Institute of Quantum Materials and Physics, Henan Academy of Sciences, Zhengzhou 450046, China. Electronic address:

Quantum Hall effect, the quantized transport phenomenon of electrons under strong magnetic fields, remains one of the hottest research topics in condensed matter physics since its discovery in 2D electronic systems. Recently, as a great advance in the research of quantum Hall effects, the quantum Hall effect in 3D systems, despite its big challenge, has been achieved in the bulk ZrTe and CdAs materials. Interestingly, CdAs is a Weyl semimetal, and quantum Hall effect is hosted by the Fermi arc states on opposite surfaces via the Weyl nodes of the bulk, and induced by the unique edge states on the boundaries of the opposite surfaces.

View Article and Find Full Text PDF

Discovery of Higher-Order Nodal Surface Semimetals.

Phys Rev Lett

May 2024

Key Laboratory of Artificial Micro- and Nano-Structures of Ministry of Education and School of Physics and Technology, Wuhan University, Wuhan 430072, China.

The emergent higher-order topological insulators significantly deepen our understanding of topological physics. Recently, the study has been extended to topological semimetals featuring gapless bulk band nodes. To date, higher-order nodal point and line semimetals have been successfully realized in different physical platforms.

View Article and Find Full Text PDF

The magnetic type-II Weyl semimetal (MWSM) CoSnS has recently been found to host a variety of remarkable phenomena including surface Fermi-arcs, giant anomalous Hall effect, and negative flat band magnetism. However, the dynamic magnetic properties remain relatively unexplored. Here, we investigate the ultrafast spin dynamics of CoSnS crystal using time-resolved magneto-optical Kerr effect and reflectivity spectroscopies.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!