PEcnv: accurate and efficient detection of copy number variations of various lengths.

Brief Bioinform

Department of Computer Science and Technology, School of Electronics and Information Engineering, Xi'an Jiaotong University, Xi'an 710049, China.

Published: September 2022

Copy number variation (CNV) is a class of key biomarkers in many complex traits and diseases. Detecting CNV from sequencing data is a substantial bioinformatics problem and a standard requirement in clinical practice. Although many proposed CNV detection approaches exist, the core statistical model at their foundation is weakened by two critical computational issues: (i) identifying the optimal setting on the sliding window and (ii) correcting for bias and noise. We designed a statistical process model to overcome these limitations by calculating regional read depths via an exponentially weighted moving average strategy. A one-run detection of CNVs of various lengths is then achieved by a dynamic sliding window, whose size is self-adopted according to the weighted averages. We also designed a novel bias/noise reduction model, accompanied by the moving average, which can handle complicated patterns and extend training data. This model, called PEcnv, accurately detects CNVs ranging from kb-scale to chromosome-arm level. The model performance was validated with simulation samples and real samples. Comparative analysis showed that PEcnv outperforms current popular approaches. Notably, PEcnv provided considerable advantages in detecting small CNVs (1 kb-1 Mb) in panel sequencing data. Thus, PEcnv fills the gap left by existing methods focusing on large CNVs. PEcnv may have broad applications in clinical testing where panel sequencing is the dominant strategy. Availability and implementation: Source code is freely available at https://github.com/Sherwin-xjtu/PEcnv.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC9487654PMC
http://dx.doi.org/10.1093/bib/bbac375DOI Listing

Publication Analysis

Top Keywords

copy number
8
sequencing data
8
sliding window
8
moving average
8
panel sequencing
8
pecnv
6
model
5
pecnv accurate
4
accurate efficient
4
efficient detection
4

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!