PEcnv: accurate and efficient detection of copy number variations of various lengths.

Xuwen Wang Ying Xu Ruoyu Liu Xin Lai Yuqian Liu Shenjie Wang Xuanping Zhang Jiayin Wang

Brief Bioinform

Department of Computer Science and Technology, School of Electronics and Information Engineering, Xi'an Jiaotong University, Xi'an 710049, China.

Published: September 2022

Copy number variation (CNV) is a class of key biomarkers in many complex traits and diseases. Detecting CNV from sequencing data is a substantial bioinformatics problem and a standard requirement in clinical practice. Although many proposed CNV detection approaches exist, the core statistical model at their foundation is weakened by two critical computational issues: (i) identifying the optimal setting on the sliding window and (ii) correcting for bias and noise. We designed a statistical process model to overcome these limitations by calculating regional read depths via an exponentially weighted moving average strategy. A one-run detection of CNVs of various lengths is then achieved by a dynamic sliding window, whose size is self-adopted according to the weighted averages. We also designed a novel bias/noise reduction model, accompanied by the moving average, which can handle complicated patterns and extend training data. This model, called PEcnv, accurately detects CNVs ranging from kb-scale to chromosome-arm level. The model performance was validated with simulation samples and real samples. Comparative analysis showed that PEcnv outperforms current popular approaches. Notably, PEcnv provided considerable advantages in detecting small CNVs (1 kb-1 Mb) in panel sequencing data. Thus, PEcnv fills the gap left by existing methods focusing on large CNVs. PEcnv may have broad applications in clinical testing where panel sequencing is the dominant strategy. Availability and implementation: Source code is freely available at https://github.com/Sherwin-xjtu/PEcnv.

Download full-text PDF	Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC9487654	PMC
http://dx.doi.org/10.1093/bib/bbac375	DOI Listing

Publication Analysis

Top Keywords

copy number

sequencing data

sliding window

moving average

panel sequencing

pecnv

model

pecnv accurate

accurate efficient

efficient detection

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!