Simulation of African and non-African low and high coverage whole genome sequence data to assess variant calling approaches.

Shatha Alosaimi Noëlle van Biljon Denis Awany Prisca K Thami Joel Defo Jacquiline W Mugo Christian D Bope Gaston K Mazandu Nicola J Mulder Emile R Chimusa

Brief Bioinform

Faculty of Health Sciences, Division of Human Genetics, Department of Pathology, University of Cape Town, Cape Town, South Africa.

Published: July 2021

Current variant calling (VC) approaches have been designed to leverage populations of long-range haplotypes and were benchmarked using populations of European descent, whereas most genetic diversity is found in non-European such as Africa populations. Working with these genetically diverse populations, VC tools may produce false positive and false negative results, which may produce misleading conclusions in prioritization of mutations, clinical relevancy and actionability of genes. The most prominent question is which tool or pipeline has a high rate of sensitivity and precision when analysing African data with either low or high sequence coverage, given the high genetic diversity and heterogeneity of this data. Here, a total of 100 synthetic Whole Genome Sequencing (WGS) samples, mimicking the genetics profile of African and European subjects for different specific coverage levels (high/low), have been generated to assess the performance of nine different VC tools on these contrasting datasets. The performances of these tools were assessed in false positive and false negative call rates by comparing the simulated golden variants to the variants identified by each VC tool. Combining our results on sensitivity and positive predictive value (PPV), VarDict [PPV = 0.999 and Matthews correlation coefficient (MCC) = 0.832] and BCFtools (PPV = 0.999 and MCC = 0.813) perform best when using African population data on high and low coverage data. Overall, current VC tools produce high false positive and false negative rates when analysing African compared with European data. This highlights the need for development of VC approaches with high sensitivity and precision tailored for populations characterized by high genetic variations and low linkage disequilibrium.

Download full-text PDF	Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8294538	PMC
http://dx.doi.org/10.1093/bib/bbaa366	DOI Listing

Publication Analysis

Top Keywords

false positive

positive false

false negative

high

low high

variant calling

calling approaches

genetic diversity

tools produce

sensitivity precision

Similar Publications

The Added Effect of Artificial Intelligence in CT Assessment of Abdominal Lymphadenopathy.

Lymphology

January 2025

Medical Biophysics Department, Medical Research Institute, Alexandria University, Alexandria, Egypt.

R A Meshref I A Saleem A A Salama S H Darwish S M El-Kholy

Lymphadenopathy is associated with lymph node abnormal size or consistency due to many causes. We employed the deep convolutional neural network ResNet-34 to detect and classify CT images from patients with abdominal lymphadenopathy and healthy controls. We created a single database containing 1400 source CT images for patients with abdominal lymphadenopathy (n = 700) and healthy controls (n = 700).

View Article and Find Full Text PDF

Similar Publications

Large-Scale Validation of the Feasibility of GPT-4 as a Proofreading Tool for Head CT Reports.

Radiology

January 2025

From the Departments of Biomedical Systems Informatics (S.K., Jaewoong Kim, C.H., D.Y.) and Neurology (Joonho Kim, J.Y.), Yonsei University College of Medicine, 50-1 Yonsei-ro, Seodaemun-gu, Seoul 03722, Republic of Korea; Department of Radiology, Central Draft Physical Examination Office of Military Manpower Administration, Daegu, Republic of Korea (D.K.); Department of Radiology, Research Institute of Radiological Science and Center for Clinical Imaging Data Science (H.J.S. Y.K., S.J.), and Center for Digital Health (H.J.S., D.Y.), Yongin Severance Hospital, Yonsei University College of Medicine, Yongin, Republic of Korea; Department of Radiology, Gangnam Severance Hospital, Yonsei University College of Medicine, Seoul, Republic of Korea (S.H.L.); Departments of Radiology (M.H.) and Neurology (S.J.L.), Ajou University Hospital, Ajou University School of Medicine, Suwon, Republic of Korea; and Institute for Innovation in Digital Healthcare, Severance Hospital, Seoul, Republic of Korea (D.Y.).

Songsoo Kim Donghyun Kim Hyun Joo Shin Seung Hyun Lee Yeseul Kang

Background The increasing workload of radiologists can lead to burnout and errors in radiology reports. Large language models, such as OpenAI's GPT-4, hold promise as error revision tools for radiology. Purpose To test the feasibility of GPT-4 use by determining its error detection, reasoning, and revision performance on head CT reports with varying error types and to validate its clinical utility by comparison with human readers.

View Article and Find Full Text PDF

Similar Publications

Beyond the null: Recognizing and reporting true negative findings.

iScience

January 2025

Department of Natural Sciences, Manchester Metropolitan University, Manchester M15GD, UK.

Manon K Schweinfurth Joachim G Frommen

Science is based on ideas that might be true or false in describing reality. In order to discern between these two, scientists conduct studies that can reveal evidence for an idea, i.e.

View Article and Find Full Text PDF

Similar Publications

The crucial role of 18F-fluorodeoxyglucose positron emission tomography/computed tomography in diagnosing pulmonary valve endocarditis in patients after transcatheter pulmonary valve implantation: a case report.

Eur Heart J Case Rep

January 2025

Department of Cardiology, Azorg, Merestraat 80, 9300 Aalst, Belgium.

Kaat Rottiers Liesbeth Rosseel

Background: Patients after transcatheter pulmonary valve implantation (TPVI) are at increased risk for infective prosthetic valve endocarditis. Diagnosis of infective endocarditis (IE) following TPVI is particularly difficult due to impaired visualization of the transcatheter pulmonary valve (TPV) with echocardiography [Delgado V, Ajmone Marsan N, de Waha S, Bonaros N, Brida M, Burri H, et al. 2023 ESC guidelines for the management of endocarditis.

View Article and Find Full Text PDF

Similar Publications

Embedding-based pair generation for contrastive representation learning in audio-visual surveillance data.

Front Robot AI

January 2025

IDLab, Ghent University-imec, Ghent, Belgium.

Wei-Cheng Wang Sander De Coninck Sam Leroux Pieter Simoens

Smart cities deploy various sensors such as microphones and RGB cameras to collect data to improve the safety and comfort of the citizens. As data annotation is expensive, self-supervised methods such as contrastive learning are used to learn audio-visual representations for downstream tasks. Focusing on surveillance data, we investigate two common limitations of audio-visual contrastive learning: false negatives and the minimal sufficient information bottleneck.

View Article and Find Full Text PDF

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!