Seasonal influenza viruses continuously evolve via antigenic drift. This leads to recurring epidemics, globally significant mortality rates, and the need for annually updated vaccines. Co-occurring mutations in hemagglutinin (HA) and neuraminidase (NA) are suggested to have synergistic interactions where mutations can increase the chances of immune escape and viral fitness.
View Article and Find Full Text PDFStud Health Technol Inform
September 2024
Antimicrobial resistance (AMR) poses a significant global health threat, resulting in 4.96 million deaths in 2019, with projections reaching 10 million by 2050. This resistance, primarily due to the overuse of antibiotics, complicates the treatment of infections caused by various microorganisms, including the gram-negative bacterium Escherichia coli.
View Article and Find Full Text PDFFront Bioeng Biotechnol
July 2024
DNA sequences of nearly any desired composition, length, and function can be synthesized to alter the biology of an organism for purposes ranging from the bioproduction of therapeutic compounds to invasive pest control. Yet despite offering many great benefits, engineered DNA poses a risk due to their possible misuse or abuse by malicious actors, or their unintentional introduction into the environment. Monitoring the presence of engineered DNA in biological or environmental systems is therefore crucial for routine and timely detection of emerging biological threats, and for improving public acceptance of genetic technologies.
View Article and Find Full Text PDFAutism spectrum disorder (ASD) is a complex neurodevelopmental disorder (NDD) influenced by genetic, epigenetic, and environmental factors. Recent advancements in genomic analysis have shed light on numerous genes associated with ASD, highlighting the significant role of both common and rare genetic mutations, as well as copy number variations (CNVs), single nucleotide polymorphisms (SNPs) and unique de novo variants. These genetic variations disrupt neurodevelopmental pathways, contributing to the disorder's complexity.
View Article and Find Full Text PDFGenomic information is increasingly used to inform medical treatments and manage future disease risks. However, any personal and societal gains must be carefully balanced against the risk to individuals contributing their genomic data. Expanding our understanding of actionable genomic insights requires researchers to access large global datasets to capture the complexity of genomic contribution to diseases.
View Article and Find Full Text PDFStud Health Technol Inform
January 2024
Coronary artery disease (CAD) has the highest disease burden worldwide. To manage this burden, predictive models are required to screen patients for preventative treatment. A range of variables have been explored for their capacity to predict disease, including phenotypic (age, sex, BMI and smoking status), medical imaging (carotid artery thickness) and genotypic.
View Article and Find Full Text PDFStud Health Technol Inform
January 2024
Healthcare data is a scarce resource and access is often cumbersome. While medical software development would benefit from real datasets, the privacy of the patients is held at a higher priority. Realistic synthetic healthcare data can fill this gap by providing a dataset for quality control while at the same time preserving the patient's anonymity and privacy.
View Article and Find Full Text PDFGenetic data is limited and generating new datasets is often an expensive, time-consuming process, involving countless moving parts to genotype and phenotype individuals. While sharing data is beneficial for quality control and software development, privacy and security are of utmost importance. Generating synthetic data is a practical solution to mitigate the cost, time and sensitivities that hamper developers and researchers in producing and validating novel biotechnological solutions to data intensive problems.
View Article and Find Full Text PDFWith the advancement of genomic engineering and genetic modification techniques, the uptake of computational tools to design guide RNA increased drastically. Searching for genomic targets to design guides with maximum on-target activity (efficiency) and minimum off-target activity (specificity) is now an essential part of genome editing experiments. Today, a variety of tools exist that allow the search of genomic targets and let users customize their search parameters to better suit their experiments.
View Article and Find Full Text PDFAlzheimer's disease (AD) is a complex genetic disease, and variants identified through genome-wide association studies (GWAS) explain only part of its heritability. Epistasis has been proposed as a major contributor to this 'missing heritability', however, many current methods are limited to only modelling additive effects. We use VariantSpark, a machine learning approach to GWAS, and BitEpi, a tool for epistasis detection, to identify AD associated variants and interactions across two independent cohorts, ADNI and UK Biobank.
View Article and Find Full Text PDFGenome editing through the development of CRISPR (Clustered Regularly Interspaced Short Palindromic Repeat)-Cas technology has revolutionized many fields in biology. Beyond Cas9 nucleases, Cas12a (formerly Cpf1) has emerged as a promising alternative to Cas9 for editing AT-rich genomes. Despite the promises, guide RNA efficiency prediction through computational tools search still lacks accuracy.
View Article and Find Full Text PDFComput Struct Biotechnol J
September 2023
Random forests (RFs) are a widely used modelling tool capable of feature selection via a variable importance measure (VIM), however, a threshold is needed to control for false positives. In the absence of a good understanding of the characteristics of VIMs, many current approaches attempt to select features associated to the response by training multiple RFs to generate statistical power via a permutation null, by employing recursive feature elimination, or through a combination of both. However, for high-dimensional datasets these approaches become computationally infeasible.
View Article and Find Full Text PDFBackground: Newborn screening (NBS) is an effective public health intervention that reduces death and disability from treatable genetic diseases, but many conditions are not screened due to a lack of a suitable assay. Whole genome and whole exome sequencing can potentially expand NBS but there remain many technical challenges preventing their use in population NBS. We investigated if targeted gene sequencing (TGS) is a feasible methodology for expanding NBS.
View Article and Find Full Text PDFThere are inherent complexities and tensions in achieving a responsible balance between safeguarding patients' privacy and sharing genomic data for advancing health and medical science. A growing body of literature suggests establishing patient genomic data ownership, enabled by blockchain technology, as one approach for managing these priorities. We conducted an online survey, applying a mixed methods approach to collect quantitative (using scale questions) and qualitative data (using open-ended questions).
View Article and Find Full Text PDFObjective: European and Australian guidelines for cystic fibrosis (CF) reproductive carrier screening recommend testing a small number of high frequency CF causing variants, rather than comprehensive CFTR sequencing. The study objective was to determine variant detection rates of commercially available targeted reproductive carrier screening tests in Australia.
Methods: Next-generation DNA sequencing of the CFTR gene was performed on 2552 individuals from a whole population sample to identify CF causing variants.
Comput Struct Biotechnol J
June 2022
New SARS-CoV-2 variants emerge as part of the virus' adaptation to the human host. The Health Organizations are monitoring newly emerging variants with suspected impact on disease or vaccination efficacy as Variants Being Monitored (VBM), like Delta and Omicron. Genetic changes (SNVs) compared to the Wuhan variant characterize VBMs with current emphasis on the spike protein and lineage markers.
View Article and Find Full Text PDFViral integration is a complex biological process, and it is useful to have a reference integration dataset with known properties to compare experimental data against, or for comparing with the results from computational tools that detect integration. To generate these data, we developed a pipeline for simulating integrations of a viral or vector genome into a host genome. Our method reproduces more complex characteristics of vector and viral integration, including integration of sub-genomic fragments, structural variation of the integrated genomes, and deletions from the host genome at the integration site.
View Article and Find Full Text PDFDetecting viral and vector integration events is a key step when investigating interactions between viral and host genomes. This is relevant in several fields, including virology, cancer research and gene therapy. For example, investigating integrations of wild-type viruses such as human papillomavirus and hepatitis B virus has proven to be crucial for understanding the role of these integrations in cancer.
View Article and Find Full Text PDFComplex genetic diseases may be modulated by a large number of epistatic interactions affecting a polygenic phenotype. Identifying these interactions is difficult due to computational complexity, especially in the case of higher-order interactions where more than two genomic variants are involved. In this paper, we present BitEpi, a fast and accurate method to test all possible combinations of up to four bi-allelic variants (i.
View Article and Find Full Text PDFExternal DNA sequences can be inserted into an organism's genome either through natural processes such as gene transfer, or through targeted genome engineering strategies. Being able to robustly identify such foreign DNA is a crucial capability for health and biosecurity applications, such as anti-microbial resistance (AMR) detection or monitoring gene drives. This capability does not exist for poorly characterised host genomes or with limited information about the integrated sequence.
View Article and Find Full Text PDFThe increased development of functionally diverse and highly specialized genome editors has created the need for comparative analytics tools that are able to profile the mutational outcomes, particularly rare and complex outcomes, to assess the editor's applicability to different domains. To address this need, we have developed Generalizable On-target activity ANAlyzer (GOANA), a high-throughput web-based software for determining editing efficiency and cataloguing rare outcomes from next-generation sequencing data. GOANA calculates mutation frequency and outcomes relative to a supplied control sample.
View Article and Find Full Text PDF