Motivation: Developing competency in the broad area of bioinformatics is challenging globally, owing to the breadth of the field and the diversity of its audiences for education and training. Course design can be facilitated by the use of a competency framework-a set of competency requirements that define the knowledge, skills and attitudes needed by individuals in (or aspiring to be in) a particular profession or role. These competency requirements can help to define curricula as they can inform both the content and level to which competency needs to be developed.
View Article and Find Full Text PDFJMIR Bioinform Biotechnol
October 2022
Background: The mammalian immune system is able to generate antibodies against a huge variety of antigens, including bacteria, viruses, and toxins. The ultradeep DNA sequencing of rearranged immunoglobulin genes has considerable potential in furthering our understanding of the immune response, but it is limited by the lack of a high-throughput, sequence-based method for predicting the antigen(s) that a given immunoglobulin recognizes.
Objective: As a step toward the prediction of antibody-antigen binding from sequence data alone, we aimed to compare a range of machine learning approaches that were applied to a collated data set of antibody-antigen pairs in order to predict antibody-antigen binding from sequence data.
Background: Pairwise alignment of short DNA sequences with affine-gap scoring is a common processing step performed in a range of bioinformatics analyses. Dynamic programming (i.e.
View Article and Find Full Text PDFIEEE/ACM Trans Comput Biol Bioinform
April 2021
A variant caller is used to identify variations in an individual genome (compared to the reference genome) in a genome processing pipeline. For the sake of accuracy, modern variant callers perform many local re-assemblies on small regions of the genome using a graph-based algorithm. However, such graph-based data structures are inefficiently stored in the linear memory of modern computers, which in turn reduces computing efficiency.
View Article and Find Full Text PDFAppl Microbiol Biotechnol
January 2019
Quorum sensing (QS) is a cell-to-cell communication that is used by bacteria to regulate collective behaviors. Quorum sensing controls virulence factor production in many bacterial species and it is regarded as an attractive target to combat bacterial pathogenicity, especially against antibiotic-resistant bacteria. Chlorogenic acid (CA), abundant in fruits, vegetables, and Chinese herbs, processes multiple activities.
View Article and Find Full Text PDFThe International Society of Computational Biology and Bioinformatics (ISCB) brings together scientists from a wide range of disciplines, including biology, medicine, computer science, mathematics and statistics. Practitioners in these fields are constantly dealing with information in visual form: from microscope images and photographs of gels to scatter plots, network graphs and phylogenetic trees, structural formulae and protein models to flow diagrams, visual aids for problem-solving are omnipresent. The offered a way to show the beauty of science in art form.
View Article and Find Full Text PDFBioinformatics is recognized as part of the essential knowledge base of numerous career paths in biomedical research and healthcare. However, there is little agreement in the field over what that knowledge entails or how best to provide it. These disagreements are compounded by the wide range of populations in need of bioinformatics training, with divergent prior backgrounds and intended application areas.
View Article and Find Full Text PDFMotivation: The Variant Call Format (VCF) is widely used to store data about genetic variation. Variant calling workflows detect potential variants in large numbers of short sequence reads generated by DNA sequencing and report them in VCF format. To evaluate the accuracy of variant callers, it is critical to correctly compare their output against a reference VCF file containing a gold standard set of variants.
View Article and Find Full Text PDFThis message is a response from the ISCB in light of the recent the New England Journal of Medicine (NEJM) editorial around data sharing.
View Article and Find Full Text PDFBMC Bioinformatics
May 2015
Background: A pharmacophore model consists of a group of chemical features arranged in three-dimensional space that can be used to represent the biological activities of the described molecules. Clustering of molecular interactions of ligands on the basis of their pharmacophore similarity provides an approach for investigating how diverse ligands can bind to a specific receptor site or different receptor sites with similar or dissimilar binding affinities. However, efficient clustering of pharmacophore models in three-dimensional space is currently a challenge.
View Article and Find Full Text PDFSummary: Rapid technological advances have led to an explosion of biomedical data in recent years. The pace of change has inspired new collaborative approaches for sharing materials and resources to help train life scientists both in the use of cutting-edge bioinformatics tools and databases and in how to analyse and interpret large datasets. A prototype platform for sharing such training resources was recently created by the Bioinformatics Training Network (BTN).
View Article and Find Full Text PDFAntigen selection of B cells within the germinal center reaction generally leads to the accumulation of replacement mutations in the complementarity-determining regions (CDRs) of immunoglobulin genes. Studies of mutations in IgE-associated VDJ gene sequences have cast doubt on the role of antigen selection in the evolution of the human IgE response, and it may be that selection for high affinity antibodies is a feature of some but not all allergic diseases. The severity of IgE-mediated anaphylaxis is such that it could result from higher affinity IgE antibodies.
View Article and Find Full Text PDFCurrent single-locus-based analyses and candidate disease gene prediction methodologies used in genome-wide association studies (GWAS) do not capitalize on the wealth of the underlying genetic data, nor functional data available from molecular biology. Here, we analyzed GWAS data from the Wellcome Trust Case Control Consortium (WTCCC) on coronary artery disease (CAD). Gentrepid uses a multiple-locus-based approach, drawing on protein pathway- or domain-based data to make predictions.
View Article and Find Full Text PDFBackground: Candidate disease gene prediction is a rapidly developing area of bioinformatics research with the potential to deliver great benefits to human health. As experimental studies detecting associations between genetic intervals and disease proliferate, better bioinformatic techniques that can expand and exploit the data are required.
Description: Gentrepid is a web resource which predicts and prioritizes candidate disease genes for both Mendelian and complex diseases.
The existence of many highly similar genes in the lymphocyte receptor gene loci makes them difficult to investigate, and the determination of phased "haplotypes" has been particularly problematic. However, V(D)J gene rearrangements provide an opportunity to infer the association of Ig genes along the chromosomes. The chromosomal distribution of H chain genes in an Ig genotype can be inferred through analysis of VDJ rearrangements in individuals who are heterozygous at points within the IGH locus.
View Article and Find Full Text PDFBackground: Genome-wide association studies (GWAS) aim to identify causal variants and genes for complex disease by independently testing a large number of SNP markers for disease association. Although genes have been implicated in these studies, few utilise the multiple-hit model of complex disease to identify causal candidates. A major benefit of multi-locus comparison is that it compensates for some shortcomings of current statistical analyses that test the frequency of each SNP in isolation for the phenotype population versus control.
View Article and Find Full Text PDFWe have analysed the transcribed immunoglobulin kappa (IGK) repertoire of peripheral blood B cells from four individuals from two genetically distinct populations, Papua New Guinean and Australian, using high-throughput DNA sequencing. The depth of sequencing data for each individual averaged 5,548 high-quality IGK reads, and permitted genotyping of the inferred IGKV and IGKJ germline gene segments for each individual. All individuals were homozygous at each IGKJ locus and had highly similar inferred IGKV genotypes.
View Article and Find Full Text PDFComplete and accurate knowledge of the genes and allelic variants of the human immunoglobulin gene loci is critical for studies of B cell repertoire development and somatic point mutation, but evidence from studies of VDJ rearrangements suggests that our knowledge of the available immunoglobulin gene repertoire is far from complete. The reported repertoire has changed little over the last 15 years. This is, in part, a consequence of the inefficiencies involved in searching for new members of large, multigenic gene families by cloning and sequencing.
View Article and Find Full Text PDF