As many structures of protein-DNA complexes have been known in the past years, several computational methods have been developed to predict DNA-binding sites in proteins. However, its inverse problem (i.e., predicting protein-binding sites in DNA) has received much less attention. One of the reasons is that the differences between the interaction propensities of nucleotides are much smaller than those between amino acids. Another reason is that DNA exhibits less diverse sequence patterns than protein. Therefore, predicting protein-binding DNA nucleotides is much harder than predicting DNA-binding amino acids. We computed the interaction propensity (IP) of nucleotide triplets with amino acids using an extensive dataset of protein-DNA complexes, and developed two support vector machine (SVM) models that predict protein-binding nucleotides from sequence data alone. One SVM model predicts protein-binding nucleotides using DNA sequence data alone, and the other SVM model predicts protein-binding nucleotides using both DNA and protein sequences. In a 10-fold cross-validation with 1519 DNA sequences, the SVM model that uses DNA sequence data only predicted protein-binding nucleotides with an accuracy of 67.0%, an F-measure of 67.1%, and a Matthews correlation coefficient (MCC) of 0.340. With an independent dataset of 181 DNAs that were not used in training, it achieved an accuracy of 66.2%, an F-measure 66.3% and a MCC of 0.324. Another SVM model that uses both DNA and protein sequences achieved an accuracy of 69.6%, an F-measure of 69.6%, and a MCC of 0.383 in a 10-fold cross-validation with 1519 DNA sequences and 859 protein sequences. With an independent dataset of 181 DNAs and 143 proteins, it showed an accuracy of 67.3%, an F-measure of 66.5% and a MCC of 0.329. Both in cross-validation and independent testing, the second SVM model that used both DNA and protein sequence data showed better performance than the first model that used DNA sequence data. To the best of our knowledge, this is the first attempt to predict protein-binding nucleotides in a given DNA sequence from the sequence data alone.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1016/j.cmpb.2014.07.009 | DOI Listing |
J Virol
January 2025
SA MRC Antibody Immunity Research Unit, School of Pathology, Faculty of Health Sciences, University of the Witwatersrand, Parktown, Johannesburg, South Africa.
The Antibody Mediated Prevention (AMP) trials showed that passively infused VRC01, a broadly neutralizing antibody (bNAb) targeting the CD4 binding site (CD4bs) on the HIV-1 envelope protein (Env), protected against neutralization-sensitive viruses. We identified six individuals from the VRC01 treatment arm with multi-lineage breakthrough HIV-1 infections from HVTN703, where one variant was sensitive to VRC01 (IC < 25 ug/mL) but another was resistant. By comparing Env sequences of resistant and sensitive clones from each participant, we identified sites predicted to affect VRC01 neutralization and assessed the effect of their reversion in the VRC01-resistant clone on neutralization sensitivity.
View Article and Find Full Text PDFAppl Environ Microbiol
January 2025
Centre for Microbiology and Environmental Systems Science, Division of Microbial Ecology, University of Vienna, Vienna, Austria.
Polyunsaturated fatty acids (PUFAs) play a crucial role in aiding bacteria to adapt to extreme and stressful environments. While there is a well-established understanding of their production, accrual, and transfer within marine ecosystems, knowledge about terrestrial environments remains limited. Investigation of the intestinal microbiome of earthworms has illuminated the presence of PUFAs presumably of microbial origin, which contrasts with the surrounding soil.
View Article and Find Full Text PDFMicrobiol Spectr
January 2025
State Key Laboratory for Diagnosis and Treatment of Infectious Diseases, National Clinical Research Center for Infectious Diseases, National Medical Center for Infectious Diseases, Collaborative Innovation Center for Diagnosis and Treatment of Infectious Diseases, The First Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, China.
Considering that the human microbiota plays a critical role in health and disease, an accurate and high-resolution taxonomic classification is thus essential for meaningful microbiome analysis. In this study, we developed an automatic system, named MultiTax pipeline, for generating taxonomy from full-length 16S rRNA sequences using the Genome Taxonomy Database and other existing reference databases. We first constructed the MultiTax-human database, a high-resolution resource specifically designed for human microbiome research and clinical applications.
View Article and Find Full Text PDFScand J Med Sci Sports
January 2025
Department of Surgical Sciences, Dentistry, Gynaecology and Paediatrics, University of Verona, Verona, Italy.
Previous studies in sports science suggested that regular exercise has a positive impact on human health. However, the effects of endurance sports and their underlying mechanisms are still not completely understood. One of the main debates regards the modulation of immune dynamics in high-intensity exercise.
View Article and Find Full Text PDFMol Genet Genomic Med
January 2025
Diagnostics and Therapeutics of Intractable Diseases, Intractable Disease Research Center, Graduate School of Medicine, Juntendo University, Tokyo, Japan.
Background: Sengers syndrome is an autosomal recessive mitochondrial DNA depletion syndrome characterized by hypertrophic cardiomyopathy, congenital cataracts, skeletal myopathy, exercise intolerance, and lactic acidosis. Dysfunction of acylglycerol kinase (AGK) is responsible for the disease, and several AGK gene variants have been reported.
Methods: We employed a comprehensive genomic analysis approach, including whole-genome sequencing and RNA sequencing, combined with various bioinformatics tools.
Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!