Motivation: Promoters are short DNA consensus sequences that are localized proximal to the transcription start sites of genes, allowing transcription initiation of particular genes. However, the precise prediction of promoters remains a challenging task because individual promoters often differ from the consensus at one or more positions.
Results: In this study, we present a new multi-layer computational approach, called MULTiPly, for recognizing promoters and their specific types. MULTiPly took into account the sequences themselves, including both local information such as k-tuple nucleotide composition, dinucleotide-based auto covariance and global information of the entire samples based on bi-profile Bayes and k-nearest neighbour feature encodings. Specifically, the F-score feature selection method was applied to identify the best unique type of feature prediction results, in combination with other types of features that were subsequently added to further improve the prediction performance of MULTiPly. Benchmarking experiments on the benchmark dataset and comparisons with five state-of-the-art tools show that MULTiPly can achieve a better prediction performance on 5-fold cross-validation and jackknife tests. Moreover, the superiority of MULTiPly was also validated on a newly constructed independent test dataset. MULTiPly is expected to be used as a useful tool that will facilitate the discovery of both general and specific types of promoters in the post-genomic era.
Availability And Implementation: The MULTiPly webserver and curated datasets are freely available at http://flagshipnt.erc.monash.edu/MULTiPly/.
Supplementary Information: Supplementary data are available at Bioinformatics online.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6736106 | PMC |
http://dx.doi.org/10.1093/bioinformatics/btz016 | DOI Listing |
Hum Genomics
January 2025
Population Health Program, QIMR Berghofer Medical Research Institute, Herston, QLD, 4006, Australia.
Background: TP53 variant classification benefits from the availability of large-scale functional data for missense variants generated using cDNA-based assays. However, absence of comprehensive splicing assay data for TP53 confounds the classification of the subset of predicted missense and synonymous variants that are also predicted to alter splicing. Our study aimed to generate and apply splicing assay data for a prioritised group of 59 TP53 predicted missense or synonymous variants that are also predicted to affect splicing by either SpliceAI or MaxEntScan.
View Article and Find Full Text PDFCancer Cell Int
January 2025
Institute for Genome Engineered Animal Models of Human Diseases, National Center of Genetically Engineered Animal Models for International Research, Dalian Medical University, 9 West Section Lvshun South Road, Dalian, 116044, China.
Clear cell renal cell carcinoma (ccRCC) is a globally severe cancer with an unfavorable prognosis. PANoptosis, a form of cell death regulated by PANoptosomes, plays a role in numerous cancer types. However, the specific roles of genes associated with PANoptosis in the development and advancement of ccRCC remain unclear.
View Article and Find Full Text PDFBMC Oral Health
January 2025
Nanfang Hospital, Southern Medical University, Guangzhou, 510515, China.
Background: Oral infectious diseases, such as dental caries, periodontitis and periapical periodontitis, are often complicated by causative bacterial biofilm formation and significantly impact human oral health and quality of life. Bacteriophage (phage) therapy has emerged as a potential alternative with successful applications in antimicrobial trials. While therapeutic use of phages has been considered as effective treatment of some infectious diseases, related research focusing on oral infectious diseases is few and lacks attention.
View Article and Find Full Text PDFBMC Genomics
January 2025
Transversal Activities in Applied Genomics, Sciensano, Brussels, Belgium.
The influx of whole genome sequencing (WGS) data in the public health and clinical diagnostic sectors has created a need for data analysis methods and bioinformatics expertise, which can be a bottleneck for many laboratories. At Sciensano, the Belgian national public health institute, an intuitive and user-friendly bioinformatics tool portal was implemented using Galaxy, an open-source platform for data analysis and workflow creation. The Galaxy @Sciensano instance is available to both internal and external scientists and offers a wide range of tools provided by the community, complemented by over 50 custom tools and pipelines developed in-house.
View Article and Find Full Text PDFBMC Plant Biol
January 2025
Triticeae Research Institute, Sichuan Agricultural University, Chengdu, Sichuan, 611130, China.
Background: The St-genome-sharing taxa are highly complex group of the species with the St nuclear genome and monophyletic origin in maternal lineages within the Triticeae, which contains more than half of polyploid species that distributed in a wide range of ecological habitats. While high level of genetic heterogeneity in plastome DNA due to a reticulate evolutionary event has been considered to link with the richness of the St-genome-sharing taxa, the relationship between the dynamics of diversification and molecular evolution is lack of understanding.
Results: Here, integrating 106 previously and 12 newly sequenced plastomes representing almost all previously recognized genomic types and genus of the Triticeae, this study applies phylogenetic reconstruction methods in combination with lineage diversification analyses, estimate of sequence evolution, and gene expression to investigate the dynamics of diversification in the tribe.
Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!