AI Article Synopsis

  • This study evaluates different methods for predicting promoter regions in the human genome using data from the ENCODE project.
  • Results showed that the effectiveness of these predictions varied, with sensitivity ranging from 32% to 56% and positive predictive values between 79% and 93%.
  • The best outcomes were achieved by combining promoter and gene predictions, leading to fewer false positives and a suggestion for improving future assessments in promoter prediction accuracy.

Article Abstract

Background: This study analyzes the predictions of a number of promoter predictors on the ENCODE regions of the human genome as part of the ENCODE Genome Annotation Assessment Project (EGASP). The systems analyzed operate on various principles and we assessed the effectiveness of different conceptual strategies used to correlate produced promoter predictions with the manually annotated 5' gene ends.

Results: The predictions were assessed relative to the manual HAVANA annotation of the 5' gene ends. These 5' gene ends were used as the estimated reference transcription start sites. With the maximum allowed distance for predictions of 1,000 nucleotides from the reference transcription start sites, the sensitivity of predictors was in the range 32% to 56%, while the positive predictive value was in the range 79% to 93%. The average distance mismatch of predictions from the reference transcription start sites was in the range 259 to 305 nucleotides. At the same time, using transcription start site estimates from DBTSS and H-Invitational databases as promoter predictions, we obtained a sensitivity of 58%, a positive predictive value of 92%, and an average distance from the annotated transcription start sites of 117 nucleotides. In this experiment, the best performing promoter predictors were those that combined promoter prediction with gene prediction. The main reason for this is the reduced promoter search space that resulted in smaller numbers of false positive predictions.

Conclusion: The main finding, now supported by comprehensive data, is that the accuracy of human promoter predictors for high-throughput annotation purposes can be significantly improved if promoter prediction is combined with gene prediction. Based on the lessons learned in this experiment, we propose a framework for the preparation of the next similar promoter prediction assessment.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1810552PMC
http://dx.doi.org/10.1186/gb-2006-7-s1-s3DOI Listing

Publication Analysis

Top Keywords

transcription start
20
start sites
16
promoter predictions
12
promoter predictors
12
reference transcription
12
promoter prediction
12
promoter
10
encode regions
8
gene ends
8
positive predictive
8

Similar Publications

Upon infection, human papillomavirus (HPV) manipulates host cell gene expression to create an environment that is supportive of a productive and persistent infection. The virus-induced changes to the host cell's transcriptome are thought to contribute to carcinogenesis. Here, we show by RNA-sequencing that oncogenic HPV18 episome replication in primary human foreskin keratinocytes (HFKs) drives host transcriptional changes that are consistent between multiple HFK donors.

View Article and Find Full Text PDF

Alopecia areata (AA) is a chronic, autoimmune skin disease characterized by non-scarring hair loss. Baricitinib, a Janus kinase inhibitor (JAKi), prevents hair loss and promotes hair regrowth by inhibiting the inflammatory Janus kinase-signal transducer and activator of transcription (JAK-STAT) signaling pathway involved in cytotoxic T cell responses targeting hair follicles. The introduction of JAKi has transformed treatment against severe AA.

View Article and Find Full Text PDF

The formation of condensed heterochromatin is critical for establishing cell-specific transcriptional programs. To reveal structural transitions underlying heterochromatin formation in maturing mouse rod photoreceptors, we apply cryo-EM tomography, AI-assisted deep denoising, and molecular modeling. We find that chromatin isolated from immature retina cells contains many closely apposed nucleosomes with extremely short or absent nucleosome linkers, which are inconsistent with the typical two-start zigzag chromatin folding.

View Article and Find Full Text PDF

Background: Sclerostin (SOST) is traditionally regarded as an osteocyte-derived secreted glycoprotein that regulates bone mineralization. Recent studies reported that SOST is also released from non-skeletal sources, especially during inflammation. However, the cellular source and regulatory mechanisms governing SOST generation in inflammation remain unclear.

View Article and Find Full Text PDF

Patterns of Isoform Variation for N Gene Subgenomic mRNAs in Betacoronavirus Transcriptomes.

Viruses

December 2024

Department of Biology, Center for Computational and Integrative Biology, Rutgers University, Camden, NJ 08102, USA.

The nucleocapsid (N) protein is the most expressed protein in later stages of SARS-CoV-2 infection with several important functions. It is translated from a subgenomic mRNA (sgmRNA) formed by template switching during transcription. A recently described translation initiation site (TIS) with a CTG codon in the leader sequence (TIS-L) is out of frame with most structural and accessory genes including the N gene and may act as a translation suppressor.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!