Publications by authors named "Alla D Fedorova"

Ribosome profiling (Ribo-Seq) captures a "snapshot" of ribosomes' locations at the entire transcriptome of a cell at sub-codon resolution providing insights into gene expression and enabling the discovery of novel translated regions. RiboGalaxy (https://ribogalaxy.genomicsdatascience.

View Article and Find Full Text PDF

The synthesis of most proteins begins at AUG codons, yet a small number of non-AUG initiated proteoforms are also known. Here we analyse a large number of publicly available Ribo-seq datasets to identify novel, previously uncharacterised non-AUG proteoforms using Trips-Viz implementation of a novel algorithm for detecting translated ORFs. In parallel we analyse genomic alignment of 120 mammals to identify evidence of protein coding evolution in sequences encoding potential extensions.

View Article and Find Full Text PDF

Defective interfering (DI) particles arise during virus propagation, are conditional on parental virus for replication and packaging, and interfere with viral expansion. There is much interest in developing DIs as anti-viral agents. Here we characterize DI particles that arose following serial passaging of SARS-CoV-2 at high multiplicity of infection.

View Article and Find Full Text PDF

The research article describing the discovery of ribosomal frameshifting in the bacterial CopA gene also reported the occurrence of frameshifting in the expression of the human ortholog ATP7B based on assays using dual luciferase reporters. An examination of the publicly available ribosome profiling data and the phylogenetic analysis of the proposed frameshifting site cast doubt on the validity of this claim and prompted us to reexamine the evidence. We observed similar apparent frameshifting efficiencies as the original authors using the same type of vector that synthesizes both luciferases as a single polyprotein.

View Article and Find Full Text PDF

Recent proteogenomic studies revealed extensive translation outside of annotated protein coding regions, such as non-coding RNAs and untranslated regions of mRNAs. This non-canonical translation is largely due to start codon plurality within the same RNA. This plurality is often due to the failure of some scanning ribosomes to recognize potential start codons leading to initiation downstream-a process termed leaky scanning.

View Article and Find Full Text PDF

Somatic mutations in regulatory sites of human stem cells affect cell identity or cause malignant transformation. By mining the human genome for co-occurrence of mutations and transcription factor binding sites, we show that C/EBP binding sites are strongly enriched with [C > T]G mutations in cancer and adult stem cells, which is of special interest because C/EBPs regulate cell fate and differentiation. In vitro protein-DNA binding assay and structural modeling of the CEBPB-DNA complex show that the G·T mismatch in the core CG dinucleotide strongly enhances affinity of the binding site.

View Article and Find Full Text PDF

Objectives: Mammalian genomics studies, especially those focusing on transcriptional regulation, require information on genomic locations of regulatory regions, particularly, transcription factor (TF) binding sites. There are plenty of published ChIP-Seq data on in vivo binding of transcription factors in different cell types and conditions. However, handling of thousands of separate data sets is often impractical and it is desirable to have a single global map of genomic regions potentially bound by a particular TF in any of studied cell types and conditions.

View Article and Find Full Text PDF

Background: Adaptive immune responses to newly encountered pathogens depend on the mobilization of antigen-specific clonotypes from a vastly diverse pool of naive T cells. Using recent advances in immune repertoire sequencing technologies, models of the immune receptor rearrangement process, and a database of annotated T cell receptor (TCR) sequences with known specificities, we explored the baseline frequencies of T cells specific for defined human leukocyte antigen (HLA) class I-restricted epitopes in healthy individuals.

Methods: We used a database of TCR sequences with known antigen specificities and a probabilistic TCR rearrangement model to estimate the baseline frequencies of TCRs specific to distinct antigens epitopespecificT-cells.

View Article and Find Full Text PDF

We present a major update of the HOCOMOCO collection that consists of patterns describing DNA binding specificities for human and mouse transcription factors. In this release, we profited from a nearly doubled volume of published in vivo experiments on transcription factor (TF) binding to expand the repertoire of binding models, replace low-quality models previously based on in vitro data only and cover more than a hundred TFs with previously unknown binding specificities. This was achieved by systematic motif discovery from more than five thousand ChIP-Seq experiments uniformly processed within the BioUML framework with several ChIP-Seq peak calling tools and aggregated in the GTRD database.

View Article and Find Full Text PDF