Background: Microbial genomes are largely comprised of protein coding sequences, yet some genomes contain many pseudogenes caused by frameshifts or internal stop codons. These pseudogenes are believed to result from gene degradation during evolution but could also be technical artifacts of genome sequencing or assembly.
Results: Using a combination of observational and experimental data, we show that many putative pseudogenes are attributable to errors that are incorporated into genomes during assembly.
The observed diversity of protein coding sequences continues to increase far more rapidly than knowledge of their functions, making classification algorithms essential for assigning a function to proteins using only their sequence. Most pipelines for annotating proteins rely on searches for homologous sequences in databases of previously annotated proteins using BLAST or HMMER. Here, we develop a new approach for classifying proteins into a taxonomy of functions and demonstrate its utility for genome annotation.
View Article and Find Full Text PDFViruses represent important test cases for data federation due to their genome size and the rapid increase in sequence data in publicly available databases. However, some consequences of previously decentralized (unfederated) data are lack of consensus or comparisons between feature annotations. Unifying or displaying alternative annotations should be a priority both for communities with robust entry representation and for nascent communities with burgeoning data sources.
View Article and Find Full Text PDFThe direct visualization of neurotransmitters is a continuing problem in neuroscience; however, functional fluorescent sensors for organic analytes are still rare. Herein, we describe a fluorescent sensor for glutamate and zinc ions. The sensor acts as a fluorescent logic gate, giving a turn-off response to glutamate or zinc ion alone.
View Article and Find Full Text PDF