Recent advances in high-throughput sequencing have exponentially increased the number of genomic data available for animals (Metazoa) in the last decades, with high-quality chromosome-level genomes being published almost daily. Nevertheless, generating a new genome is not an easy task due to the high cost of genome sequencing, the high complexity of assembly, and the lack of standardized protocols for genome annotation. The lack of consensus in the annotation and publication of genome files hinders research by making researchers lose time in reformatting the files for their purposes but can also reduce the quality of the genetic repertoire for an evolutionary study. Thus, the use of transcriptomes obtained using the same pipeline as a proxy for the genetic content of species remains a valuable resource that is easier to obtain, cheaper, and more comparable than genomes. In a previous study, we presented the Metazoan Assemblies from Transcriptomic Ensembles database (MATEdb), a repository of high-quality transcriptomic and genomic data for the two most diverse animal phyla, Arthropoda and Mollusca. Here, we present the newest version of MATEdb (MATEdb2) that overcomes some of the previous limitations of our database: (i) we include data from all animal phyla where public data are available, and (ii) we provide gene annotations extracted from the original GFF genome files using the same pipeline. In total, we provide proteomes inferred from high-quality transcriptomic or genomic data for almost 1,000 animal species, including the longest isoforms, all isoforms, and functional annotation based on sequence homology and protein language models, as well as the embedding representations of the sequences. We believe this new version of MATEdb will accelerate research on animal phylogenomics while saving thousands of hours of computational work in a plea for open, greener, and collaborative science.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11534026 | PMC |
http://dx.doi.org/10.1093/gbe/evae235 | DOI Listing |
Front Biosci (Schol Ed)
December 2024
Laboratory of Intracellular Membranes Dynamics, Institute of Cytology of the Russian Academy of Sciences, 194064 Saint Petersburg, Russia.
Background: Real-time reverse transcription quantitative polymerase chain reaction (RT-qPCR) is a powerful tool for analysing target gene expression in biological samples. To achieve reliable results by RT-qPCR, the most stable reference genes must be selected for proper data normalisation, particularly when comparing cells of different types. We aimed to choose the least variable candidate reference genes among eight housekeeping genes tested within a set of human cancer cell lines (HeLa, MCF-7, SK-UT-1B, A549, A431, SK-BR-3), as well as four lines of normal, non-malignant mesenchymal stromal cells (MSCs) of different origins.
View Article and Find Full Text PDFFront Biosci (Schol Ed)
December 2024
Department of Molecular, Cell and Cancer Biology, University of Massachusetts Chan Medical School, Worcester, MA 01605, USA.
Background: Alternative cleavage and polyadenylation (APA) is a crucial post-transcriptional gene regulation mechanism that regulates gene expression in eukaryotes by increasing the diversity and complexity of both the transcriptome and proteome. Despite the development of more than a dozen experimental methods over the last decade to identify and quantify APA events, widespread adoption of these methods has been limited by technical, financial, and time constraints. Consequently, APA remains poorly understood in most eukaryotes.
View Article and Find Full Text PDFFront Biosci (Landmark Ed)
November 2024
Department of Hematology, Taizhou Hospital of Zhejiang Province Affiliated to Wenzhou Medical University, 317000 Taizhou, Zhejiang, China.
In this comprehensive review, we delve into the transformative role of artificial intelligence (AI) in refining the application of multi-omics and spatial multi-omics within the realm of diffuse large B-cell lymphoma (DLBCL) research. We scrutinized the current landscape of multi-omics and spatial multi-omics technologies, accentuating their combined potential with AI to provide unparalleled insights into the molecular intricacies and spatial heterogeneity inherent to DLBCL. Despite current progress, we acknowledge the hurdles that impede the full utilization of these technologies, such as the integration and sophisticated analysis of complex datasets, the necessity for standardized protocols, the reproducibility of findings, and the interpretation of their biological significance.
View Article and Find Full Text PDFJACS Au
December 2024
Natural Products Research Institute, College of Pharmacy, Seoul National University, Seoul 08826, Republic of Korea.
Four new macrolides, spirosnuolides A-D (-, respectively), were discovered from the termite nest-derived sp. INHA29. Spirosnuolides A-D are 18-membered macrolides sharing an embedded [6,6]-spiroketal functionality inside the macrocycle and are conjugated with structurally uncommon side chains featuring cyclopentenone, 1,4-benzoquinone, hydroxyfuroic acid, or butenolide moieties.
View Article and Find Full Text PDFJAMIA Open
February 2025
Medical Oncology, IRCCS Sacro Cuore Don Calabria Hospital, 37024 Negrar di Valpolicella, Verona, Italy.
Objectives: In recent years, the rise of big data and artificial intelligence has led to an increasing expansion of databases and web services in biomedical research. cBioPortal is one of the most widely used platforms for accessing cancer genomic and clinical data. The primary objective of this study was to develop a tool that simplifies programmatic interaction with cBioPortal's web service.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!