1,484 results match your criteria: "Cambridge CB10 1SD UK ; Wellcome Trust Sanger Institute[Affiliation]"
Genetics
January 2025
EMBL-EBI - Non-Vertebrate Genomics Team, European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Cambridge CB10 1SD, UK.
The rapid increase in the number of reference-quality genome assemblies presents significant new opportunities for genomic research. However, the absence of standardized naming conventions for genome assemblies and annotations across datasets creates substantial challenges. Inconsistent naming hinders the identification of correct assemblies, complicates the integration of bioinformatics pipelines, and makes it difficult to link assemblies across multiple resources.
View Article and Find Full Text PDFGigascience
January 2025
Department of Animal Science, Iowa State University, Ames, IA, 50011, US.
The scientific community has long benefited from the opportunities provided by data reuse. Recognizing the need to identify the challenges and bottlenecks to reuse in the agricultural research community and propose solutions for them, the data reuse working group was started within the AgBioData consortium framework. Here, we identify the limitations of data standards, metadata deficiencies, data interoperability, data ownership, data availability, user skill level, resource availability, and equity issues, with a specific focus on agricultural genomics research.
View Article and Find Full Text PDFbioRxiv
September 2024
Riboscope Ltd, 23 King Street, Cambridge, CB1 1AH, UK.
RNA secondary (2D) structure visualisation is an essential tool for understanding RNA function. R2DT is a software package designed to visualise RNA 2D structures in consistent, recognisable, and reproducible layouts. The latest release, R2DT 2.
View Article and Find Full Text PDFNew Phytol
January 2025
Université de Lorraine, INRAE, LAE, 54000, Nancy, France.
Specialized metabolites are molecules involved in plants' interaction with their environment. Elucidating their biosynthetic pathways is a challenging but rewarding task, leading to societal applications and ecological insights. Furanocoumarins emerged multiple times in Angiosperms, raising the question of how different enzymes evolved into catalyzing identical reactions.
View Article and Find Full Text PDFGigascience
January 2025
School of Life, Health & Chemical Sciences, The Open University, Milton Keynes, Buckinghamshire, MK7 6AA, UK.
Background: Bioinformatics is fundamental to biomedical sciences, but its mastery presents a steep learning curve for bench biologists and clinicians. Learning to code while analyzing data is difficult. The curve may be flattened by separating these two aspects and providing intermediate steps for budding bioinformaticians.
View Article and Find Full Text PDFJ Proteome Res
January 2025
European Molecular Biology Laboratory-European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, U.K.
The PRIDE database is the largest public data repository of mass spectrometry-based proteomics data and currently stores more than 40,000 data sets covering a wide range of organisms, experimental techniques, and biological conditions. During the past few years, PRIDE has seen a significant increase in the amount of submitted data-independent acquisition (DIA) proteomics data sets. This provides an excellent opportunity for large-scale data reanalysis and reuse.
View Article and Find Full Text PDFbioRxiv
December 2024
European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK.
Nat Commun
December 2024
GMU-GIBH Joint School of Life Sciences, The Guangdong-Hong Kong-Macao Joint Laboratory for Cell Fate Regulation and Diseases, Guangzhou Laboratory, Guangzhou Medical University, Guangzhou, China.
Cell type deconvolution methods can impute cell proportions from bulk transcriptomics data, revealing changes in disease progression or organ development. But benchmarking studies often use simulated bulk data from the same source as the reference, which limits its application scenarios. This study examines batch effects in deconvolution and introduces SCCAF-D, a computational workflow that ensures a Pearson Correlation Coefficient above 0.
View Article and Find Full Text PDFBioinform Adv
November 2024
School of Computer Science and Engineering, UNSW Sydney, Sydney, NSW 2052, Australia.
Motivation: Developing competency in the broad area of bioinformatics is challenging globally, owing to the breadth of the field and the diversity of its audiences for education and training. Course design can be facilitated by the use of a competency framework-a set of competency requirements that define the knowledge, skills and attitudes needed by individuals in (or aspiring to be in) a particular profession or role. These competency requirements can help to define curricula as they can inform both the content and level to which competency needs to be developed.
View Article and Find Full Text PDFNucleic Acids Res
January 2025
European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK.
Ensembl (www.ensembl.org) is an open platform integrating publicly available genomics data across the tree of life with a focus on eukaryotic species related to human health, agriculture and biodiversity.
View Article and Find Full Text PDFNAR Genom Bioinform
December 2024
Department of Biochemistry and Biophysics, Stockholm University, Science for Life Laboratory, Box 1031, SE-17121 Solna, Sweden.
The Quest for Orthologs (QfO) orthology benchmark service (https://orthology.benchmarkservice.org) hosts a wide range of standardized benchmarks for orthology inference evaluation.
View Article and Find Full Text PDFGigascience
January 2024
Department of Biomedical Sciences, University of Padova, Padova 35131, Italy.
Supervised machine learning (ML) is used extensively in biology and deserves closer scrutiny. The Data Optimization Model Evaluation (DOME) recommendations aim to enhance the validation and reproducibility of ML research by establishing standards for key aspects such as data handling and processing, optimization, evaluation, and model interpretability. The recommendations help to ensure that key details are reported transparently by providing a structured set of questions.
View Article and Find Full Text PDFBioinformatics
December 2024
European Molecular Biology Laboratory-European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Cambridge CB10 1SD, United Kingdom.
Motivation: Genome-wide association studies (GWAS) have been remarkably successful in identifying associations between genetic variants and imaging-derived phenotypes. To date, the main focus of these analyses has been on established, clinically-used imaging features. We sought to investigate if deep learning approaches can detect more nuanced patterns of image variability.
View Article and Find Full Text PDFNucleic Acids Res
January 2025
Protecting Crops and the Environment, Rothamsted Research, Harpenden AL5 2JQ, UK.
Nucleic Acids Res
January 2025
Institute of Structural and Molecular Biology, University College London, London WC1E 6BT, UK.
CATH (https://www.cathdb.info) is a structural classification database that assigns domains to the structures in the Protein Data Bank (PDB) and AlphaFold Protein Structure Database (AFDB) and adds layers of biological information, including homology and functional annotation.
View Article and Find Full Text PDFNucleic Acids Res
January 2025
European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton CB10 1SD, UK.
InterPro (https://www.ebi.ac.
View Article and Find Full Text PDFNucleic Acids Res
January 2025
European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK.
GENCODE produces comprehensive reference gene annotation for human and mouse. Entering its twentieth year, the project remains highly active as new technologies and methodologies allow us to catalog the genome at ever-increasing granularity. In particular, long-read transcriptome sequencing enables us to identify large numbers of missing transcripts and to substantially improve existing models, and our long non-coding RNA catalogs have undergone a dramatic expansion and reconfiguration as a result.
View Article and Find Full Text PDFNucleic Acids Res
January 2025
European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK.
The Complex Portal (www.ebi.ac.
View Article and Find Full Text PDFbioRxiv
October 2024
Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr. Aiguader 88, Barcelona 08003, Catalonia, Spain.
Nucleic Acids Res
January 2025
European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton CB10 1SD, UK.
The Pfam protein families database is a comprehensive collection of protein domains and families used for genome annotation and protein structure and function analysis (https://www.ebi.ac.
View Article and Find Full Text PDFNucleic Acids Res
January 2025
National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA.
The members of the International Nucleotide Sequence Database Collaboration (INSDC; https://insdc.org) have built systems to collect, archive and disseminate sequence data for more than four decades. The three collaborating organizations, the National Library of Medicine, National Center for Biotechnology Information (NLM-NCBI) in the United States, Research Organization of Information and Systems, National Institute of Genetics (ROIS-NIG) in Japan; and the European Molecular Biology Laboratory-European Bioinformatics Institute (EMBL-EBI) formalized their relationship through the adoption of an arrangement which documents their commitment to free and open access to genomic sequences.
View Article and Find Full Text PDFNucleic Acids Res
January 2025
European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK.
The NHGRI-EBI GWAS Catalog serves as a vital resource for the genetic research community, providing access to the most comprehensive database of human GWAS results. Currently, it contains close to 7 000 publications for >15 000 traits, from which more than 625 000 lead associations have been curated. Additionally, 85 000 full genome-wide summary statistics datasets-containing association data for all variants in the analysis-are available for downstream analyses such as meta-analysis, fine-mapping, Mendelian randomisation or development of polygenic risk scores.
View Article and Find Full Text PDFNucleic Acids Res
January 2025
European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK.
The Rfam database, a widely used repository of non-coding RNA families, has undergone significant updates in release 15.0. This paper introduces major improvements, including the expansion of Rfamseq to 26 106 genomes, a 76% increase, incorporating the latest UniProt reference proteomes and additional viral genomes.
View Article and Find Full Text PDFJ Proteome Res
December 2024
Institute for Systems Biology, Seattle, Washington 98109, United States.
Nucleic Acids Res
January 2025
European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK.