Cambridge CB10 1SD UK ; Wellcome Trust ... Publications | LitMetric

Clinical Trial Review Systematic Review Meta-Analysis Books and Documents Randomized Controlled Trial

1,484 results match your criteria: "Cambridge CB10 1SD UK ; Wellcome Trust Sanger Institute[Affiliation]"

Page 1 of 60

Guidelines for Gene and Genome Assembly Nomenclature.

Genetics

January 2025

EMBL-EBI - Non-Vertebrate Genomics Team, European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Cambridge CB10 1SD, UK.

Ethalinda K S Cannon David C Molik Adam J Wright Huiting Zhang Loren Honaas

The rapid increase in the number of reference-quality genome assemblies presents significant new opportunities for genomic research. However, the absence of standardized naming conventions for genome assemblies and annotations across datasets creates substantial challenges. Inconsistent naming hinders the identification of correct assemblies, complicates the integration of bioinformatics pipelines, and makes it difficult to link assemblies across multiple resources.

View Article and Find Full Text PDF

Similar Publications

Data reuse in agricultural genomics research: challenges and recommendations.

Gigascience

January 2025

Department of Animal Science, Iowa State University, Ames, IA, 50011, US.

Alenka Hafner Victoria DeLeo Cecilia H Deng Christine G Elsik Damarius S Fleming

The scientific community has long benefited from the opportunities provided by data reuse. Recognizing the need to identify the challenges and bottlenecks to reuse in the agricultural research community and propose solutions for them, the data reuse working group was started within the AgBioData consortium framework. Here, we identify the limitations of data standards, metadata deficiencies, data interoperability, data ownership, data availability, user skill level, resource availability, and equity issues, with a specific focus on agricultural genomics research.

View Article and Find Full Text PDF

Similar Publications

R2DT: A COMPREHENSIVE PLATFORM FOR VISUALISING RNA SECONDARY STRUCTURE.

bioRxiv

September 2024

Riboscope Ltd, 23 King Street, Cambridge, CB1 1AH, UK.

Holly McCann Caeden D Meade Loren Dean Williams Anton S Petrov Philip Z Johnson

RNA secondary (2D) structure visualisation is an essential tool for understanding RNA function. R2DT is a software package designed to visualise RNA 2D structures in consistent, recognisable, and reproducible layouts. The latest release, R2DT 2.

View Article and Find Full Text PDF

Similar Publications

Lineage-specific patterns in the Moraceae family allow identification of convergent P450 enzymes involved in furanocoumarin biosynthesis.

New Phytol

January 2025

Université de Lorraine, INRAE, LAE, 54000, Nancy, France.

Alexandre Bouillé Romain Larbat Rashmi Kumari Alexandre Olry Clément Charles

Specialized metabolites are molecules involved in plants' interaction with their environment. Elucidating their biosynthetic pathways is a challenging but rewarding task, leading to societal applications and ecological insights. Furanocoumarins emerged multiple times in Angiosperms, raising the question of how different enzymes evolved into catalyzing identical reactions.

View Article and Find Full Text PDF

Similar Publications

Galaxy as a gateway to bioinformatics: Multi-Interface Galaxy Hands-on Training Suite (MIGHTS) for scRNA-seq.

Gigascience

January 2025

School of Life, Health & Chemical Sciences, The Open University, Milton Keynes, Buckinghamshire, MK7 6AA, UK.

Camila L Goclowski Julia Jakiela Tyler Collins Saskia Hiltemann Morgan Howells

Background: Bioinformatics is fundamental to biomedical sciences, but its mastery presents a steep learning curve for bench biologists and clinicians. Learning to code while analyzing data is difficult. The curve may be flattened by separating these two aspects and providing intermediate steps for budding bioinformaticians.

View Article and Find Full Text PDF

Similar Publications

Integrated View of Baseline Protein Expression in Human Tissues Using Public Data Independent Acquisition Data Sets.

J Proteome Res

January 2025

European Molecular Biology Laboratory-European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, U.K.

Ananth Prakash Andrew Collins Liora Vilmovsky Silvie Fexova Andrew R Jones

The PRIDE database is the largest public data repository of mass spectrometry-based proteomics data and currently stores more than 40,000 data sets covering a wide range of organisms, experimental techniques, and biological conditions. During the past few years, PRIDE has seen a significant increase in the amount of submitted data-independent acquisition (DIA) proteomics data sets. This provides an excellent opportunity for large-scale data reanalysis and reuse.

View Article and Find Full Text PDF

Similar Publications

A general strategy for generating expert-guided, simplified views of ontologies.

bioRxiv

December 2024

European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK.

Anita R Caron Aleix Puig-Barbe Ellen M Quardokus James P Balhoff Jasmine Belfiore

Article Synopsis

The use of well-structured ontologies and ontology-aware tools enhances data and analyses to be FAIR (Findable, Accessible, Interoperable, Reusable), supporting effective lexical searches and biologically meaningful annotation grouping.
Researchers face challenges in adopting ontologies, primarily due to their complexity and the tendency to create simplified hierarchies that may misuse relationship types, leading to ineffective organization.
A suite of validation tools is introduced to help users align their hierarchies with established ontology structures, providing graphical reports and tailored views for various atlases like the HuBMAP Human Reference Atlas and the Human Developmental Cell Atlas.

View Article and Find Full Text PDF

Similar Publications

Alleviating batch effects in cell type deconvolution with SCCAF-D.

Nat Commun

December 2024

GMU-GIBH Joint School of Life Sciences, The Guangdong-Hong Kong-Macao Joint Laboratory for Cell Fate Regulation and Diseases, Guangzhou Laboratory, Guangzhou Medical University, Guangzhou, China.

Shuo Feng Liangfeng Huang Anna Vathrakokoili Pournara Ziliang Huang Xinlu Yang

Cell type deconvolution methods can impute cell proportions from bulk transcriptomics data, revealing changes in disease progression or organ development. But benchmarking studies often use simulated bulk data from the same source as the reference, which limits its application scenarios. This study examines batch effects in deconvolution and introduces SCCAF-D, a computational workflow that ensures a Pearson Correlation Coefficient above 0.

View Article and Find Full Text PDF

Similar Publications

The ISCB competency framework v. 3: a revised and extended standard for bioinformatics education and training.

Bioinform Adv

November 2024

School of Computer Science and Engineering, UNSW Sydney, Sydney, NSW 2052, Australia.

Cath Brooksbank Michelle D Brazas Nicola Mulder Russell Schwartz Verena Ras

Motivation: Developing competency in the broad area of bioinformatics is challenging globally, owing to the breadth of the field and the diversity of its audiences for education and training. Course design can be facilitated by the use of a competency framework-a set of competency requirements that define the knowledge, skills and attitudes needed by individuals in (or aspiring to be in) a particular profession or role. These competency requirements can help to define curricula as they can inform both the content and level to which competency needs to be developed.

View Article and Find Full Text PDF

Similar Publications

Ensembl 2025.

Nucleic Acids Res

January 2025

European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK.

Sarah C Dyer Olanrewaju Austine-Orimoloye Andrey G Azov Matthieu Barba If Barnes

Ensembl (www.ensembl.org) is an open platform integrating publicly available genomics data across the tree of life with a focus on eukaryotic species related to human health, agriculture and biodiversity.

View Article and Find Full Text PDF

Similar Publications

New developments for the Quest for Orthologs benchmark service.

NAR Genom Bioinform

December 2024

Department of Biochemistry and Biophysics, Stockholm University, Science for Life Laboratory, Box 1031, SE-17121 Solna, Sweden.

Adrian Altenhoff Yannis Nevers Vinh Tran Dushyanth Jyothi Maria Martin

The Quest for Orthologs (QfO) orthology benchmark service (https://orthology.benchmarkservice.org) hosts a wide range of standardized benchmarks for orthology inference evaluation.

View Article and Find Full Text PDF

Similar Publications

DOME Registry: implementing community-wide recommendations for reporting supervised machine learning in biology.

Gigascience

January 2024

Department of Biomedical Sciences, University of Padova, Padova 35131, Italy.

Omar Abdelghani Attafi Damiano Clementel Konstantinos Kyritsis Emidio Capriotti Gavin Farrell

Supervised machine learning (ML) is used extensively in biology and deserves closer scrutiny. The Data Optimization Model Evaluation (DOME) recommendations aim to enhance the validation and reproducibility of ML research by establishing standards for key aspects such as data handling and processing, optimization, evaluation, and model interpretability. The recommendations help to ensure that key details are reported transparently by providing a structured set of questions.

View Article and Find Full Text PDF

Similar Publications

Autoencoder-based phenotyping of ophthalmic images highlights genetic loci influencing retinal morphology and provides informative biomarkers.

Bioinformatics

December 2024

European Molecular Biology Laboratory-European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Cambridge CB10 1SD, United Kingdom.

Panagiotis I Sergouniotis Adam Diakite Kumar Gaurav Ewan Birney

Motivation: Genome-wide association studies (GWAS) have been remarkably successful in identifying associations between genetic variants and imaging-derived phenotypes. To date, the main focus of these analyses has been on established, clinically-used imaging features. We sought to investigate if deep learning approaches can detect more nuanced patterns of image variability.

View Article and Find Full Text PDF

Similar Publications

PHI-base - the multi-species pathogen-host interaction database in 2025.

Nucleic Acids Res

January 2025

Protecting Crops and the Environment, Rothamsted Research, Harpenden AL5 2JQ, UK.

Martin Urban Alayne Cuzick James Seager Nagashree Nonavinakere Jahobanta Sahoo

Article Synopsis

The Pathogen-Host Interactions Database (PHI-base) has been curating genes related to various pathogens since 2005, focusing on their roles in pathogenicity and interactions with different hosts, including humans and plants.
The latest update, version 4.17, shows significant growth with a 19% increase in genes and a 23% increase in interactions since the previous version.
The upcoming version 5.0 introduces a new curation workflow, unifies existing data, and enhances data-sharing capabilities, making it a more comprehensive resource for researchers, available at specific online portals.

View Article and Find Full Text PDF

Similar Publications

CATH v4.4: major expansion of CATH by experimental and predicted structural data.

Nucleic Acids Res

January 2025

Institute of Structural and Molecular Biology, University College London, London WC1E 6BT, UK.

Vaishali P Waman Nicola Bordin Andy Lau Shaun Kandathil Jude Wells

CATH (https://www.cathdb.info) is a structural classification database that assigns domains to the structures in the Protein Data Bank (PDB) and AlphaFold Protein Structure Database (AFDB) and adds layers of biological information, including homology and functional annotation.

View Article and Find Full Text PDF

Similar Publications

InterPro: the protein sequence classification resource in 2025.

Nucleic Acids Res

January 2025

European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton CB10 1SD, UK.

Matthias Blum Antonina Andreeva Laise Cavalcanti Florentino Sara Rocio Chuguransky Tiago Grego

InterPro (https://www.ebi.ac.

View Article and Find Full Text PDF

Similar Publications

GENCODE 2025: reference gene annotation for human and mouse.

Nucleic Acids Res

January 2025

European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK.

Jonathan M Mudge Sílvia Carbonell-Sala Mark Diekhans Jose Gonzalez Martinez Toby Hunt

GENCODE produces comprehensive reference gene annotation for human and mouse. Entering its twentieth year, the project remains highly active as new technologies and methodologies allow us to catalog the genome at ever-increasing granularity. In particular, long-read transcriptome sequencing enables us to identify large numbers of missing transcripts and to substantially improve existing models, and our long non-coding RNA catalogs have undergone a dramatic expansion and reconfiguration as a result.

View Article and Find Full Text PDF

Similar Publications

Complex portal 2025: predicted human complexes and enhanced visualisation tools for the comparison of orthologous and paralogous complexes.

Nucleic Acids Res

January 2025

European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK.

Sucharitha Balu Susie Huget Juan Jose Medina Reyes Eliot Ragueneau Kalpana Panneerselvam

The Complex Portal (www.ebi.ac.

View Article and Find Full Text PDF

Similar Publications

GENCODE: massively expanding the lncRNA catalog through capture long-read RNA sequencing.

bioRxiv

October 2024

Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr. Aiguader 88, Barcelona 08003, Catalonia, Spain.

Gazaldeep Kaur Tamara Perteghella Sílvia Carbonell-Sala Jose Gonzalez-Martinez Toby Hunt

Article Synopsis

- Accurate gene annotations are essential for interpreting how genomes function, and the GENCODE consortium has spent twenty years creating reference annotations for human and mouse genomes, serving as a vital resource for researchers globally.
- Previous annotations of long non-coding RNAs (lncRNAs) were incomplete and poorly organized, hindering research, prompting GENCODE to launch a comprehensive effort that resulted in adding nearly 18,000 novel human genes and over 22,000 novel mouse genes, significantly increasing the catalog of transcripts.
- The new annotations not only show evolutionary patterns and link to genetic variants associated with traits but also improve understanding of previously unclear genomic functions, greatly advancing research into both human and mouse genetic diseases.

View Article and Find Full Text PDF

Similar Publications

The Pfam protein families database: embracing AI/ML.

Nucleic Acids Res

January 2025

European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton CB10 1SD, UK.

Typhaine Paysan-Lafosse Antonina Andreeva Matthias Blum Sara Rocio Chuguransky Tiago Grego

The Pfam protein families database is a comprehensive collection of protein domains and families used for genome annotation and protein structure and function analysis (https://www.ebi.ac.

View Article and Find Full Text PDF

Similar Publications

The international nucleotide sequence database collaboration (INSDC): enhancing global participation.

Nucleic Acids Res

January 2025

National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA.

Ilene Karsch-Mizrachi Masanori Arita Tony Burdett Guy Cochrane Yasukazu Nakamura

The members of the International Nucleotide Sequence Database Collaboration (INSDC; https://insdc.org) have built systems to collect, archive and disseminate sequence data for more than four decades. The three collaborating organizations, the National Library of Medicine, National Center for Biotechnology Information (NLM-NCBI) in the United States, Research Organization of Information and Systems, National Institute of Genetics (ROIS-NIG) in Japan; and the European Molecular Biology Laboratory-European Bioinformatics Institute (EMBL-EBI) formalized their relationship through the adoption of an arrangement which documents their commitment to free and open access to genomic sequences.

View Article and Find Full Text PDF

Similar Publications

The NHGRI-EBI GWAS Catalog: standards for reusability, sustainability and diversity.

Nucleic Acids Res

January 2025

European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK.

Maria Cerezo Elliot Sollis Yue Ji Elizabeth Lewis Ala Abid

The NHGRI-EBI GWAS Catalog serves as a vital resource for the genetic research community, providing access to the most comprehensive database of human GWAS results. Currently, it contains close to 7 000 publications for >15 000 traits, from which more than 625 000 lead associations have been curated. Additionally, 85 000 full genome-wide summary statistics datasets-containing association data for all variants in the analysis-are available for downstream analyses such as meta-analysis, fine-mapping, Mendelian randomisation or development of polygenic risk scores.

View Article and Find Full Text PDF

Similar Publications

Rfam 15: RNA families database in 2025.

Nucleic Acids Res

January 2025

European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK.

Nancy Ontiveros-Palacios Emma Cooke Eric P Nawrocki Sandra Triebel Manja Marz

The Rfam database, a widely used repository of non-coding RNA families, has undergone significant updates in release 15.0. This paper introduces major improvements, including the expansion of Rfamseq to 26 106 genomes, a 76% increase, incorporating the latest UniProt reference proteomes and additional viral genomes.

View Article and Find Full Text PDF

Similar Publications

The 2024 Report on the Human Proteome from the HUPO Human Proteome Project.

J Proteome Res

December 2024

Institute for Systems Biology, Seattle, Washington 98109, United States.

Gilbert S Omenn Sandra Orchard Lydie Lane Cecilia Lindskog Charles Pineau

Article Synopsis

The Human Proteome Project (HPP) aims to identify every protein-coding gene’s isoform and integrate proteomics into studies of human health and disease.
Major updates include the retirement of neXtProt as the knowledge base, with UniProtKB now serving as the reference proteome, and GENCODE providing the target protein list.
Recent data shows that 93% of protein-coding genes have been expressed, leaving 1,273 non-expressed proteins, along with the introduction of a new scoring system for functional annotation of proteins.

View Article and Find Full Text PDF

Similar Publications

The PRIDE database at 20 years: 2025 update.

Nucleic Acids Res

January 2025

European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK.

Yasset Perez-Riverol Chakradhar Bandla Deepti J Kundu Selvakumar Kamatchinathan Jingwen Bai

Article Synopsis

The PRIDE database is a premier repository for mass spectrometry-based proteomics data and plays a key role in the ProteomeXchange consortium, facilitating research sharing.
Over the past three years, PRIDE has made significant advancements, including a new file transfer protocol and an automatic dataset validation process, resulting in approximately 534 datasets submitted monthly.
Recent innovations include the introduction of a PRIDE chatbot for user support and enhanced efforts to integrate high-quality data with resources like UniProt and Ensembl.

View Article and Find Full Text PDF

Similar Publications