Publications by Melanie Courtot

Publications by authors named "Melanie Courtot"

Page 1 of 2

Development of the Ontario Hereditary Cancer Research Network, a unified registry as a resource for individuals with inherited cancer syndromes: an observational registry creation protocol.

Kirsten M Farncombe Lauren K Hughes Elif Tuzlali Mohammad Reza Akbari Irene L Andrulis Melanie Courtot

BMJ Open

November 2024

Article Synopsis

- In Canada, hereditary cancer care is inconsistent, leading to patients seeing multiple doctors and needing to advocate for their own treatment plans, highlighting the call for a national registry for carriers of cancer-susceptibility genes.
- The Ontario Hereditary Cancer Research Network (OHCRN) is being established to create a centralized registry in Ontario to collect clinical, genomic, and self-reported data from participants, expected to launch in mid-2025.
- Ethical approval for the registry has been secured, and the data will be shared in anonymized forms on the OHCRN website and at conferences, aiming to inform health policies and support research in the field.

View Article and Find Full Text PDF

The Canadian VirusSeq Data Portal and Duotang: open resources for SARS-CoV-2 viral sequences and genomic epidemiology.

Erin E Gill Baofeng Jia Carmen Lia Murall Raphaël Poujol Muhammad Zohaib Anwar Mélanie Courtot

Microb Genom

October 2024

Article Synopsis

The COVID-19 pandemic spurred global efforts to sequence SARS-CoV-2 genomes to monitor its evolution and guide public health decisions, resulting in millions of genome sequences being shared worldwide.
The Canadian COVID-19 Genomics Network (CanCOGeN - VirusSeq) launched the Canadian VirusSeq Data Portal to provide open access to genomic sequences and standardized contextual data while adhering to FAIR standards.
The portal emphasizes data quality, privacy compliance, and security, and is used alongside tools like Viral AI and the CoVaRR-Net to facilitate ongoing research and analysis of SARS-CoV-2 variants in Canada.

View Article and Find Full Text PDF

The Canadian VirusSeq Data Portal & Duotang: open resources for SARS-CoV-2 viral sequences and genomic epidemiology.

Erin E Gill Baofeng Jia Carmen Lia Murall Raphaël Poujol Muhammad Zohaib Anwar Mélanie Courtot

ArXiv

May 2024

The COVID-19 pandemic led to a large global effort to sequence SARS-CoV-2 genomes from patient samples to track viral evolution and inform public health response. Millions of SARS-CoV-2 genome sequences have been deposited in global public repositories. The Canadian COVID-19 Genomics Network (CanCOGeN - VirusSeq), a consortium tasked with coordinating expanded sequencing of SARS-CoV-2 genomes across Canada early in the pandemic, created the Canadian VirusSeq Data Portal, with associated data pipelines and procedures, to support these efforts.

View Article and Find Full Text PDF

Toward a common standard for data and specimen provenance in life sciences.

Rudolf Wittner Petr Holub Cecilia Mascia Francesca Frexia Heimo Müller Mélanie Courtot

Learn Health Syst

January 2024

Article Synopsis

The importance of openly sharing and reusing specimens and data in life sciences research is highlighted, as it directly affects the quality of findings and knowledge.
Accurate documentation of pre-analytical conditions, analytical procedures, and data processing is crucial to validate research results, but current information on sample and data provenance is often inadequate.
The publication discusses a standardization effort aimed at creating reliable machine-actionable documentation for data lineage and specimens, inviting experts from biotechnology and biomedical fields to contribute to this initiative.

View Article and Find Full Text PDF

Features of a FAIR vocabulary.

Fuqi Xu Nick Juty Carole Goble Simon Jupp Helen Parkinson Mélanie Courtot

J Biomed Semantics

June 2023

Background: The Findable, Accessible, Interoperable and Reusable(FAIR) Principles explicitly require the use of FAIR vocabularies, but what precisely constitutes a FAIR vocabulary remains unclear. Being able to define FAIR vocabularies, identify features of FAIR vocabularies, and provide assessment approaches against the features can guide the development of vocabularies.

Results: We differentiate data, data resources and vocabularies used for FAIR, examine the application of the FAIR Principles to vocabularies, align their requirements with the Open Biomedical Ontologies principles, and propose FAIR Vocabulary Features.

View Article and Find Full Text PDF

FAIR in action - a flexible framework to guide FAIRification.

Danielle Welter Nick Juty Philippe Rocca-Serra Fuqi Xu David Henderson Melanie Courtot

Sci Data

May 2023

The COVID-19 pandemic has highlighted the need for FAIR (Findable, Accessible, Interoperable, and Reusable) data more than any other scientific challenge to date. We developed a flexible, multi-level, domain-agnostic FAIRification framework, providing practical guidance to improve the FAIRness for both existing and future clinical and molecular datasets. We validated the framework in collaboration with several major public-private partnership projects, demonstrating and delivering improvements across all aspects of FAIR and across a variety of datasets and their contexts.

View Article and Find Full Text PDF

Accelerating cancer omics and precision oncology in health care and research: a Lancet Oncology Commission.

Raffaella Casolino Amber L Johns Melanie Courtot Rita T Lawlor Francesco De Lorenzo

Lancet Oncol

February 2023

View Article and Find Full Text PDF

Recommendations for the formatting of Variant Call Format (VCF) files to make plant genotyping data FAIR.

Sebastian Beier Anne Fiebig Cyril Pommier Isuru Liyanage Matthias Lange Mélanie Courtot

F1000Res

July 2022

In this opinion article, we discuss the formatting of files from (plant) genotyping studies, in particular the formatting of metadata in Variant Call Format (VCF) files. The flexibility of the VCF format specification facilitates its use as a generic interchange format across domains but can lead to inconsistency between files in the presentation of metadata. To enable fully autonomous machine actionable data flow, generic elements need to be further specified.

View Article and Find Full Text PDF

The GA4GH Phenopacket schema defines a computable representation of clinical data.

Julius O B Jacobsen Michael Baudis Gareth S Baynam Jacques S Beckmann Sergi Beltran Mélanie Courtot

Nat Biotechnol

June 2022

View Article and Find Full Text PDF

ELIXIR biovalidator for semantic validation of life science metadata.

Isuru Liyanage Tony Burdett Bert Droesbeke Karoly Erdos Rolando Fernandez Mélanie Courtot

Bioinformatics

May 2022

Summary: To advance biomedical research, increasingly large amounts of complex data need to be discovered and integrated. This requires syntactic and semantic validation to ensure shared understanding of relevant entities. This article describes the ELIXIR biovalidator, which extends the syntactic validation of the widely used AJV library with ontology-based validation of JSON documents.

View Article and Find Full Text PDF

GA4GH: International policies and standards for data sharing across genomic research and healthcare.

Heidi L Rehm Angela J H Page Lindsay Smith Jeremy B Adams Gil Alterovitz Mélanie Courtot

Cell Genom

November 2021

The Global Alliance for Genomics and Health (GA4GH) aims to accelerate biomedical advances by enabling the responsible sharing of clinical and genomic data through both harmonized data aggregation and federated approaches. The decreasing cost of genomic sequencing (along with other genome-wide molecular assays) and increasing evidence of its clinical utility will soon drive the generation of sequence data from tens of millions of humans, with increasing levels of diversity. In this perspective, we present the GA4GH strategies for addressing the major challenges of this data revolution.

View Article and Find Full Text PDF

GA4GH Passport standard for digital identity and access permissions.

Craig Voisin Mikael Linden Stephanie O M Dyke Sarion R Bowers Pinar Alper Mélanie Courtot

Cell Genom

November 2021

The Global Alliance for Genomics and Health (GA4GH) supports international standards that enable a federated data sharing model for the research community while respecting data security, ethical and regulatory frameworks, and data authorization and access processes for sensitive data. The GA4GH Passport standard (Passport) defines a machine-readable digital identity that conveys roles and data access permissions (called "visas") for individual users. Visas are issued by data stewards, including data access committees (DACs) working with public databases, the entities responsible for the quality, integrity, and access arrangements for the datasets in the management of human biomedical data.

View Article and Find Full Text PDF

The Data Use Ontology to streamline responsible access to human biomedical datasets.

Jonathan Lawson Moran N Cabili Giselle Kerry Tiffany Boughtwood Adrian Thorogood Mélanie Courtot

Cell Genom

November 2021

Human biomedical datasets that are critical for research and clinical studies to benefit human health also often contain sensitive or potentially identifying information of individual participants. Thus, care must be taken when they are processed and made available to comply with ethical and regulatory frameworks and informed consent data conditions. To enable and streamline data access for these biomedical datasets, the Global Alliance for Genomics and Health (GA4GH) Data Use and Researcher Identities (DURI) work stream developed and approved the Data Use Ontology (DUO) standard.

View Article and Find Full Text PDF

Empirical validation of an automated approach to data use oversight.

Moran N Cabili Jonathan Lawson Andrea Saltzman Greg Rushton Pearl O'Rourke Mélanie Courtot

Cell Genom

November 2021

The current paradigm for data use oversight of biomedical datasets is onerous, extending the timescale and resources needed to obtain access for secondary analyses, thus hindering scientific discovery. For a researcher to utilize a controlled-access dataset, a data access committee must review her research plans to determine whether they are consistent with the data use limitations (DULs) specified by the informed consent form. The newly created GA4GH data use ontology (DUO) holds the potential to streamline this process by making data use oversight computable.

View Article and Find Full Text PDF

BioSamples database: FAIRer samples metadata to accelerate research data management.

Mélanie Courtot Dipayan Gupta Isuru Liyanage Fuqi Xu Tony Burdett

Nucleic Acids Res

January 2022

The BioSamples database at EMBL-EBI is the central institutional repository for sample metadata storage and connection to EMBL-EBI archives and other resources. The technical improvements to our infrastructure described in our last update have enabled us to scale and accommodate an increasing number of communities, resulting in a higher number of submissions and more heterogeneous data. The BioSamples database now has a valuable set of features and processes to improve data quality in BioSamples, and in particular enriching metadata content and following FAIR principles.

View Article and Find Full Text PDF

OBO Foundry in 2021: operationalizing open data principles to evaluate ontologies.

Rebecca Jackson Nicolas Matentzoglu James A Overton Randi Vita James P Balhoff Melanie Courtot

Database (Oxford)

October 2021

Biological ontologies are used to organize, curate and interpret the vast quantities of data arising from biological experiments. While this works well when using a single ontology, integrating multiple ontologies can be problematic, as they are developed independently, which can lead to incompatibilities. The Open Biological and Biomedical Ontologies (OBO) Foundry was created to address this by facilitating the development, harmonization, application and sharing of ontologies, guided by a set of overarching principles.

View Article and Find Full Text PDF

The European Nucleotide Archive in 2020.

Peter W Harrison Alisha Ahamed Raheela Aslam Blaise T F Alako Josephine Burgin Mélanie Courtot

Nucleic Acids Res

January 2021

The European Nucleotide Archive (ENA; https://www.ebi.ac.

View Article and Find Full Text PDF

BioSamples database: an updated sample metadata hub.

Mélanie Courtot Luca Cherubin Adam Faulconbridge Daniel Vaughan Matthew Green

Nucleic Acids Res

January 2019

The BioSamples database at EMBL-EBI provides a central hub for sample metadata storage and linkage to other EMBL-EBI resources. BioSamples has recently undergone major changes, both in terms of data content and supporting infrastructure. The data content has more than doubled from around 2 million samples in 2014 to just over 5 million samples in 2018.

View Article and Find Full Text PDF

Using OWL reasoning to support the generation of novel gene sets for enrichment analysis.

David J Osumi-Sutherland Enrico Ponta Melanie Courtot Helen Parkinson Laura Badi

J Biomed Semantics

February 2018

Background: The Gene Ontology (GO) consists of over 40,000 terms for biological processes, cell components and gene product activities linked into a graph structure by over 90,000 relationships. It has been used to annotate the functions and cellular locations of several million gene products. The graph structure is used by a variety of tools to group annotated genes into sets whose products share function or location.

View Article and Find Full Text PDF

Prospects for Fungal Bioremediation of Acidic Radioactive Waste Sites: Characterization and Genome Sequence of MD1149.

Rok Tkavc Vera Y Matrosova Olga E Grichenko Cene Gostinčar Robert P Volpe Melanie Courtot

Front Microbiol

January 2018

Highly concentrated radionuclide waste produced during the Cold War era is stored at US Department of Energy (DOE) production sites. This radioactive waste was often highly acidic and mixed with heavy metals, and has been leaking into the environment since the 1950s. Because of the danger and expense of cleanup of such radioactive sites by physicochemical processes, bioremediation methods are being developed for cleanup of contaminated ground and groundwater.

View Article and Find Full Text PDF

Identifiers for the 21st century: How to design, provision, and reuse persistent identifiers to maximize utility and impact of life science data.

Julie A McMurry Nick Juty Niklas Blomberg Tony Burdett Tom Conlin Mélanie Courtot

PLoS Biol

June 2017

In many disciplines, data are highly decentralized across thousands of online databases (repositories, registries, and knowledgebases). Wringing value from such databases depends on the discipline of data science and on the humble bricks and mortar that make integration possible; identifiers are a core component of this integration infrastructure. Drawing on our experience and on work by other groups, we outline 10 lessons we have learned about the identifier qualities and best practices that facilitate large-scale data integration.

View Article and Find Full Text PDF

Dead simple OWL design patterns.

David Osumi-Sutherland Melanie Courtot James P Balhoff Christopher Mungall

J Biomed Semantics

June 2017

Background: Bio-ontologies typically require multiple axes of classification to support the needs of their users. Development of such ontologies can only be made scalable and sustainable by the use of inference to automate classification via consistent patterns of axiomatization. Many bio-ontologies originating in OBO or OWL follow this approach.

View Article and Find Full Text PDF

CARD 2017: expansion and model-centric curation of the comprehensive antibiotic resistance database.

Baofeng Jia Amogelang R Raphenya Brian Alcock Nicholas Waglechner Peiyao Guo Mélanie Courtot

Nucleic Acids Res

January 2017

The Comprehensive Antibiotic Resistance Database (CARD; http://arpcard.mcmaster.ca) is a manually curated resource containing high quality reference data on the molecular basis of antimicrobial resistance (AMR), with an emphasis on the genes, proteins and mutations involved in AMR.

View Article and Find Full Text PDF

Ontobee: A linked ontology data server to support ontology term dereferencing, linkage, query and integration.

Edison Ong Zuoshuang Xiang Bin Zhao Yue Liu Yu Lin Mélanie Courtot

Nucleic Acids Res

January 2017

Linked Data (LD) aims to achieve interconnected data by representing entities using Unified Resource Identifiers (URIs), and sharing information using Resource Description Frameworks (RDFs) and HTTP. Ontologies, which logically represent entities and relations in specific domains, are the basis of LD. Ontobee (http://www.

View Article and Find Full Text PDF

The Ontology for Biomedical Investigations.

Anita Bandrowski Ryan Brinkman Mathias Brochhausen Matthew H Brush Bill Bug Mélanie Courtot

PLoS One

April 2017

The Ontology for Biomedical Investigations (OBI) is an ontology that provides terms with precisely defined meanings to describe all aspects of how investigations in the biological and medical domains are conducted. OBI re-uses ontologies that provide a representation of biomedical knowledge from the Open Biological and Biomedical Ontologies (OBO) project and adds the ability to describe how this knowledge was derived. We here describe the state of OBI and several applications that are using it, such as adding semantic expressivity to existing databases, building data entry forms, and enabling interoperability between knowledge resources.

View Article and Find Full Text PDF

Publications by authors named "Melanie Courtot"

Development of the Ontario Hereditary Cancer Research Network, a unified registry as a resource for individuals with inherited cancer syndromes: an observational registry creation protocol.

Article Synopsis

The Canadian VirusSeq Data Portal and Duotang: open resources for SARS-CoV-2 viral sequences and genomic epidemiology.

Article Synopsis

The Canadian VirusSeq Data Portal & Duotang: open resources for SARS-CoV-2 viral sequences and genomic epidemiology.

Toward a common standard for data and specimen provenance in life sciences.

Article Synopsis

Features of a FAIR vocabulary.

FAIR in action - a flexible framework to guide FAIRification.

Accelerating cancer omics and precision oncology in health care and research: a Lancet Oncology Commission.

Recommendations for the formatting of Variant Call Format (VCF) files to make plant genotyping data FAIR.

The GA4GH Phenopacket schema defines a computable representation of clinical data.

ELIXIR biovalidator for semantic validation of life science metadata.

GA4GH: International policies and standards for data sharing across genomic research and healthcare.

GA4GH Passport standard for digital identity and access permissions.

The Data Use Ontology to streamline responsible access to human biomedical datasets.

Empirical validation of an automated approach to data use oversight.

BioSamples database: FAIRer samples metadata to accelerate research data management.

OBO Foundry in 2021: operationalizing open data principles to evaluate ontologies.

The European Nucleotide Archive in 2020.

BioSamples database: an updated sample metadata hub.

Using OWL reasoning to support the generation of novel gene sets for enrichment analysis.

Prospects for Fungal Bioremediation of Acidic Radioactive Waste Sites: Characterization and Genome Sequence of MD1149.

Identifiers for the 21st century: How to design, provision, and reuse persistent identifiers to maximize utility and impact of life science data.

Dead simple OWL design patterns.

CARD 2017: expansion and model-centric curation of the comprehensive antibiotic resistance database.

Ontobee: A linked ontology data server to support ontology term dereferencing, linkage, query and integration.

The Ontology for Biomedical Investigations.

A PHP Error was encountered

A PHP Error was encountered