Publications by Sarah Cohen Boulakia | LitMetric

Publications by authors named "Sarah Cohen Boulakia"

Page 1 of 1

BioFlow-Insight: facilitating reuse of Nextflow workflows with structure reconstruction and visualization.

George Marchment Bryan Brancotte Marie Schmit Frédéric Lemoine Sarah Cohen-Boulakia

NAR Genom Bioinform

September 2024

Bioinformatics workflows are increasingly used for sharing analyses, serving as a cornerstone for enhancing the reproducibility and shareability of bioinformatics analyses. In particular, Nextflow is a commonly used workflow system, permitting the creation of large workflows while offering substantial flexibility. An increasing number of Nextflow workflows are being shared on repositories such as GitHub.

View Article and Find Full Text PDF

Reprohackathons: promoting reproducibility in bioinformatics through training.

Thomas Cokelaer Sarah Cohen-Boulakia Frédéric Lemoine

Bioinformatics

June 2023

Motivation: The reproducibility crisis has highlighted the importance of improving the way bioinformatics data analyses are implemented, executed, and shared. To address this, various tools such as content versioning systems, workflow management systems, and software environment management systems have been developed. While these tools are becoming more widely used, there is still much work to be done to increase their adoption.

View Article and Find Full Text PDF

Developing and reusing bioinformatics data analysis pipelines using scientific workflow systems.

Marine Djaffardjy George Marchment Clémence Sebe Raphael Blanchet Khalid Bellajhame Sarah Cohen-Boulakia

Comput Struct Biotechnol J

March 2023

Data analysis pipelines are now established as an effective means for specifying and executing bioinformatics data analysis and experiments. While scripting languages, particularly Python, R and notebooks, are popular and sufficient for developing small-scale pipelines that are often intended for a single user, it is now widely recognized that they are by no means enough to support the development of large-scale, shareable, maintainable and reusable pipelines capable of handling large volumes of data and running on high performance computing clusters. This review outlines the key requirements for building large-scale data pipelines and provides a mapping of existing solutions that fulfill them.

View Article and Find Full Text PDF

Toward an observatory of the evolution of clinical trials through phylomemy reconstruction: the COVID-19 vaccines example.

Quentin Lobbé David Chavalarias Alexandre Delanoë Gabriel Ferrand Sarah Cohen-Boulakia

J Clin Epidemiol

September 2022

Objectives: To visualize the evolution of all registered COVID-19 vaccine trials.

Study Design And Setting: As part of the living mapping of the COVID-NMA initiative, we identify biweekly all COVID-19 vaccine trials and automatically extract data from the EU clinical trials registry, ClinicalTrials.gov, IRCT and the World Health Organization International Clinical Trials Registry Platform.

View Article and Find Full Text PDF

Research response to coronavirus disease 2019 needed better coordination and collaboration: a living mapping of registered trials.

Van Thu Nguyen Philippe Rivière Pierre Ripoll Julien Barnier Romain Vuillemot Sarah Cohen-Boulakia

J Clin Epidemiol

February 2021

Objectives: Researchers worldwide are actively engaging in research activities to search for preventive and therapeutic interventions against coronavirus disease 2019 (COVID-19). Our aim was to describe the planning of randomized controlled trials (RCTs) in terms of timing related to the course of the COVID-19 epidemic and research question evaluated.

Study Design And Setting: We performed a living mapping of RCTs registered in the WHO International Clinical Trials Registry Platform.

View Article and Find Full Text PDF

NGPhylogeny.fr: new generation phylogenetic services for non-specialists.

Frédéric Lemoine Damien Correia Vincent Lefort Olivia Doppelt-Azeroual Fabien Mareuil Sarah Cohen-Boulakia

Nucleic Acids Res

July 2019

Phylogeny.fr, created in 2008, has been designed to facilitate the execution of phylogenetic workflows, and is nowadays widely used. However, since its development, user needs have evolved, new tools and workflows have been published, and the number of jobs has increased dramatically, thus promoting new practices, which motivated its refactoring.

View Article and Find Full Text PDF

The CoLoMoTo Interactive Notebook: Accessible and Reproducible Computational Analyses for Qualitative Biological Networks.

Aurélien Naldi Céline Hernandez Nicolas Levy Gautier Stoll Pedro T Monteiro Sarah Cohen-Boulakia

Front Physiol

June 2018

Analysing models of biological networks typically relies on workflows in which different software tools with sensitive parameters are chained together, many times with additional manual steps. The accessibility and reproducibility of such workflows is challenging, as publications often overlook analysis details, and because some of these tools may be difficult to install, and/or have a steep learning curve. The CoLoMoTo Interactive Notebook provides a unified environment to edit, execute, share, and reproduce analyses of qualitative models of biological networks.

View Article and Find Full Text PDF

Distilling structure in Taverna scientific workflows: a refactoring approach.

Sarah Cohen-Boulakia Jiuqiang Chen Paolo Missier Carole Goble Alan R Williams

BMC Bioinformatics

May 2014

Background: Scientific workflows management systems are increasingly used to specify and manage bioinformatics experiments. Their programming model appeals to bioinformaticians, who can use them to easily specify complex data processing pipelines. Such a model is underpinned by a graph structure, where nodes represent bioinformatics tasks and links represent the dataflow.

View Article and Find Full Text PDF

Gene List significance at-a-glance with GeneValorization.

Bryan Brancotte Anne Biton Isabelle Bernard-Pierrot François Radvanyi Fabien Reyal Sarah Cohen-Boulakia

Bioinformatics

April 2011

Motivation: High-throughput technologies provide fundamental informations concerning thousands of genes. Many of the current research laboratories daily use one or more of these technologies and end-up with lists of genes. Assessing the originality of the results obtained includes being aware of the number of publications available concerning individual or multiple genes and accessing information about these publications.

View Article and Find Full Text PDF

Review of the selected proceedings of the Fifth International Workshop on Data Integration in the Life Sciences 2008.

Amos Bairoch Sarah Cohen-Boulakia Christine Froidevaux

BMC Bioinformatics

July 2008

View Article and Find Full Text PDF

BioGuideSRS: querying multiple sources with a user-centric perspective.

Sarah Cohen-Boulakia Olivier Biton Susan Davidson Christine Froidevaux

Bioinformatics

May 2007

Unlabelled: Biologists are frequently faced with the problem of integrating information from multiple heterogeneous sources with their own experimental data. Given the large number of public sources, it is difficult to choose which sources to integrate without assistance. When doing this manually, biologists differ in their preferences concerning the sources to be queried as well as the strategies, i.

View Article and Find Full Text PDF

Path-based systems to guide scientists in the maze of biological data sources.

Sarah Cohen-Boulakia Susan Davidson Christine Froidevaux Zoé Lacroix Maria-Esther Vidal

J Bioinform Comput Biol

October 2006

Fueled by novel technologies capable of producing massive amounts of data for a single experiment, scientists are faced with an explosion of information which must be rapidly analyzed and combined with other data to form hypotheses and create knowledge. Today, numerous biological questions can be answered without entering a wet lab. Scientific protocols designed to answer these questions can be run entirely on a computer.

View Article and Find Full Text PDF

Selecting biological data sources and tools with XPR, a path language for RDF.

Sarah Cohen-Boulakia Christine Froidevaux Emmanuel Pietriga

Pac Symp Biocomput

January 2007

As the number, richness and diversity of biological sources grow, scientists are increasingly confronted with the problem of selecting appropriate sources and tools. To address this problem, we have designed BioGuidel, a user-centric framework that helps scientists choose sources and tools according to their preferences and strategy, by specifying queries through a user-friendly visual interface. In this paper, we provide a complete RDF representation of BioGuide and introduce XPR (eXtensible Path language for RDF), an extension of FSL2 that is expressive enough to model all BioGuide queries.

View Article and Find Full Text PDF

Proteome informatics II: bioinformatics for comparative proteomics.

Frédérique Lisacek Sarah Cohen-Boulakia Ron D Appel

Proteomics

October 2006

The present review attempts to cover the most recent initiatives directed towards representing, storing, displaying and processing protein-related data suited to undertake "comparative proteomics" studies. Data interpretation is brought into focus. Efforts invested into analysing and interpreting experimental data increasingly express the need for adding meaning.

View Article and Find Full Text PDF

Selecting biomedical data sources according to user preferences.

Sarah Cohen Boulakia Séverine Lair Nicolas Stransky Stéphane Graziani François Radvanyi

Bioinformatics

August 2004

Motivation: Biologists are now faced with the problem of integrating information from multiple heterogeneous public sources with their own experimental data contained in individual sources. The selection of the sources to be considered is thus critically important.

Results: Our aim is to support biologists by developing a module based on an algorithm that presents a selection of sources relevant to their query and matched to their own preferences.

View Article and Find Full Text PDF