Publications by Artem Lysenko | LitMetric

Publications by authors named "Artem Lysenko"

Page 1 of 2

Enhanced analysis of tabular data through Multi-representation DeepInsight.

Alok Sharma Yosvany López Shangru Jia Artem Lysenko Keith A Boroevich

Sci Rep

June 2024

Tabular data analysis is a critical task in various domains, enabling us to uncover valuable insights from structured datasets. While traditional machine learning methods can be used for feature engineering and dimensionality reduction, they often struggle to capture the intricate relationships and dependencies within real-world datasets. In this paper, we present Multi-representation DeepInsight (MRep-DeepInsight), a novel extension of the DeepInsight method designed to enhance the analysis of tabular data.

View Article and Find Full Text PDF

Advances in AI and machine learning for predictive medicine.

Alok Sharma Artem Lysenko Shangru Jia Keith A Boroevich Tatsuhiko Tsunoda

J Hum Genet

October 2024

The field of omics, driven by advances in high-throughput sequencing, faces a data explosion. This abundance of data offers unprecedented opportunities for predictive modeling in precision medicine, but also presents formidable challenges in data analysis and interpretation. Traditional machine learning (ML) techniques have been partly successful in generating predictive models for omics analysis but exhibit limitations in handling potential relationships within the data for more accurate prediction.

View Article and Find Full Text PDF

scDeepInsight: a supervised cell-type identification method for scRNA-seq data with deep learning.

Shangru Jia Artem Lysenko Keith A Boroevich Alok Sharma Tatsuhiko Tsunoda

Brief Bioinform

September 2023

Annotation of cell-types is a critical step in the analysis of single-cell RNA sequencing (scRNA-seq) data that allows the study of heterogeneity across multiple cell populations. Currently, this is most commonly done using unsupervised clustering algorithms, which project single-cell expression data into a lower dimensional space and then cluster cells based on their distances from each other. However, as these methods do not use reference datasets, they can only achieve a rough classification of cell-types, and it is difficult to improve the recognition accuracy further.

View Article and Find Full Text PDF

DeepInsight-3D architecture for anti-cancer drug response prediction with deep-learning on multi-omics.

Alok Sharma Artem Lysenko Keith A Boroevich Tatsuhiko Tsunoda

Sci Rep

February 2023

Modern oncology offers a wide range of treatments and therefore choosing the best option for particular patient is very important for optimal outcome. Multi-omics profiling in combination with AI-based predictive models have great potential for streamlining these treatment decisions. However, these encouraging developments continue to be hampered by very high dimensionality of the datasets in combination with insufficiently large numbers of annotated samples.

View Article and Find Full Text PDF

Immune subtypes and neoantigen-related immune evasion in advanced colorectal cancer.

Toshitaka Sugawara Fuyuki Miya Toshiaki Ishikawa Artem Lysenko Jo Nishino

iScience

February 2022

Article Synopsis

* A specific cancer subtype with immune evasion is linked to poor survival rates due to a lack of highly expressed neoantigens and high chromosomal instability, which contribute to immune resistance.
* The study suggests that analyzing the tumor microenvironment and neoantigen makeup could serve as valuable prognostic tools for treatment decisions in advanced colorectal cancer.

View Article and Find Full Text PDF

DeepFeature: feature selection in nonimage data using convolutional neural network.

Alok Sharma Artem Lysenko Keith A Boroevich Edwin Vans Tatsuhiko Tsunoda

Brief Bioinform

November 2021

Artificial intelligence methods offer exciting new capabilities for the discovery of biological mechanisms from raw data because they are able to detect vastly more complex patterns of association that cannot be captured by classical statistical tests. Among these methods, deep neural networks are currently among the most advanced approaches and, in particular, convolutional neural networks (CNNs) have been shown to perform excellently for a variety of difficult tasks. Despite that applications of this type of networks to high-dimensional omics data and, most importantly, meaningful interpretation of the results returned from such models in a biomedical context remains an open problem.

View Article and Find Full Text PDF

Cerebrospinal fluid proteome shows disrupted neuronal development in multiple sclerosis.

Ellen F Mosleth Christian Alexander Vedeler Kristian Hovde Liland Anette McLeod Gerd Haga Bringeland Artem Lysenko

Sci Rep

February 2021

Despite intensive research, the aetiology of multiple sclerosis (MS) remains unknown. Cerebrospinal fluid proteomics has the potential to reveal mechanisms of MS pathogenesis, but analyses must account for disease heterogeneity. We previously reported explorative multivariate analysis by hierarchical clustering of proteomics data of MS patients and controls, which resulted in two groups of individuals.

View Article and Find Full Text PDF

PHI-Nets: A Network Resource for Ascomycete Fungal Pathogens to Annotate and Identify Putative Virulence Interacting Proteins and siRNA Targets.

Elzbieta I Janowska-Sejda Artem Lysenko Martin Urban Chris Rawlings Sophia Tsoka

Front Microbiol

December 2019

Article Synopsis

* Researchers created interactomes for 15 Ascomycete fungal species to analyze functional gene patterns linked to their disease-causing abilities.
* A second analysis explored interactions of small silencing plant RNAs with their targets, suggesting potential virulence genes, and all 15 network datasets are available for public access at www.phi-base.org.

View Article and Find Full Text PDF

HseSUMO: Sumoylation site prediction using half-sphere exposures of amino acids residues.

Alok Sharma Artem Lysenko Yosvany López Abdollah Dehzangi Ronesh Sharma

BMC Genomics

April 2019

Background: Post-translational modifications are viewed as an important mechanism for controlling protein function and are believed to be involved in multiple important diseases. However, their profiling using laboratory-based techniques remain challenging. Therefore, making the development of accurate computational methods to predict post-translational modifications is particularly important for making progress in this area of research.

View Article and Find Full Text PDF

An integrative machine learning approach for prediction of toxicity-related drug safety.

Artem Lysenko Alok Sharma Keith A Boroevich Tatsuhiko Tsunoda

Life Sci Alliance

December 2018

Recent trends in drug development have been marked by diminishing returns caused by the escalating costs and falling rates of new drug approval. Unacceptable drug toxicity is a substantial cause of drug failure during clinical trials and the leading cause of drug withdraws after release to the market. Computational methods capable of predicting these failures can reduce the waste of resources and time devoted to the investigation of compounds that ultimately fail.

View Article and Find Full Text PDF

Navigating the disease landscape: knowledge representations for contextualizing molecular signatures.

Mansoor Saqi Artem Lysenko Yi-Ke Guo Tatsuhiko Tsunoda Charles Auffray

Brief Bioinform

March 2019

Large amounts of data emerging from experiments in molecular medicine are leading to the identification of molecular signatures associated with disease subtypes. The contextualization of these patterns is important for obtaining mechanistic insight into the aberrant processes associated with a disease, and this typically involves the integration of multiple heterogeneous types of data. In this review, we discuss knowledge representations that can be useful to explore the biological context of molecular signatures, in particular three main approaches, namely, pathway mapping approaches, molecular network centric approaches and approaches that represent biological statements as knowledge graphs.

View Article and Find Full Text PDF

Arete - candidate gene prioritization using biological network topology with additional evidence types.

Artem Lysenko Keith Anthony Boroevich Tatsuhiko Tsunoda

BioData Min

July 2017

Background: Refinement of candidate gene lists to select the most promising candidates for further experimental verification remains an essential step between high-throughput exploratory analysis and the discovery of specific causal genes. Given the qualitative and semantic complexity of biological data, successfully addressing this challenge requires development of flexible and interoperable solutions for making the best possible use of the largest possible fraction of all available data.

Results: We have developed an easily accessible framework that links two established network-based gene prioritization approaches with a supporting isolation forest-based integrative ranking method.

View Article and Find Full Text PDF

Developing integrated crop knowledge networks to advance candidate gene discovery.

Keywan Hassani-Pak Martin Castellote Maria Esch Matthew Hindle Artem Lysenko

Appl Transl Genom

December 2016

The chances of raising crop productivity to enhance global food security would be greatly improved if we had a complete understanding of all the biological mechanisms that underpinned traits such as crop yield, disease resistance or nutrient and water use efficiency. With more crop genomes emerging all the time, we are nearer having the basic information, at the gene-level, to begin assembling crop gene catalogues and using data from other plant species to understand how the genes function and how their interactions govern crop development and physiology. Unfortunately, the task of creating such a complete knowledge base of gene functions, interaction networks and trait biology is technically challenging because the relevant data are dispersed in myriad databases in a variety of data formats with variable quality and coverage.

View Article and Find Full Text PDF

Recon2Neo4j: applying graph database technologies for managing comprehensive genome-scale networks.

Irina Balaur Alexander Mazein Mansoor Saqi Artem Lysenko Christopher J Rawlings

Bioinformatics

April 2017

Summary: The goal of this work is to offer a computational framework for exploring data from the Recon2 human metabolic reconstruction model. Advanced user access features have been developed using the Neo4j graph database technology and this paper describes key features such as efficient management of the network data, examples of the network querying for addressing particular tasks, and how query results are converted back to the Systems Biology Markup Language (SBML) standard format. The Neo4j-based metabolic framework facilitates exploration of highly connected and comprehensive human metabolic data and identification of metabolic subnetworks of interest.

View Article and Find Full Text PDF

EpiGeNet: A Graph Database of Interdependencies Between Genetic and Epigenetic Events in Colorectal Cancer.

Irina Balaur Mansoor Saqi Ana Barat Artem Lysenko Alexander Mazein

J Comput Biol

October 2017

The development of colorectal cancer (CRC)-the third most common cancer type-has been associated with deregulations of cellular mechanisms stimulated by both genetic and epigenetic events. StatEpigen is a manually curated and annotated database, containing information on interdependencies between genetic and epigenetic signals, and specialized currently for CRC research. Although StatEpigen provides a well-developed graphical user interface for information retrieval, advanced queries involving associations between multiple concepts can benefit from more detailed graph representation of the integrated data.

View Article and Find Full Text PDF

Representing and querying disease networks using graph databases.

Artem Lysenko Irina A Roznovăţ Mansoor Saqi Alexander Mazein Christopher J Rawlings

BioData Min

July 2016

Background: Systems biology experiments generate large volumes of data of multiple modalities and this information presents a challenge for integration due to a mix of complexity together with rich semantics. Here, we describe how graph databases provide a powerful framework for storage, querying and envisioning of biological data.

Results: We show how graph databases are well suited for the representation of biological information, which is typically highly connected, semi-structured and unpredictable.

View Article and Find Full Text PDF

Transcriptome and metabolite profiling of the infection cycle of Zymoseptoria tritici on wheat reveals a biphasic interaction with plant immunity involving differential pathogen chromosomal contributions and a variation on the hemibiotrophic lifestyle definition.

Jason J Rudd Kostya Kanyuka Keywan Hassani-Pak Mark Derbyshire Ambrose Andongabo Artem Lysenko

Plant Physiol

March 2015

The hemibiotrophic fungus Zymoseptoria tritici causes Septoria tritici blotch disease of wheat (Triticum aestivum). Pathogen reproduction on wheat occurs without cell penetration, suggesting that dynamic and intimate intercellular communication occurs between fungus and plant throughout the disease cycle. We used deep RNA sequencing and metabolomics to investigate the physiology of plant and pathogen throughout an asexual reproductive cycle of Z.

View Article and Find Full Text PDF

A novel approach to identify genes that determine grain protein deviation in cereals.

Ellen F Mosleth Yongfang Wan Artem Lysenko Gemma A Chope Simon P Penson

Plant Biotechnol J

June 2015

Grain yield and protein content were determined for six wheat cultivars grown over 3 years at multiple sites and at multiple nitrogen (N) fertilizer inputs. Although grain protein content was negatively correlated with yield, some grain samples had higher protein contents than expected based on their yields, a trait referred to as grain protein deviation (GPD). We used novel statistical approaches to identify gene transcripts significantly related to GPD across environments.

View Article and Find Full Text PDF

Discovering study-specific gene regulatory networks.

Valeria Bo Tanya Curtis Artem Lysenko Mansoor Saqi Stephen Swift

PLoS One

March 2016

Microarrays are commonly used in biology because of their ability to simultaneously measure thousands of genes under different conditions. Due to their structure, typically containing a high amount of variables but far fewer samples, scalable network analysis techniques are often employed. In particular, consensus approaches have been recently used that combine multiple microarray studies in order to find networks that are more robust.

View Article and Find Full Text PDF

Genetical and comparative genomics of Brassica under altered Ca supply identifies Arabidopsis Ca-transporter orthologs.

Neil S Graham John P Hammond Artem Lysenko Sean Mayes Seosamh O Lochlainn

Plant Cell

July 2014

Although Ca transport in plants is highly complex, the overexpression of vacuolar Ca(2+) transporters in crops is a promising new technology to improve dietary Ca supplies through biofortification. Here, we sought to identify novel targets for increasing plant Ca accumulation using genetical and comparative genomics. Expression quantitative trait locus (eQTL) mapping to 1895 cis- and 8015 trans-loci were identified in shoots of an inbred mapping population of Brassica rapa (IMB211 × R500); 23 cis- and 948 trans-eQTLs responded specifically to altered Ca supply.

View Article and Find Full Text PDF

Interactive exploration of integrated biological datasets using context-sensitive workflows.

Fabian Horn Martin Rittweger Jan Taubert Artem Lysenko Christopher Rawlings

Front Genet

March 2014

Network inference utilizes experimental high-throughput data for the reconstruction of molecular interaction networks where new relationships between the network entities can be predicted. Despite the increasing amount of experimental data, the parameters of each modeling technique cannot be optimized based on the experimental data alone, but needs to be qualitatively assessed if the components of the resulting network describe the experimental setting. Candidate list prioritization and validation builds upon data integration and data visualization.

View Article and Find Full Text PDF

Network-based data integration for selecting candidate virulence associated proteins in the cereal infecting fungus Fusarium graminearum.

Artem Lysenko Martin Urban Laura Bennett Sophia Tsoka Elzbieta Janowska-Sejda

PLoS One

February 2014

The identification of virulence genes in plant pathogenic fungi is important for understanding the infection process, host range and for developing control strategies. The analysis of already verified virulence genes in phytopathogenic fungi in the context of integrated functional networks can give clues about the underlying mechanisms and pathways directly or indirectly linked to fungal pathogenicity and can suggest new candidates for further experimental investigation, using a 'guilt by association' approach. Here we study 133 genes in the globally important Ascomycete fungus Fusarium graminearum that have been experimentally tested for their involvement in virulence.

View Article and Find Full Text PDF

AIGO: towards a unified framework for the analysis and the inter-comparison of GO functional annotations.

Michael Defoin-Platel Matthew M Hindle Artem Lysenko Stephen J Powers Dimah Z Habash

BMC Bioinformatics

November 2011

Background: In response to the rapid growth of available genome sequences, efforts have been made to develop automatic inference methods to functionally characterize them. Pipelines that infer functional annotation are now routinely used to produce new annotations at a genome scale and for a broad variety of species. These pipelines differ widely in their inference algorithms, confidence thresholds and data sources for reasoning.

View Article and Find Full Text PDF

Assessing the functional coherence of modules found in multiple-evidence networks from Arabidopsis.

Artem Lysenko Michael Defoin-Platel Keywan Hassani-Pak Jan Taubert Charlie Hodgman

BMC Bioinformatics

May 2011

Background: Combining multiple evidence-types from different information sources has the potential to reveal new relationships in biological systems. The integrated information can be represented as a relationship network, and clustering the network can suggest possible functional modules. The value of such modules for gaining insight into the underlying biological processes depends on their functional coherence.

View Article and Find Full Text PDF

Data integration for plant genomics--exemplars from the integration of Arabidopsis thaliana databases.

Artem Lysenko Matthew Morritt Hindle Jan Taubert Mansoor Saqi Christopher John Rawlings

Brief Bioinform

November 2009

The development of a systems based approach to problems in plant sciences requires integration of existing information resources. However, the available information is currently often incomplete and dispersed across many sources and the syntactic and semantic heterogeneity of the data is a challenge for integration. In this article, we discuss strategies for data integration and we use a graph based integration method (Ondex) to illustrate some of these challenges with reference to two example problems concerning integration of (i) metabolic pathway and (ii) protein interaction data for Arabidopsis thaliana.

View Article and Find Full Text PDF