Publications by authors named "Volkan Atalay"

Idiopathic pulmonary fibrosis is a fatal disease characterized by the TGF-β-dependent activation of lung fibroblasts, leading to excessive deposition of collagen proteins and progressive replacement of healthy lung with scar tissue. We and others have shown that TGF-β-mediated activation of the Mechanistic Target of Rapamycin Complex 1 (mTORC1) and downstream upregulation of Activating Transcription Factor 4 (ATF4) promote metabolic reprogramming in lung fibroblasts characterized by upregulation of the de synthesis of glycine, the most abundant amino acid found in collagen protein. Whether mTOR and ATF4 regulate other metabolic pathways in lung fibroblasts has not been explored.

View Article and Find Full Text PDF

Arginine is a conditionally essential amino acid with known roles in protein production, nitric oxide synthesis, biosynthesis of proline and polyamines, and regulation of intracellular signaling pathways. Arginine biosynthesis and catabolism have been linked to TGF-β-induced activation of fibroblasts in the context of pulmonary fibrosis; however, a thorough study on the metabolic and signaling roles of arginine in the process of fibroblast activation has not been conducted. Here, we used metabolic dropouts and labeling strategies to determine how activated fibroblasts utilize arginine.

View Article and Find Full Text PDF

Idiopathic pulmonary fibrosis is a fatal disease characterized by the TGF-β-dependent activation of lung fibroblasts, leading to excessive deposition of collagen proteins and progressive replacement of healthy lung with scar tissue. We and others have shown that fibroblast activation is supported by metabolic reprogramming, including the upregulation of the synthesis of glycine, the most abundant amino acid found in collagen protein. How fibroblast metabolic reprogramming is regulated downstream of TGF-β is incompletely understood.

View Article and Find Full Text PDF

Motivation: Utilizing AI-driven approaches for drug-target interaction (DTI) prediction require large volumes of training data which are not available for the majority of target proteins. In this study, we investigate the use of deep transfer learning for the prediction of interactions between drug candidate compounds and understudied target proteins with scarce training data. The idea here is to first train a deep neural network classifier with a generalized source training dataset of large size and then to reuse this pre-trained neural network as an initial configuration for re-training/fine-tuning purposes with a small-sized specialized target training dataset.

View Article and Find Full Text PDF

As the number of protein sequences increases in biological databases, computational methods are required to provide accurate functional annotation with high coverage. Although several machine learning methods have been proposed for this purpose, there are still two main issues: (i) construction of reliable positive and negative training and validation datasets, and (ii) fair evaluation of their performances based on predefined experimental settings. To address these issues, we have developed ProFAB: Open Protein Functional Annotation Benchmark, which is a platform providing an infrastructure for a fair comparison of protein function prediction methods.

View Article and Find Full Text PDF

Summary: Accurate prediction of the subcellular locations (SLs) of proteins is a critical topic in protein science. In this study, we present SLPred, an ensemble-based multi-view and multi-label protein subcellular localization prediction tool. For a query protein sequence, SLPred provides predictions for nine main SLs using independent machine-learning models trained for each location.

View Article and Find Full Text PDF

Purpose: Computational approaches have been used at different stages of drug development with the purpose of decreasing the time and cost of conventional experimental procedures. Lately, techniques mainly developed and applied in the field of artificial intelligence (AI), have been transferred to different application domains such as biomedicine.

Methods: In this study, we conducted an investigative analysis via data-driven evaluation of potential hepatocellular carcinoma (HCC) therapeutics in the context of AI-assisted drug discovery/repurposing.

View Article and Find Full Text PDF

Systemic analysis of available large-scale biological/biomedical data is critical for studying biological mechanisms, and developing novel and effective treatment approaches against diseases. However, different layers of the available data are produced using different technologies and scattered across individual computational resources without any explicit connections to each other, which hinders extensive and integrative multi-omics-based analysis. We aimed to address this issue by developing a new data integration/representation methodology and its application by constructing a biological data resource.

View Article and Find Full Text PDF

The identification of physical interactions between drug candidate compounds and target biomolecules is an important process in drug discovery. Since conventional screening procedures are expensive and time consuming, computational approaches are employed to provide aid by automatically predicting novel drug-target interactions (DTIs). In this study, we propose a large-scale DTI prediction system, DEEPScreen, for early stage drug discovery, using deep convolutional neural networks.

View Article and Find Full Text PDF

Summary: iBioProVis is an interactive tool for visual analysis of the compound bioactivity space in the context of target proteins, drugs and drug candidate compounds. iBioProVis tool takes target protein identifiers and, optionally, compound SMILES as input, and uses the state-of-the-art non-linear dimensionality reduction method t-Distributed Stochastic Neighbor Embedding (t-SNE) to plot the distribution of compounds embedded in a 2D map, based on the similarity of structural properties of compounds and in the context of compounds' cognate targets. Similar compounds, which are embedded to proximate points on the 2D map, may bind the same or similar target proteins.

View Article and Find Full Text PDF

Background: The Critical Assessment of Functional Annotation (CAFA) is an ongoing, global, community-driven effort to evaluate and improve the computational annotation of protein function.

Results: Here, we report on the results of the third CAFA challenge, CAFA3, that featured an expanded analysis over the previous CAFA rounds, both in terms of volume of data analyzed and the types of analysis performed. In a novel and major new development, computational predictions and assessment goals drove some of the experimental assays, resulting in new functional annotations for more than 1000 genes.

View Article and Find Full Text PDF

Automated protein function prediction is critical for the annotation of uncharacterized protein sequences, where accurate prediction methods are still required. Recently, deep learning based methods have outperformed conventional algorithms in computer vision and natural language processing due to the prevention of overfitting and efficient training. Here, we propose DEEPred, a hierarchical stack of multi-task feed-forward deep neural networks, as a solution to Gene Ontology (GO) based protein function prediction.

View Article and Find Full Text PDF

Background: The automated prediction of the enzymatic functions of uncharacterized proteins is a crucial topic in bioinformatics. Although several methods and tools have been proposed to classify enzymes, most of these studies are limited to specific functional classes and levels of the Enzyme Commission (EC) number hierarchy. Besides, most of the previous methods incorporated only a single input feature type, which limits the applicability to the wide functional space.

View Article and Find Full Text PDF

The identification of interactions between drugs/compounds and their targets is crucial for the development of new drugs. In vitro screening experiments (i.e.

View Article and Find Full Text PDF

Visualizing large-scale data produced by the high throughput experiments as a biological graph leads to better understanding and analysis. This study describes a customized force-directed layout algorithm, EClerize, for biological graphs that represent pathways in which the nodes are associated with Enzyme Commission (EC) attributes. The nodes with the same EC class numbers are treated as members of the same cluster.

View Article and Find Full Text PDF

Transcriptome experiments are performed to assess protein abundance through mRNA expression analysis. Expression levels of genes vary depending on the experimental conditions and the cell response. Transcriptome data must be diverse and yet comparable in reference to stably expressed genes, even if they are generated from different experiments on the same biological context from various laboratories.

View Article and Find Full Text PDF

Determination of cell signalling behaviour is crucial for understanding the physiological response to a specific stimulus or drug treatment. Current approaches for large-scale data analysis do not effectively incorporate critical topological information provided by the signalling network. We herein describe a novel model- and data-driven hybrid approach, or signal transduction score flow algorithm, which allows quantitative visualization of cyclic cell signalling pathways that lead to ultimate cell responses such as survival, migration or death.

View Article and Find Full Text PDF

Functional protein annotation is an important matter for in vivo and in silico biology. Several computational methods have been proposed that make use of a wide range of features such as motifs, domains, homology, structure and physicochemical properties. There is no single method that performs best in all functional classification problems because information obtained using any of these features depends on the function to be assigned to the protein.

View Article and Find Full Text PDF

Due to the increase in gene expression data sets in recent years, various data mining techniques have been proposed for mining gene expression profiles. However, most of these methods target single gene expression data sets and cannot handle all the available gene expression data in public databases in reasonable amount of time and space. In this paper, we propose a novel framework, bi-k-bi clustering, for finding association rules of gene pairs that can easily operate on large scale and multiple heterogeneous data sets.

View Article and Find Full Text PDF

Automated classification of proteins is indispensable for further in vivo investigation of excessive number of unknown sequences generated by large scale molecular biology techniques. This study describes a discriminative system based on feature space mapping, called subsequence profile map (SPMap) for functional classification of protein sequences. SPMap takes into account the information coming from the subsequences of a protein.

View Article and Find Full Text PDF

Motivation: We designed a general computational kernel for classification problems that require specific motif extraction and search from sequences. Instead of searching for explicit motifs, our approach finds the distribution of implicit motifs and uses as a feature for classification. Implicit motif distribution approach may be used as modus operandi for bioinformatics problems that require specific motif extraction and search, which is otherwise computationally prohibitive.

View Article and Find Full Text PDF