Protein subcellular localization is an important topic in proteomics since it is related to a protein׳s overall function, helps in the understanding of metabolic pathways, and in drug design and discovery. In this paper, a basic approximation technique from natural language processing called the linear interpolation smoothing model is applied for predicting protein subcellular localizations. The proposed approach extracts features from syntactical information in protein sequences to build probabilistic profiles using dependency models, which are used in linear interpolation to determine how likely is a sequence to belong to a particular subcellular location. This technique builds a statistical model based on maximum likelihood. It is able to deal effectively with high dimensionality that hinders other traditional classifiers such as Support Vector Machines or k-Nearest Neighbours without sacrificing performance. This approach has been evaluated by predicting subcellular localizations of Gram positive and Gram negative bacterial proteins.

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.jtbi.2015.08.020DOI Listing

Publication Analysis

Top Keywords

linear interpolation
12
subcellular localization
8
gram positive
8
positive gram
8
gram negative
8
negative bacterial
8
bacterial proteins
8
interpolation smoothing
8
smoothing model
8
protein subcellular
8

Similar Publications

Background: Previous research has suggested that time-in-target range (TTR) for systolic blood pressure (SBP) was associated with adverse cardiovascular events, but real-world data studies remain limited.

Objectives: The purpose of this study was to estimate the SBP-TTR associated with cardiovascular disease (CVD) and premature death among the employed individuals with hypertension.

Methods: This study included 9,552 participants from the workplace hypertension management program initiated by the Kailuan Study in 2009.

View Article and Find Full Text PDF

Geographical impact on the distribution of polycyclic aromatic hydrocarbons (PAHs) in hilly terrain topsoil: A case study at Chongqing, SW, China.

J Hazard Mater

January 2025

Key Laboratory of Sedimentary Basin and Oil and Gas Resources, China Geological Survey, Ministry of Land and Resources & Chengdu Center of Geological Survey, Chengdu 610081, China; College of Materials and Chemistry& Chemical Engineering, Chengdu University of Technology, Chengdu 610059, China. Electronic address:

The distribution and transport of polycyclic aromatic hydrocarbons (PAHs) in urban environments are influenced by both anthropogenic sources and natural landscape features. While previous research has primarily focused on human activities as drivers of PAH pollution, the role of terrain-especially in cities with complex topographies-remains underexplored. To investigate the effect of terrain features on PAH distribution and transport, we analyzed topsoil samples evenly distributed in Chongqing, a city with hilly terrain (elevation: 48-2300 m).

View Article and Find Full Text PDF

Foliage plants have strict requirements for their growing environment, and timely and accurate soil temperature forecasts are crucial for their growth and health. Soil temperature exhibits by its non-linear variations, time lags, and coupling with multiple variables, making precise short-term multi-step forecasts challenging. To address this issue, this study proposes a multivariate forecasting method suitable for soil temperature forecasting.

View Article and Find Full Text PDF

The class imbalance problem is one of the difficult factors affecting the performance of traditional classifiers. The oversampling technique is the most common way to solve the class imbalance problem. They alleviate the performance impact of the class imbalance problem on traditional machine learning by augmenting minority instance feature representation.

View Article and Find Full Text PDF

Understanding subsurface temperature variations is crucial for assessing material degradation in underground structures. This study maps subsurface temperatures across the contiguous United States for depths from 50 to 3500 m, comparing linear interpolation, gradient boosting (LightGBM), neural networks, and a novel hybrid approach combining linear interpolation with LightGBM. Results reveal heterogeneous temperature patterns both horizontally and vertically.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!