Complex Intell Systems
January 2022
Graph-based algorithms are known to be effective approaches to semi-supervised learning. However, there has been relatively little work on extending these algorithms to the multi-label classification case. We derive an extension of the Manifold Regularization algorithm to multi-label classification, which is significantly simpler than the general Vector Manifold Regularization approach.
View Article and Find Full Text PDFAccurate identification of strand residues aids prediction and analysis of numerous structural and functional aspects of proteins. We propose a sequence-based predictor, BETArPRED, which improves prediction of strand residues and β-strand segments. BETArPRED uses a novel design that accepts strand residues predicted by SSpro and predicts the remaining positions utilizing a logistic regression classifier with nine custom-designed features.
View Article and Find Full Text PDFBackground: Current protocols yield crystals for <30% of known proteins, indicating that automatically identifying crystallizable proteins may improve high-throughput structural genomics efforts. We introduce CRYSTALP2, a kernel-based method that predicts the propensity of a given protein sequence to produce diffraction-quality crystals. This method utilizes the composition and collocation of amino acids, isoelectric point, and hydrophobicity, as estimated from the primary sequence, to generate predictions.
View Article and Find Full Text PDFThe exact mechanisms of prion misfolding and factors that predispose an individual to prion diseases are largely unknown. Our approach to identifying candidate factors in-silico relies on contrasting the C-terminal domain of PrP(C) sequences from two groups of vertebrate species: those that have been found to suffer from prion diseases, and those that have not. We propose that any significant differences between the two groups are candidate factors that may predispose individuals to develop prion disease, which should be further analyzed by wet-lab investigations.
View Article and Find Full Text PDFBiochem Biophys Res Commun
September 2006
Structural class characterizes the overall folding type of a protein or its domain. A number of computational methods have been proposed to predict structural class based on primary sequences; however, the accuracy of these methods is strongly affected by sequence homology. This paper proposes, an ensemble classification method and a compact feature-based sequence representation.
View Article and Find Full Text PDFProtein structural class describes the overall folding type of a protein or its domain. A number of methods were developed to predict protein structural class based on its primary sequence. The homology of the predicted sequences with respect to the training sequences is a key attribute for the prediction performance.
View Article and Find Full Text PDFIEEE Trans Syst Man Cybern B Cybern
February 2006
Business intelligence and bioinformatics applications increasingly require the mining of datasets consisting of millions of data points, or crafting real-time enterprise-level decision support systems for large corporations and drug companies. In all cases, there needs to be an underlying data mining system, and this mining system must be highly scalable. To this end, we describe a new rule learner called DataSqueezer.
View Article and Find Full Text PDFHelicobacter pylori and Campylobacter jejuni have been shown to modify their flagellins with pseudaminic acid (Pse), via O-linkage, while C. jejuni also possesses a general protein glycosylation pathway (Pgl) responsible for the N-linked modification of at least 30 proteins with a heptasaccharide containing 2,4-diacetamido-2,4,6-trideoxy-alpha-D-glucopyranose, a derivative of bacillosamine. To further define the Pse and bacillosamine biosynthetic pathways, we have undertaken functional characterization of UDP-alpha-D-GlcNAc modifying dehydratase/aminotransferase pairs, in particular the H.
View Article and Find Full Text PDFCampylobacter jejuni and Campylobacter coli are the main causes of bacterial diarrhea worldwide, and Helicobacter pylori is known to cause duodenal ulcers. In all of these pathogenic organisms, the flagellin proteins are heavily glycosylated with a 2-keto-3-deoxy acid, pseudaminic acid (5,7-diacetamido-3,5,7,9-tetradeoxy-L-glycero-L-manno-nonulosonic acid). The presence of pseudaminic acid is required for the proper development of the flagella and is thereby necessary for motility in, and invasion of, the host.
View Article and Find Full Text PDFRecombinant lysine:N(6)-hydroxylase, rIucD, catalyzes the hydroxylation of L-lysine to its N(6)-hydroxy derivative, with NADPH and FAD serving as cofactors in the reaction. The five cysteine residues present in rIucD can be replaced, individually or in combination, with alanine without effecting a major change in the thermal stability, the affinity for L-lysine and FAD, as well as the k(cat) for mono-oxygenase activity of the protein. However, when the susceptibility to modification by either 5,5'-dithiobis(2-nitrobenzoic acid) (DTNB) or 2,6-dichlorophenol indophenol (DPIP) serves as the criterion for monitoring conformational change(s) in rIucD and its muteins, Cys146-->Ala and Cys166-->Ala substitutions are found to induce an enhancement in the reactivity of one of the protein's remaining cysteine residues with concomitant diminution of mono-oxygenase function.
View Article and Find Full Text PDF