Observations of comorbidity in heart diseases, including cardiac dysfunction (CD) are increasing, including and cognitive impairment, such as Alzheimer's disease and dementia (AD/D). This comorbidity might be due to a pleiotropic effect of genetic variants shared between CD and AD/D. Here, we validated comorbidity of CD and AD/D based on diagnostic records from millions of patients in Korea and the University of California, San Francisco Medical Center (odds ratio 11.
View Article and Find Full Text PDFThe homeodomain is found in hundreds of transcription factors that play roles in fate determination via cell-autonomous regulation of gene expression. However, some homeodomain-containing proteins (HPs) are thought to be secreted and penetrate neighboring cells to affect the recipient cell fate. To determine whether this is a general characteristic of HPs, we carried out a large-scale validation for intercellular transfer of HPs.
View Article and Find Full Text PDFBackground: Exome sequencing has been emerged as a primary method to identify detailed sequence variants associated with complex diseases including Crohn's disease in the protein-coding regions of human genome. However, constructing an interpretable model for exome sequencing data is challenging because of the huge diversity of genomic variation. In addition, it has been known that utilizing biologically relevant information in a rigorous manner is essential for effectively extracting disease-associated information.
View Article and Find Full Text PDFBMC Bioinformatics
February 2016
Background: Elucidating the cooperative mechanism of interconnected residues is an important component toward understanding the biological function of a protein. Coevolution analysis has been developed to model the coevolutionary information reflecting structural and functional constraints. Recently, several methods have been developed based on a probabilistic graphical model called the Markov random field (MRF), which have led to significant improvements for coevolution analysis; however, thus far, the performance of these models has mainly been assessed by focusing on the aspect of protein structure.
View Article and Find Full Text PDFSince the cooperative mechanism between interconnected residues plays a critical role in protein functions, the detection of coevolving residues is important for studying various biological functions of proteins. In this work, we developed a new correlated mutation analysis method that shows substantially better prediction accuracy than all other methods. More importantly, the prediction accuracy of our new method is insensitive to the characteristics of the multiple sequence alignments (MSAs) from which the correlated mutation scores are calculated.
View Article and Find Full Text PDFPac Symp Biocomput
December 2013
The evolution of intrinsically disordered proteins has been studied primarily by focusing on evolutionary changes at an individual position such as substitution and conservation, but the evolutionary association between disordered residues has not been comprehensively investigated. Here, we analyze the distribution of residue-residue coevolution for disordered proteins. We reveal that the degree of coevolved residues significantly decreases in disordered regions regardless of the sequence propensity, and the degree distribution of coevolved and conserved residues exclusively differs in each functional category.
View Article and Find Full Text PDFBMC Bioinformatics
April 2010
Background: Although both conservation and correlated mutation (CM) are important information reflecting the different sorts of context in multiple sequence alignment, most of alignment methods use sequence profiles that only represent conservation. There is no general way to represent correlated mutation and incorporate it with sequence alignment yet.
Methods: We develop a novel method, CM profile, to represent correlated mutation as the spectral feature derived by using linear predictive coding where correlated mutations among different positions are represented by a fixed number of values.
BMC Bioinformatics
December 2007
Background: For successful protein structure prediction by comparative modeling, in addition to identifying a good template protein with known structure, obtaining an accurate sequence alignment between a query protein and a template protein is critical. It has been known that the alignment accuracy can vary significantly depending on our choice of various alignment parameters such as gap opening penalty and gap extension penalty. Because the accuracy of sequence alignment is typically measured by comparing it with its corresponding structure alignment, there is no good way of evaluating alignment accuracy without knowing the structure of a query protein, which is obviously not available at the time of structure prediction.
View Article and Find Full Text PDFMotivation: Currently, the most accurate fold-recognition method is to perform profile-profile alignments and estimate the statistical significances of those alignments by calculating Z-score or E-value. Although this scheme is reliable in recognizing relatively close homologs related at the family level, it has difficulty in finding the remote homologs that are related at the superfamily or fold level.
Results: In this paper, we present an alternative method to estimate the significance of the alignments.