The availability of electronic health records creates fertile ground for developing computational models of various medical conditions. We present a new approach for detecting and analyzing patients with unexpected responses to treatment, building on machine learning and statistical methodology. Given a specific patient, we compute a statistical score for the deviation of the patient's response from responses observed in other patients having similar characteristics and medication regimens.
View Article and Find Full Text PDFAMIA Jt Summits Transl Sci Proc
August 2015
The availability of electronic health records creates fertile ground for developing computational models for various medical conditions. Using machine learning, we can detect patients with unexpected responses to treatment and provide statistical testing and visualization tools to help further analysis. The new system was developed to help researchers uncover new features associated with reduced response to treatment, and to aid physicians in identifying patients that are not responding to treatment as expected and hence deserve more attention.
View Article and Find Full Text PDFIssues of publication bias, lack of replicability, and false discovery have long plagued the genetics community. Proper utilization of public and shared data resources presents an opportunity to ameliorate these problems. We present an approach to public database management that we term Quality Preserving Database (QPD).
View Article and Find Full Text PDFStud Health Technol Inform
December 2011
The new generation of health information standards, where the syntax and semantics of the content is explicitly formalized, allows for interoperability in healthcare scenarios and analysis in clinical research settings. Studies involving clinical and genomic data include accumulating knowledge as relationships between genotypic and phenotypic information as well as associations within the genomic and clinical worlds. Some involve analysis results targeted at a specific disease; others are of a predictive nature specific to a patient and may be used by decision support applications.
View Article and Find Full Text PDFIEEE/ACM Trans Comput Biol Bioinform
January 2012
The common scenario in computational biology in which a community of researchers conduct multiple statistical tests on one shared database gives rise to the multiple hypothesis testing problem. Conventional procedures for solving this problem control the probability of false discovery by sacrificing some of the power of the tests. We suggest a scheme for controlling false discovery without any power loss by adding new samples for each use of the database and charging the user with the expenses.
View Article and Find Full Text PDFBackground: Analysis of the viral genome for drug resistance mutations is state-of-the-art for guiding treatment selection for human immunodeficiency virus type 1 (HIV-1)-infected patients. These mutations alter the structure of viral target proteins and reduce or in the worst case completely inhibit the effect of antiretroviral compounds while maintaining the ability for effective replication. Modern anti-HIV-1 regimens comprise multiple drugs in order to prevent or at least delay the development of resistance mutations.
View Article and Find Full Text PDFMotivation: Optimizing HIV therapies is crucial since the virus rapidly develops mutations to evade drug pressure. Recent studies have shown that genotypic information might not be sufficient for the design of therapies and that other clinical and demographical factors may play a role in therapy failure. This study is designed to assess the improvement in prediction achieved when such information is taken into account.
View Article and Find Full Text PDFProteins fold into a well-defined structure as a result of the collapse of the polypeptide chain, while transient protein-complex formation mainly is a result of binding of two folded individual monomers. Therefore, a protein-protein interface does not resemble the core of monomeric proteins, but has a more polar nature. Here, we address the question of whether the physico-chemical characteristics of intraprotein versus interprotein bonds differ, or whether interfaces are different from folded monomers only in the preference for certain types of interactions.
View Article and Find Full Text PDFThe development of bioinformatic tools by individual labs results in the abundance of parallel programs for the same task. For example, identification of binding site regions between interacting proteins is done using: ProMate, WHISCY, PPI-Pred, PINUP and others. All servers first identify unique properties of binding sites and then incorporate them into a predictor.
View Article and Find Full Text PDFThe formation of specific protein interactions plays a crucial role in most, if not all, biological processes, including signal transduction, cell regulation, the immune response and others. Recent advances in our understanding of the molecular architecture of protein-protein binding sites, which facilitates such diversity in binding affinity and specificity, are enabling us to address key questions. What is the amino acid composition of binding sites? What are interface hotspots? How are binding sites organized? What are the differences between tight and weak interacting complexes? How does water contribute to binding? Can the knowledge gained be translated into protein design? And does a universal code for binding exist, or is it the architecture and chemistry of the interface that enable diverse but specific binding solutions?
View Article and Find Full Text PDFIs the whole protein surface available for interaction with other proteins, or are specific sites pre-assigned according to their biophysical and structural character? And if so, is it possible to predict the location of the binding site from the surface properties? These questions are answered quantitatively by probing the surfaces of proteins using spheres of radius of 10 A on a database (DB) of 57 unique, non-homologous proteins involved in heteromeric, transient protein-protein interactions for which the structures of both the unbound and bound states were determined. In structural terms, we found the binding site to have a preference for beta-sheets and for relatively long non-structured chains, but not for alpha-helices. Chemically, aromatic side-chains show a clear preference for binding sites.
View Article and Find Full Text PDFDocking algorithms produce many possible structures of a protein-protein complex. In most cases some of them resemble the correct structure within an r.m.
View Article and Find Full Text PDF