Motivation: Convolutional neural networks (CNN) have outperformed conventional methods in modeling the sequence specificity of DNA-protein binding. Yet inappropriate CNN architectures can yield poorer performance than simpler models. Thus an in-depth understanding of how to match CNN architecture to a given task is needed to fully harness the power of CNNs for computational biology applications.
Results: We present a systematic exploration of CNN architectures for predicting DNA sequence binding using a large compendium of transcription factor datasets. We identify the best-performing architectures by varying CNN width, depth and pooling designs. We find that adding convolutional kernels to a network is important for motif-based tasks. We show the benefits of CNNs in learning rich higher-order sequence features, such as secondary motifs and local sequence context, by comparing network performance on multiple modeling tasks ranging in difficulty. We also demonstrate how careful construction of sequence benchmark datasets, using approaches that control potentially confounding effects like positional or motif strength bias, is critical in making fair comparisons between competing methods. We explore how to establish the sufficiency of training data for these learning tasks, and we have created a flexible cloud-based framework that permits the rapid exploration of alternative neural network architectures for problems in computational biology.
Availability And Implementation: All the models analyzed are available at http://cnn.csail.mit.edu
Contact: gifford@mit.edu
Supplementary Information: Supplementary data are available at Bioinformatics online.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4908339 | PMC |
http://dx.doi.org/10.1093/bioinformatics/btw255 | DOI Listing |
BMC Bioinformatics
December 2024
College of Computer Science and Technology, Inner Mongolia Minzu University, Tongliao, 028000, China.
As a heterogeneous disease, prostate cancer (PCa) exhibits diverse clinical and biological features, which pose significant challenges for early diagnosis and treatment. Metabolomics offers promising new approaches for early diagnosis, treatment, and prognosis of PCa. However, metabolomics data are characterized by high dimensionality, noise, variability, and small sample sizes, presenting substantial challenges for classification.
View Article and Find Full Text PDFNan Fang Yi Ke Da Xue Xue Bao
December 2024
Department of Radiology, Third Affiliated Hospital of Southern Medical University (Academy of Orthopedics of Guangdong Province), Guangzhou 510630, China.
Methods: We retrospectively collected CT scan data from 276 patients with pathologically confirmed primary bone tumors from 4 medical centers in Guangdong Province between January, 2010 and August, 2021. A convolutional neural network (CNN) was employed as the deep learning architecture. The optimal baseline deep learning model (R-Net) was determined through transfer learning, and an optimized model (S-Net) was obtained through algorithmic improvements.
View Article and Find Full Text PDFJ Neuropathol Exp Neurol
December 2024
Department of Pathology and Laboratory Medicine, University of California Davis, Sacramento, CA, United States.
Microinfarcts and microhemorrhages are characteristic lesions of cerebrovascular disease. Although multiple studies have been published, there is no one universal standard criteria for the neuropathological assessment of cerebrovascular disease. In this study, we propose a novel application of machine learning in the automated screening of microinfarcts and microhemorrhages.
View Article and Find Full Text PDFGenomics Proteomics Bioinformatics
December 2024
Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan 430070, China.
Chromatin compartmentalization and epigenomic modification are crucial in cell differentiation and diseases development. However, precise mapping of chromatin compartmental patterns requires Hi-C or Micro-C data at high sequencing depth. Exploring the systematic relationship between epigenomic modifications and compartmental patterns remains challenging.
View Article and Find Full Text PDFAnn Ital Chir
December 2024
Department of Colorectal Surgery, Hubei Provincial Hospital of Traditional Chinese Medicine Affiliated to Hubei University of Chinese Medicine, 430071 Wuhan, Hubei, China.
Aim: Anorectal diseases, often requiring surgical intervention and careful post-operative wound management, pose substantial challenges in healthcare. This study presents a novel application of artificial intelligence, specifically machine learning, aimed at improving the classification and analysis of post-surgical wound images. By doing so, it seeks to enhance patient outcomes through personalized and optimized wound care strategies.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!