Introduction: Scoliosis is a pathological spine structure deformation, predominantly classified as "idiopathic" due to its unknown etiology. However, it has been suggested that scoliosis may be linked to polygenic backgrounds. It is crucial to identify potential Adolescent Idiopathic Scoliosis (AIS)-related genetic backgrounds before scoliosis onset.
Methods: The present study was designed to intelligently parse, decompose and predict AIS-related variants in ClinVar database. Possible AIS-related variant records downloaded from ClinVar were parsed for various labels, decomposed for Dinucleotide Compositional Representation (DCR) and other traits, screened for high-risk genes with statistical analysis, and then learned intelligently with deep learning to predict high-risk AIS genotypes.
Results: Results demonstrated that the present framework is composed of all technical sections of data parsing, scoliosis genotyping, genome encoding, machine learning (ML)/deep learning (DL) and scoliosis genotype predicting. 58,000 scoliosis-related records were automatically parsed and statistically analyzed for high-risk genes and genotypes, such as , and . All variant genes were decomposed for DCR and other traits. Unsupervised ML indicated marked inter-group separation and intra-group clustering of the DCR of , or for the five types of variants (Pathogenic, Pathogeniclikely, Benign, Benignlikely and Uncertain). A FBN1 DCR-based Convolutional Neural Network (CNN) was trained for Pathogenic and Benign/ Benignlikely variants performed accurately on validation data and predicted 179 high-risk scoliosis variants. The trained predictor was interpretable for the similar distribution of variant types and variant locations within 2D structure units in the predicted 3D structure of .
Discussion: In summary, scoliosis risk is predictable by deep learning based on genomic decomposed features of DCR. DCR-based classifier has predicted more scoliosis risk variants in ClinVar database. DCR-based models would be promising for genotype-to-phenotype prediction for more disease types.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11534654 | PMC |
http://dx.doi.org/10.3389/fgene.2024.1492226 | DOI Listing |
Proc Natl Acad Sci U S A
January 2025
Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA 02139.
Protein language models (PLMs) have demonstrated impressive success in modeling proteins. However, general-purpose "foundational" PLMs have limited performance in modeling antibodies due to the latter's hypervariable regions, which do not conform to the evolutionary conservation principles that such models rely on. In this study, we propose a transfer learning framework called Antibody Mutagenesis-Augmented Processing (AbMAP), which fine-tunes foundational models for antibody-sequence inputs by supervising on antibody structure and binding specificity examples.
View Article and Find Full Text PDFProc Natl Acad Sci U S A
January 2025
Department of Physics, The Hong Kong University of Science and Technology, Hong Kong, China.
Dissolution of CO in water followed by the subsequent hydrolysis reactions is of great importance to the global carbon cycle, and carbon capture and storage. Despite numerous previous studies, the reactions are still not fully understood at the atomistic scale. Here, we combined ab initio molecular dynamics (AIMD) simulations with Markov state models to elucidate the reaction mechanisms and kinetics of CO in supercritical water both in the bulk and nanoconfined states.
View Article and Find Full Text PDFJMIR Med Inform
January 2025
Department of Science and Education, Shenzhen Baoan Women's and Children's Hospital, Shenzhen, China.
Background: Large language models (LLMs) have been proposed as valuable tools in medical education and practice. The Chinese National Nursing Licensing Examination (CNNLE) presents unique challenges for LLMs due to its requirement for both deep domain-specific nursing knowledge and the ability to make complex clinical decisions, which differentiates it from more general medical examinations. However, their potential application in the CNNLE remains unexplored.
View Article and Find Full Text PDFPLoS Comput Biol
January 2025
Department of Computer Science, Colorado State University, Fort Collins, Colorado, United States of America.
Complex deep learning models trained on very large datasets have become key enabling tools for current research in natural language processing and computer vision. By providing pre-trained models that can be fine-tuned for specific applications, they enable researchers to create accurate models with minimal effort and computational resources. Large scale genomics deep learning models come in two flavors: the first are large language models of DNA sequences trained in a self-supervised fashion, similar to the corresponding natural language models; the second are supervised learning models that leverage large scale genomics datasets from ENCODE and other sources.
View Article and Find Full Text PDFPLoS One
January 2025
North China Institute of Aerospace Engineering, Langfang, China.
As the global economy expands, waterway transportation has become increasingly crucial to the logistics sector. This growth presents both significant challenges and opportunities for enhancing the accuracy of ship detection and tracking through the application of artificial intelligence. This article introduces a multi-object tracking system designed for unmanned aerial vehicles (UAVs), utilizing the YOLOv7 and Deep SORT algorithms for detection and tracking, respectively.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!