Collagen is one of the most important structural proteins in biology, and its structural hierarchy plays a crucial role in many mechanically important biomaterials. Here, we demonstrate how transformer models can be used to predict, directly from the primary amino acid sequence, the thermal stability of collagen triple helices, measured via the melting temperature . We report two distinct transformer architectures to compare performance. First, we train a small transformer model from scratch, using our collagen data set featuring only 633 sequence-to- pairings. Second, we use a large pretrained transformer model, ProtBERT, and fine-tune it for a particular downstream task by utilizing sequence-to- pairings, using a deep convolutional network to translate natural language processing BERT embeddings into required features. Both the small transformer model and the fine-tuned ProtBERT model have similar values of test data ( = 0.84 vs 0.79, respectively), but the ProtBERT is a much larger pretrained model that may not always be applicable for other biological or biomaterials questions. Specifically, we show that the small transformer model requires only 0.026% of the number of parameters compared to the much larger model but reaches almost the same accuracy for the test set. We compare the performance of both models against 71 newly published sequences for which has been obtained as a validation set and find reasonable agreement, with ProtBERT outperforming the small transformer model. The results presented here are, to our best knowledge, the first demonstration of the use of transformer models for relatively small data sets and for the prediction of specific biophysical properties of interest. We anticipate that the work presented here serves as a starting point for transformer models to be applied to other biophysical problems.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1021/acsbiomaterials.2c00737 | DOI Listing |
Neuroscience
January 2025
Department of Computer Engineering, Faculty of Engineering, Igdir University, 76000, Igdir, Turkey. Electronic address:
Neurological disorders, including cerebral vascular occlusions and strokes, present a major global health challenge due to their high mortality rates and long-term disabilities. Early diagnosis, particularly within the first hours, is crucial for preventing irreversible damage and improving patient outcomes. Although neuroimaging techniques like magnetic resonance imaging (MRI) have advanced significantly, traditional methods often fail to fully capture the complexity of brain lesions.
View Article and Find Full Text PDFPLoS One
January 2025
Harvard extension school, Harvard University, Boston, Massachusetts, United States of America.
To address the limitations of existing stock price prediction models in handling real-time data streams-such as poor scalability, declining predictive performance due to dynamic changes in data distribution, and difficulties in accurately forecasting non-stationary stock prices-this paper proposes an incremental learning-based enhanced Transformer framework (IL-ETransformer) for online stock price prediction. This method leverages a multi-head self-attention mechanism to deeply explore the complex temporal dependencies between stock prices and feature factors. Additionally, a continual normalization mechanism is employed to stabilize the data stream, enhancing the model's adaptability to dynamic changes.
View Article and Find Full Text PDFPrehosp Emerg Care
January 2025
Department of Pediatrics, University of Colorado School of Medicine, Aurora, CO.
Objectives: Abusive head trauma (AHT) is a leading cause of death in young children. Analyses of patient characteristics presenting to Emergency Medical Services (EMS) are often limited to structured data fields. Artificial Intelligence (AI) and Large Language Models (LLM) may identify rare presentations like AHT through factors not found in structured data.
View Article and Find Full Text PDFAnal Methods
January 2025
Engineering Research Center of Intelligent Theranostics Technology and Instruments, Ministry of Education, School of Biomedical Engineering and Informatics, Nanjing Medical University, Nanjing, 211166, China.
The presented research introduces a new method to identify drug-resistant bacteria rapidly with high accuracy using artificial intelligence combined with Multi-angle Dynamic Light Scattering (MDLS) signals and Raman scattering signals. The main research focus is to distinguish methicillin-resistant (MRSA) and methicillin-sensitive (MSSA). First, a microfluidic platform was developed embedded with optical fibers to acquire the MDLS signals of bacteria and Raman scattering signals obtained by using a Raman spectrometer.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!