Publications by authors named "Colwell L"

The Pfam protein families database is a comprehensive collection of protein domains and families used for genome annotation and protein structure and function analysis (https://www.ebi.ac.

View Article and Find Full Text PDF

Motivation: In this article, we propose a method for finding similarities between Pfam families based on the pre-trained neural network ProtENN2. We use the model ProtENN2 per-residue embeddings to produce new high-dimensional per-family embeddings and develop an approach for calculating inter-family similarity scores based on these embeddings, and evaluate its predictions using structure comparison.

Results: We apply our method to Pfam annotation by refining clan membership for Pfam families, suggesting both new members of existing clans and potential new clans for future Pfam releases.

View Article and Find Full Text PDF

Several peptide dual agonists of the human glucagon receptor (GCGR) and the glucagon-like peptide-1 receptor (GLP-1R) are in development for the treatment of type 2 diabetes, obesity and their associated complications. Candidates must have high potency at both receptors, but it is unclear whether the limited experimental data available can be used to train models that accurately predict the activity at both receptors of new peptide variants. Here we use peptide sequence data labelled with in vitro potency at human GCGR and GLP-1R to train several models, including a deep multi-task neural-network model using multiple loss optimization.

View Article and Find Full Text PDF

AlphaFold2 (ref. ) has revolutionized structural biology by accurately predicting single structures of proteins. However, a protein's biological function often depends on multiple conformational substates, and disease-causing point mutations often cause population changes within these substates.

View Article and Find Full Text PDF

This article aims to draw attention to increasing genericism in nurse education in the United Kingdom, which sees less specialist mental health education for mental health nursing students and offers opposition to such direction. In 2018, the Nursing and Midwifery Council produced the 'Future Nurse' standards which directed changes to pre-registration nurse education. This led to dissatisfaction from many mental health nurses, specifically regarding reduced mental health content for students studying mental health nursing.

View Article and Find Full Text PDF

Paracentral acute middle maculopathy (PAMM) can result from decreased perfusion to retinal and choroidal circulations meeting in the inner nuclear layer (INL) and outer plexiform layer (OPL). Although ischemia in these layers is rare, certain procedures place patients at risk for possible events, such as cardiopulmonary bypass, coronary angiography, and endovascular coil embolization. We report the case of a 70-year-old woman with a sudden-onset left paracentral scotoma following internal carotid endovascular repair.

View Article and Find Full Text PDF
Article Synopsis
  • This study evaluates the performance of ChatGPT-4 in the Ophthalmology Knowledge Assessment Program (OKAP) exam compared to ChatGPT-3.5.
  • Both models answered 180 practice questions, with ChatGPT-4 scoring significantly better (81% vs. 57%).
  • The results indicate that ChatGPT-4 has potential for use in ophthalmology education and clinical support, prompting further research on AI integration in medical training.
View Article and Find Full Text PDF

Predicting the function of a protein from its amino acid sequence is a long-standing challenge in bioinformatics. Traditional approaches use sequence alignment to compare a query sequence either to thousands of models of protein families or to large databases of individual protein sequences. Here we introduce ProteInfer, which instead employs deep convolutional neural networks to directly predict a variety of protein functions - Enzyme Commission (EC) numbers and Gene Ontology (GO) terms - directly from an unaligned amino acid sequence.

View Article and Find Full Text PDF

Here we detail the LandscapeFold secondary structure prediction algorithm and how it is used. The algorithm was previously described and tested in (Kimchi O et al., Biophys J 117(3):520-532, 2019), though it was not named there.

View Article and Find Full Text PDF

Over the last 25 years, biology has entered the genomic era and is becoming a science of 'big data'. Most interpretations of genomic analyses rely on accurate functional annotations of the proteins encoded by more than 500 000 genomes sequenced to date. By different estimates, only half the predicted sequenced proteins carry an accurate functional annotation, and this percentage varies drastically between different organismal lineages.

View Article and Find Full Text PDF

Collagen fibrils are the major constituents of the extracellular matrix, which provides structural support to vertebrate connective tissues. It is widely assumed that the superstructure of collagen fibrils is encoded in the primary sequences of the molecular building blocks. However, the interplay between large-scale architecture and small-scale molecular interactions makes the ab initio prediction of collagen structure challenging.

View Article and Find Full Text PDF

Understanding the relationship between amino acid sequence and protein function is a long-standing challenge with far-reaching scientific and translational implications. State-of-the-art alignment-based techniques cannot predict function for one-third of microbial protein sequences, hampering our ability to exploit data from diverse organisms. Here, we train deep learning models to accurately predict functional annotations for unaligned amino acid sequences across rigorous benchmark assessments built from the 17,929 families of the protein families database Pfam.

View Article and Find Full Text PDF

Machine-learning-guided protein design is rapidly emerging as a strategy to find high-fitness multi-mutant variants. In this issue of Cell Systems, Wittman et al. analyze the impact of design decisions for machine-learning-assisted directed evolution (MLDE) on its ability to navigate a fitness landscape and reliably find global optima.

View Article and Find Full Text PDF

Background: Numerous studies have confirmed the association of ambient temperature and air pollution with a higher risk of morbidities, yet few have addressed their effect on the ocular system. The purpose of this study was to assess the association between temperature, air pollution, and emergency room visits for conjunctivitis.

Methods: In this case-crossover study, the records of all emergency room visits to Soroka University Medical Center (SUMC) from 2009 to 2014 were reviewed for patients with conjunctivitis.

View Article and Find Full Text PDF

Modern experimental technologies can assay large numbers of biological sequences, but engineered protein libraries rarely exceed the sequence diversity of natural protein families. Machine learning (ML) models trained directly on experimental data without biophysical modeling provide one route to accessing the full potential diversity of engineered proteins. Here we apply deep learning to design highly diverse adeno-associated virus 2 (AAV2) capsid protein variants that remain viable for packaging of a DNA payload.

View Article and Find Full Text PDF

A central challenge in expanding the genetic code of cells to incorporate noncanonical amino acids into proteins is the scalable discovery of aminoacyl-tRNA synthetase (aaRS)-tRNA pairs that are orthogonal in their aminoacylation specificity. Here we computationally identify candidate orthogonal tRNAs from millions of sequences and develop a rapid, scalable approach-named tRNA Extension (tREX)-to determine the in vivo aminoacylation status of tRNAs. Using tREX, we test 243 candidate tRNAs in Escherichia coli and identify 71 orthogonal tRNAs, covering 16 isoacceptor classes, and 23 functional orthogonal tRNA-cognate aaRS pairs.

View Article and Find Full Text PDF

Collagen fibrils are central to the molecular organization of the extracellular matrix (ECM) and to defining the cellular microenvironment. Glycation of collagen fibrils is known to impact on cell adhesion and migration in the context of cancer and in model studies, glycation of collagen molecules has been shown to affect the binding of other ECM components to collagen. Here we use TEM to show that ribose-5-phosphate (R5P) glycation of collagen fibrils - potentially important in the microenvironment of actively dividing cells, such as cancer cells - disrupts the longitudinal ordering of the molecules in collagen fibrils and, using KFM and FLiM, that R5P-glycated collagen fibrils have a more negative surface charge than unglycated fibrils.

View Article and Find Full Text PDF

Collagen consists of three peptides twisted together through a periodic array of hydrogen bonds. Here we use this as inspiration to find design rules for programmed specific interactions for self-assembling synthetic collagenlike triple helices, starting from disordered configurations. The assembly generically nucleates defects in the triple helix, the characteristics of which can be manipulated by spatially varying the enthalpy of helix formation.

View Article and Find Full Text PDF