AI Article Synopsis

  • Codon optimization is crucial in mRNA vaccine design because it enhances protein stability and expression in targeted tissues, given the complexity of synonymous codons.
  • Researchers developed a BERT-based model called CodonBERT, which uses a cross-attention mechanism to improve codon optimization by capturing long-term relationships between codons and amino acids.
  • CodonBERT is freely available for use and can be customized for specific optimization goals, enhancing the efficiency of mRNA sequence selection.

Article Abstract

Motivation: Due to the varying delivery methods of mRNA vaccines, codon optimization plays a critical role in vaccine design to improve the stability and expression of proteins in specific tissues. Considering the many-to-one relationship between synonymous codons and amino acids, the number of mRNA sequences encoding the same amino acid sequence could be enormous. Finding stable and highly expressed mRNA sequences from the vast sequence space using in silico methods can generally be viewed as a path-search problem or a machine translation problem. However, current deep learning-based methods inspired by machine translation may have some limitations, such as recurrent neural networks, which have a weak ability to capture the long-term dependencies of codon preferences.

Results: We develop a BERT-based architecture that uses the cross-attention mechanism for codon optimization. In CodonBERT, the codon sequence is randomly masked with each codon serving as a key and a value. In the meantime, the amino acid sequence is used as the query. CodonBERT was trained on high-expression transcripts from Human Protein Atlas mixed with different proportions of high codon adaptation index codon sequences. The result showed that CodonBERT can effectively capture the long-term dependencies between codons and amino acids, suggesting that it can be used as a customized training framework for specific optimization targets.

Availability And Implementation: CodonBERT is freely available on https://github.com/FPPGroup/CodonBERT.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11226863PMC
http://dx.doi.org/10.1093/bioinformatics/btae330DOI Listing

Publication Analysis

Top Keywords

codon optimization
12
bert-based architecture
8
codon
8
cross-attention mechanism
8
codons amino
8
amino acids
8
mrna sequences
8
amino acid
8
acid sequence
8
machine translation
8

Similar Publications

p-Coumaric acid (p-CA), an invaluable phytochemical, has novel bioactivities, including antiproliferative, anxiolytic, and neuroprotective effects, and is the main precursor of various flavonoids, such as caffeic acid, naringenin, and resveratrol. Herein, we report the engineering of Escherichia coli for de novo production of p-CA via the PAL-C4H pathway. As the base strain, we used the E.

View Article and Find Full Text PDF

Recombinant Expression of a New Antimicrobial Peptide Composed of hBD-3 and hBD-4 in Escherichia coli and Investigation of Its Activity Against Multidrug-Resistant Bacteria.

Probiotics Antimicrob Proteins

January 2025

State Key Laboratory of Pathogen and Biosecurity, Academy of Military Medical Sciences, No. 20 Dongda Street, Beijing, 100071, Fengtai District, China.

Human β-defensin (HBD) has been recognized as a promising antimicrobial agent due to its broad-spectrum antimicrobial activity against various pathogens. In our previous work, we engineered a chimeric human β-defensin, designated H4, by fusing human β-defensin 3 and human β-defensin 4, resulting in enhanced antimicrobial activity and salt stability. However, the high cost of chemical synthesis due to the relatively large number of amino acids in H4 has limited its applications.

View Article and Find Full Text PDF

Bluetongue (BT) is a vector-borne viral disease of multiple domestic and wild ruminants across the globe. The VP7 protein of bluetongue virus (BTV) is the major immune-dominant structural protein that is conserved across the BTV serotypes and therefore, targeted for the development of immuno-diagnostics for BT. In this study, full-length recombinant VP7 protein (rVP7) of BTV-1 was expressed in Trochoplusia ni derived insect cells (Tn5) using codon-optimized synthetic gene construct through baculovirus expression system.

View Article and Find Full Text PDF

Lipases, enzymes that perform the hydrolysis of triglycerides into fatty acids and glycerol, present a potential paradigm shift in the realms of food and detergent industries. Their enhanced efficiency, energy conservation and environmentally friendly attributes make them promising substitutes for chemical catalysts. Motivated by this prospect, this present study was targeted on the heterologous expression of a lipase gene, employing E.

View Article and Find Full Text PDF

Goose astrovirus (GoAstV) has emerged as a significant pathogen affecting the goose industry in China, with GoAstV-2 becoming the dominant genotype since 2017. This study explores the genetic and structural factors underlying the prevalence of GoAstV-2, focusing on codon usage bias, spike protein variability, and structural stability. Phylogenetic and effective population size analyses revealed that GoAstV-2 experienced rapid expansion between 2017 and 2018, followed by population stabilization.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!