Motivation: Due to the varying delivery methods of mRNA vaccines, codon optimization plays a critical role in vaccine design to improve the stability and expression of proteins in specific tissues. Considering the many-to-one relationship between synonymous codons and amino acids, the number of mRNA sequences encoding the same amino acid sequence could be enormous. Finding stable and highly expressed mRNA sequences from the vast sequence space using in silico methods can generally be viewed as a path-search problem or a machine translation problem. However, current deep learning-based methods inspired by machine translation may have some limitations, such as recurrent neural networks, which have a weak ability to capture the long-term dependencies of codon preferences.
Results: We develop a BERT-based architecture that uses the cross-attention mechanism for codon optimization. In CodonBERT, the codon sequence is randomly masked with each codon serving as a key and a value. In the meantime, the amino acid sequence is used as the query. CodonBERT was trained on high-expression transcripts from Human Protein Atlas mixed with different proportions of high codon adaptation index codon sequences. The result showed that CodonBERT can effectively capture the long-term dependencies between codons and amino acids, suggesting that it can be used as a customized training framework for specific optimization targets.
Availability And Implementation: CodonBERT is freely available on https://github.com/FPPGroup/CodonBERT.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11226863 | PMC |
http://dx.doi.org/10.1093/bioinformatics/btae330 | DOI Listing |
Bioprocess Biosyst Eng
January 2025
Department of Chemical and Biomolecular Engineering, Korea Advanced Institute of Science and Technology (KAIST), Daejeon, 34141, Korea.
p-Coumaric acid (p-CA), an invaluable phytochemical, has novel bioactivities, including antiproliferative, anxiolytic, and neuroprotective effects, and is the main precursor of various flavonoids, such as caffeic acid, naringenin, and resveratrol. Herein, we report the engineering of Escherichia coli for de novo production of p-CA via the PAL-C4H pathway. As the base strain, we used the E.
View Article and Find Full Text PDFProbiotics Antimicrob Proteins
January 2025
State Key Laboratory of Pathogen and Biosecurity, Academy of Military Medical Sciences, No. 20 Dongda Street, Beijing, 100071, Fengtai District, China.
Human β-defensin (HBD) has been recognized as a promising antimicrobial agent due to its broad-spectrum antimicrobial activity against various pathogens. In our previous work, we engineered a chimeric human β-defensin, designated H4, by fusing human β-defensin 3 and human β-defensin 4, resulting in enhanced antimicrobial activity and salt stability. However, the high cost of chemical synthesis due to the relatively large number of amino acids in H4 has limited its applications.
View Article and Find Full Text PDFJ Immunol Methods
January 2025
ICAR-Indian Veterinary Research Institute, Bangalore, Karnataka 560024, India.
Bluetongue (BT) is a vector-borne viral disease of multiple domestic and wild ruminants across the globe. The VP7 protein of bluetongue virus (BTV) is the major immune-dominant structural protein that is conserved across the BTV serotypes and therefore, targeted for the development of immuno-diagnostics for BT. In this study, full-length recombinant VP7 protein (rVP7) of BTV-1 was expressed in Trochoplusia ni derived insect cells (Tn5) using codon-optimized synthetic gene construct through baculovirus expression system.
View Article and Find Full Text PDFInt J Biol Macromol
January 2025
College of Technology and Engineering, MPUAT, Udaipur, Rajasthan-313001, India. Electronic address:
Lipases, enzymes that perform the hydrolysis of triglycerides into fatty acids and glycerol, present a potential paradigm shift in the realms of food and detergent industries. Their enhanced efficiency, energy conservation and environmentally friendly attributes make them promising substitutes for chemical catalysts. Motivated by this prospect, this present study was targeted on the heterologous expression of a lipase gene, employing E.
View Article and Find Full Text PDFAvian Pathol
January 2025
College of Animal Science and Technology/Veterinary Medicine, Anhui Agricultural University, Hefei, PR China.
Goose astrovirus (GoAstV) has emerged as a significant pathogen affecting the goose industry in China, with GoAstV-2 becoming the dominant genotype since 2017. This study explores the genetic and structural factors underlying the prevalence of GoAstV-2, focusing on codon usage bias, spike protein variability, and structural stability. Phylogenetic and effective population size analyses revealed that GoAstV-2 experienced rapid expansion between 2017 and 2018, followed by population stabilization.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!