N-terminal coding sequence (NCS) influences gene expression by impacting the translation initiation rate. The NCS optimization problem is to find an NCS that maximizes gene expression. The problem is important in genetic engineering. However, current methods for NCS optimization such as rational design and statistics-guided approaches are labor-intensive yield only relatively small improvements. This paper introduces a deep learning/synthetic biology codesigned few-shot training workflow for NCS optimization. Our method utilizes -nearest encoding followed by word2vec to encode the NCS, then performs feature extraction using attention mechanisms, before constructing a time-series network for predicting gene expression intensity, and finally a direct search algorithm identifies the optimal NCS with limited training data. We took green fluorescent protein (GFP) expressed by as a reporting protein of NCSs, and employed the fluorescence enhancement factor as the metric of NCS optimization. Within just six iterative experiments, our model generated an NCS (MLD) that increased average GFP expression by 5.41-fold, outperforming the state-of-the-art NCS designs. Extending our findings beyond GFP, we showed that our engineered NCS (MLD) can effectively boost the production of N-acetylneuraminic acid by enhancing the expression of the crucial rate-limiting gene, demonstrating its practical utility. We have open-sourced our NCS expression database and experimental procedures for public use.

Download full-text PDF

Source
http://dx.doi.org/10.1021/acssynbio.4c00371DOI Listing

Publication Analysis

Top Keywords

gene expression
16
ncs optimization
16
ncs
12
n-terminal coding
8
ncs mld
8
expression
7
gene
5
integrating deep
4
deep learning
4
learning synthetic
4

Similar Publications

Pathway analysis plays a critical role in bioinformatics, enabling researchers to identify biological pathways associated with various conditions by analyzing gene expression data. However, the rise of large, multi-center datasets has highlighted limitations in traditional methods like Over-Representation Analysis (ORA) and Functional Class Scoring (FCS), which struggle with low signal-to-noise ratios (SNR) and large sample sizes. To tackle these challenges, we use a deep learning-based classification method, Gene PointNet, and a novel $P$-value computation approach leveraging the confusion matrix to address pathway analysis tasks.

View Article and Find Full Text PDF

Purpose: After failing primary and secondary hormonal therapy, castration-resistant and neuroendocrine prostate cancer metastatic to the bone is invariably lethal, although treatment with docetaxel and carboplatin can modestly improve survival. Therefore, agents targeting biologically relevant pathways in PCa and potentially synergizing with docetaxel and carboplatin in inhibiting bone metastasis growth are urgently needed.

Experimental Design: Phosphorylated (activated) AXL expression in human prostate cancer bone metastases was assessed by immunohistochemical staining.

View Article and Find Full Text PDF

Spatial transcriptomics enhances our understanding of cellular organization by mapping gene expression data to precise tissue locations. Here, we present a protocol for using weighted ensemble method for spatial transcriptomics (WEST), which uses ensemble techniques to boost the robustness and accuracy of existing algorithms. We describe steps for preprocessing data, obtaining embeddings from individual algorithms, and ensemble integrating all embeddings as a similarity matrix.

View Article and Find Full Text PDF
Article Synopsis
  • Primary ciliary dyskinesia (PCD) is a rare genetic disorder linked to chronic respiratory issues, infertility, and problems with body asymmetry, primarily caused by mutations in the CCDC39 and CCDC40 genes.
  • Researchers used advanced techniques to investigate how these genetic variants impact cellular functions beyond just causing cilia to stop moving.
  • They discovered that the absence of CCDC39/CCDC40 creates a significant loss of over 90 ciliary structural proteins, leading to cilia dysfunction and other cellular issues, suggesting that gene therapy could potentially offer a new treatment strategy for PCD.
View Article and Find Full Text PDF

Primary mitochondrial disorders are most often caused by deleterious mutations in the mitochondrial DNA (mtDNA). Here, we used a mitochondrial DddA-derived cytosine base editor (DdCBE) to introduce a compensatory edit in a mouse model that carries the pathological mutation in the mitochondrial transfer RNA (tRNA) alanine (mt-tRNA) gene. Because the original m.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!