Genetic selection from libraries expressing proteins with randomized amino acid segments is a powerful approach to identify proteins with novel biological activities. Here, we assessed the utility of deep DNA sequencing to characterize the composition, diversity, size and stability of such randomized libraries. We used 454 pyrosequencing to sequence a retroviral library expressing small proteins with randomized transmembrane domains. Despite the potential for unintended random mutagenesis during its construction, the overall hydrophobic composition and diversity of the proteins encoded by the sequenced library conformed well to its design. In addition, our sequencing results allowed us to calculate a more accurate estimate of the number of different proteins encoded by the library and suggested that the traditional methods for estimating the size of randomized libraries may overestimate their true size. Our results further demonstrated that no significant genetic bottlenecks exist in the methods used to express complex retrovirus libraries in mammalian cells and recover library sequences from these cells. These findings suggest that deep sequencing can be used to determine the quality and content of other libraries with randomized segments and to follow individual sequences during selection.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3038463 | PMC |
http://dx.doi.org/10.1093/protein/gzq112 | DOI Listing |
Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!