Author:
Price W Nicholson,Handelman Samuel K,Everett John K,Tong Saichiu N,Bracic Ana,Luff Jon D,Naumov Victor,Acton Thomas,Manor Philip,Xiao Rong,Rost Burkhard,Montelione Gaetano T,Hunt John F
Abstract
Abstract
The biochemical and physical factors controlling protein expression level and solubility in vivo remain incompletely characterized. To gain insight into the primary sequence features influencing these outcomes, we performed statistical analyses of results from the high-throughput protein-production pipeline of the Northeast Structural Genomics Consortium. Proteins expressed in E. coli and consistently purified were scored independently for expression and solubility levels. These parameters nonetheless show a very strong positive correlation. We used logistic regressions to determine whether they are systematically influenced by fractional amino acid composition or several bulk sequence parameters including hydrophobicity, sidechain entropy, electrostatic charge, and predicted backbone disorder. Decreasing hydrophobicity correlates with higher expression and solubility levels, but this correlation apparently derives solely from the beneficial effect of three charged amino acids, at least for bacterial proteins. In fact, the three most hydrophobic residues showed very different correlations with solubility level. Leu showed the strongest negative correlation among amino acids, while Ile showed a slightly positive correlation in most data segments. Several other amino acids also had unexpected effects. Notably, Arg correlated with decreased expression and, most surprisingly, solubility of bacterial proteins, an effect only partially attributable to rare codons. However, rare codons did significantly reduce expression despite use of a codon-enhanced strain. Additional analyses suggest that positively but not negatively charged amino acids may reduce translation efficiency in E. coli irrespective of codon usage. While some observed effects may reflect indirect evolutionary correlations, others may reflect basic physicochemical phenomena. We used these results to construct and validate predictors of expression and solubility levels and overall protein usability, and we propose new strategies to be explored for engineering improved protein expression and solubility.
Publisher
Springer Science and Business Media LLC
Reference87 articles.
1. Makrides SC: Strategies for achieving high-level expression of genes in Escherichia coli. Microbiology and Molecular Biology Reviews. 1996, 60: 512-
2. Sorensen HP, Mortensen KK: Advanced genetic strategies for recombinant protein expression in Escherichia coli. Journal of biotechnology. 2005, 115: 113-128. 10.1016/j.jbiotec.2004.08.004.
3. Tresaugues L, Collinet B, Minard P, Henckes G, Aufrère R, Blondeau K, Liger D, Zhou CZ, Janin J, van Tilbeurgh H, others: Refolding strategies from inclusion bodies in a structural genomics project. Journal of Structural and Functional Genomics. 2004, 5: 195-204.
4. Davis GD, Elisee C, Newham DM, Harrison RG: New fusion protein systems designed to give soluble expression in Escherichia coli. Biotechnology and bioengineering. 1999, 65:
5. Kudla G, Murray AW, Tollervey D, Plotkin JB: Coding-sequence determinants of gene expression in Escherichia coli. Science. 2009, 324: 255-8. 10.1126/science.1170160.
Cited by
26 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献