CN101160525A - Genetic code increase of non-natural active amino acid - Google Patents

Genetic code increase of non-natural active amino acid Download PDF

Info

Publication number
CN101160525A
CN101160525A CNA2004800211558A CN200480021155A CN101160525A CN 101160525 A CN101160525 A CN 101160525A CN A2004800211558 A CNA2004800211558 A CN A2004800211558A CN 200480021155 A CN200480021155 A CN 200480021155A CN 101160525 A CN101160525 A CN 101160525A
Authority
CN
China
Prior art keywords
amino acid
trna
protein
amino acids
unnatural amino
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CNA2004800211558A
Other languages
Chinese (zh)
Inventor
A·戴特斯
A·T·克罗普
J·W·钦
C·J·安德森
P·G·舒尔茨
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Scripps Research Institute
Original Assignee
Scripps Research Institute
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Scripps Research Institute filed Critical Scripps Research Institute
Priority to CN201210057706.2A priority Critical patent/CN102618605B/en
Publication of CN101160525A publication Critical patent/CN101160525A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Micro-Organisms Or Cultivation Processes Thereof (AREA)
  • Preparation Of Compounds By Using Micro-Organisms (AREA)

Abstract

The present invention provides compositions and methods for producing translational components that expand the number of genetically encoded amino acids in eukaryotic cells. The components include orthogonal tRNAs, orthogonal aminoacyl-tRNA synthetases, pairs of tRNAs/synthetases, and unnatural amino acids. Also provided are proteins and methods for producing proteins with unnatural amino acids in eukaryotic cells.

Description

非天然活性氨基酸遗传密码增加 Genetic code increase for unnaturally active amino acids

相关申请的交叉参考Cross References to Related Applications

本申请是基于Chin等于2003年6月18日提交的题为“扩展真核生物遗传密码”的USSN60/479,931、Chin等于2003年8月5日提交的题为“扩展真核生物遗传密码”的USSN60/493,014和Chin等于2003年8月19日提交的题为“扩展真核生物遗传密码”的USSN60/496,548的常规用途专利申请。本文据此要求这些在先申请的优先权和利益。This application is based on USSN 60/479,931 filed June 18, 2003 entitled "Extended Eukaryotic Genetic Code" by Chin et al. General Use Patent Application USSN 60/493,014 and Chin et al., filed August 19, 2003, USSN 60/496,548 entitled "Extended Eukaryotic Genetic Code." Priority to and benefit of these earlier applications is hereby claimed herein.

关于在联邦资助的研究和发展中所作发明权利的声明Statement Regarding Rights to Inventions Made in Federally Sponsored Research and Development

本发明在国立卫生研究院基金编号GM62159的政府资助和能源部基金DE-FG0300ER45812的资助下完成。政府拥有本发明的一定权利。This invention was made with government support under National Institutes of Health grant number GM62159 and Department of Energy grant DE-FG0300ER45812. The government has certain rights in this invention.

发明领域field of invention

本发明属于真核细胞中的翻译生物化学领域。本发明涉及在真核细胞中正交tRNA、正交合成酶和它们配对的生产方法和组合物。本发明也涉及非天然氨基酸的组合物、蛋白质和在真核细胞包括非天然氨基酸中生产蛋白质的方法。The present invention is in the field of translational biochemistry in eukaryotic cells. The present invention relates to methods and compositions for the production of orthogonal tRNAs, orthogonal synthetases and their pairs in eukaryotic cells. The invention also relates to compositions of unnatural amino acids, proteins and methods of producing proteins in eukaryotic cells comprising unnatural amino acids.

发明背景Background of the invention

从细菌到人类,每个已知生物的遗传密码都编码相同的二十个普通氨基酸。这20个相同的天然氨基酸的不同组合形成蛋白质,进行实际上所有的复杂生命过程,从光合作用到信号转导和免疫反应。为了研究和修饰蛋白的结构和功能,科学家们试图操纵遗传密码和蛋白质的氨基酸序列。但是,难以去除由遗传密码强加的限制,即将蛋白质限于二十个遗传编码的标准构件(很少除硒代半胱氨酸(参见,例如,A.Bock等,(1991),Molecular Microbiology 5:515-20)和吡咯赖氨酸之外(参见,例如,G.Srinivasan,等,(2002),Science 296:1459-62)。From bacteria to humans, the genetic code of every known organism encodes the same twenty common amino acids. Different combinations of these 20 identical natural amino acids form proteins that carry out virtually all complex life processes, from photosynthesis to signal transduction and immune responses. To study and modify the structure and function of proteins, scientists attempt to manipulate the genetic code and amino acid sequences of proteins. However, it is difficult to remove the restriction imposed by the genetic code, i.e. limiting proteins to the twenty genetically encoded standard building blocks (rarely except selenocysteine (see, e.g., A. Bock et al., (1991), Molecular Microbiology 5: 515-20) and pyrrolysine (see, for example, G. Srinivasan, et al., (2002), Science 296:1459-62).

在消除这些限制方面已经取得了一些进展,虽然该进展已被限制而且合理控制蛋白结构和功能的能力仍就处于萌芽状态。例如,化学家已经开发了合成和操纵小分子结构的方法和策略(参见,例如,E.J.Corey和X.-M.Cheng,《化学合成的逻辑》(The Logic of Chemical Synthesis)(Wiley-Interscience,New York,1995))。全合成(参见,例如,B.Merrifield,(1986),Science 232:341-7(1986))和半合成方法(参见,例如,D.Y.Jackson等,(1994)Science 266:243-7和P.E.Dawson和S.B.Kent,(2000),Annual Review of Biochemistry 69:923-60)使合成肽和小蛋白成为可能,但是这些方法限于使用超过10千道尔顿(kDa)的蛋白质。诱变法虽然是有功效的,但也限于有限数量的结构改变。在很多情况下,可能在整个蛋白质中竞争性掺入与普通氨基酸接近的结构类似物。参见,例如,R.Furter,(1998),Protein Science 7:419-26;K.Kirshenbaum,等,(2002),ChemBioChem 3:235-7和V.Doring等,(2001),Science 292:501-4。Some progress has been made in removing these limitations, although the progress has been limited and the ability to rationally control protein structure and function is still nascent. For example, chemists have developed methods and strategies to synthesize and manipulate the structure of small molecules (see, e.g., E.J. Corey and X.-M. Cheng, The Logic of Chemical Synthesis (Wiley-Interscience, pp. New York, 1995)). Total synthesis (seeing, for example, B.Merrifield, (1986), Science 232:341-7 (1986)) and semi-synthetic methods (seeing, for example, D.Y.Jackson et al., (1994) Science 266:243-7 and P.E.Dawson and S.B.Kent, (2000), Annual Review of Biochemistry 69:923-60) made it possible to synthesize peptides and small proteins, but these methods were limited to the use of proteins over 10 kilodaltons (kDa). Mutagenesis, while effective, is also limited to a limited number of structural changes. In many cases, competitive incorporation of close structural analogs to common amino acids is possible throughout proteins. See, e.g., R. Furter, (1998), Protein Science 7: 419-26; K. Kirshenbaum, et al., (2002), ChemBioChem 3: 235-7 and V. Doring et al., (2001), Science 292: 501 -4.

在尝试扩展操纵蛋白结构和功能的能力中,开发了用化学酰化正交tRNA的体外方法,该方法允许在体外响应于无义密码子将非天然氨基酸选择性掺入(参见,例如,J.A.Ellman,等,(1992),Science 255:197-200)。将具有新结构和物理性质的氨基酸选择性掺入蛋白中,以研究蛋白折叠和稳定性,以及生物分子识别和催化作用。参见,例如,D.Mendel,等,(1995),Annual Review of Biophysics and BiomolecularStructure 24:435-462和V.W.Cornish,等(1995年3月31日),Angew Chem.Int.Ed.Engl.,34:621-633。然而,该方法的化学计量性质严重限制了可以产生的蛋白量。In an attempt to expand the ability to manipulate protein structure and function, an in vitro method was developed to chemically acylate orthogonal tRNAs, which allows the selective incorporation of unnatural amino acids in vitro in response to nonsense codons (see, e.g., J.A. Ellman, et al. (1992), Science 255:197-200). Selective incorporation of amino acids with novel structures and physical properties into proteins to study protein folding and stability, as well as biomolecular recognition and catalysis. See, for example, D. Mendel, et al., (1995), Annual Review of Biophysics and Biomolecular Structure 24: 435-462 and V.W. Cornish, et al. (March 31, 1995), Angew Chem. Int. Ed. Engl., 34 : 621-633. However, the stoichiometric nature of the method severely limits the amount of protein that can be produced.

将非天然氨基酸显微注射入细胞。例如,通过显微注射化学错酰化嗜热四膜虫tRNA(例如,M.E.Saks,等(1996),用于通过无义抑制将非天然氨基酸体内掺入蛋白质中的工程四膜虫tRNAGln,J.Biol.Chem.271:23169-23175)和相应的mRNA将非天然氨基酸引入爪蟾卵母细胞的烟酰乙酰胆碱受体中(例如,M.W.Nowak,等(1998),将非天然氨基酸体内掺入爪蟾卵母细胞表达系统的离子通道中,酶学方法.293:504-529)。这允许通过引入具有独特的物理或化学性质的侧链氨基酸,对卵母细胞内的受体进行详细的生物物理研究。参见,例如,D.A.Dougherty(2000),作为蛋白结构和功能探针的非天然氨基酸,Curr.Opin.Chem.Biol.4:645-652。Microinjection of unnatural amino acids into cells. For example, chemical misacylation of Tetrahymena tRNA by microinjection (e.g., M.E. Saks, et al. (1996) for engineering Tetrahymena tRNAGln for in vivo incorporation of unnatural amino acids into proteins by nonsense suppression, J .Biol.Chem.271:23169-23175) and the corresponding mRNAs introduce unnatural amino acids into the nicotinyl acetylcholine receptors of Xenopus oocytes (for example, M.W.Nowak, et al. (1998), incorporation of unnatural amino acids in vivo Ion channels in the Xenopus oocyte expression system, Methods in Enzymology. 293:504-529). This allows detailed biophysical studies of receptors within oocytes by introducing side-chain amino acids with unique physical or chemical properties. See, eg, D.A. Dougherty (2000), Unnatural amino acids as probes of protein structure and function, Curr. Opin. Chem. Biol. 4:645-652.

不幸的是,该方法限于可显微注射的细胞中的蛋白质,因为相关的tRNA是体外化学酰化的,不能再酰化,所以蛋白产率很低。Unfortunately, this method is limited to proteins in cells that can be microinjected, because the associated tRNA is chemically acylated in vitro and cannot be re-acylated, so protein yields are low.

为克服这些限制,将新组分加入原核生物大肠杆菌的蛋白生物合成机器中(例如,L.Wang,等,(2001),Science 292:498-500),这允许在体内遗传编码非天然氨基酸。为响应琥珀密码子TAG,用该方法将具有新化学、物理或生物学性质的一些新氨基酸,包括光亲和标记和可光异构的氨基酸、酮基氨基酸和糖基化氨基酸以高保真度有效掺入大肠杆菌的蛋白中。参见,例如,J.W.Chin等,(2002),Journal ofthe American Chemical Society 124:9026-9027;J.W.Chin和P.G.Schultz,(2002),ChemBioChem 11:1135-1137;J.W.Chin,等,(2002),PNAS United States of America99:11020-11024:和L.Wang和P.G.Schultz,(2002),Chem.Comm.,1-10。然而,原核细胞和真核细胞的翻译机器并不是高度保守的;因此,加入大肠杆菌的生物合成机器的组分不能经常用来将非天然氨基酸位点特异性地掺入真核细胞的蛋白中。例如,大肠杆菌中使用的詹氏甲烷球菌酪氨酰-tRNA合成酶/tRNA对在真核细胞中是不正交的。此外,tRNA在真核细胞中,而非在原核细胞中的转录是通过RNA聚合酶III进行的,这限制了可在真核细胞中转录的tRNA结构基因的一级序列。而且,与原核细胞相反,真核细胞中的tRNA需要从转录它们的细胞核中输出至胞质,以在翻译中起作用。最后,真核80S核糖体与70S原核核糖体不同。因此,需要开发改进的生物合成机器组件,以扩展真核生物遗传密码。本发明满足了这些和其它需要,这在下面公开的综述中显而易见。To overcome these limitations, new components were added to the protein biosynthetic machinery of the prokaryote E. coli (e.g., L. Wang, et al., (2001), Science 292:498-500), which allowed genetically encoding unnatural amino acids in vivo . In response to the amber codon TAG, some novel amino acids with novel chemical, physical, or biological properties, including photoaffinity-labeled and photoisomerizable amino acids, ketoamino acids, and glycosylated amino acids, were synthesized with high fidelity in response to the amber codon TAG. Efficiently incorporated into E. coli proteins. See, e.g., J.W. Chin et al., (2002), Journal of the American Chemical Society 124:9026-9027; J.W.Chin and P.G. Schultz, (2002), ChemBioChem 11:1135-1137; J.W.Chin, et al., (2002), PNAS United States of America 99: 11020-11024: and L. Wang and P.G. Schultz, (2002), Chem.Comm., 1-10. However, the translation machinery of prokaryotic and eukaryotic cells is not highly conserved; therefore, components of the biosynthetic machinery added to E. coli cannot be routinely used to site-specifically incorporate unnatural amino acids into eukaryotic proteins . For example, the M. jannaschii tyrosyl-tRNA synthetase/tRNA pair used in E. coli is non-orthogonal in eukaryotic cells. In addition, tRNA transcription in eukaryotes, but not in prokaryotes, is carried out by RNA polymerase III, which limits the primary sequence of tRNA structural genes that can be transcribed in eukaryotes. Also, in contrast to prokaryotes, tRNAs in eukaryotes require export from the nucleus where they are transcribed to the cytoplasm in order to function in translation. Finally, eukaryotic 80S ribosomes are different from 70S prokaryotic ribosomes. Therefore, there is a need to develop improved biosynthetic machinery components to expand the eukaryotic genetic code. The present invention fulfills these and other needs, as will be apparent from the review of the disclosure below.

发明概要Summary of the invention

本发明提供了具有翻译组件的真核细胞,例如,正交氨酰基-tRNA合成酶(O-RS)对和正交tRNA(O-tRNA),及它们的个别组件,它们用于真核蛋白生物合成机器,在真核细胞中将非天然氨基酸掺入正在生长的多肽链中。The present invention provides eukaryotic cells having translational components, e.g., orthogonal aminoacyl-tRNA synthetase (O-RS) pairs and orthogonal tRNAs (O-tRNA), and their individual components, for use in eukaryotic proteins The biosynthetic machinery that incorporates unnatural amino acids into growing polypeptide chains in eukaryotic cells.

本发明组合物包括含有正交氨酰基-tRNA合成酶(O-RS)(例如,来源于非真核生物,如大肠杆菌、嗜热脂肪芽孢杆菌等)的真核细胞(例如,酵母细胞(如酿酒酵母细胞),哺乳动物细胞、植物细胞、藻类细胞、真菌细胞、昆虫细胞等),其中O-RS在真核细胞中优选地氨酰化具有至少一个非天然氨基酸的正交tRNA(O-tRNA)。任选地,在给定真核细胞中可以氨酰化两种或多种OtRNA。在一个方面,O-RS氨酰化例如,至少40%、至少45%、至少50%、至少60%、至少75%、至少80%、或甚至90%或更多的具有非天然氨基酸的O-tRNA,与具有氨基酸序列,如SEQ ID NO.:86或45中所列序列的O-RS一样有效。在一个实施方式中,本发明的O-RS氨酰化具有非天然氨基酸的O-tRNA,比O-RS氨酰化具有天然氨基酸的O-tRNA的效率高例如,至少10倍、至少20倍、至少30倍等。Compositions of the present invention include eukaryotic cells (for example, yeast cells ( such as Saccharomyces cerevisiae cells), mammalian cells, plant cells, algal cells, fungal cells, insect cells, etc.), wherein the O-RS in eukaryotic cells preferably aminoacylates an orthogonal tRNA with at least one unnatural amino acid (O -tRNA). Optionally, two or more OtRNAs can be aminoacylated in a given eukaryotic cell. In one aspect, the O-RS aminoacylates, e.g., at least 40%, at least 45%, at least 50%, at least 60%, at least 75%, at least 80%, or even 90% or more of the O with unnatural amino acids. - tRNA as effective as an O-RS having an amino acid sequence such as that set forth in SEQ ID NO.: 86 or 45. In one embodiment, the O-RS of the present invention aminoacylates an O-tRNA with a non-natural amino acid, for example, at least 10 times, at least 20 times more efficiently than an O-RS aminoacylates an O-tRNA with a natural amino acid , at least 30 times, etc.

在一个实施方式中,SEQ ID NO.:3-35(例如,3-19、20-35或序列3-35的任何其它亚组)中所列任一个多核苷酸序列,或其互补多核苷酸序列编码0-RS或其一部分。在另一实施方式中,O-RS包含SEQ ID NO.:36-63(例如,36-47、48-63或36-63的任何其它亚组)和/或86,或其保守变体中任一个所列的氨基酸序列。在另一实施方式中,O-RS包含与天然产生的酪氨酰氨酰基-tRNA合成酶(TyrRS)例如、至少90%、至少95%、至少98%、至少99%或至少99.5%或更多相同的氨基酸序列,并包含来自A-E族的两种或多种氨基酸。A族包括与大肠杆菌TyrRS的Tyr37相对应位置上的缬氨酸、异亮氨酸、亮氨酸、甘氨酸、丝氨酸、丙氨酸或苏氨酸。B族包括与大肠杆菌TyrRS的Asn126相对应位置上的天冬氨酸。C族包括与大肠杆菌TyrRS的Asp182相对应位置上的苏氨酸、丝氨酸、精氨酸、天冬酰胺或甘氨酸。D族包括与大肠杆菌TyrRS的Phel83相对应位置上的甲硫氨酸、丙氨酸、缬氨酸或酪氨酸;E族包括与大肠杆菌TyrRS的Leul86相对应位置上的丝氨酸、甲硫氨酸、缬氨酸、半胱氨酸、苏氨酸或丙氨酸。In one embodiment, any polynucleotide sequence listed in SEQ ID NO.: 3-35 (for example, 3-19, 20-35 or any other subgroup of sequence 3-35), or its complementary polynucleotide The acid sequence encodes the O-RS or a portion thereof. In another embodiment, the O-RS comprises SEQ ID NO.: 36-63 (e.g., 36-47, 48-63, or any other subgroup of 36-63) and/or 86, or conservative variants thereof Any of the listed amino acid sequences. In another embodiment, the O-RS comprises at least 90%, at least 95%, at least 98%, at least 99%, or at least 99.5% or more of a naturally occurring tyrosylaminoacyl-tRNA synthetase (TyrRS) Multiple identical amino acid sequences, and contain two or more amino acids from A-E families. Family A includes valine, isoleucine, leucine, glycine, serine, alanine or threonine at positions corresponding to Tyr37 of E. coli TyrRS. Family B includes an aspartic acid at a position corresponding to Asn126 of E. coli TyrRS. Family C includes threonine, serine, arginine, asparagine or glycine at positions corresponding to Asp182 of E. coli TyrRS. Family D includes methionine, alanine, valine or tyrosine at the corresponding position to Phel83 of Escherichia coli TyrRS; family E includes serine and methionine at the corresponding position to Leul86 of Escherichia coli TyrRS acid, valine, cysteine, threonine or alanine.

任何这些族组合的亚组是本发明的特征。例如,在一个实施方式中,O-RS具有两种或多种选自与大肠杆菌TyrRS的Tyr37相对应位置上出现的缬氨酸、异亮氨酸、亮氨酸、或苏氨酸;与大肠杆菌TyrRS的Asp182相对应位置上的苏氨酸、丝氨酸、精氨酸、或甘氨酸;与大肠杆菌TyrRS的Phel83相对应位置上的甲硫氨酸、或酪氨酸;和与大肠杆菌TyrRS的Leul86相对应位置上的丝氨酸、或丙氨酸的氨基酸。在另一实施方式中,O-RS包括两种或多种选自与大肠杆菌TyrRS的Tyr37相对应位置上的甘氨酸、丝氨酸、或丙氨酸,与大肠杆菌TyrRS的Asnl26相对应位置上的天冬氨酸,与大肠杆菌TyrRS的Aspl82相对应位置上的天冬酰胺,与大肠杆菌TyrRS的Phel83相对应位置上的丙氨酸或缬氨酸和/或与大肠杆菌TyrRS的Leul86相对应位置上的甲硫氨酸、缬氨酸、半胱氨酸、或苏氨酸。Subgroups of any of these family combinations are features of the invention. For example, in one embodiment, the O-RS has two or more valine, isoleucine, leucine, or threonine selected from the position corresponding to Tyr37 of Escherichia coli TyrRS; Threonine, serine, arginine, or glycine at the corresponding position of Asp182 of Escherichia coli TyrRS; Methionine or tyrosine at the corresponding position of Phel83 of Escherichia coli TyrRS; and with Escherichia coli TyrRS The amino acid of serine or alanine at the corresponding position of Leul86. In another embodiment, the O-RS includes two or more glycine, serine, or alanine selected from the corresponding position of Tyr37 of E. Partic acid, asparagine at the position corresponding to Aspl82 of E. coli TyrRS, alanine or valine at the position corresponding to Phe183 of E. coli TyrRS and/or on the position corresponding to Leul86 of E. coli TyrRS of methionine, valine, cysteine, or threonine.

在另一实施方式中,与天然氨基酸相比,O-RS对于非天然氨基酸具有一种或多种改进或增强的酶性质。例如,与天然氨基酸相比,对非天然氨基酸的改进或增强的性质包括,例如,较高km、较低km、较高kcat、较低kcat、较低kcat/km、较高kcat/km等的任意一种。In another embodiment, the O-RS has one or more improved or enhanced enzymatic properties for unnatural amino acids as compared to natural amino acids. For example, improved or enhanced properties of unnatural amino acids include, for example, higher k m , lower k m , higher k cat , lower k cat , lower k cat /k m , lower k cat , lower k cat /k m , compared to natural amino acids. Any of the higher k cat /k m etc.

真核细胞也任选地包括非天然氨基酸。真核细胞任选地包括正交tRNA(O-tRNA)(例如,来自非真核生物,如大肠杆菌、嗜热脂肪芽孢杆菌和/或类似物),其中O-tRNA识别选择密码子,并优选地由O-RS氨酰化具有非天然氨基酸的O-tRNA。在一个方面,O-tRNA介导非天然氨基酸掺入蛋白质中,其效率相当于包含SEQ ID NO.:65中所列多核苷酸序列或在该序列的细胞中加工的tRNA效率例如,至少45%、至少50%、至少60%、至少75%、至少80%、至少90%、至少95%或99%。在另一方面,O-tRNA包含SEQ ID NO.:65的序列,O-RS包含选自SEQ ID NO.:36-63(例如,36-47、48-63或36-63的任何其它亚组)和/或86,和/或其保守变体中任意一个所列氨基酸序列的多肽序列。Eukaryotic cells also optionally include unnatural amino acids. Eukaryotic cells optionally include an orthogonal tRNA (O-tRNA) (e.g., from a non-eukaryotic organism such as E. coli, Bacillus stearothermophilus, and/or the like), wherein the O-tRNA recognizes a selector codon, and O-tRNAs with unnatural amino acids are preferably aminoacylated by O-RS. In one aspect, the O-tRNA mediates the incorporation of the unnatural amino acid into the protein with an efficiency equivalent to that of a tRNA comprising or processed in a cell comprising the polynucleotide sequence set forth in SEQ ID NO.: 65, e.g., at least 45 %, at least 50%, at least 60%, at least 75%, at least 80%, at least 90%, at least 95%, or 99%. In another aspect, the O-tRNA comprises the sequence of SEQ ID NO.: 65 and the O-RS comprises any other subunit selected from SEQ ID NO.: 36-63 (e.g., 36-47, 48-63 or 36-63) Group) and/or 86, and/or the polypeptide sequence of any one of the listed amino acid sequences in conservative variants thereof.

在另一实施方式中,真核细胞包含含有编码感兴趣多肽的多核苷酸的核酸,其中多核苷酸包含O-tRNA识别的选择密码子。在一个方面,包含非天然氨基酸的感兴趣多肽的产率是从多核苷酸缺少选择密码子的细胞中获得的天然产生的感兴趣多肽的例如,至少2.5%、至少5%、至少10%、至少25%、至少30%、至少40%、50%或更多。在另一方面,细胞在没有非天然氨基酸的情况下产生感兴趣多肽的产率是在有非天然氨基酸的情况下多肽产率的例如,小于35%、小于30%、小于20%、小于15%、小于10%、小于5%、小于2.5%等。In another embodiment, a eukaryotic cell comprises a nucleic acid comprising a polynucleotide encoding a polypeptide of interest, wherein the polynucleotide comprises a selector codon recognized by the O-tRNA. In one aspect, the yield of a polypeptide of interest comprising an unnatural amino acid is, e.g., at least 2.5%, at least 5%, at least 10%, At least 25%, at least 30%, at least 40%, 50% or more. In another aspect, the cell produces the polypeptide of interest in the absence of the unnatural amino acid in a yield that is, for example, less than 35%, less than 30%, less than 20%, less than 15% of the yield of the polypeptide in the presence of the unnatural amino acid. %, less than 10%, less than 5%, less than 2.5%, etc.

本发明也提供包含正交氨酰基-tRNA合成酶(O-RS)、正交tRNA(O-tRNA)、非天然氨基酸和含有编码感兴趣多肽的多核苷酸的核酸的真核细胞。多核苷酸包含O-tRNA识别的选择密码子。此外,在真核细胞中O-RS优选地氨酰化具有非天然氨基酸的正交tRNA(O-tRNA),细胞在没有非天然氨基酸的情况下产生感兴趣多肽的产率是在有非天然氨基酸的情况下多肽产率的例如,小于30%、小于20%、小于15%、小于10%、小于5%、小于2.5%等。The invention also provides eukaryotic cells comprising an orthogonal aminoacyl-tRNA synthetase (O-RS), an orthogonal tRNA (O-tRNA), an unnatural amino acid, and a nucleic acid comprising a polynucleotide encoding a polypeptide of interest. The polynucleotide comprises a selector codon recognized by the O-tRNA. Furthermore, O-RS preferentially aminoacylates orthogonal tRNAs (O-tRNAs) with unnatural amino acids in eukaryotic cells, and cells can produce polypeptides of interest in the absence of unnatural amino acids with higher yields than in the presence of unnatural amino acids. In the case of amino acids, the polypeptide yield is, for example, less than 30%, less than 20%, less than 15%, less than 10%, less than 5%, less than 2.5%, and the like.

包括含有正交tRNA(O-tRNA)的真核细胞的组合物也是本发明的特征。一般地,O-tRNA在体内介导非天然氨基酸掺入蛋白质中,该蛋白质通过含有O-tRNA识别的选择密码子的多核苷酸编码。在一个实施方式中,O-tRNA介导非天然氨基酸掺入蛋白质中,其效率相当于包含SEQ ID NO.:65中所列多核苷酸序列或在该序列的细胞中加工的tRNA效率的例如,至少45%、至少50%、至少60%、至少75%、至少80%、至少90%、至少95%或甚至99%或更高。在另一实施方式中,O-tRNA包含SEQ ID NO.:65中所列的多核苷酸序列或加工自多核苷酸序列,或其保守变体。在另一实施方式中,O-tRNA包含可循环的O-tRNA。Compositions comprising eukaryotic cells containing orthogonal tRNAs (O-tRNAs) are also a feature of the invention. Generally, O-tRNAs mediate in vivo incorporation of unnatural amino acids into proteins encoded by polynucleotides containing selector codons recognized by O-tRNAs. In one embodiment, the O-tRNA mediates the incorporation of the unnatural amino acid into the protein with an efficiency equivalent to that of a tRNA comprising the polynucleotide sequence set forth in SEQ ID NO.: 65 or processed in a cell of this sequence, for example , at least 45%, at least 50%, at least 60%, at least 75%, at least 80%, at least 90%, at least 95%, or even 99% or higher. In another embodiment, the O-tRNA comprises a polynucleotide sequence listed in SEQ ID NO.: 65 or is processed from a polynucleotide sequence, or a conservative variant thereof. In another embodiment, the O-tRNA comprises a recyclable O-tRNA.

在本发明的一个方面,O-tRNA是转录后修饰的。本发明也提供在真核细胞中编码O-tRNA的核酸或其互补多核苷酸。在一个实施方式中,核酸包含A框和B框。In one aspect of the invention, the O-tRNA is post-transcriptionally modified. The present invention also provides nucleic acids encoding O-tRNAs or complementary polynucleotides thereof in eukaryotic cells. In one embodiment, the nucleic acid comprises an A box and a B box.

本发明也描述了生产翻译组件的方法,例如,O-RSs或O-tRNA/O-RS对(和这些方法生产的翻译组件)。例如,本发明提供了生产正交氨酰基-tRNA合成酶(O-RS)的方法,该酶在真核细胞中优选地氨酰化具有非天然氨基酸的正交tRNA。该方法包括,例如,将第一种类的真核细胞的群体在非天然氨基酸存在下进行正选择,其中真核细胞各自包含:i)氨酰基-tRNA合成酶(RS)文库的一员、ii)正交tRNA(O-tRNA)、iii)编码正选择标记的多核苷酸和iv)编码负选择标记的多核苷酸;其中在正选择下存活的细胞包含在非天然氨基酸存在下氨酰化正交tRNA(O-tRNA)的活性RS。在没有非天然氨基酸的情况下将在正选择下存活的细胞进行负选择,以去除氨酰化具有天然氨基酸的O-tRNA的活性RS。这提供了优选地氨酰化具有非天然氨基酸的O-tRNA的O-RS。The invention also describes methods for producing translational components, eg, O-RSs or O-tRNA/O-RS pairs (and translational components produced by these methods). For example, the invention provides methods for producing an orthogonal aminoacyl-tRNA synthetase (O-RS) that preferentially aminoacylates an orthogonal tRNA with an unnatural amino acid in a eukaryotic cell. The method comprises, for example, positively selecting a population of eukaryotic cells of a first kind in the presence of an unnatural amino acid, wherein the eukaryotic cells each comprise: i) a member of an aminoacyl-tRNA synthetase (RS) library, ii ) an orthogonal tRNA (O-tRNA), iii) a polynucleotide encoding a positive selection marker and iv) a polynucleotide encoding a negative selection marker; wherein cells surviving positive selection comprise aminoacylation in the presence of an unnatural amino acid Active RS of Orthogonal tRNA (O-tRNA). Cells surviving positive selection in the absence of unnatural amino acids were subjected to negative selection to remove active RSs that aminoacylate O-tRNAs with natural amino acids. This provides an O-RS that preferentially aminoacylates O-tRNAs with unnatural amino acids.

在某些实施方式中,将编码正选择标记的多核苷酸可操作地连接于效应元件,细胞还包括a)编码从效应元件调节转录的转录调节蛋白(例如,真核转录调节蛋白等)和b)包含至少一种选择密码子的的多核苷酸。通过氨酰化具有非天然氨基酸的O-tRNA将非天然氨基酸掺入转录调节蛋白中导致正选择标记的转录。在一个实施方式中,转录调节蛋白是转录激活蛋白(例如,GAL4等),选择密码子是琥珀终止密码子,例如,其中琥珀终止密码子位于或基本上接近编码转录激活蛋白的DNA结合域的部分多核苷酸。In certain embodiments, a polynucleotide encoding a positive selection marker is operably linked to a response element, and the cell further comprises a) a transcriptional regulatory protein encoding a transcriptional regulator that regulates transcription from the response element (e.g., a eukaryotic transcriptional regulatory protein, etc.) and b) A polynucleotide comprising at least one selector codon. Incorporation of unnatural amino acids into transcriptional regulatory proteins by aminoacylation of O-tRNAs with unnatural amino acids results in the transcription of positive selectable markers. In one embodiment, the transcriptional regulator protein is a transcriptional activator protein (e.g., GAL4, etc.), and the selector codon is an amber stop codon, e.g., wherein the amber stop codon is located at or substantially adjacent to the DNA binding domain encoding the transcriptional activator protein some polynucleotides.

正选择标记可以是各种分子中任意一种。在一个实施方式中,正选择标记包含生长营养补充剂,在缺少营养补充剂的培养基中进行选择。在另一实施方式中,编码正选择标记的多核苷酸是例如,ura3、leu2、lys2、lacZ基因、his3(例如,其中his3基因编码咪唑甘油磷酸脱氢酶,由提供的3-氨基三唑(3-AT)和/或类似物检测。在另一实施方式中,编码正选择标记的多核苷酸包含选择密码子。Positive selectable markers can be any of a variety of molecules. In one embodiment, the positive selection marker comprises a growth nutrient supplement and selection is performed in medium lacking the nutrient supplement. In another embodiment, the polynucleotide encoding a positive selection marker is, for example, ura3, leu2, lys2, lacZ gene, his3 (for example, wherein the his3 gene encodes imidazole glycerol phosphate dehydrogenase, provided by 3-aminotriazole (3-AT) and/or analog detection. In another embodiment, the polynucleotide encoding a positive selection marker comprises a selector codon.

如同正选择标记一样,负选择标记也可以是各种分子中任意一种。在某些实施方式中,编码负选择标记的多核苷酸可操作地连接于效应元件,转录调节蛋白从效应元件介导转录。通过氨酰化具有天然氨基酸的O-tRNA将天然氨基酸掺入转录调节蛋白中导致负选择标记的转录。在一个实施方式中,编码负选择标记的多核苷酸是例如,ura3基因,负选择在含有5-氟乳清酸(5-FOA)的培养基中完成。在另一实施方式中,用于负选择的培养基包含可以被负选择标记转化为可检测物质的选择剂或筛选剂。在本发明的一个方面,可检测物质是有毒物质。在一个实施方式中,编码负选择标记的多核苷酸包含选择密码子。As with positive selection markers, negative selection markers can be any of a variety of molecules. In certain embodiments, a polynucleotide encoding a negative selection marker is operably linked to a response element from which a transcriptional regulator protein mediates transcription. Incorporation of natural amino acids into transcriptional regulatory proteins by aminoacylation of O-tRNAs with natural amino acids results in the transcription of negative selectable markers. In one embodiment, the polynucleotide encoding a negative selection marker is, for example, the ura3 gene, and negative selection is performed in a medium containing 5-fluoroorotic acid (5-FOA). In another embodiment, the medium used for negative selection contains a selection or screening agent that can be converted to a detectable substance by a negative selection marker. In one aspect of the invention, the detectable substance is a toxic substance. In one embodiment, the polynucleotide encoding a negative selection marker comprises a selector codon.

在某些实施方式中,正选择标记和/或负选择标记包含在合适反应物的存在下发荧光或催化发光反应的多肽。在本发明的一个方面,通过荧光激活细胞分选(FACS)或通过发光检测正选择标记和/或负选择标记。在某些实施方式中,正选择标记和/或负选择标记包含基于亲和力的筛选标记或转录调节蛋白。在一个实施方式中,同一多核苷酸编码正选择标记和负选择标记。In certain embodiments, positive and/or negative selection markers comprise polypeptides that fluoresce or catalyze a luminescent reaction in the presence of a suitable reactant. In one aspect of the invention, the positive and/or negative selection markers are detected by fluorescence activated cell sorting (FACS) or by luminescence. In certain embodiments, the positive and/or negative selection markers comprise affinity-based selection markers or transcriptional regulator proteins. In one embodiment, the same polynucleotide encodes a positive selection marker and a negative selection marker.

在一个实施方式中,编码本发明正选择标记和/或负选择标记的多核苷酸可以包含至少两个选择密码子,各自或两者可以包含至少两个不同的选择密码子或至少两个相同的选择密码子。In one embodiment, the polynucleotide encoding the positive selection marker and/or the negative selection marker of the present invention may comprise at least two selector codons, each or both may comprise at least two different selector codons or at least two identical selector codons. the selector codon.

选择/筛选严格性的附加水平也可用于本发明方法。在一个实施方式中,方法可包括,例如,在步骤(a)、(b)或(a)和(b)提供数量不等的失活合成酶,其中数量不等的失活合成酶提供附加水平的选择或筛选严格性。在一个实施方式中,本方法用于生产O-RS的步骤(a),(b)或步骤(a)和(b)包括不同的选择或筛选严格性,例如,正和/或负选择标记的严格性。该方法任选地包括将优选地氨酰化具有非天然氨基酸的O-tRNA的O-RS进行附加选择轮,例如,附加正选择轮、附加负选择轮或附加正和负选择轮的组合。Additional levels of selection/screening stringency may also be used in the methods of the invention. In one embodiment, the method may comprise, for example, providing varying amounts of inactivated synthetases in steps (a), (b) or (a) and (b), wherein varying amounts of inactivated synthetases provide additional level of selection or screening stringency. In one embodiment, steps (a), (b) or steps (a) and (b) of the method for producing O-RS include different selection or screening stringencies, e.g., positive and/or negative selection markers strictness. The method optionally includes subjecting the O-RS that preferably aminoacylates the O-tRNA with the unnatural amino acid to additional rounds of selection, eg, additional rounds of positive selection, additional rounds of negative selection, or a combination of additional rounds of positive and negative selection.

在一个实施方式中,选择/筛选包括一种或多种正或负选择/筛选,它们选自,例如,氨基酸通透性的改变,翻译效率的改变,翻译保真度的改变等。一种或多种改变是基于编码正交tRNA-tRNA合成酶对的组件的一种或多种多核苷酸中的突变被用来生产蛋白。In one embodiment, selection/screening comprises one or more positive or negative selections/screens selected from, for example, changes in amino acid permeability, changes in translation efficiency, changes in translation fidelity, and the like. The one or more alterations are based on mutations in one or more polynucleotides encoding components of the orthogonal tRNA-tRNA synthetase pair used to produce the protein.

一般地,RS文库(例如,突变体RS文库)包含来自至少一种例如,来自非真核生物的氨酰基-tRNA合成酶(RS)的RS。在一个实施方式中,RS文库来自失活的RS,例如,其中失活RS通过突变活性RS而产生。在另一实施方式中,失活RS包含氨基酸结合口袋和一个或多个含有用一种或多种不同氨基酸取代结合口袋的氨基酸,例如,取代的氨基酸用丙氨酸取代。Typically, the RS library (eg, mutant RS library) comprises an RS from at least one aminoacyl-tRNA synthetase (RS), eg, from a non-eukaryotic organism. In one embodiment, the RS library is from an inactive RS, eg, wherein the inactive RS is generated by mutating an active RS. In another embodiment, an inactive RS comprises an amino acid binding pocket and one or more amino acids in the binding pocket are substituted with one or more different amino acids, eg, the substituted amino acid is substituted with alanine.

在某些实施方式中,生产O-RS的方法还包括在编码RS的核酸上进行随机突变、位点特异性突变、重组、嵌合构建或它们的任意组合,因此产生突变体RS文库。在某些实施方式中,该方法还包括,例如,(c)分离编码O-RS的核酸;(d)从核酸中产生一组编码突变O-RS的多核苷酸(例如,通过随机诱变、位点特异性诱变、嵌合构建、重组或它们的任意组合);和(e)重复步骤(a)和/或(b),直到获得优选地氨酰化具有非天然氨基酸的O-tRNA的突变O-RS。在本发明的一个方面,步骤(c)-(e)至少进行两次。In some embodiments, the method for producing an O-RS further includes performing random mutation, site-specific mutation, recombination, chimeric construction or any combination thereof on the nucleic acid encoding the RS, thereby generating a mutant RS library. In certain embodiments, the method further comprises, for example, (c) isolating a nucleic acid encoding an O-RS; (d) generating a set of polynucleotides encoding a mutant O-RS from the nucleic acid (e.g., by random mutagenesis , site-specific mutagenesis, chimeric construction, recombination, or any combination thereof); and (e) repeating steps (a) and/or (b) until an O- Mutation O-RS of tRNA. In one aspect of the invention, steps (c)-(e) are performed at least twice.

生产O-tRNA/O-RS对的方法也是本发明的特征。在一个实施方式中,如上所述地获得O-RS,通过将第一种类的真核细胞的群体进行负选择获得O-tRNA,其中真核细胞包括tRNA文库的一员,以去除包含被对真核细胞内源性氨酰基-tRNA合成酶(RS)氨酰化的tRNA文库的一员的细胞。这提供了与第一种类的真核细胞正交的tRNA库。在本发明的一个方面,tRNA文库包含来自至少一种例如,来自非真核生物的tRNA的tRNA。在本发明的另一方面,氨酰基-tRNA合成酶(RS)文库包括来自至少一种例如,来自非真核生物的氨酰基-tRNA合成酶(RS)的RS。在本发明的另一方面,tRNA文库包括来自至少一种来自第一种非真核生物的tRNA的tRNAs。氨酰基-tRNA合成酶(RS)文库任选地包含来自至少一种来自第二种非真核生物的氨酰基-tRNA合成酶(RS)的RS。在一个实施方式中,第一种和第二种非真核生物是相同的。另外,第一种和第二种非真核生物可以是不同的。通过本发明方法生产的特异性O-tRNA/O-RS对也是本发明的特征。Methods of producing O-tRNA/O-RS pairs are also a feature of the invention. In one embodiment, the O-RS is obtained as described above by negatively selecting a population of eukaryotic cells of a first kind comprising a member of the tRNA library to remove the O-tRNA containing A cell in which a member of a tRNA library is aminoacylated by the endogenous aminoacyl-tRNA synthetase (RS) of eukaryotic cells. This provides a library of tRNAs orthogonal to the first kind of eukaryotic cells. In one aspect of the invention, the tRNA library comprises tRNAs from at least one tRNA, eg, from a non-eukaryotic organism. In another aspect of the invention, the aminoacyl-tRNA synthetase (RS) library comprises an RS from at least one aminoacyl-tRNA synthetase (RS), eg, from a non-eukaryotic organism. In another aspect of the invention, the tRNA library comprises tRNAs derived from at least one tRNA from a first non-eukaryotic organism. The aminoacyl-tRNA synthetase (RS) library optionally comprises an RS from at least one aminoacyl-tRNA synthetase (RS) from a second non-eukaryote. In one embodiment, the first and second non-eukaryotic organisms are the same. Additionally, the first and second non-eukaryotic organisms can be different. Specific O-tRNA/O-RS pairs produced by the methods of the invention are also a feature of the invention.

本发明的另一特征是在一种类中生产翻译组件和将选择/筛选的翻译组件引入第二种类的方法。例如,在第一种类(例如,真核生物种类,如酵母等)中生产O-tRNA/O-RS对的方法还包括将编码O-tRNA的核酸和编码O-RS的核酸引入第二种类的真核细胞(例如,哺乳动物、昆虫、真菌、藻类、植物等)。第二种类可以在体内用引入的翻译组件将非天然氨基酸掺入正在生长的多肽链中,例如,在翻译期间。Another feature of the invention is a method of producing translational components in one species and introducing selected/screened translational components into a second species. For example, a method of producing an O-tRNA/O-RS pair in a first species (e.g., a eukaryotic species such as yeast, etc.) further comprises introducing a nucleic acid encoding an O-tRNA and a nucleic acid encoding an O-RS into a second species eukaryotic cells (eg, mammals, insects, fungi, algae, plants, etc.). The second species can incorporate the unnatural amino acid into a growing polypeptide chain in vivo with an introduced translational component, for example, during translation.

在另一实施例中,生产在真核细胞中优选地氨酰化具有非天然氨基酸的正交tRNA的正交氨酰基-tRNA合成酶(O-RS)的方法包括:(a)在非天然氨基酸存在下对第一种类的真核细胞群体(例如,真核生物种类,如酵母或类似物)进行正选择。第一种类的真核细胞各自包括:i)氨酰基-tRNA合成酶(RS)文库的一员,ii)正交tRNA(O-tRNA),iii)编码正选择标记的多核苷酸,和iv)编码负选择标记的多核苷酸。能够在正选择下存活的细胞包含在非天然氨基酸存在下氨酰化正交tRNA(O-tRNA)的活性RS。将在正选择下存活的细胞在没有非天然氨基酸的情况下进行负选择,以去除氨酰化具有天然氨基酸的O-tRNA的活性RS,因此提供优选地氨酰化具有非天然氨基酸的O-tRNA的O-RS。将编码O-tRNA的核酸和编码O-RS的核酸引入第二种类的真核细胞(例如,哺乳动物、昆虫、真菌、藻类、植物和/或类似物)。当这些组件在第二种类中翻译时,可以用来将非天然氨基酸掺入第二种类中感兴趣的蛋白或多肽。在一个实施方式中,将O-tRNA和/或O-RS引入第二种类的生物真核细胞中。In another embodiment, a method of producing an orthogonal aminoacyl-tRNA synthetase (O-RS) that preferentially aminoacylates an orthogonal tRNA with an unnatural amino acid in a eukaryotic cell comprises: (a) A first species of eukaryotic cell population (eg, a eukaryotic species such as yeast or the like) is positively selected for in the presence of the amino acid. The eukaryotic cells of the first class each comprise: i) a member of an aminoacyl-tRNA synthetase (RS) library, ii) an orthogonal tRNA (O-tRNA), iii) a polynucleotide encoding a positive selection marker, and iv ) a polynucleotide encoding a negative selection marker. Cells capable of surviving positive selection contain an active RS that aminoacylates an orthogonal tRNA (O-tRNA) in the presence of an unnatural amino acid. Cells surviving positive selection were negatively selected in the absence of unnatural amino acids to remove active RSs that aminoacylate O-tRNAs with natural amino acids, thus providing preferential aminoacylation of O-tRNAs with unnatural amino acids. O-RS of tRNA. The nucleic acid encoding the O-tRNA and the nucleic acid encoding the O-RS are introduced into a second species of eukaryotic cell (eg, mammalian, insect, fungal, algae, plant, and/or the like). These modules, when translated in the second species, can be used to incorporate unnatural amino acids into proteins or polypeptides of interest in the second species. In one embodiment, the O-tRNA and/or O-RS is introduced into the eukaryotic cell of the second species of organism.

在某些实施方式中,通过将第一种类真核细胞的群体进行负选择获得O-tRNA,其中真核细胞包含tRNA文库的一员,以去除被对真核细胞内源性氨酰基-tRNA合成酶(RS)氨酰化的tRNA文库的一员的细胞。这提供了与第一种类和第二种类的真核细胞正交的tRNA库。In certain embodiments, the O-tRNA is obtained by negative selection of a first population of eukaryotic cells comprising a member of a tRNA library to remove endogenous aminoacyl-tRNAs that are targeted to the eukaryotic cells Synthetase (RS) aminoacylates a member of the tRNA library in cells. This provides a pool of tRNAs that are orthogonal to eukaryotic cells of the first and second species.

在一个方面,本发明包括含有一种蛋白的组合物,其中该蛋白包含至少一种非天然氨基酸和至少一个翻译后修饰,其中至少一个翻译后修饰包括将含有第二活性基团的分子通过[3+2]环加成附着到含有第一活性基团的至少一种非天然氨基酸上。In one aspect, the invention includes compositions comprising a protein, wherein the protein comprises at least one unnatural amino acid and at least one post-translational modification, wherein the at least one post-translational modification comprises passing a molecule comprising a second reactive group through 3+2] cycloaddition attachment to at least one unnatural amino acid containing a first reactive group.

因此,具有至少一种非天然氨基酸的蛋白(或感兴趣多肽)也是本发明的特征。在本发明的某些实施方式中,具有至少一种非天然氨基酸的蛋白包括至少一个翻译后修饰。在一个实施方式中,至少一个翻译后修饰是将含有第二活性基团的分子(例如,染料、聚合物如聚乙二醇的衍生物、光交联剂、细胞毒化合物、亲和标记、生物素的衍生物、树脂、第二种蛋白或多肽、金属螯合剂、辅因子、脂肪酸、碳水化合物、多核苷酸(例如、DNA、RNA等)等)通过[3+2]环加成附着到含有第一活性基团的至少一种非天然氨基酸上。例如,第一活性基团是炔基部分(例如,非天然氨基酸中对-炔丙基氧基苯丙氨酸)(该基团有时也称为乙炔部分),第二活性基团是叠氮基部分。在另一个实施例中,第一活性基团是叠氮基部分(例如,非天然氨基酸中对-叠氮基-L-苯丙氨酸),第二活性基团是炔基部分。在某些实施方式中,本发明的蛋白包括至少一种含有至少一个翻译后修饰的非天然氨基酸(例如,酮式非天然氨基酸),其中至少一个翻译后修饰包含糖部分。在某些实施方式中,在真核细胞中体内进行翻译后修饰。Thus, proteins (or polypeptides of interest) having at least one unnatural amino acid are also a feature of the invention. In certain embodiments of the invention, proteins having at least one unnatural amino acid include at least one post-translational modification. In one embodiment, at least one post-translational modification is the addition of a molecule containing a second reactive group (e.g., a dye, a derivative of a polymer such as polyethylene glycol, a photocrosslinker, a cytotoxic compound, an affinity tag, Derivatives of biotin, resins, second proteins or polypeptides, metal chelators, cofactors, fatty acids, carbohydrates, polynucleotides (eg, DNA, RNA, etc.) attached via [3+2] cycloaddition to at least one unnatural amino acid containing a first reactive group. For example, the first reactive group is an alkynyl moiety (e.g., p-propargyloxyphenylalanine in unnatural amino acids) (this group is sometimes referred to as the acetylene moiety), the second reactive group is an azide base part. In another embodiment, the first reactive group is an azido moiety (eg, p-azido-L-phenylalanine in an unnatural amino acid) and the second reactive group is an alkynyl moiety. In certain embodiments, proteins of the invention include at least one unnatural amino acid (eg, a keto unnatural amino acid) that contains at least one post-translational modification, wherein at least one post-translational modification comprises a sugar moiety. In certain embodiments, post-translational modifications are performed in vivo in eukaryotic cells.

在某些实施方式中,蛋白包括至少一个通过真核细胞体内进行的翻译后修饰,其中翻译后修饰并不是通过原核细胞进行的。翻译后修饰的例子包括但不限于乙酰化、酰化、脂质-修饰、棕榈酰化、棕榈酸酯加成、磷酸化、糖脂-连接修饰等。在一个实施方式中,翻译后修饰包括将寡糖通过GlcNAc-天冬酰胺连接附着到天冬酰胺上(例如,其中寡糖包括(GlcNAc-Man)2-Man-GlcNAc-GlcNAc等)。在另一实施方式中,翻译后修饰包括将寡糖(例如,Gal-GalNAc,Gal-GlcNAc等)通过GalNAc-丝氨酸、GalNAc-苏氨酸、GlcNAc-丝氨酸或GlcNAc-苏氨酸连接附着到丝氨酸或苏氨酸上。在某些实施方式中,本发明的蛋白或多肽可包含分泌或定位序列、表位标记、FLAG标记、聚组氨酸标记、GST融合蛋白和/或类似物。In certain embodiments, the protein includes at least one post-translational modification in vivo by a eukaryotic cell, wherein the post-translational modification is not carried out by a prokaryotic cell. Examples of post-translational modifications include, but are not limited to, acetylation, acylation, lipid-modification, palmitoylation, palmitate addition, phosphorylation, glycolipid-linkage modification, and the like. In one embodiment, the post-translational modification comprises attachment of an oligosaccharide to asparagine via a GlcNAc-asparagine linkage (eg, wherein the oligosaccharide comprises (GlcNAc-Man) 2 -Man-GlcNAc-GlcNAc, etc.). In another embodiment, the post-translational modification comprises attachment of an oligosaccharide (e.g., Gal-GalNAc, Gal-GlcNAc, etc.) to serine via a GalNAc-serine, GalNAc-threonine, GlcNAc-serine, or GlcNAc-threonine linkage or on threonine. In certain embodiments, a protein or polypeptide of the invention may comprise a secretion or localization sequence, an epitope tag, a FLAG tag, a polyhistidine tag, a GST fusion protein, and/or the like.

一般地,蛋白与任意可用蛋白(例如,治疗蛋白、诊断蛋白、工业酶或它们的一部分和/或类似物)有例如,至少60%、至少70%、至少75%、至少80%、至少90%、至少95%或甚至至少99%或更多相同,它们包含一种或多种非天然氨基酸。在一个实施方式中,本发明组合物包括感兴趣的蛋白或多肽和赋形剂(例如,缓冲液、药学上可接受的赋形剂等)。Generally, the protein has an affinity, e.g., at least 60%, at least 70%, at least 75%, at least 80%, at least 90%, with any available protein (e.g., a therapeutic protein, a diagnostic protein, an industrial enzyme, or a portion thereof and/or the like). %, at least 95% or even at least 99% or more identical, they comprise one or more unnatural amino acids. In one embodiment, a composition of the invention includes a protein or polypeptide of interest and an excipient (eg, buffer, pharmaceutically acceptable excipient, etc.).

感兴趣的蛋白或多肽可含有至少一个、至少两个、至少三个、至少四个、至少五个、至少六个、至少七个、至少八个、至少九个、或十个或更多非天然氨基酸。非天然氨基酸可以是相同或不同的,例如,在包含1、2、3、4、5、6、7、8、9、10或更多不同的非天然氨基酸的蛋白中可以有1、2、3、4、5、6、7、8、9、10或更多不同位点。在某些实施方式中,蛋白质天然产生形式中存在至少一种,但少于全部的具体氨基酸用非天然氨基酸取代。A protein or polypeptide of interest may contain at least one, at least two, at least three, at least four, at least five, at least six, at least seven, at least eight, at least nine, or ten or more non- Natural amino acids. The unnatural amino acids can be the same or different, for example, in a protein comprising 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 or more different unnatural amino acids there can be 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 or more different positions. In certain embodiments, at least one, but less than all, of a particular amino acid present in a naturally occurring form of the protein is substituted with an unnatural amino acid.

一种蛋白(或感兴趣多肽)的例子包括但不限于,例如,细胞因子、生长因子、生长因子受体、干扰素、白介素、炎症分子、癌基因产物、肽激素、信号转导分子、甾类激素受体、促红细胞生成素(EPO)、胰岛素、人生长激素、α-1抗胰蛋白酶、血管生成抑制素、抗溶血因子、抗体、载脂蛋白、脱辅蛋白质、心钠素、心房钠尿多肽、心房肽、C-X-C趋化因子、T39765、NAP-2、ENA-78、Gro-a、Gro-b、Gro-c、IP-10、GCP-2、NAP-4、SDF-1、PF4、MIG、降钙素、c-kit配体、细胞因子、CC趋化因子、单核细胞趋化蛋白-1、单核细胞趋化蛋白-2、单核细胞趋化蛋白-3、单核细胞炎症蛋白-1α、单核细胞炎症蛋白-1β、RANTES、I309、R83915、R91733、HCC1、T58847、D31065、T64262、CD40、CD40配体、C-kit配体、胶原、集落刺激因子(CSF)、补体因子5a、补体抑制剂、补体受体1、细胞因子、DHFR、上皮嗜中性粒细胞激活肽-78、GROα/MGSA、GROβ、GROγ、MIP-1α、MIP-1δ、MCP-1、表皮生长因子(EGF)、上皮嗜中性粒细胞激活肽、促红细胞生成素(EPO)、剥脱性毒素、因子IX、因子VII、因子VIII、因子X、成纤维细胞生长因子(FGF)、纤维蛋白原、纤连蛋白、G-CSF、GM-CSF、葡糖脑苷脂酶、促性腺素、生长因子、生长因子受体、Hedgehog蛋白、血红蛋白、肝细胞生长因子(HGF)、水蛭素、人血清白蛋白、ICAM1、ICAM-1受体、LFA-1、LFA-1受体、胰岛素、胰岛素-样生长因子(IGF)、IGF-I、IGF-II、干扰素、IFN-α、IFN-β、IFN-γ、白介素、IL-1、IL-2、IL-3、IL-4、IL-5、IL-6、IL-7、IL-8、IL-9、IL-10、IL-11、IL-12、角质形成细胞生长因子(KGF)、乳铁蛋白、白血病抑制因子、荧光素酶、Neurturin、嗜中性粒细胞抑制因子(NIF)、制瘤素M、成骨蛋白、癌基因产物、甲状旁腺激素、PD-ECSF、PDGF、肽激素、人生长激素、多效营养因子、蛋白A、蛋白G、热源性外毒素A、B或C、松弛素、肾素、SCF、可溶性补体受体I、可溶性I-CAM1、可溶性白介素受体、可溶性TNF受体、生长调节素、促生长素抑制素、促生长素、链激酶、超抗原、葡萄球菌肠毒素、SEA、SEB、SEC1、SEC2、SEC3、SED、SEE、甾类激素受体、超氧化物歧化酶(SOD)、中毒性休克综合征毒素、胸腺素α1、组织纤溶酶原激活物、肿瘤生长因子(TGF)、TGF-α、TGF-β、肿瘤坏死因子、肿瘤坏死因子α、肿瘤坏死因子β、肿瘤坏死因子受体(TNFR)、VLA-4蛋白、VCAM-1蛋白、血管内皮生长因子(VEGEF)、尿激酶、Mos、Ras、Raf、Met;p53、Tat、Fos、Myc、Jun、Myb、Rel、雌激素受体、孕酮受体、睾酮受体、醛固酮受体、LDL受体、SCF/c-Kit、CD40L/CD40、VLA-4/VCAM-1、ICAM-l/LFA-1、透明质酸苷(hyalurin)/CD44、皮质酮、Genebank或其它可用数据库中存在的蛋白等,和/或它们的一部分。在一个实施方式中,感兴趣多肽包括转录调节蛋白(例如,转录激活蛋白(如GAL4),或转录抑制蛋白等)或它们的一部分。Examples of a protein (or polypeptide of interest) include, but are not limited to, e.g., cytokines, growth factors, growth factor receptors, interferons, interleukins, inflammatory molecules, oncogene products, peptide hormones, signal transduction molecules, steroid Hormone receptor, erythropoietin (EPO), insulin, human growth hormone, alpha-1 antitrypsin, angiostatin, antihemolytic factor, antibody, apolipoprotein, apoprotein, atrial natriuretic peptide, atrial Natriuretic peptide, atrial peptide, C-X-C chemokine, T39765, NAP-2, ENA-78, Gro-a, Gro-b, Gro-c, IP-10, GCP-2, NAP-4, SDF-1, PF4, MIG, calcitonin, c-kit ligand, cytokines, CC chemokines, monocyte chemoattractant protein-1, monocyte chemoattractant protein-2, monocyte chemoattractant protein-3, monocyte chemoattractant protein Nuclear cell inflammatory protein-1α, monocyte inflammatory protein-1β, RANTES, I309, R83915, R91733, HCC1, T58847, D31065, T64262, CD40, CD40 ligand, C-kit ligand, collagen, colony-stimulating factor (CSF ), complement factor 5a, complement inhibitor, complement receptor 1, cytokines, DHFR, epithelial neutrophil-activating peptide-78, GROα/MGSA, GROβ, GROγ, MIP-1α, MIP-1δ, MCP-1 , epidermal growth factor (EGF), epithelial neutrophil-activating peptide, erythropoietin (EPO), exfoliative toxin, factor IX, factor VII, factor VIII, factor X, fibroblast growth factor (FGF), Fibrinogen, Fibronectin, G-CSF, GM-CSF, Glucocerebrosidase, Gonadotropins, Growth Factors, Growth Factor Receptors, Hedgehog Protein, Hemoglobin, Hepatocyte Growth Factor (HGF), Hirudin , human serum albumin, ICAM1, ICAM-1 receptor, LFA-1, LFA-1 receptor, insulin, insulin-like growth factor (IGF), IGF-I, IGF-II, interferon, IFN-α, IFN-β, IFN-γ, interleukin, IL-1, IL-2, IL-3, IL-4, IL-5, IL-6, IL-7, IL-8, IL-9, IL-10, IL-11, IL-12, Keratinocyte Growth Factor (KGF), Lactoferrin, Leukemia Inhibitory Factor, Luciferase, Neurturin, Neutrophil Inhibitory Factor (NIF), Oncostatin M, Osteogenic Protein , oncogene product, parathyroid hormone, PD-ECSF, PDGF, peptide hormone, human growth hormone, pleiotrophic factor, protein A, protein G, pyrogenic exotoxin A, B or C, relaxin, renin, SCF, soluble complement receptor I, soluble I-CAM1, soluble interleukin receptor, soluble TNF receptor, somatomodulin, somatostatin, somatotropin, streptokinase, superantigen, staphylococcal enterotoxin, SEA, SEB, SEC1, SEC2, SEC3, SED, SEE, steroid hormone receptor, superoxide dismutase (SOD), toxic shock syndrome toxin, thymosin α1, tissue plasminogen activator, tumor growth factor ( TGF), TGF-α, TGF-β, tumor necrosis factor, tumor necrosis factor α, tumor necrosis factor β, tumor necrosis factor receptor (TNFR), VLA-4 protein, VCAM-1 protein, vascular endothelial growth factor (VEGEF ), Urokinase, Mos, Ras, Raf, Met; p53, Tat, Fos, Myc, Jun, Myb, Rel, Estrogen Receptor, Progesterone Receptor, Testosterone Receptor, Aldosterone Receptor, LDL Receptor, SCF /c-Kit, CD40L/CD40, VLA-4/VCAM-1, ICAM-1/LFA-1, hyalurin/CD44, corticosterone, proteins present in Genebank or other available databases, etc., and / or parts of them. In one embodiment, the polypeptide of interest includes a transcriptional regulatory protein (eg, a transcriptional activator (such as GAL4), or a transcriptional repressor, etc.) or a portion thereof.

真核细胞中的GAL4蛋白或其部分的组合物也是本发明的特征。一般地,GAL4蛋白或其一部分包含至少一种非天然氨基酸。Compositions of GAL4 proteins or portions thereof in eukaryotic cells are also a feature of the invention. Typically, the GAL4 protein or portion thereof comprises at least one unnatural amino acid.

本发明的真核细胞提供了合成含大有用量非天然氨基酸的蛋白的能力。例如,生产含有非天然氨基酸的蛋白在细胞抽提物、缓冲液、药学上可接受的赋形剂和/或类似物中的蛋白浓度是,例如,至少10微克/升、至少50微克/升、至少75微克/升、至少100微克/升、至少200微克/升、至少250微克/升、或至少500微克/升或更高。在某些实施方式中,本发明组合物包括,例如,至少10微克、至少50微克、至少75微克、至少100微克、至少200微克、至少250微克、或至少500微克或更多含有非天然氨基酸的蛋白。The eukaryotic cells of the invention provide the ability to synthesize proteins containing unnatural amino acids in large useful quantities. For example, the protein concentration in cell extracts, buffers, pharmaceutically acceptable excipients and/or the like to produce proteins containing unnatural amino acids is, for example, at least 10 μg/L, at least 50 μg/L , at least 75 micrograms/liter, at least 100 micrograms/liter, at least 200 micrograms/liter, at least 250 micrograms/liter, or at least 500 micrograms/liter or higher. In certain embodiments, compositions of the invention comprise, for example, at least 10 micrograms, at least 50 micrograms, at least 75 micrograms, at least 100 micrograms, at least 200 micrograms, at least 250 micrograms, or at least 500 micrograms or more containing unnatural amino acids protein.

在某些实施方式中,核酸编码感兴趣的蛋白或多肽(或它们的一部分)。一般地,该核酸包含至少一个选择密码子、至少两个选择密码子、至少三个选择密码子、至少四个选择密码子、至少五个选择密码子、至少六个选择密码子、至少七个选择密码子、至少八个选择密码子、至少九个选择密码子、或甚至十个或更多选择密码子。In certain embodiments, the nucleic acid encodes a protein or polypeptide of interest (or a portion thereof). Typically, the nucleic acid comprises at least one selector codon, at least two selector codons, at least three selector codons, at least four selector codons, at least five selector codons, at least six selector codons, at least seven A selector codon, at least eight selector codons, at least nine selector codons, or even ten or more selector codons.

本发明也提供在真核细胞中生产至少一种含有至少一个非天然氨基酸的蛋白质(以及用这种方法生产的蛋白)的方法。方法包括,例如,在合适的培养基中培养含有一种核酸的真核细胞,该核酸包含至少一个选择密码子并编码该蛋白。真核细胞也含有细胞中有功能且能识别选择密码子的正交tRNA(O-tRNA),以及优选地氨酰化具有非天然氨基酸的O-tRNA的正交氨酰基tRNA合成酶(O-RS),培养基含有非天然氨基酸。在一个实施方式中,O-RS氨酰化具有非天然氨基酸的O-tRNA相当于具有如SEQ ID NO.:86或45中所列氨基酸序列的O-RS的效率的例如,至少45%、至少50%、至少60%、至少75%、至少80%、至少90%、至少95%、或甚至99%或更高。在另一实施方式中,O-tRNA包括由SEQ ID NO.:64或65或其互补多核苷酸序列编码,或加工自该序列。在另一实施方式中,O-RS包含SEQ ID NO.:36-63(例如,36-47、48-63或36-63的任意其它亚组)和/或86中任一个所列氨基酸序列。The invention also provides methods of producing at least one protein (and proteins produced by such methods) comprising at least one unnatural amino acid in a eukaryotic cell. Methods include, for example, culturing in a suitable medium a eukaryotic cell containing a nucleic acid comprising at least one selector codon and encoding the protein. Eukaryotic cells also contain an orthogonal tRNA (O-tRNA) that is functional in the cell and recognizes a selector codon, and an orthogonal aminoacyl tRNA synthetase (O-tRNA) that preferentially aminoacylates the O-tRNA with an unnatural amino acid. RS), the medium contains unnatural amino acids. In one embodiment, the O-RS aminoacylates an O-tRNA with an unnatural amino acid equivalent to the efficiency of an O-RS with an amino acid sequence as set forth in SEQ ID NO.: 86 or 45, e.g., at least 45%, At least 50%, at least 60%, at least 75%, at least 80%, at least 90%, at least 95%, or even 99% or higher. In another embodiment, the O-tRNA is encoded by SEQ ID NO.: 64 or 65 or its complementary polynucleotide sequence, or processed from this sequence. In another embodiment, the O-RS comprises any one of the amino acid sequences listed in SEQ ID NO.: 36-63 (e.g., 36-47, 48-63, or any other subgroup of 36-63) and/or 86 .

在一个实施方式中,该方法还包括将非天然氨基酸掺入该蛋白中,其中非天然氨基酸包含第一活性基团;并将该蛋白与含有第二活性基团的分子(例如,染料、聚合物、例如、聚乙二醇的衍生物、光交联剂、细胞毒化合物、亲和标记、生物素的衍生物、树脂、第二种蛋白或多肽、金属螯合剂、辅因子、脂肪酸、碳水化合物、多核苷酸(例如、DNA、RNA等)等)接触。第一活性基团与第二活性基团反应,将该分子通过[3+2]环加成附着到非天然氨基酸上。在一个实施方式中,第一活性基团是炔基或叠氮基部分,第二活性基团是叠氮基或炔基部分。例如,第一活性基团是炔基部分(例如,非天然氨基酸中对-炔丙基氧基苯丙氨酸),第二活性基团是叠氮基部分。在另一实施例中,第一活性基团是叠氮基部分(例如,非天然氨基酸中对-叠氮基-L-苯丙氨酸),第二活性基团是炔基部分.In one embodiment, the method further comprises incorporating an unnatural amino acid into the protein, wherein the unnatural amino acid comprises a first reactive group; and combining the protein with a molecule (e.g., dye, polymeric substances, for example, derivatives of polyethylene glycol, photocrosslinkers, cytotoxic compounds, affinity tags, derivatives of biotin, resins, second proteins or polypeptides, metal chelators, cofactors, fatty acids, carbohydrates Compounds, polynucleotides (eg, DNA, RNA, etc.), etc.) are contacted. The first reactive group reacts with the second reactive group, attaching the molecule to the unnatural amino acid via a [3+2] cycloaddition. In one embodiment, the first reactive group is an alkynyl or azido moiety and the second reactive group is an azido or alkynyl moiety. For example, the first reactive group is an alkynyl moiety (eg, p-propargyloxyphenylalanine in an unnatural amino acid) and the second reactive group is an azido moiety. In another embodiment, the first reactive group is an azido moiety (e.g., p-azido-L-phenylalanine in an unnatural amino acid) and the second reactive group is an alkynyl moiety.

在某些实施方式中,编码蛋白包含治疗蛋白、诊断蛋白、工业酶或它们的一部分。在一个实施方式中,通过非天然氨基酸进一步修饰该方法生产的蛋白。例如,通过如亲核-亲电子反应,经由[3+2]环加成等修饰非天然氨基酸。在另一实施方式中,通过至少一个翻译后修饰(例如,N-糖基化,O-糖基化,乙酰化,酰化,脂质-修饰,棕榈酰化,棕榈酸酯加成,磷酸化,糖脂-连接修饰等)体内修饰该方法生产的蛋白质。In certain embodiments, the encoded protein comprises a therapeutic protein, a diagnostic protein, an industrial enzyme, or a portion thereof. In one embodiment, the protein produced by this method is further modified by unnatural amino acids. For example, unnatural amino acids are modified by, for example, nucleophilic-electrophilic reactions, via [3+2] cycloaddition, and the like. In another embodiment, through at least one post-translational modification (e.g., N-glycosylation, O-glycosylation, acetylation, acylation, lipid-modification, palmitoylation, palmitate addition, phosphorylation (e.g., glycolipid-linkage modification, etc.) modify proteins produced by this method in vivo.

也提供了生产筛选或选择转录调节蛋白的方法(以及用这种方法生产筛选或选择转录调节蛋白)。方法包括,例如,选择第一个多核苷酸序列,其中多核苷酸序列编码核酸结合域;将第一个多核苷酸序列突变,以包括至少一种选择密码子。这提供了筛选或选择多核苷酸序列。方法也包括,例如,选择第二个多核苷酸序列,其中第二个多核苷酸序列编码转录激活域;提供构建物,它包含可操作地连接于第二个多核苷酸序列的筛选或选择多核苷酸序列;和,将构建物非天然氨基酸、正交tRNA合成酶(O-RS)和正交tRNA(Q-tRNA)引入细胞。用这些组件,响应于筛选或选择多核苷酸序列中的选择密码子,O-RS优选地氨酰化具有非天然氨基酸的O-tRNA,O-tRNA识别选择密码子并将非天然氨基酸掺入核酸结合域中。这提供了筛选或选择转录调节蛋白。Also provided are methods of producing screening or selecting transcriptional modulating proteins (and using such methods to produce screening or selecting transcriptional modulating proteins). Methods include, for example, selecting a first polynucleotide sequence, wherein the polynucleotide sequence encodes a nucleic acid binding domain; and mutating the first polynucleotide sequence to include at least one selector codon. This provides screening or selection of polynucleotide sequences. The method also includes, for example, selecting a second polynucleotide sequence, wherein the second polynucleotide sequence encodes a transcriptional activation domain; providing a construct comprising a screening or selection operatively linked to the second polynucleotide sequence the polynucleotide sequence; and, introducing the construct unnatural amino acid, orthogonal tRNA synthetase (O-RS) and orthogonal tRNA (Q-tRNA) into the cell. With these modules, in response to screening or selecting a selector codon in a polynucleotide sequence, the O-RS preferentially aminoacylates the O-tRNA with the unnatural amino acid, the O-tRNA recognizes the selector codon and incorporates the unnatural amino acid in the nucleic acid binding domain. This provides for screening or selection of transcriptional regulator proteins.

在某些实施方式中,本发明的组合物和方法包括真核细胞。本发明的真核细胞包括,例如,哺乳动物细胞、酵母细胞、真菌细胞、植物细胞、昆虫细胞等的任意一种。本发明的翻译组件可以来自各种生物,例如,非真核生物,如原核生物(例如,大肠杆菌,嗜热脂肪芽孢杆菌等),或古细菌,或例如,真核生物。In certain embodiments, the compositions and methods of the invention include eukaryotic cells. The eukaryotic cells of the present invention include, for example, any of mammalian cells, yeast cells, fungal cells, plant cells, insect cells, and the like. The translational components of the invention can be from various organisms, eg, non-eukaryotes, such as prokaryotes (eg, E. coli, Bacillus stearothermophilus, etc.), or archaea, or, eg, eukaryotes.

本发明的选择密码子扩展了真核蛋白生物合成机器的遗传密码子构架。本发明中可以使用各种选择密码子的任意一种,包括终止密码子(例如,琥珀密码子、赭石密码子或乳白终止密码子)、无义密码子、罕用密码子、四(或更多)碱基密码子和/或类似物。The selector codons of the present invention expand the genetic codon framework of the eukaryotic protein biosynthetic machinery. Any of a variety of selector codons can be used in the present invention, including stop codons (e.g., amber codons, ocher codons, or opal stop codons), nonsense codons, rare codons, four (or more Multi) base codons and/or analogs.

可用于本文描述的组合物和方法的非天然氨基酸的例子包括(但不限于):对-乙酰基-L-苯丙氨酸、对-碘代-L-苯丙氨酸、O-甲基-L-酪氨酸、对-炔丙基氧基苯丙氨酸、对-炔丙基-苯丙氨酸、L-3-(2-萘基)丙氨酸、3-甲基-苯丙氨酸、0-4-烯丙基-L-酪氨酸、4-丙基-L-酪氨酸、三-氧-乙酰基-GlcNAcβ-丝氨酸、L-多巴、氟化苯丙氨酸、异丙基-L-苯丙氨酸、对-叠氮基-L-苯丙氨酸、对-酰基-L-苯丙氨酸、对-苯甲酰基-L-苯丙氨酸、L-磷酸丝氨酸、膦酰基丝氨酸、膦酰基酪氨酸、对-溴苯丙氨酸、对-氨基-L-苯丙氨酸、异丙基-L-苯丙氨酸、酪氨酸氨基酸的非天然类似物;谷氨酰胺氨基酸的非天然类似物;苯丙氨酸氨基酸的非天然类似物;丝氨酸氨基酸的非天然类似物;苏氨酸氨基酸的非天然类似物;烷基、芳基、酰基、叠氮基、氰基、卤素、肼、酰肼、羟基、链烯基、炔基、醚、硫醇、磺酰基、硒、酯、硫代酸、硼酸、硼酸盐、磷酰基、膦酰基、膦、杂环、烯酮、亚胺、醛、羟胺、酮基或氨基取代的氨基酸或它们的任意组合;具有可光敏化的交联剂的氨基酸;自旋标记的氨基酸;荧光氨基酸;金属结合氨基酸;含金属的氨基酸;放射性氨基酸;光笼蔽(photocaged)和/或可光异构的氨基酸;含有生物素或生物素-类似物的氨基酸;含酮氨基酸;含有聚乙二醇或聚醚的氨基酸;重原子取代的氨基酸;可化学切割或可光切割的氨基酸;具有延长侧链的氨基酸;含有毒基团的氨基酸;糖取代的氨基酸;含有碳-连接糖的氨基酸;具有氧化还原活性的氨基酸;含α-羟基的酸;氨基硫代酸;α,α双取代的氨基酸;β-氨基酸;除脯氨酸或组氨酸外的环氨基酸,除苯丙氨酸,酪氨酸或色氨酸外的芳族氨基酸,和/或类似物。Examples of unnatural amino acids that can be used in the compositions and methods described herein include, but are not limited to: p-acetyl-L-phenylalanine, p-iodo-L-phenylalanine, O-methyl -L-tyrosine, p-propargyloxyphenylalanine, p-propargyl-phenylalanine, L-3-(2-naphthyl)alanine, 3-methyl-phenyl Alanine, 0-4-allyl-L-tyrosine, 4-propyl-L-tyrosine, tris-oxo-acetyl-GlcNAc beta-serine, L-dopa, phenylalanine fluoride acid, isopropyl-L-phenylalanine, p-azido-L-phenylalanine, p-acyl-L-phenylalanine, p-benzoyl-L-phenylalanine, L-phosphoserine, phosphonoserine, phosphonotyrosine, p-bromophenylalanine, p-amino-L-phenylalanine, isopropyl-L-phenylalanine, tyrosine amino acids Unnatural analogs; Unnatural analogs of glutamine amino acid; Unnatural analogs of phenylalanine amino acid; Unnatural analogs of serine amino acid; Unnatural analogs of threonine amino acid; Alkyl, aryl, Acyl, azido, cyano, halogen, hydrazine, hydrazide, hydroxyl, alkenyl, alkynyl, ether, thiol, sulfonyl, selenium, ester, thioacid, boronic acid, borate, phosphoryl, Phosphono, phosphine, heterocyclic, enone, imine, aldehyde, hydroxylamine, keto, or amino substituted amino acids or any combination thereof; amino acids with photosensitizable crosslinkers; spin-labeled amino acids; fluorescent amino acids ; metal-binding amino acids; metal-containing amino acids; radioactive amino acids; photocaged and/or photoisomerizable amino acids; amino acids containing biotin or biotin-analogs; ketogenic amino acids; containing polyethylene glycol or polyether amino acids; heavy atom substituted amino acids; chemically cleavable or photocleavable amino acids; amino acids with extended side chains; amino acids containing toxic groups; sugar substituted amino acids; amino acids containing carbon-linked sugars; amino acids with Redox-active amino acids; acids containing α-hydroxyl groups; aminothio acids; α,α-disubstituted amino acids; β-amino acids; cyclic amino acids other than proline or histidine, except phenylalanine, tyrosine Aromatic amino acids other than amino acid or tryptophan, and/or the like.

本发明也提供多肽(O-RS)和多核苷酸,例如,0-tRNA,编码O-RS或其一部分(例如,合成酶的活性位点)的多核苷酸,用于构建氨酰基-tRNA合成酶突变体的寡核苷酸,编码含有一种或多种选择密码子的感兴趣的蛋白或多肽的多核苷酸等。例如,本发明的多肽包括包含SEQ ID NO.:36-63(例如,36-47、48-63或36-63的任意其它亚组)和/或86中任一个所列氨基酸序列的多肽,包含由SEQ ID NO.:3-35(例如,3-19、20-35或序列3-35的任何其它亚组)中任一个所列多核苷酸序列编码的氨基酸序列的多肽,和具有抗体特异免疫活性的多肽,该抗体对包含SEQ ID NO.:36-63(例如,36-47、48-63或36-63的任意其它亚组)和/或86中任一个所列氨基酸序列的多肽或包含SEQ ID NO.:3-35(例如,3-19、20-35或序列3-35的任何其它亚组)中任一所列多核苷酸序列编码的氨基酸序列的多肽特异。The invention also provides polypeptides (O-RS) and polynucleotides, e.g., O-tRNA, polynucleotides encoding O-RS or a portion thereof (e.g., the active site of a synthetase), for use in the construction of aminoacyl-tRNA Oligonucleotides for synthetase mutants, polynucleotides encoding proteins or polypeptides of interest containing one or more selector codons, etc. For example, polypeptides of the present invention include polypeptides comprising any one of the amino acid sequences listed in SEQ ID NO.: 36-63 (e.g., 36-47, 48-63 or any other subgroup of 36-63) and/or 86, A polypeptide comprising an amino acid sequence encoded by any one of the polynucleotide sequences set forth in SEQ ID NO.: 3-35 (e.g., 3-19, 20-35, or any other subgroup of sequence 3-35), and having an antibody Specific immunologically active polypeptides, the antibody comprises any one of the amino acid sequences listed in SEQ ID NO.: 36-63 (for example, 36-47, 48-63 or any other subgroup of 36-63) and/or 86 Polypeptides or polypeptides comprising the amino acid sequence encoded by any of the polynucleotide sequences listed in SEQ ID NO.: 3-35 (eg, 3-19, 20-35, or any other subgroup of sequences 3-35) are specific.

本发明的多肽也包括与天然产生的酪氨酰氨酰基-tRNA合成酶(TyrRS)(例如,SEQ ID NO.:2)具有至少90%相同氨基酸序列的多肽,和包含A-E族(上述)中两种或多种氨基酸的多肽。类似地,本发明多肽也任选地包括含有SEQ ID NO.:36-63(例如,36-47、48-63或36-63的任意其它亚组)和/或86中任意一个的至少20个连续氨基酸,和如A-E族中所述的两个或多个氨基酸取代的多肽。含有任一上述多肽的保守变体的氨基酸序列也作为本发明的多肽包括在内。Polypeptides of the present invention also include polypeptides having at least 90% identical amino acid sequences to naturally occurring tyrosylaminoacyl-tRNA synthetases (TyrRS) (e.g., SEQ ID NO.: 2), and those comprising A polypeptide of two or more amino acids. Similarly, polypeptides of the present invention also optionally comprise at least 20 of any one of SEQ ID NO.: 36-63 (e.g., 36-47, 48-63 or any other subgroup of 36-63) and/or 86. consecutive amino acids, and polypeptides with two or more amino acid substitutions as described in Groups A-E. Amino acid sequences comprising conservative variants of any of the aforementioned polypeptides are also included as polypeptides of the present invention.

在一个实施方式中,组合物包括本发明的多肽和赋形剂(例如,缓冲液、水、药学上可接受的赋形剂等)。本发明也提供与本发明多肽具有特异免疫活性的抗体或抗血清。In one embodiment, a composition includes a polypeptide of the invention and an excipient (eg, buffer, water, pharmaceutically acceptable excipient, etc.). The present invention also provides antibodies or antisera having specific immunological activity with the polypeptides of the present invention.

本发明中也提供了多核苷酸。本发明的多核苷酸包括那些用一种或多种选择密码子编码本发明感兴趣的蛋白或多肽的多核苷酸。此外,本发明的多核苷酸包括,例如,含有SEQ ID NO.:3-35(例如,3-19、20-35或序列3-35的任意其它亚组)、64-85中任意一个所列核苷酸序列的多核苷酸;与该多核苷酸序列互补或编码该多核苷酸序列的多核苷酸;和/或编码含有SEQ ID NO.:36-63(例如,36-47、48-63或36-63的任意其它亚组)和/或86中任意一个所列氨基酸序列或其保守变体的多肽的多核苷酸。本发明的多核苷酸也包括编码本发明多肽的多核苷酸。类似地,在高度严谨条件下与上述多核苷酸杂交的核酸超过基本上全长的核酸是本发明的多核苷酸。Polynucleotides are also provided in the present invention. Polynucleotides of the invention include those that encode a protein or polypeptide of interest of the invention using one or more selector codons. In addition, polynucleotides of the present invention include, for example, those containing any one of SEQ ID NO.: 3-35 (for example, 3-19, 20-35 or any other subgroup of sequence 3-35), 64-85 A polynucleotide having a sequence of nucleotide sequences; a polynucleotide which is complementary to or encodes the polynucleotide sequence; -63 or any other subgroup of 36-63) and/or polynucleotides of polypeptides of any one of the amino acid sequences listed in 86 or conservative variants thereof. Polynucleotides of the invention also include polynucleotides encoding polypeptides of the invention. Similarly, nucleic acids that hybridize under highly stringent conditions to the polynucleotides described above over substantially their full length are polynucleotides of the invention.

本发明的多核苷酸也包括编码多肽的多核苷酸,该多肽包含与天然产生的酪氨酰氨酰基-tRNA合成酶(TyrRS)(例如,SEQ ID NO.:2)至少90%相同的氨基酸序列,和包含A-E族(上述)中两个或多个突变。与上述多核苷酸和/或含有任一上述多核苷酸的保守变体的多核苷酸至少70%(或至少75%、至少80%、至少85%、至少90%、至少95%、至少98%、或至少99%或更多)相同的多核苷酸也包括在本发明的多核苷酸中。The polynucleotides of the invention also include polynucleotides encoding polypeptides comprising amino acids at least 90% identical to naturally occurring tyrosyl-tRNA synthetase (TyrRS) (e.g., SEQ ID NO.: 2) sequence, and contain two or more mutations in families A-E (above). At least 70% (or at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 98% of the above polynucleotides and/or polynucleotides containing conservative variants of any of the above polynucleotides) %, or at least 99% or more) identical polynucleotides are also included in the polynucleotides of the present invention.

在某些实施方式中,载体(例如,质粒、粘粒、噬菌体、病毒等)包含本发明的多核苷酸。在一个实施方式中,载体是表达载体。在另一实施方式中,表达载体包括可操作地连接于一种或多种本发明的多核苷酸的启动子。在另一实施方式中,细胞含有包括本发明的多核苷酸的载体。In certain embodiments, a vector (eg, plasmid, cosmid, phage, virus, etc.) comprises a polynucleotide of the invention. In one embodiment, the vector is an expression vector. In another embodiment, an expression vector includes a promoter operably linked to one or more polynucleotides of the invention. In another embodiment, a cell contains a vector comprising a polynucleotide of the invention.

在另一方面,本发明提供了化合物的组合物和生产所述化合物的方法。例如,化合物包括,例如,非天然氨基酸(如对-(炔丙基氧基)-苯丙氨酸(例如,图11中的1),叠氮基染料(如化学结构4和化学结构6中所示),炔基聚乙二醇(例如化学结构7中所示),其中n是例如,50和10,000、75和5,000、100和2,000、100和1,000等之间的整数等。在本发明的实施方式中,炔基聚乙二醇的分子量为,例如约5,000至约100,000Da、约20,000至约50,000Da、约20,000至约10,000Da(例如,20,000Da)。In another aspect, the invention provides compositions of compounds and methods of producing said compounds. For example, compounds include, for example, unnatural amino acids (such as p-(propargyloxy)-phenylalanine (for example, 1 in FIG. 11 ), azido dyes (such as in Chemical Structure 4 and Chemical Structure 6 shown), alkynyl polyethylene glycol (such as shown in chemical structure 7), wherein n is an integer between, for example, 50 and 10,000, 75 and 5,000, 100 and 2,000, 100 and 1,000, etc. In the present invention In certain embodiments, the alkynyl polyethylene glycol has a molecular weight of, for example, about 5,000 to about 100,000 Da, about 20,000 to about 50,000 Da, about 20,000 to about 10,000 Da (eg, 20,000 Da).

Figure A20048002115500231
Figure A20048002115500231

也提供了含有这些化合物的各种组合物,例如,具有蛋白质和细胞。在一个方面,组合物包括对-(炔丙基氧基)-苯丙氨酸非天然氨基酸,还包括正交tRNA。可将非天然氨基酸结合到(例如,以共价方式)正交tRNA,例如,通过氨基-酰基键共价结合到正交tRNA,共价结合到正交tRNA的末端核糖的3’OH或2’OH等。Various compositions containing these compounds are also provided, eg, with proteins and cells. In one aspect, the composition includes the p-(propargyloxy)-phenylalanine unnatural amino acid and also includes an orthogonal tRNA. The unnatural amino acid can be attached (e.g., covalently) to the orthogonal tRNA, e.g., covalently attached to the orthogonal tRNA via an amino-acyl bond, covalently attached to the 3'OH or 2 of the terminal ribose sugar of the orthogonal tRNA. 'OH et al.

在本发明的一个方面,含有叠氮基染料的蛋白质(例如,化学结构4或化学结构6)还包括至少一种非天然氨基酸(例如,炔氨基酸),其中叠氮基染料通过[3+2]环加成附着到非天然氨基酸上。In one aspect of the invention, the protein containing an azido dye (e.g., Chemical Structure 4 or Chemical Structure 6) further includes at least one unnatural amino acid (e.g., an alkyne amino acid), wherein the azido dye passes through [3+2 ] cycloaddition attachment to unnatural amino acids.

在一个实施方式中,一种蛋白含有化学结构7的炔基聚乙二醇。在另一实施方式中,该组合物还包括至少一种非天然氨基酸(例如,叠氮基氨基酸),其中炔基聚乙二醇通过[3+2]环加成附着到非天然氨基酸上。In one embodiment, a protein contains an alkynyl polyethylene glycol of chemical structure 7. In another embodiment, the composition further comprises at least one unnatural amino acid (eg, an azidoamino acid), wherein the alkynyl polyethylene glycol is attached to the unnatural amino acid by a [3+2] cycloaddition.

本发明中包括了合成各种化合物的方法。例如,提供了合成对-(炔丙基氧基)苯丙氨酸化合物的方法。例如,该方法包括(a)将N-叔-丁氧基羰基-酪氨酸和K2CO3悬浮在无水DMF中;(b)将炔丙基溴加入(a)的反应混合物中,烷化羟基和羧基基团,产生具有下述结构的保护中间化合物:Methods of synthesizing various compounds are included in the present invention. For example, methods for the synthesis of p-(propargyloxy)phenylalanine compounds are provided. For example, the method involves (a) suspending N-tert-butoxycarbonyl-tyrosine and K2CO3 in anhydrous DMF; (b) adding propargyl bromide to the reaction mixture of (a), Alkylation of the hydroxyl and carboxyl groups yields protected intermediate compounds with the following structures:

Figure A20048002115500241
Figure A20048002115500241

和(c)将保护中间化合物与无水HC1在MeOH中混合,使胺部分去保护,从而合成对-(炔丙基氧基)苯丙氨酸化合物。在一个实施方式中,该方法还包括(d)将对-(炔丙基氧基)苯丙氨酸HCl溶解于NaOH和MeOH水溶液中,室温搅拌;(e)将pH调整到7;和(f)沉淀对-(炔丙基氧基)苯丙氨酸化合物。and (c) the p-(propargyloxy)phenylalanine compound is synthesized by mixing the protected intermediate compound with anhydrous HCl in MeOH to deprotect the amine moiety. In one embodiment, the method further comprises (d) dissolving p-(propargyloxy)phenylalanine HCl in aqueous NaOH and MeOH, stirring at room temperature; (e) adjusting the pH to 7; and ( f) Precipitation of the p-(propargyloxy)phenylalanine compound.

也提供了合成叠氮基染料的方法。例如,方法包括:(a)提供含有磺酰基卤化物部分的染料化合物;(b)在3-叠氮基丙胺和三乙胺的存在下将染料化合物加热到室温,将3-叠氮基丙胺的胺部分偶联到染料化合物的卤素位置,从而合成叠氮基染料。在一个实施方式中,该染料化合物含有丹磺酰氯,叠氮基染料含有化学结构4的组合物。在一个方面,该方法还包括从反应混合物中纯化叠氮基染料。Methods of synthesizing azido-based dyes are also provided. For example, the method comprises: (a) providing a dye compound containing a sulfonyl halide moiety; (b) heating the dye compound to room temperature in the presence of 3-azidopropylamine and triethylamine, the 3-azidopropylamine The amine moiety of the dye is coupled to the halogen position of the dye compound to synthesize an azido dye. In one embodiment, the dye compound contains dansyl chloride, and the azido dye contains the composition of chemical structure 4. In one aspect, the method further comprises purifying the azido dye from the reaction mixture.

在另一实施例中,合成叠氮基染料的方法包括(a)提供含胺染料化合物;(b)将含胺染料化合物与碳二亚胺和4-(3-叠氮基丙基氨甲酰基)-丁酸在合适的溶剂中混合,将酸的羰基与染料化合物的胺部分偶联,从而合成叠氮基染料。在一个实施方式中,碳二亚胺包括1-乙基-3-(3-二甲基氨丙基)碳二亚胺盐酸盐(EDCI)。在一个方面,含胺染料包括荧光素胺(fluoresceinamine),合适的溶剂包括吡啶。例如,含胺染料包括荧光素胺,叠氮基染料包括化学结构6的组合物。在一个实施方式中,该方法还包括(c)沉淀叠氮基染料;(d)用HC1洗涤沉淀;(e)将洗涤过的沉淀溶解在EtOAc中;和(f)在己烷中沉淀叠氮基染料。In another embodiment, a method of synthesizing an azido dye comprises (a) providing an amine-containing dye compound; (b) combining the amine-containing dye compound with carbodiimide and 4-(3-azidopropylaminocarbamate Acyl)-butyric acid is mixed in a suitable solvent to couple the carbonyl group of the acid with the amine moiety of the dye compound to synthesize an azido dye. In one embodiment, the carbodiimide comprises 1-ethyl-3-(3-dimethylaminopropyl)carbodiimide hydrochloride (EDCI). In one aspect, the amine-containing dye includes fluoresceinamine and suitable solvents include pyridine. For example, amine-containing dyes include fluorescein amine, and azido-based dyes include compositions of Chemical Structure 6. In one embodiment, the method further comprises (c) precipitating the azido dye; (d) washing the precipitate with HCl; (e) dissolving the washed precipitate in EtOAc; and (f) precipitating the azido dye in hexane Nitrogen based dyes.

也提供了合成炔丙基酰胺聚乙二醇的方法。例如,该方法包括在室温下将炔丙基胺与聚乙二醇(PEG)-羟基琥珀酰亚胺酯在有机溶剂(例如,CH2Cl2)中反应,产生化学结构7的炔丙基酰胺聚乙二醇。在一个实施方式中,该方法还包括用乙酸乙酯沉淀炔丙基酰胺聚乙二醇。在一个方面,该方法还包括在甲醇中再结晶炔丙基酰胺聚乙二醇;真空下干燥产物。Also provided are methods of synthesizing propargylamide polyethylene glycols. For example, the method involves reacting propargylamine with polyethylene glycol (PEG)-hydroxysuccinimide ester in an organic solvent (eg, CH2Cl2 ) at room temperature to produce the propargyl group of chemical structure 7 Amide polyethylene glycol. In one embodiment, the method further comprises precipitating the propargylamide polyethylene glycol with ethyl acetate. In one aspect, the method further comprises recrystallizing propargylamide polyethylene glycol in methanol; drying the product under vacuum.

试剂盒也是本发明的特征。例如,提供了在细胞中生产包含至少一种非天然氨基酸的蛋白质的试剂盒,其中该试剂盒包括一个含有编码O-tRNA或O-tRNA的多核苷酸序列和编码O-RS或O-RS的多核苷酸序列的容器。在一个实施方式中,该试剂盒还包括至少一种非天然氨基酸。在另一实施方式中,该试剂盒还包括生产该蛋白质的说明材料。Kits are also a feature of the invention. For example, a kit for producing a protein comprising at least one unnatural amino acid in a cell is provided, wherein the kit comprises a polynucleotide sequence encoding O-tRNA or O-tRNA and a polynucleotide sequence encoding O-RS or O-RS container of polynucleotide sequences. In one embodiment, the kit also includes at least one unnatural amino acid. In another embodiment, the kit further includes instructional material for producing the protein.

定义definition

在详细描述本发明之前,应理解本发明不限于具体装置或生物系统,当然它们可以改变。也应理解本文使用的术语是仅为描述具体实施方式的目的,并不打算限制。如本说明书和所附权利要求书中所用的单数形式“一个”、“一种”和“该”包括复数,除非该内容有明确规定。因此,例如,“一个细胞”的提法包括两种或多种细胞的组合;“细菌”的提法包括细菌的混合物等。Before the present invention is described in detail, it is to be understood that this invention is not limited to particular devices or biological systems, which can, of course, vary. It is also to be understood that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting. As used in this specification and the appended claims, the singular forms "a," "an" and "the" include plural referents unless the content clearly dictates otherwise. Thus, for example, reference to "a cell" includes a combination of two or more cells; reference to "bacteria" includes mixtures of bacteria, and the like.

除非本文或下面的说明书剩余部分中有其它限定,本文中使用的所有技术和科学术语的含义与本发明所属领域普通技术人员通常理解的含义相同。Unless otherwise defined herein or in the rest of the specification below, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs.

同源:当蛋白质和/或蛋白序列天然地或人工地来自共同的祖先蛋白质或蛋白序列时,则它们“同源”。类似地,当核酸和/或核酸序列天然地或人工地来自共同的祖先核酸和/或核酸序列时,则它们“同源”。例如,可以通过任何可行的诱变方法修饰任何天然产生的核酸,使其包括一种或多种选择密码子。当该诱变的核酸表达时,它编码含有一个或多个非天然氨基酸的多肽。当然,该突变过程还可以改变一个或多个标准密码子,从而也在所得的突变蛋白中改变一个或多个标准氨基酸。Homologous: Proteins and/or protein sequences are "homologous" when they are derived, either naturally or artificially, from a common ancestral protein or protein sequence. Similarly, nucleic acids and/or nucleic acid sequences are "homologous" when they are derived, either naturally or artificially, from a common ancestral nucleic acid and/or nucleic acid sequence. For example, any naturally occurring nucleic acid can be modified to include one or more selector codons by any available mutagenesis method. When expressed, the mutagenized nucleic acid encodes a polypeptide containing one or more unnatural amino acids. Of course, the mutation process can also change one or more standard codons, thereby also changing one or more standard amino acids in the resulting mutein.

同源性通常由两种或多种核酸或蛋白(或其序列)之间的序列相似性推定。用于确定同源性的序列间相似性的精确百分数随核酸和蛋白而变还有争论,但通常将少至25%的序列相似性用来确定同源性。较高水平的序列相似性,例如,30%、40%、50%、60%、70%、80%、90%、95%或99%或更高,也可以用来确定同源性。本文描述了通常可用的确定序列相似性百分数的方法(例如,使用默认参数的BLASTP和BLASTN)。Homology is generally inferred from sequence similarity between two or more nucleic acids or proteins (or sequences thereof). The precise percentage of similarity between sequences used to determine homology is subject to debate with nucleic acids and proteins, but generally as little as 25% sequence similarity is used to determine homology. Higher levels of sequence similarity, eg, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 95% or 99% or higher, can also be used to determine homology. Commonly available methods for determining percent sequence similarity (eg, BLASTP and BLASTN using default parameters) are described herein.

正交:本文使用的术语“正交”指对与细胞或翻译系统内源性的相应分子、或不能与细胞内源性组件一起作用的分子相比,与细胞的内源性组件一起作用的分子(例如,正交tRNA(O-tRNA)和/或正交氨酰基tRNA合成酶(O-RS))具有降低的效率。在指tRNA和氨酰基-tRNA合成酶的情况下,正交指不能或效率降低,例如,正交tRNA与内源性tRNA合成酶一起作用的效率比内源性tRNA与内源性tRNA合成酶的效率低,或正交氨酰基-tRNA合成酶与内源性tRNA一起作用的效率比内源性tRNA合成酶与内源性tRNA一起作用的效率小于20%、小于10%、小于5%、或小于1%。细胞中的正交分子缺少功能内源性互补分子。例如,由细胞的任意内源性RS氨酰化细胞中正交tRNA的效率比由内源性RS氨酰化内源性tRNA的效率低,或甚至是零。在另一实施例中,在感兴趣的细胞中,正交RS氨酰化任意内源性tRNA的效率比由内源性RS氨酰化内源性tRNA的效率低,或甚至是零。可以将第二个正交分子引入细胞,与第一个正交分子一起作用。例如,正交tRNA/RS对包括引入的互补组件,它们在细胞中一起发挥作用,其效率相当于相应tRNA/RS内源对的效率的(例如,50%、60%、70%、75%、80%、90%、95%、或99%或更高)。Orthogonal: As used herein, the term "orthogonal" refers to a molecule that acts with a cell's endogenous components compared to a corresponding molecule that is endogenous to the cell or translation system, or a molecule that does not work with the cell's endogenous components. Molecules (eg, orthogonal tRNA (O-tRNA) and/or orthogonal aminoacyl tRNA synthetase (O-RS)) have reduced efficiency. In the case of referring to tRNA and aminoacyl-tRNA synthetase, orthogonal refers to inability or reduced efficiency, e.g., orthogonal tRNA works more efficiently with endogenous tRNA synthetase than endogenous tRNA with endogenous tRNA synthetase Low efficiency, or the efficiency of the orthogonal aminoacyl-tRNA synthetase with the endogenous tRNA is less than 20%, less than 10%, less than 5%, or less than the efficiency of the endogenous tRNA synthetase with the endogenous tRNA 1%. Orthogonal molecules in cells lack functional endogenous complementary molecules. For example, the efficiency of aminoacylation of an orthogonal tRNA in a cell by any endogenous RS of the cell is lower, or even zero, than the efficiency of aminoacylation of an endogenous tRNA by an endogenous RS. In another embodiment, the orthogonal RS aminoacylates any endogenous tRNA with less efficiency, or even zero, than aminoacylation of the endogenous tRNA by the endogenous RS in the cell of interest. A second orthogonal molecule can be introduced into the cell to act in conjunction with the first orthogonal molecule. For example, an orthogonal tRNA/RS pair includes introduced complementary components that function together in a cell with an efficiency equivalent to that of the corresponding tRNA/RS endogenous pair (e.g., 50%, 60%, 70%, 75% , 80%, 90%, 95%, or 99% or higher).

互补:术语“互补”指可以-起作用的正交对、O-tRNA和O-RS组件,例如,其中O-RS使O-tRNA氨酰化。Complementary: The term "complementary" refers to an orthogonal pair, an O-tRNA, and an O-RS assembly that can function, eg, where the O-RS aminoacylates the O-tRNA.

优选地氨酰化:术语“优选地氨酰化”指与O-RS氨酰化天然产生的tRNA或用于产生O-tRNA的起始材料相比,以例如,70%、75%、85%、90%、95%或99%或更高的效率氨酰化具有非天然氨基酸的O-tRNA。将非天然氨基酸以高保真度掺入正在生长的多肽链中,例如,对于给定的选择密码子效率大于75%、对于给定的选择密码子效率高于约80%、对于给定的选择密码子效率大于约90%、对于给定的选择密码子效率大于约95%或对于给定的选择密码子效率大于约99%或更高。Preferably aminoacylated: The term "preferably aminoacylated" refers to aminoacylation of a naturally occurring tRNA or the starting material used to produce an O-tRNA, e.g., 70%, 75%, 85%, compared to O-RS aminoacylation. %, 90%, 95%, or 99% or greater efficiency to aminoacylate an O-tRNA with an unnatural amino acid. Incorporation of the unnatural amino acid into the growing polypeptide chain with high fidelity, e.g., greater than 75% efficiency for a given selector codon, greater than about 80% efficiency for a given selector codon, greater than about 80% efficiency for a given selector codon, Codon efficiency is greater than about 90%, greater than about 95% efficient for a given selector codon, or greater than about 99% efficient for a given selector codon or higher.

选择密码子:术语“选择密码子”指在翻译过程中被O-tRNA识别而不被内源性tRNA识别的密码子。O-tRNA抗密码子环识别mRNA上的选择密码子并在多肽的这个位点上掺入其氨基酸,例如,非天然氨基酸。选择密码子可包括,例如,无义密码子,如终止密码子,如,琥珀、赭石和乳白密码子;四个或更多碱基密码子;罕用密码子;来自天然或非天然碱基对和/或类似物的密码子。Selector codon: The term "selector codon" refers to a codon that is recognized by O-tRNA but not endogenous tRNA during translation. The O-tRNA anti-codon loop recognizes a selector codon on the mRNA and incorporates its amino acid, eg, an unnatural amino acid, at this position in the polypeptide. Selector codons can include, for example, nonsense codons, such as stop codons, e.g., amber, ocher, and opal codons; four or more base codons; uncommon codons; from natural or unnatural bases Codons for pairs and/or analogs.

抑制型tRNA:抑制型tRNA是在给定翻译系统中,例如,通过提供响应于选择密码子将氨基酸掺入多肽链的机制改变信使RNA(mRNA)的阅读的tRNA。例如,抑制型tRNA可以通过,例如终止密码子、四碱基密码子、罕用密码子和/或类似物阅读。Suppressor tRNA: A suppressor tRNA is a tRNA that alters the reading of messenger RNA (mRNA) in a given translation system, eg, by providing a mechanism for the incorporation of amino acids into the polypeptide chain in response to a selector codon. For example, suppressor tRNAs can be read by, for example, stop codons, four base codons, rare codons, and/or the like.

可循环tRNA:术语“可循环tRNA”指氨酰化的tRNA,可用氨基酸(例如,非天然氨基酸),在翻译期间将该氨基酸,例如非天然氨基酸掺入一种或多种多肽链,重复地再氨酰化。Recyclable tRNA: The term "cyclable tRNA" refers to a tRNA that is aminoacylated, using an amino acid (e.g., an unnatural amino acid), which is incorporated into one or more polypeptide chains during translation, repeatedly Reaminoacylation.

翻译系统:术语“翻译系统”指将天然产生的氨基酸掺入正在生长的多肽链(蛋白)中的组分的集合组。翻译系统的组分可包括,例如,核糖体、tRNA、合成酶、mRNA、氨基酸等。可将本发明的组件(例如,ORS、OtRNAs、非天然氨基酸等)加入到体外或体内翻译系统例如,真核细胞,例如,酵母细胞、哺乳动物细胞、植物细胞、藻类细胞、真菌细胞、昆虫细胞、和/或类似物中。Translation system: The term "translation system" refers to the collective group of components that incorporate naturally occurring amino acids into a growing polypeptide chain (protein). Components of a translation system can include, for example, ribosomes, tRNAs, synthetases, mRNAs, amino acids, and the like. Components of the invention (e.g., ORS, OtRNAs, unnatural amino acids, etc.) can be incorporated into in vitro or in vivo translation systems, e.g., eukaryotic cells, e.g., yeast cells, mammalian cells, plant cells, algal cells, fungal cells, insect cells cells, and/or the like.

非天然氨基酸:本文使用的术语“非天然氨基酸”指不是20种普通的天然产生的氨基酸、硒半胱氨酸或吡咯赖氨酸之一的任意氨基酸、修饰氨基酸和/或氨基酸类似物。Unnatural amino acid: As used herein, the term "unnatural amino acid" refers to any amino acid, modified amino acid and/or amino acid analog that is not one of the 20 common naturally occurring amino acids, selenocysteine or pyrrolysine.

来源于:本文使用的术语“来源于”指从具体的分子或生物分离或用来自具体分子或生物的信息制成的组件。Derived from: As used herein, the term "derived from" refers to a component that is isolated from or made with information from a particular molecule or organism.

失活RS:本文使用的术语“失活RS”指经过突变使其不再可用氨基酸氨酰化其天然关联tRNA的合成酶。Inactive RS: As used herein, the term "inactive RS" refers to a synthetase that has been mutated so that it can no longer aminoacylate its naturally associated tRNA with an amino acid.

正选择或筛选标记:本文使用的术语“正选择或筛选标记”指存在时,例如表达、激活等,可以从没有正选择标记的细胞中鉴定具有正选择标记的细胞的标记。Positive selection or screening marker: As used herein, the term "positive selection or screening marker" refers to a marker that when present, eg, expressed, activated, etc., allows the identification of cells with the positive selection marker from cells without the positive selection marker.

负选择或筛选标记:本文使用的术语“负选择或筛选标记”指指存在时,例如表达、激活等,可以鉴定不具有所需性质的细胞的标记(例如,与具有所需性质的细胞相比)。Negative selection or selection marker: As used herein, the term "negative selection or selection marker" refers to a marker that, when present, e.g., expressed, activated, etc., can identify cells that do not have a desired property (e.g., compared to cells that have a desired property). Compare).

报道者:本文使用的术语“报道者”指可以用来选择感兴趣的系统的靶组件的组件。例如,报道者可包括荧光筛选标记(例如,绿色荧光蛋白)、发光标记(例如,萤火虫荧光素酶蛋白)、基于亲和力的筛选标记或可选择的标记基因,如his3、ura3、leu2、lys2、lacZ、β-gal/lacZ(β-半乳糖苷酶)、Adh(醇脱氢酶)等。Reporter: The term "reporter" as used herein refers to a component that can be used to select a target component of a system of interest. For example, a reporter can include a fluorescent selection marker (e.g., green fluorescent protein), a luminescent marker (e.g., firefly luciferase protein), an affinity-based selection marker, or a selectable marker gene such as his3, ura3, leu2, lys2, lacZ, β-gal/lacZ (β-galactosidase), Adh (alcohol dehydrogenase), etc.

真核生物:本文使用的术语“真核生物”指属于系统发育域真核生物,例如,动物(例如,哺乳动物、昆虫、爬虫、鸟等)、纤毛虫、植物(例如,单子叶植物、双子叶植物、藻类等)、真菌、酵母、鞭毛虫、微孢子虫、原生生物等的生物。Eukaryote: The term "eukaryote" as used herein refers to organisms belonging to the phylogenetic domain eukaryotes, for example, animals (e.g., mammals, insects, reptiles, birds, etc.), ciliates, plants (e.g., monocots, Dicotyledonous plants, algae, etc.), fungi, yeasts, flagellates, microsporidia, protists, etc.

非-真核生物:本文使用的术语“非-真核生物”指非真核生物。例如,属于真细菌(例如,大肠杆菌、嗜热栖热菌、嗜热脂肪芽孢杆菌等)系统发育域,或古细菌(例如,詹氏甲烷球菌、热自养甲烷杆菌、盐杆菌属如沃氏富盐菌和盐杆菌种NRC-1、闪烁古生球菌、激烈火球菌、堀越氏火球菌、敏捷气热菌等)系统发育域的非真核生物。Non-eukaryote: As used herein, the term "non-eukaryote" refers to a non-eukaryote. For example, those belonging to the phylogenetic domain of Eubacteria (e.g., Escherichia coli, Thermus thermophilus, Bacillus stearothermophilus, etc.) Non-eukaryotic organisms in the phylogenetic domains of Halobacterium eutropha and Halobacterium species NRC-1, Archaeoglobulin fulgillum, Pyrococcus furiosus, Pyrococcus horikoshii, Aerothermus agilis, etc.).

抗体:本文使用的术语“抗体″包括但不限于基本上通过一个或多个免疫球蛋白基因编码的多肽,或其片段,它特异性结合并识别分析物(抗原)。例子包括多克隆、单克隆、嵌合和单链抗体等。本发明使用的术语“抗体”也包括免疫球蛋白的片段,包括Fab片段和表达文库包括噬菌体展示产生的片段。抗体结构和术语参见,例如,Paul,《基本免疫学》(Fundamental Immunology),第4版,1999,Raven Press,纽约。Antibody: The term "antibody" as used herein includes, but is not limited to, polypeptides substantially encoded by one or more immunoglobulin genes, or fragments thereof, which specifically bind and recognize an analyte (antigen). Examples include polyclonal, monoclonal, chimeric and single chain antibodies and the like. The term "antibody" as used herein also includes fragments of immunoglobulins, including Fab fragments and expression libraries including fragments produced by phage display. Antibody structure and nomenclature see, eg, Paul, Fundamental Immunology, 4th ed., 1999, Raven Press, New York.

保守变体:术语“保守变体”指翻译组件,例如,保守变体O-tRNA或保守变体O-RS,其功能类似保守变体基于的组件,例如,O-tRNA或O-RS,但序列中有变化。例如,O-RS将氨酰化具有非天然氨基酸的互补O-tRNA或保守变体O-tRNA,虽然O-tRNA和保守变体O-tRNA并不具有相同序列。保守变体在序列中可具有例如,一种变化、两种变化、三种变化、四种变化或五种或更多的变化,只要保守变体与相应的O-tRNA或O-RS互补。Conservative variant: The term "conservative variant" refers to a translational component, e.g., a conservative variant O-tRNA or a conservative variant O-RS, which functions like a component on which the conservative variant is based, e.g., an O-tRNA or an O-RS, But there are changes in the sequence. For example, an O-RS will aminoacylate a complementary O-tRNA or a conservative variant O-tRNA with an unnatural amino acid, although the O-tRNA and the conservative variant O-tRNA do not have the same sequence. Conservative variants can have, for example, one change, two changes, three changes, four changes, or five or more changes in the sequence, so long as the conservative variant is complementary to the corresponding O-tRNA or O-RS.

选择剂或筛选剂:本文使用的术语“选择剂或筛选剂”指存在时,可以从群体中选择/筛选某种组分的试剂。例如,选择或筛选剂包括但不限于,例如,营养物、抗生素、光波长,抗体,表达的多核苷酸(例如,转录调节蛋白)等。选择剂可随,例如,浓度、强度等而不同。Selection or screening agent: The term "selection or screening agent" as used herein refers to an agent which, when present, can select/screen for a certain component from a population. For example, selection or screening agents include, but are not limited to, eg, nutrients, antibiotics, wavelengths of light, antibodies, expressed polynucleotides (eg, transcriptional regulatory proteins), and the like. Selective agents can vary, eg, in concentration, strength, and the like.

可检测物质:本文使用的术语“可检测物质”指当激活、改变、表达等时,可以从群体中选择/筛选某种组分的试剂。例如,可检测物质可以是化学试剂,例如,5-氟乳清酸(5-FOA),它在某些条件下,例如,URA3报道基因的表达成为可检测的,例如,能够杀死表达URA3报道基因的细胞的有毒产物。Detectable substance: As used herein, the term "detectable substance" refers to an agent that, when activated, altered, expressed, etc., can select/screen for a certain component from a population. For example, the detectable substance can be a chemical agent, e.g., 5-fluoroorotic acid (5-FOA), which under certain conditions, e.g., the expression of a URA3 reporter gene becomes detectable, e.g., is able to kill the expression of URA3 Toxic product of the reporter gene.

附图简要描述Brief description of the drawings

图1,A、B和C组以图解说明通常用于扩展真核细胞例如,酿酒酵母遗传密码的正和负选择方案,A组图解说明报道基因的激活转录,它是通过GAL4中TAG密码子的琥珀抑制驱动的。条纹框指出DNA结合域,阴影框指出主要和隐蔽的激活域。B组说明报道基因的例子,例如,MaV203中的HIS3、LacZ、URA3。C组图解说明可以用于选择方案的质粒,例如,pEcTyrRS/tRNACUA和pGADGAL4xxTAG。Figure 1, panels A, B, and C to illustrate the positive and negative selection schemes commonly used to expand the genetic code of eukaryotic cells such as Saccharomyces cerevisiae, and panel A to illustrate the activation of transcription of the reporter gene through the TAG codon in GAL4 Amber inhibition driven. Striped boxes indicate DNA binding domains, shaded boxes indicate major and cryptic activation domains. Panel B illustrates examples of reporter genes, eg, HIS3, LacZ, URA3 in MaV203. Panel C illustrates plasmids that can be used in the selection protocol, eg, pEcTyrRS/ tRNACUA and pGADGAL4xxTAG.

图2说明在选择性培养基上第一代GAL4报道基因的EcTyrRS和tRNACUA依赖性表型。DB-AD是GAL4DNA结合域和激活域间的融合。DB-TAG-AD在合成接头DB和AD之间有代替酪氨酸密码子的TAG密码子。A5是EcTyrRS的失活型,其中活性位点中的5个残基突变为丙氨酸。Figure 2 illustrates the EcTyrRS and tRNA CUA- dependent phenotype of the first generation GAL4 reporter gene on selective media. DB-AD is a fusion between the DNA-binding and activation domains of GAL4. DB-TAG-AD has a TAG codon in place of a tyrosine codon between the synthetic linkers DB and AD. A5 is an inactive form of EcTyrRS in which 5 residues in the active site are mutated to alanine.

图3,A和B组说明在选择性培养基上第二代GAL4报道基因的EcTyrRS和tRNACUA依赖性表型。条纹框指出DNA结合域,阴影框指出主要和隐蔽的激活域。A组说明GAL4中具有单个氨基酸突变的构建物。B组说明GAL4中具有两个氨基酸突变的构建物。Figure 3, panels A and B illustrate the EcTyrRS and tRNA CUA -dependent phenotype of the second generation GAL4 reporter gene on selective media. Striped boxes indicate DNA binding domains, shaded boxes indicate major and cryptic activation domains. Panel A illustrates constructs with single amino acid mutations in GAL4. Panel B illustrates constructs with two amino acid mutations in GAL4.

图4A、B和C组说明有或没有EcTyrRS的pGADGAL4(T44TAG、R11OTAG),以及MaV203中的各种报道基因。A组显示在X-gal、-Ura或-Leu、-Trp存在下的结果。B组显示在不同浓度的3-AT存在下的结果。C组显示在不同百分数的5-FOA存在下的结果。Figure 4A, B and C panels illustrate pGADGAL4(T44TAG, R11OTAG) with or without EcTyrRS, and various reporter genes in MaV203. Panel A shows the results in the presence of X-gal, -Ura or -Leu, -Trp. Panel B shows the results in the presence of different concentrations of 3-AT. Panel C shows the results in the presence of different percentages of 5-FOA.

图5A和B组说明ONPG水解各种GAL4突变体,例如,其中残基T44(A)和R110(B)是允许位点。A组说明用T44位点上各种类型的突变测定的ONPG水解。B组说明用Rl10位点上各种类型的突变测定的ONPG水解。‘GAL4’是转染了pCL1的MaV203,超出标度~600 ONPG水解单位。‘没有’是分别用编码GAL4DB和GAL4AD的质粒转化MaV203。Figure 5A and B panels illustrate ONPG hydrolysis of various GAL4 mutants, for example, where residues T44 (A) and R110 (B) are permissive sites. Panel A illustrates ONPG hydrolysis measured with various types of mutations at the T44 position. Panel B illustrates ONPG hydrolysis measured with various types of mutations at the R110 position. 'GAL4' is MaV203 transfected with pCL1, out of scale ~600 ONPG hydrolysis units. 'None' was transformation of MaV203 with plasmids encoding GAL4DB and GAL4AD, respectively.

图6显示了活性EcTyrRS克隆的选择。将含有1:10的pEcTyrRS-tRNACUA:pA5-tRNACUA混合物的MaV203以103稀释度铺板于(-Leu,-Trp)平板(左)或(-Leu,-Trp,-His+50mM3-AT)平板(右),用XGAL覆盖处理。Figure 6 shows selection of active EcTyrRS clones. MaV203 containing 1:10 mixture of pEcTyrRS-tRNA CUA : pA5-tRNA CUA was plated on (-Leu,-Trp) plate (left) or (-Leu,-Trp,-His+50mM3-AT) at 103 dilution ) plate (right), treated with XGAL overlay.

图7,A和B组。A组说明结合了酪氨酸的嗜热脂肪芽孢杆菌酪氨酰-tRNA合成酶的活性位点的立体视图。显示了突变的残基,并与来自大肠杆菌酪氨酰-tRNA合成酶Tyr37(嗜热脂肪芽孢杆菌TyrRS残基Tyr34)、Asn126(Asn123)、Asp182(Asp176)、Phe183(Phe177)和Leu186(Leu180)的残基相对应。B组说明非天然氨基酸例子(从左至右)对-乙酰基-L-苯丙氨酸(1)、对-苯甲酰基-L-苯丙氨酸(2)、对-叠氮基-L-苯丙氨酸(3)、0-甲基-L-酪氨酸(4)和对-碘代-L-酪氨酸(5)的结构式。Figure 7, panels A and B. Panel A illustrates a stereoscopic view of the active site of Bacillus stearothermophilus tyrosyl-tRNA synthetase incorporating tyrosine. The mutated residues are shown and compared with those from E. coli tyrosyl-tRNA synthetase Tyr 37 (Bacillus stearothermophilus TyrRS residue Tyr 34 ), Asn 126 (Asn 123 ), Asp 182 (Asp 176 ), Phe 183 (Phe 177 ) and Leu 186 (Leu 180 ) residues correspond. Panel B illustrates examples of unnatural amino acids (from left to right) p-acetyl-L-phenylalanine (1), p-benzoyl-L-phenylalanine (2), p-azido- Structural formulas of L-phenylalanine (3), O-methyl-L-tyrosine (4) and p-iodo-L-tyrosine (5).

图8,A、B、C和D组。A组说明可以用于选择/筛选正交tRNA的载体和报道构建物、真核细胞中的正交氨酰基合成酶或正交tRNA/RS对。B组说明含有GAL4反应型HIS3、URA3以及lacZ反应型报道者的酵母的表型,在选择性培养基上,响应于活性(TyrRS)或失活(A5RS)氨酰基-tRNA合成酶。C组说明一个选择方案的例子,用于在真核细胞例如,UAA是非天然氨基酸的酿酒酵母中选择编码附加氨基酸的突变体合成酶。D组说明从具有对-乙酰基-L-苯丙氨酸的选择中分离酵母的表型。Figure 8, panels A, B, C and D. Panel A illustrates vectors and reporter constructs that can be used for selection/screening of orthogonal tRNAs, orthogonal aminoacyl synthetases or orthogonal tRNA/RS pairs in eukaryotic cells. Panel B illustrates the phenotype of yeast containing GAL4-responsive HIS3, URA3, and lacZ-responsive reporters, on selective media, in response to active (TyrRS) or inactive (A5RS) aminoacyl-tRNA synthetases. Panel C illustrates an example of a selection protocol for selecting mutant synthetases encoding additional amino acids in eukaryotic cells such as Saccharomyces cerevisiae where UAA is an unnatural amino acid. Panel D illustrates the phenotype of yeast isolated from selection with p-acetyl-L-phenylalanine.

图9说明人超氧化物歧化酶(hSOD)(33TAG)HIS在酿酒酵母中的蛋白表达,它遗传编码非天然氨基酸,如图7B组中所示。Figure 9 illustrates protein expression in Saccharomyces cerevisiae of human superoxide dismutase (hSOD) (33TAG) HIS, which genetically encodes unnatural amino acids, as shown in Figure 7B panel.

图10,A-H组说明如图7B组中所示的含有非天然氨基酸(标为Y*)的胰蛋白酶肽VY*GSIK(SEQ ID NO:87)的串联质谱分析。A组说明具有非天然氨基酸p-乙酰基-L-苯丙氨酸的胰蛋白酶肽(1)的串联质谱分析。B组说明具有非天然氨基酸对-苯甲酰基-L-苯丙氨酸的胰蛋白酶肽(2)的串联质谱分析。C组说明具有非天然氨基酸对-叠氮基-L-苯丙氨酸的胰蛋白酶肽(3)的串联质谱分析。D组说明具有非天然氨基酸邻-甲基-L-酪氨酸的胰蛋白酶肽(4)的串联质谱分析。E组说明具有非天然氨基酸对-碘代-L-酪氨酸的胰蛋白酶肽(5)的串联质谱分析。F组说明在Y*位置有色氨酸(W)的胰蛋白酶肽的串联质谱分析。G组说明在Y*位置有酪氨酸(Y)的胰蛋白酶肽的串联质谱分析。H组说明在Y*位置有亮氨酸(L)的胰蛋白酶肽的串联质谱分析。Figure 10, panels A-H illustrate the tandem mass spectrometric analysis of the tryptic peptide VY*GSIK (SEQ ID NO: 87) containing an unnatural amino acid (labeled Y*) as shown in Figure 7 panel B. Panel A illustrates the tandem mass spectrometric analysis of tryptic peptide (1) with the unnatural amino acid p-acetyl-L-phenylalanine. Panel B illustrates the tandem mass spectrometric analysis of a tryptic peptide (2) with the unnatural amino acid p-benzoyl-L-phenylalanine. Panel C illustrates the tandem mass spectrometric analysis of tryptic peptide (3) with the unnatural amino acid p-azido-L-phenylalanine. Panel D illustrates the tandem mass spectrometric analysis of a tryptic peptide (4) with the unnatural amino acid o-methyl-L-tyrosine. Panel E illustrates the tandem mass spectrometric analysis of tryptic peptide (5) with the unnatural amino acid p-iodo-L-tyrosine. Panel F illustrates tandem mass spectrometric analysis of tryptic peptides with tryptophan (W) at the Y* position. Panel G illustrates tandem mass spectrometric analysis of tryptic peptides with a tyrosine (Y) at the Y* position. Panel H illustrates tandem mass spectrometric analysis of tryptic peptides with leucine (L) at the Y* position.

图11说明两种非天然氨基酸的例子,(1)对-炔丙基氧基苯丙氨酸和(2)对-叠氮基苯丙氨酸。Figure 11 illustrates two examples of unnatural amino acids, (1) p-propargyloxyphenylalanine and (2) p-azidophenylalanine.

图12,A、B和C组说明在图11中所示的非天然氨基酸1和2存在或不存在的情况下SOD的表达。A组说明Gelcode蓝染色实验。B组说明用抗-SOD抗体的Western印迹实验。C组说明用抗-6xHis抗体的Western印迹实验。Figure 12, panels A, B and C illustrate the expression of SOD in the presence or absence of unnatural amino acids 1 and 2 shown in Figure 11 . Panel A illustrates Gelcode blue staining experiments. Panel B illustrates Western blot experiments with anti-SOD antibodies. Panel C illustrates Western blot experiments with anti-6xHis antibody.

图13,A、B和C组说明通过[3+2]环加成标记的蛋白质。A组说明合成的染料标记3-6。B组说明SOD和染料间的反应。C组说明凝胶内荧光扫描和Gelcode蓝染色。Figure 13, panels A, B and C illustrate proteins labeled by [3+2] cycloaddition. Panel A illustrates synthesized dye labels 3-6. Panel B illustrates the reaction between SOD and the dye. Panel C illustrates in-gel fluorescence scanning and Gelcode blue staining.

图14说明真核细胞,如在缺少尿嘧啶的SD培养基上,在图11中所示1或2的存在或不存在下用合成酶突变体转化酿酒酵母细胞的生长。Figure 14 illustrates the growth of eukaryotic cells, eg, S. cerevisiae cells transformed with synthetase mutants in the presence or absence of 1 or 2 shown in Figure 11, on SD medium lacking uracil.

图15,A和B组说明在Y*位置含有叠氮(Az)(A组)或炔(Al)(B组)非天然氨基酸的胰蛋白酶肽VY*GSIK(SEQ ID NO:87)的串联质谱分析,显示它们的预计片段离子质量。箭头表明观察到各肽的b(蓝)和y(红)离子系列。Figure 15, Panels A and B illustrate the concatenation of tryptic peptide VY*GSIK (SEQ ID NO: 87) containing an azide (Az) (panel A) or alkyne (Al) (panel B) unnatural amino acid at the Y* position Mass spectrometry, showing their expected fragment ion masses. Arrows indicate the b (blue) and y (red) ion series observed for each peptide.

图16图解说明将非天然氨基酸,如对-炔丙基氧基苯丙氨酸体内掺入正在生长多肽链中,以及通过该非天然氨基酸的[3+2]环加成反应,与有机小分子生物共轭。Figure 16 illustrates the incorporation of an unnatural amino acid, such as p-propargyloxyphenylalanine, into a growing polypeptide chain in vivo, and the reaction of the unnatural amino acid with an organic small Molecular bioconjugation.

图17,A、B和C组说明用[3+2]环加成PEG化含有非天然氨基酸的蛋白质。A组说明炔丙基酰胺PEG在Cu(I)和磷酸盐缓冲液(PB)存在下与含有叠氮基氨基酸的蛋白质(例如,N3-SOD)的反应。B组说明通过凝胶分析蛋白质的PEG化。C组说明炔丙基酰胺PEG的合成。Figure 17, Panels A, B and C illustrate the PEGylation of proteins containing unnatural amino acids by the [3+2] cycloaddition. Panel A illustrates the reaction of propargylamide PEG with an azido amino acid containing protein (eg, N3 -SOD) in the presence of Cu(I) and phosphate buffer (PB). Panel B illustrates the analysis of PEGylation of proteins by gel. Panel C illustrates the synthesis of propargylamide PEG.

发明详述Detailed description of the invention

在真核细胞中超越遗传密码强加的化学限制直接遗传修饰蛋白结构的能力,将提供强大的分子工具,以探测或操纵细胞过程。本发明提供了在真核细胞中能扩展遗传编码的氨基酸数目的翻译组件。这些包括tRNAs(例如,正交tRNAs(O-tRNAs))、氨酰基-tRNA合成酶(例如,正交合成酶(O-RS))、O-tRNA/O-RS对和非天然氨基酸。The ability to directly genetically modify protein structure in eukaryotic cells beyond the chemical constraints imposed by the genetic code will provide powerful molecular tools to probe or manipulate cellular processes. The present invention provides translational modules that expand the number of genetically encoded amino acids in eukaryotic cells. These include tRNAs (eg, orthogonal tRNAs (O-tRNAs)), aminoacyl-tRNA synthetases (eg, orthogonal synthetases (O-RS)), O-tRNA/O-RS pairs, and unnatural amino acids.

一般地,能够有效表达并加工本发明的O-tRNA,它在真核细胞的翻译中发挥功能,但不被宿主的氨酰基-tRNA合成酶显著地氨酰化。响应于选择密码子,本发明的O-tRNA将非天然氨基酸在mRNA翻译期间传递到正在生长的多肽链上,该非天然氨基酸并不编码普通的二十种氨基酸的任意一种。In general, O-tRNAs of the invention are efficiently expressed and processed, function in translation in eukaryotic cells, but are not significantly aminoacylated by host aminoacyl-tRNA synthetases. In response to a selector codon, the O-tRNA of the invention delivers an unnatural amino acid that does not encode any of the common twenty amino acids to the growing polypeptide chain during translation of the mRNA.

本发明的O-RS在真核细胞中优选地氨酰化本发明具有非天然氨基酸的O-tRNA,但并不氨酰化任何胞质宿主的tRNA。而且,本发明氨酰基-tRNA合成酶的特异性使其接受非天然氨基酸而拒绝任何内源性氨基酸。包括例子O-RS或其部分氨基酸序列的多肽也是本发明的特征。此外,编码翻译组件、O-tRNA、O-RS及其部分的多核苷酸是本发明的特征。The O-RS of the invention preferentially aminoacylates the O-tRNA of the invention with an unnatural amino acid in eukaryotic cells, but does not aminoacylate any cytoplasmic host tRNA. Furthermore, the specificity of the aminoacyl-tRNA synthetases of the invention is such that they accept unnatural amino acids and reject any endogenous amino acids. Polypeptides comprising an example O-RS or a partial amino acid sequence thereof are also a feature of the invention. In addition, polynucleotides encoding translational components, O-tRNAs, O-RSs, and portions thereof are features of the invention.

本发明也提供生产将非天然氨基酸用于真核细胞的所需翻译组件,如O-RS和或正交对(正交tRNA和正交氨酰基-tRNA合成酶)的方法,(以及由所述方法生产的翻译组件)。例如,来自大肠杆菌的酪氨酰-tRNA合成酶/tRNACUA对是本发明的O-tRNA/O-RS对。此外,本发明也描述在一个真核细胞中选择/筛选翻译组件的方法,一旦选择/筛选,就可以在不同真核细胞(没有用于选择/筛选的真核细胞)中使用那些组件。例如,生产用于真核细胞的翻译组件的选择/筛选方法可以在酵母,例如,酿酒酵母中进行,然后可以将那些选择组件用于另外的真核细胞,例如,另外的酵母细胞、哺乳动物细胞、昆虫细胞、植物细胞、真菌细胞等。The invention also provides methods for producing desired translational components, such as O-RSs and or orthogonal pairs (orthogonal tRNA and orthogonal aminoacyl-tRNA synthetases), for the use of unnatural amino acids in eukaryotic cells, (and derived from such translation components produced by the method described above). For example, the tyrosyl-tRNA synthetase/tRNA CUA pair from E. coli is an O-tRNA/O-RS pair of the invention. Furthermore, the present invention also describes methods for selecting/screening translational components in one eukaryotic cell, once selected/screened, those components can be used in a different eukaryotic cell (eukaryotic cells not used for selection/screening). For example, selection/screening methods for producing translational components for use in eukaryotic cells can be performed in yeast, e.g., Saccharomyces cerevisiae, and those selection components can then be used in additional eukaryotic cells, e.g. cells, insect cells, plant cells, fungal cells, etc.

本发明还提供了在真核细胞中生产蛋白质的方法,其中该蛋白含有非天然氨基酸。用本发明的翻译组件生产该蛋白。本发明也提供包括非天然氨基酸的蛋白质(和由本发明方法生产的蛋白)。感兴趣的蛋白或多肽也可包括翻译后修饰,例如,通过[3+2]环加成或亲核-亲电子反应加入修饰,这不能通过原核细胞进行,等。在某些实施方式中,本发明也包括用非天然氨基酸生产转录调节蛋白的方法(和由所述方法生产的蛋白)。包括含有非天然氨基酸的蛋白的组合物也是本发明的特征。The invention also provides methods of producing proteins in eukaryotic cells, wherein the proteins contain unnatural amino acids. The protein is produced using the translation module of the invention. The invention also provides proteins (and proteins produced by the methods of the invention) comprising unnatural amino acids. A protein or polypeptide of interest may also include post-translational modifications, eg, modifications added by [3+2] cycloaddition or nucleophilic-electrophilic reactions, which cannot be performed by prokaryotic cells, etc. In certain embodiments, the present invention also includes methods of using unnatural amino acids to produce transcriptional regulatory proteins (and proteins produced by such methods). Compositions comprising proteins comprising unnatural amino acids are also a feature of the invention.

用非天然氨基酸生产蛋白或多肽的试剂盒也是本发明的特征。Kits for producing proteins or polypeptides using unnatural amino acids are also a feature of the invention.

正交氨酰基-tRNA合成酶(O-RS)Orthogonal aminoacyl-tRNA synthetase (O-RS)

为了将非天然氨基酸特异性掺入到感兴趣的蛋白或多肽中,在真核细胞中,改变合成酶的底物特异性以致只有所需的非天然氨基酸,而没有任意普通的20种氨基酸加入tRNA。如果正交合成酶是混杂的,它将导致在靶位上混合有天然和非天然氨基酸的突变蛋白质。本发明提供了对于具体的非天然氨基酸具有修饰的底物特异性的生产正交氨酰基-tRNA合成酶的组合物和方法。To specifically incorporate an unnatural amino acid into a protein or polypeptide of interest, in eukaryotic cells, the substrate specificity of the synthetase is altered so that only the desired unnatural amino acid is added without any of the common 20 amino acids added tRNA. If the orthogonal synthetases are promiscuous, it will result in a mutant protein with a mixture of natural and unnatural amino acids at the target site. The present invention provides compositions and methods for producing orthogonal aminoacyl-tRNA synthetases with modified substrate specificities for specific unnatural amino acids.

包括正交氨酰基-tRNA合成酶(O-RS)的真核细胞是本发明的特征。O-RS在真核细胞中优选地氨酰化具有非天然氨基酸的正交tRNA(O-tRNA)。在某些实施方式中,O-RS利用多于一个非天然氨基酸,例如,两个或更多,三个或更多等。因此,本发明的O-RS可具有用不同的非天然氨基酸优选地氨酰化O-tRNA的能力。通过选择哪一个非天然氨基酸或非天然氨基酸的组合放入细胞和/或通过选择放入细胞以掺入的不同量的非天然氨基酸提供了附加的对照水平。Eukaryotic cells comprising an orthogonal aminoacyl-tRNA synthetase (O-RS) are a feature of the invention. O-RS preferentially aminoacylates orthogonal tRNAs (O-tRNAs) with unnatural amino acids in eukaryotic cells. In certain embodiments, the O-RS utilizes more than one unnatural amino acid, eg, two or more, three or more, etc. Thus, the O-RS of the invention may have the ability to preferentially aminoacylate O-tRNA with different unnatural amino acids. Additional control levels are provided by selecting which unnatural amino acid or combination of unnatural amino acids is placed into the cell and/or by selecting different amounts of the unnatural amino acid placed into the cell for incorporation.

与天然氨基酸相比,本发明的O-RS对非天然氨基酸任选地具有一种或多种改进或增强的酶性质。这些性质包括,例如,与天然产生的氨基酸如,20种已知普通氨基酸之一相比,对非天然氨基酸较高km、较低km、较高kcat、较低kcat、较低kcat/km、较高kcat/km等。O-RSs of the invention optionally have one or more improved or enhanced enzymatic properties for unnatural amino acids compared to natural amino acids. These properties include, for example, higher k m , lower k m , higher k cat , lower k cat , lower k cat /k m , higher k cat /k m , etc.

任选地,O-RS可通过包括O-RS的多肽和/或通过编码O-RS或其部分的多核苷酸提供给真核细胞。例如,如SEQ ID NO.:3-35(例如,3-19、20-35或序列3-35的任何其它亚组)或其互补多核苷酸序列中任意一个所列多核苷酸序列编码O-RS或其部分。在另一实施例中,O-RS包含如SEQ ID NO.:36-63(例如,36-47、48-63或36-63的任意其它亚组)和/或86,或它们的保守变体的氨基酸序列。参见例如,本文表5、6和8以及实施例6用于示例O-RS分子的序列。Optionally, the O-RS can be provided to eukaryotic cells by a polypeptide comprising the O-RS and/or by a polynucleotide encoding the O-RS or a portion thereof. For example, a polynucleotide sequence encoding O - RS or parts thereof. In another embodiment, the O-RS comprises SEQ ID NO.: 36-63 (for example, any other subgroup of 36-47, 48-63 or 36-63) and/or 86, or their conservative variants The amino acid sequence of the body. See, eg, Tables 5, 6 and 8 herein and Example 6 for exemplary sequences of O-RS molecules.

O-RS也可包含与天然产生的酪氨酰氨酰基-tRNA合成酶(TyrRS)(例如,SEQ IDNO.:2中所列)的氨基酸序列例如,至少90%、至少95%、至少98%、至少99%、或甚至至少99.5%相同的氨基酸序列,包含A-E族的两种或多种氨基酸。A族包括与大肠杆菌TyrRS的Tyr37相对应位置上的缬氨酸、异亮氨酸、亮氨酸、甘氨酸、丝氨酸、丙氨酸、或苏氨酸;B族包括与大肠杆菌TyrRS的Asn126相对应位置上的天冬氨酸;C族包括与大肠杆菌TyrRS的Asp182相对应位置上的苏氨酸、丝氨酸、精氨酸、天冬酰胺或甘氨酸;D族包括与大肠杆菌TyrRS的Phe183相对应位置上的甲硫氨酸、丙氨酸、缬氨酸、或酪氨酸;E族包括与大肠杆菌TyrRS的Leu186相对应位置上的丝氨酸、甲硫氨酸、缬氨酸、半胱氨酸、苏氨酸、或丙氨酸。这些族的任何亚组组合是本发明的特征。例如,在一个实施方式中,O-RS具有两种或多种选自出现与大肠杆菌TyrRS的Tyr37相对应位置上的缬氨酸、异亮氨酸、亮氨酸、或苏氨酸;与大肠杆菌TyrRS的Asp182相对应位置上的苏氨酸、丝氨酸、精氨酸或甘氨酸;与大肠杆菌TyrRS的Phe183相对应位置上的甲硫氨酸、或酪氨酸;和与大肠杆菌TyrRS的Leu186相对应位置上的丝氨酸或丙氨酸的氨基酸。在另一实施方式中,O-RS包括两种或多种选自与大肠杆菌TyrRS的Tyr37相对应位置上的甘氨酸、丝氨酸或丙氨酸,与大肠杆菌TyrRS的Asn126相对应位置上的天冬氨酸,与大肠杆菌TyrRS的Asp182相对应位置上的天冬酰胺,与大肠杆菌TyrRS的Phe183相对应位置上的丙氨酸或缬氨酸,和/或与大肠杆菌TyrRS的Leu186相对应位置上的甲硫氨酸、缬氨酸、半胱氨酸或苏氨酸的氨基酸。也参见,例如,本文的表4、表6和表8。The O-RS may also comprise, e.g., at least 90%, at least 95%, at least 98% identical to the amino acid sequence of a naturally occurring tyrosylaminoacyl-tRNA synthetase (TyrRS) (e.g., set forth in SEQ ID NO.: 2) , at least 99%, or even at least 99.5% identical amino acid sequences, comprising two or more amino acids from groups A-E. Family A includes valine, isoleucine, leucine, glycine, serine, alanine, or threonine corresponding to Tyr37 of Escherichia coli TyrRS; family B includes Asn126 of Escherichia coli TyrRS Aspartic acid at the corresponding position; family C includes threonine, serine, arginine, asparagine or glycine at the position corresponding to Asp182 of E. coli TyrRS; family D includes corresponding to Phe183 of E. coli TyrRS Methionine, alanine, valine, or tyrosine at the position; E family includes serine, methionine, valine, cysteine at the corresponding position to Leu186 of E. coli TyrRS , threonine, or alanine. Any subgroup combination of these families is a feature of the invention. For example, in one embodiment, the O-RS has two or more valine, isoleucine, leucine, or threonine selected from the position corresponding to Tyr37 of E. coli TyrRS; Threonine, serine, arginine or glycine at the corresponding position of Asp182 of Escherichia coli TyrRS; Methionine or tyrosine at the corresponding position of Phe183 of Escherichia coli TyrRS; and Leu186 of Escherichia coli TyrRS The amino acid corresponding to serine or alanine at the position. In another embodiment, the O-RS includes two or more glycine, serine or alanine selected from the corresponding position of Tyr37 of Escherichia coli TyrRS, and asparagus at the position corresponding to Asn126 of Escherichia coli TyrRS amino acid, asparagine at the position corresponding to Asp182 of Escherichia coli TyrRS, alanine or valine at the position corresponding to Phe183 of Escherichia coli TyrRS, and/or on the position corresponding to Leu186 of Escherichia coli TyrRS Amino acids of methionine, valine, cysteine or threonine. See also, eg, Tables 4, 6, and 8 herein.

除了O-RS,本发明的真核细胞还可包括附加组分,例如,非天然氨基酸。真核细胞也包括正交tRNA(O-tRNA)(例如,来自非真核生物,如大肠杆菌、嗜热脂肪芽孢杆菌和/或类似物),其中O-tRNA识别选择密码子,并由O-RS优选地氨酰化具有非天然氨基酸的O-tRNA。细胞中也可存在包含编码感兴趣多肽的多核苷酸的核酸,其中多核苷酸包含O-tRNA识别的选择密码子,或这些中的一种或多种的组合。In addition to the O-RS, eukaryotic cells of the invention may also include additional components, eg, unnatural amino acids. Eukaryotic cells also include orthogonal tRNAs (O-tRNAs) (e.g., from non-eukaryotic organisms such as E. coli, B. -RS preferentially aminoacylates O-tRNAs with unnatural amino acids. A nucleic acid comprising a polynucleotide encoding a polypeptide of interest comprising a selector codon recognized by the O-tRNA, or a combination of one or more of these, may also be present in the cell.

在一个方面,O-tRNA介导非天然氨基酸掺入蛋白质中,其效率相当于包含SEQ IDNO.:65所列包含或加工自多核苷酸序列的tRNA效率的例如、至少45%、至少50%、至少60%、至少75%、至少80%、至少90%、至少95%或99%。在另一方面,O-tRNA包含SEQ ID NO.:65和O-RS包含SEQ ID NO.:36-63(例如,36-47、48-63或36-63的任何其它亚组)和/或86和/或它们的保守变体中任意一个所列多肽序列。也参见,例如,本文表5和实施例6中用于示例O-RS和O-tRNA分子的序列。In one aspect, the O-tRNA mediates the incorporation of the unnatural amino acid into the protein with an efficiency equivalent to, for example, at least 45%, at least 50%, that of a tRNA comprising or processed from a polynucleotide sequence set forth in SEQ ID NO.: 65 , at least 60%, at least 75%, at least 80%, at least 90%, at least 95%, or 99%. In another aspect, the O-tRNA comprises SEQ ID NO.: 65 and the O-RS comprises SEQ ID NO.: 36-63 (e.g., any other subgroup of 36-47, 48-63 or 36-63) and/ or any one of the listed polypeptide sequences in 86 and/or their conservative variants. See also, eg, Table 5 and Example 6 herein for the sequences of exemplary O-RS and O-tRNA molecules.

在一个实施例中,真核细胞包含正交氨酰基-tRNA合成酶(O-RS)、正交tRNA(O-tRNA)、非天然氨基酸和含有编码感兴趣多肽的多核苷酸的核酸,其中多核苷酸包含O-tRNA识别的选择密码子。O-RS在真核细胞中优选地氨酰化具有非天然氨基酸的正交tRNA(O-tRNA),细胞在不存在非天然氨基酸的情况下生产感兴趣多肽,其产率相当于在非天然氨基酸存在下多肽产率的例如,小于30%、小于20%、小于15%、小于10%、小于5%、小于2.5%等。In one embodiment, the eukaryotic cell comprises an orthogonal aminoacyl-tRNA synthetase (O-RS), an orthogonal tRNA (O-tRNA), an unnatural amino acid, and a nucleic acid comprising a polynucleotide encoding a polypeptide of interest, wherein The polynucleotide comprises a selector codon recognized by the O-tRNA. O-RS preferentially aminoacylates orthogonal tRNAs (O-tRNAs) with unnatural amino acids in eukaryotic cells, and the cells produce the polypeptide of interest in the absence of unnatural amino acids in yields comparable to those found in unnatural amino acids. For example, the yield of polypeptide in the presence of amino acids is less than 30%, less than 20%, less than 15%, less than 10%, less than 5%, less than 2.5%, etc.

是本发明特征的生产O-RS的方法任选地包括从野生型合成酶的构架产生突变合成酶库,然后基于它们相对于普通的二十种氨基酸对非天然氨基酸的特异性选择突变RS。为了分离所述合成酶,选择方法是:(i)敏感的,因为来自首轮的所需合成酶的活性可以低,数目小;(ii)“可调的”,因为需要在不同的选择轮中改变选择严格性;和(iii)通用的,以使这些方法可用于不同非天然氨基酸。A method of producing an O-RS that is a feature of the invention optionally includes generating a library of mutant synthetases from the framework of a wild-type synthetase, and then selecting mutant RSs based on their specificity for unnatural amino acids relative to the common twenty amino acids. To isolate the synthetases, the selection method is: (i) sensitive because the activity of the desired synthetase from the first round can be low and small in number; varying the stringency of selection; and (iii) general, so that these methods can be used for different unnatural amino acids.

生产在真核细胞中优选地氨酰化具有非天然氨基酸的正交tRNA的正交氨酰基-tRNA合成酶(O-RS)的方法一般包括应用正选择的组合,然后负选择。在正选择中,在阳性标记的非必需位点引入选择密码子的抑制使真核细胞在正选择压力下存活。在非天然氨基酸存在下,存活细胞从而编码加入具有非天然氨基酸的正交抑制型tRNA的活性合成酶。在负选择中,在阴性标记的非必需位点引入选择密码子的抑制除去具有天然氨基酸特异性的合成酶。正和负选择中存活的细胞编码仅(或至少优选地)氨酰化(加入)具有非天然氨基酸的正交抑制型tRNA的合成酶。Methods for producing orthogonal aminoacyl-tRNA synthetases (O-RS) that preferentially aminoacylate orthogonal tRNAs with unnatural amino acids in eukaryotic cells generally involve applying a combination of positive selection followed by negative selection. In positive selection, the introduction of suppression of selector codons at non-essential sites of positive markers allows eukaryotic cells to survive positive selection pressure. In the presence of unnatural amino acids, surviving cells thus encode active synthetases that incorporate orthogonal suppressor tRNAs with unnatural amino acids. In negative selection, inhibition of the introduction of a selector codon at a non-essential site of a negative marker removes synthetases with natural amino acid specificity. Cells surviving positive and negative selection encode synthetases that only (or at least preferentially) aminoacylate (add) orthogonal suppressor tRNAs with unnatural amino acids.

例如,该方法包括:(a)进行正选择,在非天然氨基酸存在下,第一种类生物真核细胞的群体,其中真核细胞各包含:i)氨酰基-tRNA合成酶(RS)文库的一员,ii)正交tRNA(O-tRNA),iii)编码正选择标记的多核苷酸,和iv)编码负选择标记的多核苷酸;其中在正选择中存活的细胞包含在非天然氨基酸存在下氨酰化正交tRNA(O-tRNA)的活性RS;和(b)将在正选择中存活的细胞在不存在非天然氨基酸的情况下进行负选择,以去除氨酰化具有天然氨基酸的O-tRNA的活性RS,从而提供优选地氨酰化具有非天然氨基酸的O-tRNA的O-RS。For example, the method includes: (a) performing positive selection, in the presence of an unnatural amino acid, on a population of eukaryotic cells of a first species of organism, wherein the eukaryotic cells each comprise: i) a library of aminoacyl-tRNA synthetase (RS) A member, ii) an orthogonal tRNA (O-tRNA), iii) a polynucleotide encoding a positive selection marker, and iv) a polynucleotide encoding a negative selection marker; wherein cells surviving positive selection contain In the presence of an active RS that aminoacylates an orthogonal tRNA (O-tRNA); and (b) negatively selects cells that survive positive selection in the absence of an unnatural amino acid to remove aminoacylation with a natural amino acid The active RS of the O-tRNA, thereby providing an O-RS that preferentially aminoacylates the O-tRNA with the unnatural amino acid.

正选择标记可以是各种分子中的任意一种。在一个实施方式中,正选择标记是为生长提供营养补充物的产品,并在缺少营养补充物的培养基上进行选择。编码正选择标记的多核苷酸的例子包括但不限于,例如,基于补充细胞的氨基酸营养缺陷的报道基因、his3基因(例如,其中his3基因编码咪唑甘油磷酸脱氢酶,通过3-氨基三唑(3-AT))、ura3基因、leu2基因、lys2基因、lacZ基因、adh基因等检测。参见,例如,G.M.Kishore和D.M.Shah,(1988),作为除草剂的氨基酸生物合成抑制剂(Amino acid biosynthesis inhibitors as herbicides),Annual Review ofBiochemistry 57:627-663。在一个实施方式中,通过邻-硝基苯基-β-D-半乳糖吡喃糖苷(ONPG)的水解检测lacZ产生。参见,例如,I.G.Serebriiskii和E.A.Golemis,(2000),lacZ在研究基因功能中的用途:用于酵母双杂交系统的β-半乳糖苷测定的评价(Uses of lacZ to study gene function:evaluation of beta-galactosidaseassays employed in the yeast two-hybrid system),Analytical Biochemistry 285:1-15。附加的正选择标记包括,例如、荧光素酶、绿色荧光蛋白(GFP)、YFP、EGFP、RFP、抗生素抗性基因产物(例如,氯霉素乙酰基转移酶(CAT))、转录调节蛋白(例如,GAL4)等。编码正选择标记的多核苷酸任选地包含选择密码子。Positive selectable markers can be any of a variety of molecules. In one embodiment, the positive selection marker is a product that provides nutritional supplements for growth, and selection is performed on media lacking nutritional supplements. Examples of polynucleotides encoding positive selectable markers include, but are not limited to, e.g., reporter genes based on amino acid auxotrophy of supplemented cells, the his3 gene (e.g., wherein the his3 gene encodes imidazole glycerol phosphate dehydrogenase, expressed via 3-aminotriazole (3-AT)), ura3 gene, leu2 gene, lys2 gene, lacZ gene, adh gene, etc. See, eg, G.M. Kishore and D.M. Shah, (1988), Amino acid biosynthesis inhibitors as herbicides, Annual Review of Biochemistry 57:627-663. In one embodiment, lacZ production is detected by hydrolysis of o-nitrophenyl-β-D-galactopyranoside (ONPG). See, e.g., I.G. Serebriiskii and E.A. Golemis, (2000), Uses of lacZ to study gene function: evaluation of beta -galactosidase assays employed in the yeast two-hybrid system), Analytical Biochemistry 285: 1-15. Additional positive selection markers include, for example, luciferase, green fluorescent protein (GFP), YFP, EGFP, RFP, antibiotic resistance gene products (e.g., chloramphenicol acetyltransferase (CAT)), transcriptional regulators ( For example, GAL4) and the like. A polynucleotide encoding a positive selection marker optionally includes a selector codon.

可以将编码正选择标记的多核苷酸可操作地连接于效应元件。也可存在编码从效应元件调节转录的转录调节蛋白,并包含至少一个选择密码子的附加多核苷酸。通过氨酰化具有非天然氨基酸的O-tRNA将非天然氨基酸掺入转录调节蛋白中导致编码正选择标记的多核苷酸(例如,报道基因)的转录。例如,见图1A。选择密码子任选地位于编码转录调节蛋白的DNA结合域的多核苷酸内或基本上在其部分的附近。A polynucleotide encoding a positive selection marker can be operably linked to the response element. Additional polynucleotides encoding transcriptional regulatory proteins that regulate transcription from the response elements and comprising at least one selector codon may also be present. Incorporation of a non-natural amino acid into a transcriptional regulatory protein by aminoacylation of an O-tRNA with the non-natural amino acid results in the transcription of a polynucleotide encoding a positive selection marker (eg, a reporter gene). See, for example, Figure 1A. A selector codon is optionally located within or substantially adjacent to a portion of the polynucleotide encoding the DNA binding domain of the transcriptional regulator protein.

也可将编码负选择标记的多核苷酸可操作地连接于效应元件,由转录调节蛋白介导转录。参见,例如,A.J,DeMaggio等,(2000),酵母分裂-杂交系统(The yeastsplit-hybrid system),Method Enzymol.328:128-137;H.M.Shih等,(1996),阳性遗传选择破坏蛋白-蛋白相互作用:鉴定阻止与辅激活物CBP结合的CREB突变(A positive genetic selection for disrupting protein-protein interactions:identification of CREB mutations that prevent association with thecoactivator CBP),Proc.Natl.Acad.Sci.U.S.A.93:13896-13901;M.Vidal,等,(1996),用酵母反向双杂交系统遗传表征哺乳动物蛋白-蛋白相互作用域(Genetic characterization of a mammalian protein-protein interaction domainby using a yeast reverse two-hybrid system),[评论],Proc.Natl.Acad.Sci.U.S.A.93:10321-10326;和M.Vidal,等,(1996),用反向双杂交和单杂交系统检测蛋白-蛋白解离和DNA-蛋白相互作用(Reverse two-hybrid and one-hybrid systemsto detect dissociation of protein-protein and DNA-protein interactions),[评论],Proc.Natl.Acad.Sci.U.S.A.93:10315-10320。通过氨酰化具有天然氨基酸的O-tRNA将天然氨基酸掺入转录调节蛋白中导致负选择标记的转录。负选择标记任选地包含选择密码子。在一个实施方式中,本发明的正选择标记和/或负选择标记可包含至少两个选择密码子,它们每个或两个可含有至少两种不同的选择密码子或至少两种相同的选择密码子。A polynucleotide encoding a negative selection marker can also be operably linked to a response element, transcription mediated by a transcriptional regulator protein. See, e.g., A.J, DeMaggio et al., (2000), The yeast split-hybrid system (The yeast split-hybrid system), Method Enzymol.328:128-137; H.M.Shih et al., (1996), Positive genetic selection destroys protein-protein Interactions: Identification of CREB mutations that prevent association with the coactivator CBP (A positive genetic selection for disrupting protein-protein interactions: identification of CREB mutations that prevent association with the coactivator CBP), Proc.Natl.Acad.Sci.U.S.A.93:13896 -13901; M.Vidal, et al., (1996), Genetic characterization of a mammalian protein-protein interaction domain by using a yeast reverse two-hybrid system , [Review], Proc.Natl.Acad.Sci.U.S.A.93:10321-10326; and M.Vidal, et al., (1996), Detection of protein-protein dissociation and DNA-protein dissociation using reverse two-hybrid and one-hybrid systems Interaction (Reverse two-hybrid and one-hybrid systems to detect dissociation of protein-protein and DNA-protein interactions), [Review], Proc.Natl.Acad.Sci.U.S.A.93:10315-10320. Incorporation of natural amino acids into transcriptional regulatory proteins by aminoacylation of O-tRNAs with natural amino acids results in the transcription of negative selectable markers. Negative selection markers optionally contain a selector codon. In one embodiment, the positive and/or negative selection markers of the invention may comprise at least two selector codons, each or both of which may contain at least two different selector codons or at least two identical selector codons. a.

转录调节蛋白是与核酸序列(例如,效应元件)结合(直接或间接)并调节可操作地连接于效应元件的序列转录的分子。转录调节蛋白可以是转录激活蛋白(例如,GAL4、核激素受体、AP1、CREB、LEF/tcf家族成员、SMADs、VP16、SP1等),转录抑制蛋白(例如,核激素受体、Groucho/tle家族、Engrailed家族等)或可根据环境具有两种活性的蛋白(例如,LEF/tcf、同源框蛋白等)。效应元件一般是转录调节蛋白可识别的核酸序列或与转录调节蛋白一致作用的附加剂。A transcriptional regulator protein is a molecule that binds (directly or indirectly) to a nucleic acid sequence (eg, a response element) and regulates transcription of the sequence operably linked to the response element. Transcriptional regulators can be transcriptional activators (e.g., GAL4, nuclear hormone receptors, AP1, CREB, LEF/tcf family members, SMADs, VP16, SP1, etc.), transcriptional repressors (e.g., nuclear hormone receptors, Groucho/tle family, Engrailed family, etc.) or proteins that can have both activities depending on the context (eg, LEF/tcf, homeobox proteins, etc.). Response elements are generally nucleic acid sequences that can be recognized by transcriptional regulatory proteins or additives that act in concert with transcriptional regulatory proteins.

转录调节蛋白另一例子是转录激活蛋白,GAL4(参见例如,图1A)。参见,例如,A.Laughon,等,(1984),鉴定由酿酒酵母GAL4基因编码的两种蛋白(Identificationof two proteins encoded by the Saccharomyces cerevisiae GAL4 gene),Molecular&Cellular Biology 4:268-275;A.Laughon和R.F.Gesteland,(1984),酿酒酵母GAL4基因的一级结构(Primary structure of the Saccharomyces cerevisiae GAL4gene),Molecular&Cellular Biology 4:260-267;L.Keegan,等,(1986),从真核调节蛋白的转录-激活功能分离DNA结合(Separation of DNA binding from thetranscription-activating function of a eukaryotic regulatory protein),Science 231:699-704;和M.Ptashne,(1988),真核转录激活蛋白是如何工作的(Howeukaryotic transcriptional activators work),Nature 335:683-689。这个881个氨基酸的蛋白的N-末端147氨基酸形成特异地结合DNA序列的DNA结合域(DBD)。参见,例如,M.Carey,等,(1989),GAL4的氨基-末端片段与DNA结合为二聚体(Anamino-terminal fragment of GAL4 binds DNA as a dimer),J.Mol.Biol.209:423-432;和E.Giniger,等,(1985),GAL4,一种酵母阳性调节蛋白的特异性DNA结合(Specific DNA binding of GAL4,a positive regulatory protein of yeast),Cell 40:767-774。该DBD通过间插蛋白序列连接到C-末端的113氨基酸激活域(AD),当该激活域与DNA结合时可以激活转录。参见,例如,J.Ma和M.Ptashne,(1987),GAL4的缺失分析限定了两种转录激活节段(Deletion analysis of GAL4 defines twotranscriptional activating segments),Cell 48:847-853:和J.Ma和M.Ptashne,(1987),GAL80识别GAL4羧基-末端的30个氨基酸(The carboxy-terminal 30 aminoacids of GAL4 are recognized by GAL80),Cell 50:137-142。通过将琥珀密码子置于,例如,含有GAL4的N-末端DBD和它的C-末端AD的单个多肽的N-末端DBD,通过O-tRNA/O-RS对的琥珀抑制可以与通过GAL4的转录激活连接(图1,A组)。GAL4激活的报道基因可以用于用基因进行的正和负选择(图1,B组)。Another example of a transcriptional regulator protein is the transcriptional activator protein, GAL4 (see eg, Figure 1A). See, for example, A.Laughon, et al., (1984), Identification of two proteins encoded by the Saccharomyces cerevisiae GAL4 gene, Molecular & Cellular Biology 4:268-275; A.Laughon and R.F.Gesteland, (1984), Primary structure of the Saccharomyces cerevisiae GAL4 gene, Molecular & Cellular Biology 4: 260-267; L.Keegan, et al., (1986), Transcription from eukaryotic regulatory proteins -Separation of DNA binding from the transcription-activating function of a eukaryotic regulatory protein, Science 231:699-704; and M.Ptashne, (1988), How eukaryotic transcription-activating function of a eukaryotic regulatory protein works transcriptional activators work), Nature 335: 683-689. The N-terminal 147 amino acids of this 881 amino acid protein form a DNA binding domain (DBD) that specifically binds DNA sequences. See, e.g., M. Carey, et al., (1989), Anamino-terminal fragment of GAL4 binds DNA as a dimer (Anamino-terminal fragment of GAL4 binds DNA as a dimer), J. Mol. Biol. 209: 423 -432; and E. Giniger, et al., (1985), GAL4, Specific DNA binding of GAL4, a positive regulatory protein of yeast, Cell 40:767-774. The DBD is linked by an intervening protein sequence to a C-terminal 113 amino acid activation domain (AD), which activates transcription when bound to DNA. See, e.g., J.Ma and M.Ptashne, (1987), Deletion analysis of GAL4 defines two transcriptional activating segments (Deletion analysis of GAL4 defines two transcriptional activating segments), Cell 48:847-853: and J.Ma and M.Ptashne, (1987), GAL80 recognizes 30 amino acids at the carboxy-terminal of GAL4 (The carboxy-terminal 30 aminoacids of GAL4 are recognized by GAL80), Cell 50:137-142. By placing the amber codon, for example, at the N-terminal DBD of a single polypeptide containing the N-terminal DBD of GAL4 and its C-terminal AD, amber suppression by the O-tRNA/O-RS pair can be compared to that by GAL4. Transcriptional activation connections (Fig. 1, panel A). GAL4-activated reporter genes can be used for positive and negative selection with genes (Figure 1, panel B).

用于负选择的培养基可以包含被负选择标记转化为可检测物质的选择剂或筛选剂。在本发明的一个方面,该可检测物质是有毒物质。编码负选择标记的多核苷酸可以是,例如,ura3基因。例如,可以将URA3报道基因置于含有GAL4DNA结合位点的启动子的控制之下。例如,当用选择密码子编码GAL4的多核苷酸翻译产生负选择标记时,GAL4激活URA3的转录。在含有5-氟乳清酸(5-FOA)的培养基上完成负选择,ura3基因的基因产物可将5-氟乳清酸转化成可检测物质(例如,杀死细胞的有毒物质)。参见,例如,J.D.Boeke,等,(1984),在酵母中正选择缺少乳清苷-5’-磷酸脱羧酶活性的突变体:5-氟乳清酸抗性(A positive selection for mutants lackingorotidine-5’-phosphate decarboxylase activity in yeast:5-fluoroorotic acidresistance),Molecular&General Genetics 197:345-346);M.Vidal,等,(1996),用酵母反向双杂交系统遗传表征哺乳动物蛋白-蛋白相互作用域(Geneticcharacterization of a mammalian protein-protein interaction domain by usinga yeast reverse two-hybrid system),[评论],Proc.Natl.Acad.Sci.U.S.A.93:10321-10326;和M.Vidal,等,(1996),用反向双杂交和单杂交系统检测蛋白-蛋白解离和DNA-蛋白相互作用(Reverse two-hybrid and one-hybrid systems to detectdissociation of protein-protein and DNA-protein interactions),[评论],Proc.Natl.Acad.Sci.U.S.A.93:10315-10320。也参见图8C。The medium used for negative selection may contain a selection or screening agent that is converted to a detectable substance by a negative selection marker. In one aspect of the invention, the detectable substance is a toxic substance. A polynucleotide encoding a negative selection marker can be, for example, the ura3 gene. For example, a URA3 reporter gene can be placed under the control of a promoter containing a GAL4 DNA binding site. For example, GAL4 activates transcription of URA3 when a polynucleotide encoding GAL4 is translated with a selector codon to generate a negative selection marker. Negative selection is accomplished on media containing 5-fluoroorotic acid (5-FOA), the gene product of the ura3 gene, which converts 5-fluoroorotic acid into a detectable substance (eg, a toxic substance that kills cells). See, e.g., J.D.Boeke, et al., (1984), Mutants lacking orotidine-5'-phosphate decarboxylase activity in positive selection in yeast: A positive selection for mutants lackingorotidine-5 '-phosphate decarboxylase activity in yeast: 5-fluoroorotic acid resistance), Molecular & General Genetics 197: 345-346); M. Vidal, et al., (1996), Genetic characterization of mammalian protein-protein interaction domains using the yeast reverse two-hybrid system (Genetic characterization of a mammalian protein-protein interaction domain by using a yeast reverse two-hybrid system), [Review], Proc.Natl.Acad.Sci.U.S.A.93:10321-10326; and M.Vidal, et al., (1996), Reverse two-hybrid and one-hybrid systems to detect dissociation of protein-protein and DNA-protein interactions, [Review], Proc. Natl. Acad. Sci. U.S.A. 93:10315-10320. See also Figure 8C.

如同正选择标记一样,负选择标记也可以是各种分子的任意一种。在一个实施方式中,正选择标记和/或负选择标记是在合适的反应物存在下发荧光或催化发光反应的多肽。例如,负选择标记包括但不限于,例如,荧光素酶、绿色荧光蛋白(GFP)、YFP、EGFP、RFP、抗生素抗性基因产物(例如、氯霉素乙酰基转移酶(CAT))、lacZ基因产物、转录调节蛋白等。在本发明的一个方面,通过荧光激活细胞分选(FACS)或通过发光检测正选择标记和/或负选择标记。在另一实施例中,正选择标记和/或负选择标记包含基于亲和力的筛选标记。同一多核苷酸可编码正选择标记和负选择标记。As with positive selection markers, negative selection markers can be any of a variety of molecules. In one embodiment, the positive and/or negative selection marker is a polypeptide that fluoresces or catalyzes a luminescent reaction in the presence of a suitable reactant. For example, negative selection markers include, but are not limited to, e.g., luciferase, green fluorescent protein (GFP), YFP, EGFP, RFP, antibiotic resistance gene products (e.g., chloramphenicol acetyltransferase (CAT)), lacZ Gene products, transcriptional regulators, etc. In one aspect of the invention, the positive and/or negative selection markers are detected by fluorescence activated cell sorting (FACS) or by luminescence. In another embodiment, the positive and/or negative selection markers comprise affinity-based selection markers. The same polynucleotide can encode both positive and negative selection markers.

选择/筛选严格性的附加水平也可用于本发明方法。该选择或筛选严格性可以在生产O-RS方法的一或两步上不同。这可包括,例如,改变编码正和/或负选择标记的多核苷酸中效应元件的量,将数量不等的失活合成酶加入到步骤的一步或两步中,改变使用的选择/筛选剂的量等。也可以进行附加轮的正和/或负选择。Additional levels of selection/screening stringency may also be used in the methods of the invention. The selection or screening stringency can be varied in one or two steps of the method for producing O-RS. This may include, for example, varying the amount of response elements in polynucleotides encoding positive and/or negative selectable markers, adding varying amounts of inactive synthetases to one or both steps, varying the selection/screening agent used amount etc. Additional rounds of positive and/or negative selection may also be performed.

选择或筛选也可包括一和或多种正或负选择或筛选,包括,例如,氨基酸通透性的改变、翻译效率的改变、翻译忠实性的改变等。一般地,一种或多种改变是基于包含或编码用于生产蛋白的正交tRNA-tRNA合成酶对的组件的一种或多种多核苷酸中的突变。Selection or screening may also include one or more positive or negative selections or screens, including, for example, changes in amino acid permeability, changes in translation efficiency, changes in translational fidelity, and the like. Typically, the one or more alterations are based on mutations in one or more polynucleotides comprising or encoding components of an orthogonal tRNA-tRNA synthetase pair for protein production.

可以用模型富集研究从过量的失活合成酶中快速选择活性合成酶。可以进行正和/或负模型选择研究。例如,将含有可能的活性氨酰基-tRNA合成酶的真核细胞与过量不同倍数的失活氨酰基-tRNA合成酶混合。比率比较在非选择性培养基中生长的细胞之间进行,例如,X-GAL覆盖测定,和在选择性培养基(例如,不存在组氨酸和/或尿嘧啶的情况下)中生长并能够存活的细胞中进行,例如,X-GAL测定。对于负模型选择,将可能的活性氨酰基-tRNA合成酶与过量不同倍数的失活氨酰基-tRNA合成酶混合,用负选择物质,例如,5-FO进行选择。Model enrichment studies can be used to rapidly select active synthetases from an excess of inactive synthetases. Positive and/or negative model selection studies can be performed. For example, eukaryotic cells containing a potentially active aminoacyl-tRNA synthetase are mixed with varying fold excesses of an inactive aminoacyl-tRNA synthetase. Ratio comparisons are made between cells grown in non-selective media, e.g., the X-GAL overlay assay, and cells grown in selective media (e.g., in the absence of histidine and/or uracil) and Viable cells are performed, for example, the X-GAL assay. For negative model selection, mix likely active aminoacyl-tRNA synthetases with varying fold excesses of inactive aminoacyl-tRNA synthetases and select with a negative selection substance, eg, 5-FO.

一般地,RS文库(例如,突变体RS文库)含有来自如来自非真核生物的至少一种氨酰基-tRNA合成酶(RS)的RS。在一个实施方式中,RS文库来自失活RS,例如,其中通过,例如在合成酶的活性位点、在合成酶的编辑机制位点、在不同位点通过结合合成酶的不同域等方式突变活性RS产生失活RS。例如,将RS的活性位点残基突变为,例如,丙氨酸残基。将编码丙氨酸突变的RS的多核苷酸用作模板,以将丙氨酸残基诱变为所有20个氨基酸。选择/筛选突变体RS文库以生产O-RS。在另一实施方式中,失活RS包含氨基酸结合口袋,用一种或多种不同氨基酸取代一种或多种含有结合口袋的氨基酸。在一个实施例中,取代的氨基酸用丙氨酸取代。任选地,将编码丙氨酸突变的RS的多核苷酸用作模板,以将丙氨酸残基诱变为所有20个氨基酸,并进行筛选/选择。Typically, an RS library (eg, a mutant RS library) contains an RS from at least one aminoacyl-tRNA synthetase (RS), eg, from a non-eukaryote. In one embodiment, the RS library is derived from an inactive RS, e.g., wherein it is mutated, e.g., at the active site of the synthetase, at the site of the editing machinery of the synthetase, at a different site by binding to a different domain of the synthetase, etc. Active RS produces inactive RS. For example, the active site residue of RS is mutated, for example, to an alanine residue. The polynucleotide encoding the alanine mutated RS was used as a template to mutagenize alanine residues to all 20 amino acids. A library of mutant RSs is selected/screened to produce O-RSs. In another embodiment, the inactive RS comprises an amino acid binding pocket, and one or more amino acids comprising the binding pocket are substituted with one or more different amino acids. In one embodiment, the substituted amino acid is substituted with alanine. Optionally, a polynucleotide encoding an alanine-mutated RS is used as a template to mutagenize alanine residues to all 20 amino acids and screen/select.

生产O-RS的方法还可包括用各种本领域已知的诱变技术生产RS文库。例如,可通过位点特异性突变、随机点突变、同源重组、DNA改组或其它递归诱变方法、嵌合构建或它们的任意组合产生突变RS。例如,可以从两种或多种其它,例如较小、较少不同的“亚文库”产生突变体RS文库。一旦合成酶进行正和负选择/筛选策略,就可进一步诱变这些合成酶。例如,可以分离编码O-RS的核酸;可从该核酸产生一组编码突变的O-RS的多核苷酸(例如,通过随机诱变,位点特异性诱变,重组或它们的任意组合);和,可以重复进行这些步骤的单独步骤或组合,直到获得优选地氨酰化具有非天然氨基酸的O-tRNA的突变O-RS。在本发明的一个方面,这些步骤至少进行两次。Methods of producing O-RSs may also include producing libraries of RSs using various mutagenesis techniques known in the art. For example, mutant RS can be generated by site-specific mutagenesis, random point mutation, homologous recombination, DNA shuffling or other recursive mutagenesis methods, chimeric construction, or any combination thereof. For example, a mutant RS library can be generated from two or more other, eg smaller, less distinct "sublibraries". Once synthetases are subjected to positive and negative selection/screening strategies, these synthetases can be further mutagenized. For example, a nucleic acid encoding an O-RS can be isolated; a set of polynucleotides encoding a mutated O-RS can be generated from the nucleic acid (e.g., by random mutagenesis, site-specific mutagenesis, recombination, or any combination thereof) and, individual steps or combinations of these steps may be repeated until a mutant O-RS that preferentially aminoacylates an O-tRNA with an unnatural amino acid is obtained. In one aspect of the invention, these steps are performed at least twice.

可以在WO2002/086075,题为“用于生产正交tRNA-氨酰基tRNA合成酶对的方法和组合物”中找到生产O-RS的更多细节。也参见,Hamano-Takaku等,(2000)突变大肠杆菌酪氨酰-tRNA合成酶利用非天然氨基酸重氮酪氨酸比酪氨酸更有效(Amutant Escherichia coli Tyrosyl-tRNA Synthetase utilizes the Unnatural AminoAcid Azatyrosine More Efficiently than Tyrosine),Journal of BiologicalChemistry,275(51):40324-40328;Kiga等(2002),在真核翻译中将非天然氨基酸位点特异性掺入蛋白中的工程大肠杆菌酪氨酰-tRNA合成酶及其在麦胚无细胞体系中的应用(An engineered Escherichia coli tyrosyl-tRNA synthetase forsite-specific incorporation of an unnatural amino acid into proteins ineukaryotic translation and its application in a wheat germ cell free system),PNAS 99(15):9715-9723;和Francklyn等,(2002),氨酰基-tRNA合成酶:变化的翻译剧场中多才多艺的演员(Aminoacyl-tRNA synthetases:Versatile players inthe changing theater of traslation);RNA,8:1363-1372。Further details of producing O-RS can be found in WO2002/086075, entitled "Methods and compositions for producing orthogonal tRNA-aminoacyl tRNA synthetase pairs". See also, Hamano-Takaku et al., (2000) Mutant Escherichia coli Tyrosyl-tRNA Synthetase utilizes the Unnatural AminoAcid Azatyrosine More Efficiently Than Tyrosine Efficiently than Tyrosine), Journal of Biological Chemistry, 275(51):40324-40328; Kiga et al. (2002), E. coli tyrosyl-tRNA engineered for site-specific incorporation of unnatural amino acids into proteins during eukaryotic translation Synthetase and its application in a wheat germ cell free system (An engineered Escherichia coli tyrosyl-tRNA synthetase for site-specific incorporation of an unnatural amino acid into proteins neukaryotic translation and its application in a wheat germ cell free system), PNAS 99( 15): 9715-9723; and Francklyn et al., (2002), Aminoacyl-tRNA synthetases: Versatile players in the changing theater of translation; RNA, 8: 1363 -1372.

正交tRNAsOrthogonal tRNAs

本发明提供了包括正交tRNA(O-tRNA)的真核细胞。该正交tRNA介导非天然氨基酸体内掺入含有O-tRNA识别的选择密码子的多核苷酸编码的蛋白质中。在某些实施方式中,本发明的O-tRNA介导非天然氨基酸掺入蛋白质中,其效率相当于含有SEQID NO.:65所列多核苷酸序列或在该序列的细胞中加工的tRNA效率的例如,至少40%、至少45%、至少50%、至少60%、至少75%、至少80%或甚至90%或更高。参见本文的表5。The present invention provides eukaryotic cells comprising orthogonal tRNAs (O-tRNAs). The orthogonal tRNA mediates in vivo incorporation of the unnatural amino acid into a protein encoded by a polynucleotide containing a selector codon recognized by the O-tRNA. In certain embodiments, the O-tRNA of the present invention mediates the incorporation of unnatural amino acids into proteins with an efficiency equivalent to the efficiency of tRNAs processed in cells containing the polynucleotide sequence set forth in SEQ ID NO.: 65 For example, at least 40%, at least 45%, at least 50%, at least 60%, at least 75%, at least 80%, or even 90% or higher. See Table 5 herein.

本发明O-tRNA的例子是SEQ ID NO.:65(参见本文的实施例6和表5)。SEQ IDNO.:65是一个剪接前/加工的转录子,它在细胞中被任选地加工,例如,采用细胞的内源性剪切和加工机器,修饰形成活性O-tRNA。一般地,这种剪接前转录子的群体在细胞中形成活性tRNA的群体(活性tRNA可以是一种或多种活性形式)。本发明也包括O-tRNA的保守变体和它的细胞加工产物。例如,O-tRNA的保守变体包括功能类似SEQ ID NO.:65的O-tRNA并维持tRNA L-形结构如加工形式,但不具有相同序列(不同于野生型tRNA分子)的那些分子。一般地,本发明O-tRNA是可循环的O-tRNA,因为O-tRNA可在体内再氨酰化,响应于选择密码子再介导非天然氨基酸掺入多核苷酸编码的蛋白质中。An example of an O-tRNA of the invention is SEQ ID NO.: 65 (see Example 6 and Table 5 herein). SEQ ID NO.: 65 is a pre-splicing/processing transcript that is optionally processed in the cell, eg, modified to form an active O-tRNA using the cell's endogenous splicing and processing machinery. Typically, this population of pre-splicing transcripts forms a population of active tRNAs (active tRNAs may be one or more active forms) in the cell. The invention also includes conservative variants of O-tRNA and its cellular processing products. For example, conservative variants of O-tRNAs include those molecules that function like the O-tRNA of SEQ ID NO.: 65 and maintain the tRNA L-shaped structure such as the processed form, but do not have the same sequence (unlike the wild-type tRNA molecule). In general, the O-tRNA of the present invention is a recyclable O-tRNA because the O-tRNA can be reaminoacylated in vivo, which then mediates incorporation of the unnatural amino acid into the protein encoded by the polynucleotide in response to a selector codon.

tRNA在真核生物中而不在原核生物中的转录是通过RNA聚合酶III进行的,该聚合酶对可在真核细胞中转录的tRNA结构基因的一级序列作出限制。此外,在真核细胞中,需要将tRNA从核中输出到转录它们的地方即胞质,以在翻译中发挥作用。编码本发明O-tRNA的核酸或它的互补多核苷酸也是本发明的特征。在本发明的一个方面,编码本发明O-tRNA的核酸包括内部启动子序列,例如,A框(例如,TRGCNNAGY)和B框(例如,GGTTCGANTCC,SEQ ID NO:88)。本发明O-tRNA也可以是转录后修饰的。例如,在真核生物中tRNA基因的转录后修饰包括用RNA酶P和3’-核酸内切酶分别去除5’-和3’-侧翼序列。加入3’-CCA序列也是真核生物中tRNA基因的转录后修饰。Transcription of tRNA in eukaryotes but not in prokaryotes is carried out by RNA polymerase III, which restricts the primary sequence of tRNA structural genes that can be transcribed in eukaryotes. Furthermore, in eukaryotic cells, tRNAs need to be exported from the nucleus to the cytoplasm, where they are transcribed, to function in translation. A nucleic acid encoding an O-tRNA of the invention or its complementary polynucleotide is also a feature of the invention. In one aspect of the invention, the nucleic acid encoding the O-tRNA of the invention includes internal promoter sequences, for example, A box (eg, TRGCNNAGY) and B box (eg, GGTTCGANTCC, SEQ ID NO: 88). The O-tRNA of the invention may also be post-transcriptionally modified. For example, post-transcriptional modification of tRNA genes in eukaryotes involves removal of 5'- and 3'-flanking sequences with RNase P and 3'-endonuclease, respectively. The addition of a 3'-CCA sequence is also a post-transcriptional modification of tRNA genes in eukaryotes.

在一个实施方式中,通过将第一种类的真核细胞的群体进行负选择获得O-tRNA,其中真核细胞含有tRNA文库的一员。负选择清除了含有被对真核细胞内源性氨酰基-tRNA合成酶(RS)氨酰化的tRNA文库的一员的细胞。这提供了与第一种类的真核细胞正交的tRNA库。In one embodiment, the O-tRNA is obtained by negative selection of a population of eukaryotic cells of a first kind, wherein the eukaryotic cells contain a member of the tRNA library. Negative selection eliminates cells containing a member of the tRNA library that is aminoacylated by the eukaryotic endogenous aminoacyl-tRNA synthetase (RS). This provides a library of tRNAs orthogonal to the first kind of eukaryotic cells.

另外,在上述将非天然氨基酸掺入多肽中的方法或与其它方法组合中,可以使用反式翻译系统。该系统包括存在于大肠杆菌称为tmRNA的分子。该RNA分子结构上涉及丙氨酰tRNA,被丙氨酰合成酶氨酰化。tmRNA和tRNA之间的差异是反密码子环被特殊的大序列代替。该序列允许核糖体用tmRNA内编码的开放阅读框作为模板在被中止的序列上继续翻译。在本发明中,可以产生用正交合成酶优选地氨酰化并载有非天然氨基酸的正交tmRNA。通过借助该系统转录基因,核糖体在特异性位点中止工作;将非天然氨基酸引入该位点,然后用正交tmRNA内编码的序列继续翻译。In addition, in the above method of incorporating a non-natural amino acid into a polypeptide or in combination with other methods, a trans-translation system may be used. This system includes a molecule present in E. coli called tmRNA. This RNA molecule structurally involves alanyl tRNA, which is aminoacylated by alanyl synthetase. The difference between tmRNA and tRNA is that the anticodon loop is replaced by a special large sequence. This sequence allows the ribosome to continue translation at the aborted sequence using the open reading frame encoded within the tmRNA as a template. In the present invention, an orthogonal tmRNA can be produced that is preferably aminoacylated with an orthogonal synthetase and loaded with an unnatural amino acid. By transcribing a gene with the help of this system, the ribosome stops working at a specific site; introduces an unnatural amino acid at that site, and then resumes translation with the sequence encoded within the orthogonal tmRNA.

生产重组正交tRNA的其它方法可以在,例如,国际专利申请WO2002/086075,题为“用于生产正交tRNA-氨酰基tRNA合成酶对的方法和组合物(Methods andcompositions for the production of orthogonal tRNA-aminoacyltRNA synthetasepairs)”中找到。也参见Forster等,(2003)通过翻译从头设计的遗传密码程序化拟肽合成酶(Programming peptidomimetic synthetases by translating geneticcodes designed de novo)PNAS 100(11):6353-6357;和Feng等,(2003),通过单氨基酸改变扩展tRNA合成酶的tRNA识别(Expanding tRNA recognition of a tRNAsynthetase by a single amino acid change),PNAS 100(10):5676-5681。Other methods for producing recombinant orthogonal tRNAs can be found, for example, in International Patent Application WO2002/086075, entitled "Methods and compositions for the production of orthogonal tRNA-aminoacyl tRNA synthetase pairs" -aminoacyltRNA synthetase pairs)". See also Forster et al., (2003) Programming peptidomimetic synthetases by translating genetic codes designed de novo PNAS 100(11):6353-6357; and Feng et al., (2003), Expanding tRNA recognition of a tRNA synthetase by a single amino acid change, PNAS 100(10):5676-5681.

正交tRNA和正交氨酰基-tRNA合成酶对Orthogonal tRNA and Orthogonal Aminoacyl-tRNA Synthetase Pairs

正交对由O-tRNA,例如,抑制型tRNA、移码tRNA等和O-RS组成。O-tRNA没有被内源性合成酶酰化,并能介导非天然氨基酸掺入含有O-tRNA体内识别的选择密码子的多核苷酸编码的蛋白质中。在真核细胞中,O-RS识别O-tRNA并优选地氨酰化具有非天然氨基酸的O-tRNA。本发明也包括生产正交对的方法以及由此方法生产的正交对,以及用于真核细胞的正交对组合物。在真核细胞中,多个正交tRNA/合成酶对的产生可以允许用不同密码子同时掺入多个非天然氨基酸。An orthogonal pair consists of an O-tRNA, eg, suppressor tRNA, frameshift tRNA, etc., and an O-RS. O-tRNAs are not acylated by endogenous synthetases and can mediate the incorporation of unnatural amino acids into proteins encoded by polynucleotides containing selector codons recognized in vivo by O-tRNAs. In eukaryotic cells, O-RS recognizes O-tRNAs and preferentially aminoacylates O-tRNAs with unnatural amino acids. The invention also includes methods of producing orthogonal pairs and orthogonal pairs produced by such methods, as well as orthogonal pair compositions for use in eukaryotic cells. In eukaryotic cells, the generation of multiple orthogonal tRNA/synthetase pairs can allow the simultaneous incorporation of multiple unnatural amino acids with different codons.

在真核细胞中,可以通过用低效率跨种氨酰化从不同生物输入对,如无义抑制对,来生产正交O-tRNA/O-RS对。在真核细胞中,O-tRNA和O-RS有效地表达和加工,O-tRNA从核中有效地输出至胞质。例如,一个所述对是来自大肠杆菌的酪氨酰-tRNA合成酶/tRNACUA对(参见,例如,H.M.Goodman,等,(1968),Nature 217:1019-24;和D.G.Barker,等,(1982),FEBS Letters 150:419-23)。当两者都在酿酒酵母的胞质中表达时,大肠杆菌酪氨酰-tRNA合成酶有效地氨酰化其关联大肠杆菌tRNACUA,但不氨酰化酿酒酵母tRNA。参见,例如,H.Edwards和P.Schimmel,(1990),Molecular&Cellular Biology 10:1633-41;和H.Edwards,等,(1991),PNAS United Statesof America 88:1153-6。此外,大肠杆菌酪氨酰tRNACUA是酿酒酵母氨酰基-tRNA合成酶的差底物(参见,例如,V.Trezeguet,等,(1991),Molecular&Cellular Biology11:2744-51),但是在酿酒酵母的蛋白翻译中有效发挥功能。参见,例如,H.Edwards和P.Schimmel,(1990)Molecular&Cellular Biology 10:1633-41;H.Edwards,等,(1991),PNAS United States of America 88:1153-6;和V.Trezeguet,等,(1991),Molecular&Cellular Biology 11:2744-51。而且,大肠杆菌TyrRS不具有校正连接到tRNA的非天然氨基酸的编辑机制。In eukaryotic cells, orthogonal O-tRNA/O-RS pairs can be produced by cross-species aminoacylation with low efficiency from different biological input pairs, such as nonsense suppressor pairs. In eukaryotic cells, O-tRNA and O-RS are efficiently expressed and processed, and O-tRNA is efficiently exported from the nucleus to the cytoplasm. For example, one such pair is the tyrosyl-tRNA synthetase/ tRNACUA pair from E. coli (see, e.g., HM Goodman, et al., (1968), Nature 217:1019-24; and DGBarker, et al., (1982) , FEBS Letters 150:419-23). E. coli tyrosyl-tRNA synthetase efficiently aminoacylates its cognate E. coli tRNA CUA but not S. cerevisiae tRNA when both are expressed in the cytoplasm of S. cerevisiae. See, eg, H. Edwards and P. Schimmel, (1990), Molecular & Cellular Biology 10:1633-41; and H. Edwards, et al., (1991), PNAS United States of America 88:1153-6. Furthermore, the E. coli tyrosyl tRNA CUA is a poor substrate for the aminoacyl-tRNA synthetase of S. function efficiently in protein translation. See, eg, H. Edwards and P. Schimmel, (1990) Molecular & Cellular Biology 10: 1633-41; H. Edwards, et al., (1991), PNAS United States of America 88: 1153-6; and V. Trezeguet, et al. , (1991), Molecular & Cellular Biology 11:2744-51. Furthermore, E. coli TyrRS does not have an editing mechanism to correct unnatural amino acids attached to tRNAs.

O-tRNA和O-RS可以是各种生物中天然产生的或可以是天然产生的tRNA和/或RS突变获得的,它产生了tRNA文库和/或RS文库。参见本文中题为“来源和宿主”的部分。在各种实施方式中,O-tRNA和O-RS来自至少一种生物。在另一实施方式中,O-tRNA来自第一生物中天然产生或突变的天然产生tRNA,O-RS来自第二生物中天然产生或突变的天然产生RS。在一个实施方式中,第一和第二非真核生物是相同的。另外,第一和第二非真核生物可以是不同的。O-tRNAs and O-RSs may be naturally occurring in various organisms or may be obtained by mutation of naturally occurring tRNAs and/or RSs, which result in tRNA libraries and/or RS libraries. See the section herein entitled "Sources and hosts". In various embodiments, the O-tRNA and O-RS are from at least one organism. In another embodiment, the O-tRNA is from a naturally occurring or mutated naturally occurring tRNA in a first organism and the O-RS is from a naturally occurring or mutated naturally occurring RS in a second organism. In one embodiment, the first and second non-eukaryotic organisms are the same. Additionally, the first and second non-eukaryotic organisms can be different.

参见本文中题为“正交氨酰基-tRNA合成酶”和“O-tRNA”的部分中生产O-RS和O-tRNA的方法。也参见国际专利申请WO2002/086075,题为“生产正交tRNA-氨酰基tRNA合成酶对的方法和组合物”(Methods and compositions for the productionof orthogonal tRNA-aminoacyltRNA synthetase pairs)。See the section herein entitled "Orthogonal Aminoacyl-tRNA Synthetase" and "O-tRNA" for methods of producing O-RS and O-tRNA. See also International Patent Application WO2002/086075, entitled "Methods and compositions for the production of orthogonal tRNA-aminoacyltRNA synthetase pairs".

保真度、效率和产率Fidelity, Efficiency and Yield

保真度指将所需分子,例如,非天然氨基酸或氨基酸掺入正在生长的多肽中所需位置的准确度。本发明翻译组件响应于选择密码子,以高保真度将非天然氨基酸掺入蛋白质中。例如,用本发明的组件,将所需非天然氨基酸掺入正在生长多肽链中所需位置的效率(例如,响应于选择密码子)相当于将不需要的特异性天然氨基酸掺入E在生长多肽链中所需位置的效率的例如,大于75%、大于85%、大于95%或甚至大于99%或更高。Fidelity refers to the accuracy with which a desired molecule, eg, an unnatural amino acid or amino acids, is incorporated into the growing polypeptide at the desired location. The translational modules of the invention incorporate unnatural amino acids into proteins with high fidelity in response to a selector codon. For example, with the modules of the invention, the efficiency with which a desired unnatural amino acid is incorporated into a desired position in a growing polypeptide chain (e.g., in response to a selector codon) is comparable to the incorporation of an undesired specific natural amino acid into an E growing polypeptide chain. The efficiency of the desired position in the polypeptide chain is, for example, greater than 75%, greater than 85%, greater than 95% or even greater than 99% or higher.

效率也可指与相应的对照相比,O-RS氨酰化具有非天然氨基酸的O-tRNA的程度。可以通过它们的效率限定本发明O-RS。在本发明的某些实施方式中,将一个O-RS与另一O-RS相比。例如,本发明O-RS氨酰化具有非天然氨基酸的O-tRNA的效率相当于具有SEQ ID NO.:86或45所列氨基酸序列(或表5中另一特异性RS)的O-RS氨酰化O-tRNA效率的例如,至少40%、至少50%、至少60%、至少75%、至少80%、至少90%、至少95%或甚至99%或更高。在另一实施方式中,本发明的O-RS氨酰化具有非天然氨基酸的O-tRNA的效率比O-RS氨酰化具有天然氨基酸的O-tRNA的效率高至少10倍,至少20倍,至少30倍等。Efficiency can also refer to the extent to which an O-RS aminoacylates an O-tRNA with an unnatural amino acid compared to a corresponding control. The O-RS of the present invention can be defined by their efficiency. In certain embodiments of the invention, one O-RS is compared to another O-RS. For example, the O-RS of the present invention aminoacylates O-tRNA with unnatural amino acids with an efficiency equivalent to that of an O-RS with the amino acid sequence listed in SEQ ID NO.: 86 or 45 (or another specific RS in Table 5) The efficiency of aminoacylation of O-tRNA is, for example, at least 40%, at least 50%, at least 60%, at least 75%, at least 80%, at least 90%, at least 95%, or even 99% or higher. In another embodiment, the efficiency of aminoacylation of O-tRNA with unnatural amino acid by O-RS of the present invention is at least 10 times, at least 20 times higher than the efficiency of O-RS aminoacylation of O-tRNA with natural amino acid , at least 30 times and so on.

用本发明的翻译组件,含有非天然氨基酸的感兴趣多肽的产率是从多核苷酸缺少选择密码子的细胞中获得天然产生的感兴趣多肽的产率的例如,至少5%、至少10%、至少20%、至少30%、至少40%、50%或更高。在另一方面,细胞在不存在非天然氨基酸的情况下生产感兴趣多肽的产率是在非天然氨基酸存在下生产多肽产率的例如,小于30%、小于20%、小于15%、小于10%、小于5%、小于2.5%等。Using the translation module of the invention, the yield of polypeptide of interest comprising an unnatural amino acid is, for example, at least 5%, at least 10%, of the yield of naturally occurring polypeptide of interest obtained from cells whose polynucleotide lacks a selector codon , at least 20%, at least 30%, at least 40%, 50% or more. In another aspect, the cell produces the polypeptide of interest in the absence of the unnatural amino acid in a yield that is, for example, less than 30%, less than 20%, less than 15%, less than 10% of the yield of the polypeptide produced in the presence of the unnatural amino acid. %, less than 5%, less than 2.5%, etc.

来源和宿主生物source and host organism

本发明的正交翻译组件一般来自非真核生物,用于真核细胞或翻译系统。例如,正交O-tRNA可以来自非真核生物,例如,真细菌,如大肠杆菌、嗜热栖热菌、嗜热脂肪芽孢杆菌等,或古细菌,如詹氏甲烷球菌、热自养甲烷杆菌、盐杆菌属如沃氏富盐菌和盐杆菌种NRC-1、闪烁古生球菌、激烈火球菌、堀越氏火球菌、敏捷气热菌等,正交O-RS可来自非真核生物,例如,真细菌,如大肠杆菌、嗜热栖热菌、嗜热脂肪芽孢杆菌等,或古细菌,如詹氏甲烷球菌、热自养甲烷杆菌、盐杆菌属如沃氏富盐菌和盐杆菌种NRC-1、闪烁古生球菌、激烈火球菌、堀越氏火球菌、敏捷气热菌等。另外,也可使用真核来源,例如,植物、藻类、原生生物、真菌、酵母、动物(例如,哺乳动物、昆虫、节肢动物等)等,例如,其中组件与感兴趣的的细胞或翻译系统正交,或将它们修饰(例如,突变)为与细胞或翻译系统正交。Orthogonal translation modules of the present invention are generally derived from non-eukaryotic organisms for use in eukaryotic cells or translation systems. For example, the orthogonal O-tRNA can be derived from non-eukaryotic organisms, for example, eubacteria such as Escherichia coli, Thermus thermophilus, Bacillus stearothermophilus, etc., or archaea such as Methanococcus jannaschii, Thermoautotrophic methane Bacillus, Halobacterium genus such as Halobacterium worschii and Halobacterium NRC-1, Archaeoglobus fulgillum, Pyrococcus furiosus, Pyrococcus horikoshii, Aerothermus agility, etc. Orthogonal O-RS can come from non-eukaryotic organisms , for example, eubacteria such as Escherichia coli, Thermus thermophilus, Bacillus stearothermophilus, etc., or archaebacteria such as Methanococcus jannaschii, Methanobacter thermoautotrophs, Halobacteria such as Halobacter worriii and Halobacterium Bacillus species NRC-1, Archaeoglobulin flickering, Pyrococcus furious, Pyrococcus Horikoshii, Aerothermus agility, etc. In addition, eukaryotic sources can also be used, e.g., plants, algae, protists, fungi, yeast, animals (e.g., mammals, insects, arthropods, etc.), etc., e.g., where components are compatible with the cell or translation system of interest Orthogonal, or modifying (eg, mutating) them to be orthogonal to the cell or translation system.

O-tRNA/O-RS对的单独组件可以来自相同生物或不同生物。在一个实施方式中,O-tRNA/O-RS对来自相同生物。例如,O-tRNA/O-RS对可以来自大肠杆菌的酪氨酰-tRNA合成酶/tRNACUA对。另外,O-tRNA/O-RS对的O-tRNA和O-RS任选地来自不同生物。Individual components of an O-tRNA/O-RS pair can be from the same organism or different organisms. In one embodiment, the O-tRNA/O-RS pairs are from the same organism. For example, the O-tRNA/O-RS pair can be from the tyrosyl-tRNA synthetase/tRNA CUA pair from E. coli. Additionally, the O-tRNA and O-RS of the O-tRNA/O-RS pair are optionally from different organisms.

可以在真核细胞中选择或筛选和/或使用正交O-tRNA、O-RS或O-tRNA/O-RS对,以用非天然氨基酸生产多肽。真核细胞可以来自各种来源的任意一种,例如,植物(例如,高等植物,如单子叶植物或双子叶植物)、藻类、原生生物、真菌、酵母(例如,酿酒酵母)、动物(例如,哺乳动物、昆虫、节肢动物等)等。具有本发明翻译组件的真核细胞组合物也是本发明的特征。Orthogonal O-tRNA, O-RS or O-tRNA/O-RS pairs can be selected or screened and/or used in eukaryotic cells to produce polypeptides with unnatural amino acids. Eukaryotic cells can be from any of a variety of origins, e.g., plants (e.g., higher plants such as monocots or dicots), algae, protists, fungi, yeasts (e.g., Saccharomyces cerevisiae), animals (e.g., , mammals, insects, arthropods, etc.) etc. Eukaryotic cell compositions having translational components of the invention are also a feature of the invention.

本发明也提供在一种类中有效筛选,以任选地用于该种类和/或第二种类(任选地,无附加选择/筛选)。例如,在一种类,如容易操纵的种类(如酵母细胞等)中选择或筛选O-tRNA/O-RS的组件,并引入第二真核生物,例如,植物(例如,高等植物,如单子叶植物或双子叶植物)、藻类、原生生物、真菌、酵母、动物(例如,哺乳动物、昆虫、节肢动物等)等,用于将非天然氨基酸体内掺入第二种类中。The invention also provides efficient screening in one species, optionally for use in that species and/or a second species (optionally, without additional selection/screening). For example, selection or screening of O-tRNA/O-RS components in one species, such as an easily manipulated species (eg, yeast cells, etc.), and introduction into a second eukaryote, eg, a plant (eg, a higher plant, such as a monad Leafy plants or dicots), algae, protists, fungi, yeast, animals (eg, mammals, insects, arthropods, etc.), etc., for in vivo incorporation of the unnatural amino acid into the second species.

例如,可以将酿酒酵母(S.cerevisiae)选作第一种真核生物,因为它是单细胞的,具有快速的世代时间,并且已相对良好地鉴定了遗传学特征。参见,例如,D.Burke,等,(2000)《酵母遗传学方法》(Methods in Yeast Genetics),Cold Spring HarborLaboratory Press,Cold Spring Harbor,NY。而且,因为真核生物的翻译机器是高度保守的(参见,例如,(1996)《翻译控制》(Translational Control),Cold SpringHarbor Laboratory Press,Cold Spring Harbor,NY;Y.Kwok和J.T.Wong,(1980),用氨酰基-tRNA合成酶作为系统发育探针确定红皮盐杆菌和真核生物之间的进化关系(Evolutionary relationship between Halobacterium cutirubrum andeukaryotes determined by use of aminoacyl-tRNA synthetases as phylogeneticprobes),CanadianJournal of Biochemistry 58:213-218;和(2001)《核糖体》(The Ribosome),Cold Spring Harbor Laboratory Press,Cold Spring Harbor,NY),可以将发现于酿酒酵母用于掺入非天然氨基酸的aaRS基因引入高等真核生物中,与关联tRNA合作使用(参见,例如,K.Sakamoto,等,(2002)将非天然氨基酸位点特异性掺入哺乳动物细胞的蛋白质中(Site-specific incorporation of anunnatural amino acid into proteins in mammalian cells),Nucleic Acids Res.30:4692-4699;和C.Kohrer,等,(2001),将琥珀和赭石抑制型tRNAs输入哺乳动物细胞:将氨基酸类似物位点特异性地插入蛋白质中的通用方法(Import of amberand ochre suppressor tRNAs into ma-malian cells:a general approach tosite-specific insertion of amino acid analogues into proteins),Proc.Natl.Acad.Sci.U.S.A.98:1431)-14315)以掺入非天然氨基酸。For example, Saccharomyces cerevisiae (S. cerevisiae) could be chosen as the first eukaryote because it is unicellular, has a rapid generation time, and is relatively well characterized genetically. See, e.g., D. Burke, et al., (2000) Methods in Yeast Genetics, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY. Moreover, since the eukaryotic translation machinery is highly conserved (see, e.g., (1996) Translational Control, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY; Y. Kwok and J.T. Wong, (1980 ), using aminoacyl-tRNA synthetases as phylogenetic probes to determine the evolutionary relationship between Halobacterium cutirubrum and eukaryotes determined by use of aminoacyl-tRNA synthetases as phylogenetic probes, Canadian Journal of Biochemistry 58:213-218; and (2001) "The Ribosome", Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY), the aaRS gene found in Saccharomyces cerevisiae for incorporation of unnatural amino acids can be introduced into higher In eukaryotes, used in conjunction with cognate tRNAs (see, e.g., K. Sakamoto, et al., (2002) Site-specific incorporation of anunnatural amino acid into proteins of mammalian cells proteins in mammalian cells), Nucleic Acids Res. 30: 4692-4699; and C. Kohrer, et al., (2001), Amber and ocher suppressor tRNAs into mammalian cells: site-specific insertion of amino acid analogues into proteins The general method in (Import of amber and ochre suppressor tRNAs into ma-malian cells: a general approach to site-specific insertion of amino acid analogues into proteins), Proc.Natl.Acad.Sci.U.S.A.98:1431)-14315) to mix into unnatural amino acids.

在一个实施例中,本文所述的在第一种类中生产O-tRNA/O-RS的方法还包括将编码O-tRNA的核酸和编码O-RS的核酸引入第二种类(例如,哺乳动物、昆虫、真菌、藻类、植物等)真核细胞中。在另一实施例中,通过在真核细胞中优选地氨酰化具有非天然氨基酸的正交tRNA来生产正交氨酰基-tRNA合成酶(O-RS)的方法包括:(a)在非天然氨基酸存在下对第一种类(例如,酵母等)真核细胞的群体进行正选择。各真核细胞包含:i)氨酰基-tRNA合成酶(RS)文库的一员,ii)正交tRNA(O-tRNA),iii)编码正选择标记的多核苷酸,和iv)编码负选择标记的多核苷酸。在正选择下存活的细胞包含在非天然氨基酸存在下氨酰化正交tRNA(O-tRNA)的活性RS。将在正选择下存活的细胞在不存在非天然氨基酸的情况下进行负选择,以去除氨酰化具有天然氨基酸的O-tRNA的活性RS。这提供了优选地氨酰化具有非天然氨基酸的O-tRNA的O-RS。将编码O-tRNA的核酸和编码O-RS的核酸(或O-tRNA和/或O-RS的组件)引入第二种类,例如,哺乳动物、昆虫、真菌、藻类、植物和/或类似物)的真核细胞。一般地,通过将第一种类真核细胞的群体进行负选择而获得O-tRNA,其中真核细胞包含tRNA文库的一员。负选择清除了被对真核细胞内源性氨酰基-tRNA合成酶(RS)氨酰化的tRNA文库的一员的细胞,这提供了与第一种类和第二种类真核细胞正交的tRNA库。In one embodiment, the method described herein for producing O-tRNA/O-RS in a first species further comprises introducing the O-tRNA-encoding nucleic acid and the O-RS-encoding nucleic acid into a second species (e.g., mammalian , insects, fungi, algae, plants, etc.) in eukaryotic cells. In another embodiment, a method for producing an orthogonal aminoacyl-tRNA synthetase (O-RS) by preferentially aminoacylation of an orthogonal tRNA with an unnatural amino acid in a eukaryotic cell comprises: (a) A population of eukaryotic cells of a first type (eg, yeast, etc.) is positively selected in the presence of a natural amino acid. Each eukaryotic cell comprises: i) a member of an aminoacyl-tRNA synthetase (RS) library, ii) an orthogonal tRNA (O-tRNA), iii) a polynucleotide encoding a positive selection marker, and iv) encoding a negative selection marker labeled polynucleotides. Cells that survive positive selection contain an active RS that aminoacylates an orthogonal tRNA (O-tRNA) in the presence of an unnatural amino acid. Cells surviving positive selection were negatively selected in the absence of unnatural amino acids to remove active RSs that aminoacylate O-tRNAs with natural amino acids. This provides an O-RS that preferentially aminoacylates O-tRNAs with unnatural amino acids. Introducing O-tRNA-encoding nucleic acid and O-RS-encoding nucleic acid (or components of O-tRNA and/or O-RS) into a second species, e.g., mammals, insects, fungi, algae, plants and/or the like ) of eukaryotic cells. Typically, the O-tRNA is obtained by negative selection of a first population of eukaryotic cells comprising a member of the tRNA library. Negative selection purges cells of a member of the tRNA library that is aminoacylated by the eukaryotic endogenous aminoacyl-tRNA synthetase (RS), which provides orthogonality to the first and second eukaryotic species tRNA library.

选择密码子selector codon

本发明的选择密码子扩展了蛋白质生物合成机器的遗传密码子构架。例如,选择密码子包括,例如,唯一的三碱基密码子,无义密码子如终止密码子,例如,琥珀密码子(UAG)、乳白密码子(UGA),非天然密码子,至少一个四碱基密码子,罕用密码子等。可以将许多选择密码引入所需基因,例如,一个或多个、两个或多个、多余三个等。一旦基因可以包括给定选择密码子的多个拷贝,就可以包括多个不同的选择密码子,或它们的任意组合。The selector codons of the present invention expand the genetic codon framework of the protein biosynthesis machinery. For example, selector codons include, for example, unique three-base codons, nonsense codons such as stop codons, for example, amber codons (UAG), opalescent codons (UGA), unnatural codons, at least one four-base codon, Base codons, rare codons, etc. A number of selection codons can be introduced into the desired gene, eg, one or more, two or more, more than three, etc. Once a gene can include multiple copies of a given selector codon, it can include multiple different selector codons, or any combination thereof.

在一个实施方式中,方法包括在真核细胞中用选择密码子中的终止密码子体内掺入非天然氨基酸。例如,生产了识别终止密码子,如UAG的O-tRNA,和由O-RS与所需非天然氨基酸将O-tRNA氨酰化。天然产生的宿主的氨酰基-tRNA合成酶并不识别该O-tRNA。可用常规的定位诱变在感兴趣多肽的感兴趣的位点引入终止密码子,例如,TAG。参见,例如,Sayers,J.R.,等(1988),在基于硫代磷酸的寡核苷酸-定向诱变中的5’,3’核酸外切酶(5’,3’Exonuclease in phosphorothioate-basedoligonucleotide-directed mutagenesis),Nucleic Acids Res.791-802。当O-RS、O-tRNA和编码感兴趣多肽的核酸在体内结合时,响应于UAG密码子掺入非天然氨基酸,产生在指定位置含有非天然氨基酸的多肽。In one embodiment, the method comprises in vivo incorporation of an unnatural amino acid using a stop codon in a selector codon in a eukaryotic cell. For example, an O-tRNA is produced that recognizes a stop codon, such as UAG, and the O-tRNA is aminoacylated by the O-RS with a desired unnatural amino acid. The O-tRNA is not recognized by naturally occurring host aminoacyl-tRNA synthetases. A stop codon, eg, TAG, can be introduced at a site of interest in a polypeptide of interest using conventional site-directed mutagenesis. See, e.g., Sayers, J.R., et al. (1988), 5', 3' Exonuclease in phosphorothioate-based oligonucleotide-directed mutagenesis directed mutagenesis), Nucleic Acids Res. 791-802. When the O-RS, O-tRNA, and nucleic acid encoding the polypeptide of interest are combined in vivo, the unnatural amino acid is incorporated in response to the UAG codon, resulting in a polypeptide containing the unnatural amino acid at the designated position.

非天然氨基酸的体内掺入完成而不显著扰乱真核宿主细胞。例如,因为UAG密码子的抑制效率取决于O-tRNA,如琥珀抑制型tRNA和真核释放因子(例如,eRF)(它结合到终止密码子并起始正在生长肽从核糖体中释放)之间的竞争,所以可以通过,例如增加O-tRNA如抑制型tRNA的表达水平来调节抑制效率。In vivo incorporation of unnatural amino acids is accomplished without significantly disturbing eukaryotic host cells. For example, since the suppression efficiency of UAG codons depends on O-tRNAs, such as amber suppressor tRNAs, and eukaryotic release factors (e.g., eRF), which bind to stop codons and initiate the release of growing peptides from ribosomes Therefore, the suppression efficiency can be adjusted by, for example, increasing the expression level of O-tRNA such as suppressor tRNA.

选择密码子也包括扩展密码子,例如,四个或多个碱基的密码子,如四、五、六或更多碱基密码子。四碱基密码子的例子包括,例如,AGGA、CUAG、UAGA、CCCU等。五碱基密码子的例子包括,例如,AGGAC、CCCCU、CCCUC、CUAGA、CUACU、UAGGC等。本发明的特征包括根据移码抑制使用扩展密码子。四个或多个碱基密码子可以把,例如,一个或多个非天然氨基酸插入相同蛋白中。例如,在具有反密码子环,如至少8-10个核苷酸反密码子环的突变O-tRNA,如特殊的移码抑制型tRNA的存在下,四个或多个碱基密码子阅读为单个氨基酸。在其它实施方式中,反密码子环可以解码,例如,至少四碱基密码子、至少五碱基密码子、或至少六碱基密码子或更多。因为有256种可能的四碱基密码子,所以在同一细胞中可以用四个或多个碱基密码子编码多个非天然氨基酸。参见,Anderson等,(2002)探索密码子和反密码子大小的限度(Exploring the Limits of Codon and Anticodon Size),Chemistry andBiology,9:237-244;Magliery,(2001)扩展遗传密码:选择四碱基密码子的有效抑制剂并用大肠杆菌中的文库方法鉴定“不稳定的”四碱基密码子(Expanding theGenetic Code:Selectiono of Efficient Suppressors of Four-base Codons andIdentification of″Shifty″Four-base Codons with a Library Approach inEscherichia coli),J.Mol.Biol.307:755-769。Selector codons also include extended codons, eg, four or more base codons, such as four, five, six or more base codons. Examples of four base codons include, eg, AGGA, CUAG, UAGA, CCCU, and the like. Examples of five base codons include, eg, AGGAC, CCCCU, CCCUC, CUAGA, CUACU, UAGGC, and the like. Features of the invention include the use of extended codons based on frameshift suppression. Four or more base codons can insert, for example, one or more unnatural amino acids into the same protein. For example, four or more base codon reads in the presence of a mutant O-tRNA with an anticodon loop, such as an anticodon loop of at least 8-10 nucleotides, such as a special frameshift suppressor tRNA for a single amino acid. In other embodiments, anticodon loops can decode, for example, at least four base codons, at least five base codons, or at least six base codons or more. Because there are 256 possible four-base codons, multiple unnatural amino acids can be encoded by four or more base codons in the same cell. See, Anderson et al., (2002) Exploring the Limits of Codon and Anticodon Size, Chemistry and Biology, 9: 237-244; Magliery, (2001) Expanding the genetic code: choosing four bases Escherichia coli effective inhibitors of four-base codons and identification of "unstable" four-base codons (Expanding the Genetic Code: Selectiono of Efficient Suppressors of Four-base Codons and Identification of "Shifty" Four-base Codons with a Library Approach in Escherichia coli), J. Mol. Biol. 307: 755-769.

例如,用体外生物合成方法四碱基密码子已用于将非天然氨基酸掺入蛋白质中。参见,例如,Ma等,(1993)Biochemistry,32:7939;和Hohsaka等,(1999)J.Am.Chem.Soc.,121:34。将CGGG和AGGU用于通过两种化学酰化的移码抑制型tRNA将2-萘基丙氨酸和赖氨酸的NBD衍生物体外同时掺入链霉抗生物素蛋白中,参见,例如,Hohsaka等,(1999)J.Am.Chem.Soc.,121:12194。在体内研究中,Moore等检测了tRNALeu衍生物与NCUA反密码子抑制UAGN密码子(N可以是U、A、G或C)的能力,发现由tRNA Leu与UCUA反密码子可以解码四联体UAGA,效率为13至26%,在0或-1框中解码少。参见,Moore等,(2000)J.Mol.Biol.,298:195。在一个实施方式中,本发明可使用基于罕用密码子或无义密码子的扩展密码子,它们可以降低在其它不需要位点上的错义连读和移码抑制。For example, four base codons have been used to incorporate unnatural amino acids into proteins using in vitro biosynthetic methods. See, eg, Ma et al., (1993) Biochemistry, 32:7939; and Hohsaka et al., (1999) J. Am. Chem. Soc., 121:34. CGGG and AGGU were used for simultaneous incorporation of NBD derivatives of 2-naphthylalanine and lysine into streptavidin in vitro by two chemically acylated frameshift suppressor tRNAs, see, for example, Hohsaka et al. (1999) J. Am. Chem. Soc., 121:12194. In an in vivo study, Moore et al. tested the ability of tRNALeu derivatives and NCUA anticodons to suppress UAGN codons (N can be U, A, G, or C), and found that quadruples could be decoded by tRNALeu and UCUA anticodons UAGA, 13 to 26% efficient, decodes less in 0 or -1 boxes. See, Moore et al., (2000) J. Mol. Biol., 298:195. In one embodiment, the present invention can use extended codons based on rare codons or nonsense codons, which can reduce missense read through and frameshift suppression at otherwise unwanted sites.

对于一个给定系统来说,选择密码子也可包括天然三碱基密码子之一,其中内源性系统并不使用(或很少使用)天然碱基密码子。例如,这包括缺少识别天然三碱基密码子的tRNA的系统和/或三碱基密码子是罕用密码子的系统。For a given system, the selector codon may also include one of the natural three-base codons for which the endogenous system does not (or rarely uses) the natural base codon. For example, this includes systems lacking tRNAs that recognize the natural three-base codon and/or systems in which the three-base codon is an uncommon codon.

选择密码子任选地包括非天然碱基对。这些非天然碱基对还扩展了现有的遗传字母表。一个额外碱基对可以使三联体密码子的数目从64增加到125。第三个碱基对的性质包括稳定和选择性碱基配对、聚合酶以高保真度有效酶促掺入DNA,新生非天然碱基对合成后有效连续的引物延伸。可以适用于方法和组合物的非天然碱基对的描述包括,例如,Hirao,等,(2002)用于将氨基酸类似物掺入蛋白质中的非天然碱基对,Nature Biotechnology,20:177-182。其它相关出版物见以下所列。Selector codons optionally include unnatural base pairs. These unnatural base pairs also expand the existing genetic alphabet. One extra base pair increases the number of triplet codons from 64 to 125. Properties of the third base pair include stable and selective base pairing, efficient enzymatic incorporation into DNA by polymerases with high fidelity, and efficient continuous primer extension after synthesis of nascent unnatural base pairs. Descriptions of unnatural base pairs that may be suitable for use in the methods and compositions include, for example, Hirao, et al., (2002) Unnatural base pairs for incorporation of amino acid analogs into proteins, Nature Biotechnology, 20: 177- 182. Other relevant publications are listed below.

对于体内使用,非天然核苷是膜可透过的,并磷酸化形成相应的三磷酸盐。此外,增加的遗传信息是稳定的,且不会被细胞酶所破坏。以前Benner和其他人所做的努力利用了与典范的Watson-Crick对不同的氢键模式,其中最值得注意的例子是异-C:异-G对。参见,例如,Switzer等,(1989)J.Am.Chem.Soc.,111:8322;和Piccirilli等,(1990)Nature,343:33;Kool,(2000)Curr.Opin.Chem.Biol.,4:602。通常,这些碱基与天然碱基有某种程度的错配,不能酶促复制。Kool和同事们证明碱基间的疏水堆积相互作用可以替换氢键,以驱使碱基对形成。参见,Kool,(2000)Curr.Opin.Chem.Biol.,4:602;和Guckian和Kool,(1998)Angew.Chem.Int.Ed.Engl.,36,2825。在开发满足上面所有要求的非天然碱基对的努力中,Schultz、Romesberg和同事们系统地合成并研究了一系列非天然疏水碱基。发现PICS:PICS自身-对比天然碱基对更稳定,大肠杆菌DNA聚合酶I的克林诺片段(KF)可将其有效掺入DNA。参见,例如,McMinn等,(1999)J.Am.Chem.Soc.,121:11586;和Ogawa等,(2000)J.Am.Chem.Soc.,122:3274。KF可以以足够于生物功能的效率和选择性合成3MN:3MN自身-对。参见,例如,Ogawa等,(2000)J.Am.Chem.Soc.,122:8803。然而,两种碱基都作为链终止剂用于进一步复制。最近发现,突变DNA聚合酶可以用于复制PICS自身对。此外,可以复制7AI自身对。参见,例如,Tae等,(2001)J.Am.Chem.Soc.,123:7439。也开发了新的金属碱基对Dipic:Py,在结合Cu(II)时形成稳定对。参见,Meggers等,(2000)J.Am.Chem.Soc.,122:10714。因为扩展密码子和非天然密码子本质上是与天然密码子正交的,所以本发明方法可以利用该性质为它们产生正交tRNA。For in vivo use, unnatural nucleosides are membrane permeable and phosphorylated to form the corresponding triphosphates. Furthermore, the increased genetic information is stable and not destroyed by cellular enzymes. Previous efforts by Benner and others exploited different hydrogen-bonding patterns than canonical Watson-Crick pairs, the most notable example of which is the iso-C:iso-G pair. See, eg, Switzer et al., (1989) J.Am.Chem.Soc., 111:8322; and Piccirilli et al., (1990) Nature, 343:33; Kool, (2000) Curr.Opin.Chem.Biol., 4:602. Typically, these bases are somewhat mismatched with natural bases and cannot be enzymatically replicated. Kool and colleagues demonstrated that hydrophobic stacking interactions between bases can displace hydrogen bonds to drive base pair formation. See, Kool, (2000) Curr. Opin. Chem. Biol., 4:602; and Guckian and Kool, (1998) Angew. Chem. Int. Ed. Engl., 36, 2825. In an effort to develop unnatural base pairs that fulfill all of the above requirements, Schultz, Romesberg, and colleagues systematically synthesized and studied a series of unnatural hydrophobic bases. It was found that PICS: PICS itself- is more stable than natural base pairs, which can be efficiently incorporated into DNA by the Klinnow fragment (KF) of E. coli DNA polymerase I. See, eg, McMinn et al., (1999) J. Am. Chem. Soc., 121:11586; and Ogawa et al., (2000) J. Am. Chem. Soc., 122:3274. KF can synthesize the 3MN:3MN self-pair with efficiency and selectivity sufficient for biological function. See, eg, Ogawa et al., (2000) J. Am. Chem. Soc., 122:8803. However, both bases serve as chain terminators for further replication. It was recently discovered that mutant DNA polymerases can be used to replicate pairs of PICS themselves. In addition, 7AI self-pairs can be replicated. See, eg, Tae et al., (2001) J. Am. Chem. Soc., 123:7439. A new metallobase pair, Dipic:Py, was also developed, forming a stable pair upon binding Cu(II). See, Meggers et al., (2000) J. Am. Chem. Soc., 122:10714. Because extended codons and unnatural codons are inherently orthogonal to natural codons, the methods of the invention can take advantage of this property to generate orthogonal tRNAs for them.

翻译旁路系统也可用于在所需多肽中掺入非天然氨基酸。在翻译旁路系统中,将大序列插入基因中,但不翻译成蛋白。该序列包含作为诱导核糖体跳过该序列并继续进行插入的下游翻译的提示的结构。Translation bypass systems can also be used to incorporate unnatural amino acids into desired polypeptides. In translation bypass systems, large sequences are inserted into the gene but not translated into protein. This sequence contains structures that act as cues to induce ribosomes to skip this sequence and proceed with downstream translation of the insertion.

非天然氨基酸unnatural amino acid

本文使用的非天然氨基酸指任何氨基酸、修饰氨基酸,或不是硒半胱氨酸和/或吡咯赖氨酸的氨基酸类似物,下面是20种遗传编码的α-氨基酸:丙氨酸,精氨酸,天冬酰胺,天冬氨酸,半胱氨酸,谷氨酰胺,谷氨酸,甘氨酸,组氨酸,异亮氨酸,亮氨酸,赖氨酸,甲硫氨酸,苯丙氨酸,脯氨酸,丝氨酸,苏氨酸,色氨酸,酪氨酸,缬氨酸。式I说明α-氨基酸的一般结构:As used herein, unnatural amino acid refers to any amino acid, modified amino acid, or analog of an amino acid other than selenocysteine and/or pyrrolysine, the following are the 20 genetically encoded alpha-amino acids: alanine, arginine , asparagine, aspartic acid, cysteine, glutamine, glutamic acid, glycine, histidine, isoleucine, leucine, lysine, methionine, phenylalanine Acid, Proline, Serine, Threonine, Tryptophan, Tyrosine, Valine. Formula I illustrates the general structure of an α-amino acid:

非天然氨基酸一般是任何具有式I的结构,其中R基团是除了20种天然氨基酸中使用的一种以外的任何取代基。参见例如,L.Stryer的《生物化学》,第三版,1988,Freeman and Company,New York中二十种天然氨基酸的结构。需要注意的是,本发明的非天然氨基酸可以是除上述二十种α-氨基酸外的天然产生的化合物。An unnatural amino acid is generally any structure of formula I in which the R group is any substituent other than one of the 20 natural amino acids used. See, eg, L. Stryer, Biochemistry, Third Edition, 1988, Freeman and Company, New York, for the structures of twenty naturally occurring amino acids. It should be noted that the unnatural amino acid of the present invention may be a naturally occurring compound other than the twenty α-amino acids mentioned above.

因为本发明的非天然氨基酸一般与侧链中的天然氨基酸不同,非天然氨基酸与其它氨基酸,例如,天然或非天然氨基酸以天然产生的蛋白质中形成的相同方式形成酰胺键。然而,非天然氨基酸具有使其与天然氨基酸不同的侧链基团。例如,式I中的R任选地包括烷基-、芳基-、酰基-、酮基-、叠氮基-、羟基-、肼、氰基-、卤素-、酰肼、链烯基、炔基、醚、硫醇、硒-、磺酰基-、硼酸、硼酸盐、磷酰基、膦酰基、膦、杂环、烯酮、亚胺、醛、酯、硫代酸、羟胺、胺等,或它们的任意组合。其它感兴趣的的非天然氨基酸包括但不限于,含有可光敏化的交联剂的氨基酸、自旋标记的氨基酸、荧光氨基酸、金属结合氨基酸、含金属的氨基酸、放射性氨基酸、具有新官能团的氨基酸、与其他分子共价或非共价相互作用的氨基酸、光笼蔽和/或可光异构的氨基酸、含有生物素或生物素-类似物氨基酸、含酮氨基酸、含有聚乙二醇或聚醚的氨基酸、重原子取代的氨基酸、可化学切割或可光切割的氨基酸、与天然氨基酸相比具有延长侧链的氨基酸(例如,聚醚或长链烃,如大于约5、大于约10个碳等)、含有碳-连接糖的氨基酸、具有氧化还原活性的氨基酸、含有氨基硫代酸的氨基酸和含有一种或多种有毒部分的氨基酸。在一些实施方式中,非天然氨基酸具有可光敏化的交联剂,它用于,例如,将蛋白质连接到固体支持物上。在一个实施方式中,非天然氨基酸具有附着于氨基酸侧链的糖部分(例如,糖基化氨基酸)和/或其它碳水化合物修饰。Because the unnatural amino acids of the invention generally differ from the natural amino acids in the side chains, the unnatural amino acids form amide bonds with other amino acids, eg, natural or unnatural amino acids, in the same manner that they are formed in naturally occurring proteins. However, unnatural amino acids have side chain groups that make them different from natural amino acids. For example, R in Formula I optionally includes alkyl-, aryl-, acyl-, keto-, azido-, hydroxy-, hydrazine, cyano-, halo-, hydrazide, alkenyl, Alkynyl, ether, thiol, selenium-, sulfonyl-, boronic acid, borate, phosphoryl, phosphono, phosphine, heterocycle, enone, imine, aldehyde, ester, thioacid, hydroxylamine, amine, etc. , or any combination of them. Other unnatural amino acids of interest include, but are not limited to, amino acids containing photosensitizable cross-linkers, spin-labeled amino acids, fluorescent amino acids, metal-binding amino acids, metal-containing amino acids, radioactive amino acids, amino acids with novel functional groups , amino acids that covalently or non-covalently interact with other molecules, photocaged and/or photoisomerizable amino acids, biotin or biotin-analogue containing amino acids, ketogenic amino acids, polyethylene glycol or poly Ether amino acids, heavy atom substituted amino acids, chemically cleavable or photocleavable amino acids, amino acids with extended side chains compared to natural amino acids (e.g., polyethers or long chain hydrocarbons, such as greater than about 5, greater than about 10 carbon, etc.), amino acids containing carbon-linked sugars, amino acids with redox activity, amino acids containing aminothio acids, and amino acids containing one or more toxic moieties. In some embodiments, the unnatural amino acid has a photosensitizable cross-linker, which is used, for example, to attach the protein to a solid support. In one embodiment, the unnatural amino acid has a sugar moiety (eg, a glycosylated amino acid) and/or other carbohydrate modifications attached to the side chain of the amino acid.

除含有新侧链的非天然氨基酸以外,非天然氨基酸也任选地包含修饰的骨架结构,例如,式II和III的结构所示:In addition to unnatural amino acids containing novel side chains, unnatural amino acids also optionally comprise modified backbone structures, for example, as shown in the structures of Formulas II and III:

其中Z一般包括OH、NH2、SH、NH-R’或S-R’;X和Y可以相同或不同,它们一般包括S或O,R和R’是任选地相同或不同,它们一般选自上述针对具有式I的非天然氨基酸R基团成分的相同列表以及氢。例如,本发明非天然氨基酸任选地包括在氨基或羧基中的取代,如式II和III所示。该类非天然氨基酸包括但不限于例如,具有与普通的二十种然氨基酸相应的侧链或非天然侧链的α-羟酸、α-硫代酸α-氨基硫代羧酸酯。此外,在α-碳上的取代任选地包括L、D或α-α-双取代氨基酸,如D-谷氨酸、D-丙氨酸、D-甲基-O-酪氨酸、氨基丁酸等。其它结构替代物包括环氨基酸,如脯氨酸类似物,以及3、4、6、7、8和9元环脯氨酸类似物,β和γ氨基酸,如取代的β-丙氨酸和γ-氨基丁酸。where Z generally includes OH, NH 2 , SH, NH-R' or S-R'; X and Y may be the same or different, they generally include S or O, R and R' are optionally the same or different, they generally selected from the same list above for the R group component of the unnatural amino acid having formula I, plus hydrogen. For example, the unnatural amino acids of the invention optionally include substitutions in the amino or carboxyl groups, as shown in Formulas II and III. Such unnatural amino acids include, but are not limited to, for example, alpha-hydroxyacids, alpha-thioacids alpha-aminothiocarboxylates having side chains corresponding to common twenty natural amino acids or unnatural side chains. In addition, substitutions at the α-carbon optionally include L, D or α-α-disubstituted amino acids such as D-glutamic acid, D-alanine, D-methyl-O-tyrosine, amino butyric acid etc. Other structural substitutions include cyclic amino acids, such as proline analogs, and 3, 4, 6, 7, 8, and 9 membered ring proline analogs, beta and gamma amino acids, such as substituted beta-alanine and gamma - GABA.

例如,很多非天然氨基酸是基于天然氨基酸的,如酪氨酸、谷氨酰胺、苯丙氨酸等。酪氨酸类似物包括对位取代的酪氨酸、邻位取代的酪氨酸和间位取代的酪氨酸,其中取代的酪氨酸包括,例如,酮基(例如乙酰基)、苯甲酰基、氨基、肼、羟胺、硫醇基、羧基、异丙基、甲基、C6-C20直链或支链烃、饱和或不饱和烃、0-甲基、聚醚、硝基、炔基等。此外,也考虑到多取代芳环。本发明的谷氨酰胺类似物包括但不限于,α-羟基衍生物、γ-取代衍生物、环状衍生物和酰胺取代的谷氨酰胺衍生物。苯丙氨酸类似物的例子包括但不限于,对位取代的苯丙氨酸、邻位取代的苯丙氨酸和间位取代的苯丙氨酸,其中取代基包括,例如,羟基、甲氧基、甲基、烯丙基、醛、叠氮基、碘、溴、酮基(例如乙酰基)、苯甲酰基、炔基等。非天然氨基酸的具体例子包括但不限于、对-乙酰基-L-苯丙氨酸、对-炔丙基氧基苯丙氨酸、0-甲基-L-酪氨酸、L-3-(2-萘基)丙氨酸、3-甲基-苯丙氨酸、0-4-烯丙基-L-酪氨酸、4-丙基-L-酪氨酸、三-O-乙酰基-GlcNAcβ-丝氨酸、L-多巴、氟化苯丙氨酸、异丙基-L-苯丙氨酸、对-叠氮基-L-苯丙氨酸、对-酰基-L-苯丙氨酸、对-苯甲酰基-L-苯丙氨酸、L-磷酸丝氨酸、膦酰基丝氨酸、膦酰基酪氨酸、对-碘代-苯丙氨酸、对-溴苯丙氨酸、对-氨基-L-苯丙氨酸和异丙基-L-苯丙氨酸等。非天然氨基酸结构的例子见图7,B组和图11。例如,WO2002/085923题为“体内掺入非天然氨基酸”的图16、17、18、19、26和29中提供了其它结构的各种非天然氨基酸。也可从Kiick等,(2002)通过Staudinger连接将叠氮化物掺入重组蛋白中用于化学选择性修饰,PNAS 99:19-24的图1结构2-5中参见其它甲硫氨酸类似物。For example, many unnatural amino acids are based on natural amino acids, such as tyrosine, glutamine, phenylalanine, etc. Tyrosine analogs include para-substituted tyrosine, ortho-substituted tyrosine, and meta-substituted tyrosine, wherein substituted tyrosines include, for example, keto (e.g., acetyl), benzyl Acyl, amino, hydrazine, hydroxylamine, thiol, carboxyl, isopropyl, methyl, C 6 -C 20 straight chain or branched hydrocarbon, saturated or unsaturated hydrocarbon, 0-methyl, polyether, nitro, Alkynyl etc. Furthermore, polysubstituted aromatic rings are also contemplated. Glutamine analogs of the present invention include, but are not limited to, α-hydroxy derivatives, γ-substituted derivatives, cyclic derivatives and amide substituted glutamine derivatives. Examples of phenylalanine analogs include, but are not limited to, para-substituted phenylalanine, ortho-substituted phenylalanine, and meta-substituted phenylalanine, wherein the substituents include, for example, hydroxyl, methyl Oxy, methyl, allyl, aldehyde, azido, iodo, bromo, keto (eg, acetyl), benzoyl, alkynyl, and the like. Specific examples of unnatural amino acids include, but are not limited to, p-acetyl-L-phenylalanine, p-propargyloxyphenylalanine, O-methyl-L-tyrosine, L-3- (2-Naphthyl)alanine, 3-methyl-phenylalanine, 0-4-allyl-L-tyrosine, 4-propyl-L-tyrosine, tri-O-acetyl GlcNAcβ-serine, L-dopa, fluorinated phenylalanine, isopropyl-L-phenylalanine, p-azido-L-phenylalanine, p-acyl-L-phenylalanine amino acid, p-benzoyl-L-phenylalanine, L-phosphoserine, phosphonoserine, phosphonotyrosine, p-iodo-phenylalanine, p-bromophenylalanine, p- -amino-L-phenylalanine and isopropyl-L-phenylalanine, etc. Examples of unnatural amino acid structures are shown in Figure 7, Panel B and Figure 11. Various unnatural amino acids of other structures are provided, for example, in Figures 16, 17, 18, 19, 26 and 29 of WO2002/085923 entitled "Incorporation of unnatural amino acids in vivo". See also other methionine analogs from Kiick et al., (2002) Incorporation of azide into recombinant proteins via Staudinger linkage for chemoselective modification, PNAS 99:19-24, Figure 1 structure 2-5 .

在一个实施方式中,提供了包括非天然氨基酸(如对-(炔丙基氧基)-苯丙氨酸)的组合物。也提供了各种含有对-(炔丙基氧基)-苯丙氨酸和,如蛋白和/或细胞的组合物。在一个方面,包括对-(炔丙基氧基)-苯丙氨酸非天然氨基酸的组合物还包括正交tRNA。非天然氨基酸可(如共价)结合到正交tRNA上,例如,通过氨基-酰基键共价结合到正交tRNA上,共价结合到正交tRNA的末端核糖的3’OH或2’OH上等。In one embodiment, compositions comprising an unnatural amino acid such as p-(propargyloxy)-phenylalanine are provided. Also provided are various compositions comprising p-(propargyloxy)-phenylalanine and, for example, proteins and/or cells. In one aspect, a composition comprising the p-(propargyloxy)-phenylalanine unnatural amino acid further comprises an orthogonal tRNA. The unnatural amino acid can be (eg, covalently) bound to the orthogonal tRNA, for example, covalently bound to the orthogonal tRNA via an amino-acyl bond, covalently bound to the 3'OH or 2'OH of the terminal ribose sugar of the orthogonal tRNA first class.

通过可以掺入蛋白的非天然氨基酸的化学部分提供了各种优点和蛋白的操作。例如,酮官能团的独特反应性允许用许多含肼或羟胺的试剂在体外和体内进行蛋白选择性修饰。重原子非天然氨基酸,例如,可以用于取向X射线结构数据。用非天然氨基酸位点特异性引入重原子也在选择重原子位置方面提供了选择性和灵活性。光敏非天然氨基酸(例如,具有二苯甲酮和芳基叠氮化物(例如,苯基叠氮化物)侧链的氨基酸),例如,允许蛋白在体内和体外进行有效的光交联。光敏非天然氨基酸的例子包括但不限于,例如,对-叠氮基-苯丙氨酸和对-苯甲酰基-苯丙氨酸。然后,可以通过激发光敏基团-提供暂时(和/或空间)对照,任意交联具有光敏非天然氨基酸的蛋白质。在一个实施例中,可以用同位素标记的,例如甲基取代非天然氨基的甲基,在例如使用核磁共振和振动光谱学中甲基用作局部结构和动力学的探针。炔基或叠氮基官能团,例如,允许通过[3+2]环加成反应用分子选择性修饰蛋白质。Various advantages and manipulations of proteins are offered by chemical moieties of unnatural amino acids that can be incorporated into proteins. For example, the unique reactivity of the ketone functional group allows selective modification of proteins in vitro and in vivo with many hydrazine- or hydroxylamine-containing reagents. Heavy-atom unnatural amino acids, for example, can be used to orient X-ray structural data. Site-specific introduction of heavy atoms with unnatural amino acids also provides selectivity and flexibility in choosing heavy atom positions. Photosensitive unnatural amino acids (eg, amino acids with benzophenone and aryl azide (eg, phenylazide) side chains), for example, allow efficient photocrosslinking of proteins in vivo and in vitro. Examples of photosensitive unnatural amino acids include, but are not limited to, eg, p-azido-phenylalanine and p-benzoyl-phenylalanine. Proteins with photosensitive unnatural amino acids can then be optionally crosslinked by exciting the photosensitive group - providing a temporal (and/or steric) control. In one example, the methyl group of an unnatural amino group may be substituted with an isotopically labeled, eg, methyl group, which is used as a probe of local structure and dynamics, eg, using nuclear magnetic resonance and vibrational spectroscopy. Alkynyl or azido functional groups, for example, allow the selective modification of proteins with molecules through [3+2] cycloaddition reactions.

非天然氨基酸的化学合成Chemical Synthesis of Unnatural Amino Acids

许多上面提供的非天然氨基酸都可以从,例如,Sigma(USA)或Aldrich(Milwaukee,WI,USA)购得。如本文所提供的方法或各种出版物中所提供的方法或用本领域技术人员已知的标准方法任选地合成那些不能从市场上购得的非天然氨基酸。有机合成技术参见,例如,Fessendon和Fessendon的《有机化学》,(1982,第二版,Willard Grant Press,Boston Mass.);March的《高级有机化学》(第三版,1985,Wiley and Sons,New York);和Carey和Sundberg的《高级有机化学》(第三版,A和B部分,1990,Plenum Press,New York)。描述非天然氨基酸合成的其它出版物包括,例如,WO2002/085923题为“体内掺入非天然氨基酸”(In vivoincorporation of Unnatural Amino Acids);Matsoukas等,(1995)J.Med.Chem.,38,4660-4669;King,F.E.和Kidd,D.A.A.(1949)从邻苯二甲酸的中间体新合成谷氨酰胺和谷氨酸γ-二肽(A New Synthesis of Glutamine and of γ-Dipeptides ofGlutamic Acid from Phthylated Intermediates),J.Chem.Soc.,3315-3319;Friedman,O.M.和Chatterrji,R.(1959)合成谷胺酰胺衍生物作为抗肿瘤剂的模式底物(Synthesis of Derivatives of Glutamine as Model Substrates for Anti-TumorAgents),J.Am.Chem.Soc.81,3750-3752;Craig,J.C.等(1988)7-氯-4[[4-(二乙氨基)-1-甲基丁基]氨基]喹啉(氯喹)的对映体的绝对构型(AbsoluteConfiguration of the Enantiomers of 7-Chloro-4[[4-(diethylamino)-1-methylbutyl]amino]quinoline(Chloroquine)),J.Org.Chem.53,1167-1170;Azoulay,M.,Vilmont,M.和Frappier,F.(1991)作为潜在抗疟药的谷胺酰胺类似物(Glutamin analogues as Potential Antimalarials),Eur.J.Med.Chem.26,201-5;Koskinen,A.M.P.和Rapoport,H.(1989)合成构象受限的氨基酸类似物4-取代脯氨酸(Synthesis of 4-Substituted Prolines asConformationally Constrained Amino Acid Analogues),J.Org.Chem.54,1859-1866;Christie,B.D.和Rapoport,H.(1985)从L-天冬酰胺合成光学纯的2-哌啶酸(Synthesis of Optically Pure Pipecolates from L-Asparagine)。应用于通过氨基酸脱羰和亚胺鎓离子环化全合成(+)-阿扑长春胺(Application to theTotal Synthesis of(+)-Apovincamine through Amino Acid Decarbonylation andIminium Ion Cyclization),J.Org.Chem.1989:1859-1866;Barton等,(1987)用自由基化学合成新α-氨基酸和衍生物:合成L-和D-α-氨基-己二酸、L-α-氨基庚二酸和合适的非饱和衍生物(Synthesis of Novelα-Amino-Acids andDerivatives Using Radical Chemistry:Synthesis of L-and D-α-Amino-AdipicAcids,L-α-aminopimelic Acid and Appropriate Unsaturated Derivatives),Tetrahedron Lett.43:4297-4308;和,Subasinghe等,(1992)使君子氨酸类似物:β-杂环2-氨基丙酸衍生物的合成及其在新使君子氨酸-敏化位点上的活性(Quisqualic acid analogues:synthesis of beta-heterocyclic 2-aminopropanoicacid derivatives and their activity at a novel quisqualate-sensitized site),J.Med.Chem.35:4602-7。也参见2002年12月22日提交的代理人案卷编号P1001US00的专利申请题为“蛋白质阵列”(Protein Arrays)。Many of the unnatural amino acids provided above are commercially available from, eg, Sigma (USA) or Aldrich (Milwaukee, WI, USA). Those unnatural amino acids that are not commercially available are optionally synthesized as provided herein or in various publications or by standard methods known to those skilled in the art. For techniques of organic synthesis see, e.g., Fessendon and Fessendon, Organic Chemistry, (1982, 2nd ed., Willard Grant Press, Boston Mass.); March, Advanced Organic Chemistry (3rd ed., 1985, Wiley and Sons, New York); and Carey and Sundberg, Advanced Organic Chemistry (Third Edition, Parts A and B, 1990, Plenum Press, New York). Other publications describing the synthesis of unnatural amino acids include, for example, WO2002/085923 entitled "In vivoincorporation of Unnatural Amino Acids" (In vivoincorporation of Unnatural Amino Acids); Matsoukas et al., (1995) J. Med. Chem., 38, 4660-4669; King, F.E. and Kidd, D.A.A. (1949) A New Synthesis of Glutamine and of γ-Dipeptides of Glutamic Acid from Phthylated Intermediates), J.Chem.Soc., 3315-3319; Friedman, O.M. and Chatterrji, R. (1959) Synthesis of Derivatives of Glutamine as Model Substrates for Antineoplastic Agents -TumorAgents), J.Am.Chem.Soc.81, 3750-3752; Craig, J.C. et al. (1988) 7-chloro-4[[4-(diethylamino)-1-methylbutyl]amino]quinone Absolute Configuration of the Enantiomers of 7-Chloro-4[[4-(diethylamino)-1-methylbutyl]amino]quinoline(Chloroquine)), J.Org.Chem.53 , 1167-1170; Azoulay, M., Vilmont, M. and Frappier, F. (1991) Glutamine analogues as Potential Antimalarials as Potential Antimalarials, Eur.J.Med.Chem.26 , 201-5; Koskinen, A.M.P. and Rapoport, H. (1989) Synthesis of 4-Substituted Prolines as Conformationally Constrained Amino Acid Analogues, J.Org.Chem. 54, 1859-1866; Christie, B.D. and Rapoport, H. (1985) Synthesis of Optically Pure Pipecolates from L-Asparagine from L-Asparagine. Application to the Total Synthesis of (+)-Apovincamine through Amino Acid Decarbonylation and Iminium Ion Cyclization through Amino Acid Decarbonylation and Iminium Ion Cyclization, J.Org.Chem.1989 : 1859-1866; Barton et al., (1987) Synthesis of new α-amino acids and derivatives by free radical chemistry: Synthesis of L- and D-α-amino-adipic acid, L-α-aminopimelic acid and suitable non- Saturated derivatives (Synthesis of Novelα-Amino-Acids and Derivatives Using Radical Chemistry: Synthesis of L-and D-α-Amino-AdipicAcids, L-α-aminopimelic Acid and Appropriate Unsaturated Derivatives), Tetrahedron Lett.43: 4297-4308; and, Subasinghe et al., (1992) Quisqualic acid analogues: synthesis of β-heterocyclic 2-alanine derivatives and their activity at new quisqualic acid-sensitizing sites of beta-heterocyclic 2-aminopropanoic acid derivatives and their activity at a novel quisqualate-sensitized site), J.Med.Chem.35:4602-7. See also Attorney Docket No. P1001US00, filed December 22, 2002, entitled "Protein Arrays."

在本发明的一个方面,提供了合成对-(炔丙基氧基)苯丙氨酸化合物的方法。方法包括,例如,(a)将N-叔-丁氧基羰基-酪氨酸和K2CO3悬浮在无水DMF中;(b)将炔丙基溴加入(a)的反应混合物中,烷化羟基和羧基,产生保护的中间体化合物,该化合物具有结构:In one aspect of the present invention, methods for the synthesis of p-(propargyloxy)phenylalanine compounds are provided. Methods include, for example, (a) suspending N - tert-butoxycarbonyl-tyrosine and K2CO3 in anhydrous DMF; (b) adding propargyl bromide to the reaction mixture of (a), Alkylation of the hydroxyl and carboxyl groups yields a protected intermediate compound which has the structure:

Figure A20048002115500501
Figure A20048002115500501

和(c)将保护的中间体化合物与无水HCl在MeOH中混合,使胺部分去保护,从而合成对-(炔丙基氧基)苯丙氨酸化合物。在一个实施方式中,该方法还包括(d)在NaOH和MeOH的水溶液中溶解对-(炔丙基氧基)苯丙氨酸HCl,室温搅拌;(e)将pH调整到pH7;和(f)沉淀对-(炔丙基氧基)苯丙氨酸化合物。参见例如,本文实施例4中炔丙基氧基苯丙氨酸的合成。and (c) mixing the protected intermediate compound with anhydrous HCl in MeOH to deprotect the amine moiety to synthesize the p-(propargyloxy)phenylalanine compound. In one embodiment, the method further comprises (d) dissolving p-(propargyloxy)phenylalanine HCl in an aqueous solution of NaOH and MeOH, stirring at room temperature; (e) adjusting the pH to pH 7; and ( f) Precipitation of the p-(propargyloxy)phenylalanine compound. See, eg, the synthesis of propargyloxyphenylalanine in Example 4 herein.

非天然氨基酸的细胞摄取Cellular uptake of unnatural amino acids

当设计和选择非天然氨基酸时,一般会考虑的一个问题是真核细胞对非天然氨基酸的摄取,例如,掺入蛋白。例如,α-氨基酸的高电荷密度提示这些化合物不大可能是细胞可透过的。通过收集基于蛋白的运输系统将天然氨基酸摄入真核细胞。如果有的话,可以完成快速筛选评价细胞摄取的非天然氨基酸。参见,例如,2002年12月22日提交的代理人案卷编号P1001US00的申请,题为“蛋白质阵列”中的例如,毒性测定;和Liu,D.R.和Schultz,P.G.(1999)具有扩展遗传密码的生物进化的进展,PNAS United States 96:4780-4785。虽然可以用各种测定容易地分析摄取,但是设计适合细胞摄取途径的非天然氨基酸的替代途径是提供体内产生氨基酸的生物合成途径。One issue that is generally considered when designing and selecting unnatural amino acids is their uptake by eukaryotic cells, eg, incorporation into proteins. For example, the high charge density of α-amino acids suggests that these compounds are unlikely to be cell permeable. Uptake of natural amino acids into eukaryotic cells by collection of protein-based transport systems. If available, rapid screens can be performed to evaluate cellular uptake of unnatural amino acids. See, e.g., Attorney Docket No. P1001US00, filed December 22, 2002, entitled "Protein Arrays," e.g., Toxicity Assays; and Liu, D.R. and Schultz, P.G. (1999) Organisms with Extended Genetic Codes Progress in Evolution, PNAS United States 96:4780-4785. While uptake can be readily analyzed with various assays, an alternative approach to designing unnatural amino acids that fit into cellular uptake pathways is to provide biosynthetic pathways that produce amino acids in vivo.

非天然氨基酸的生物合成Biosynthesis of Unnatural Amino Acids

细胞中已经存在很多生物合成途径,用于生产氨基酸和其它化合物。然而在自然界中,例如在真核细胞中,可能并不存在用于具体非天然氨基酸的生物合成方法,本发明提供了这种方法。例如,在宿主细胞中通过加入新酶或修饰现有的宿主细胞途径任选地产生非天然氨基酸的生物合成途径。附加新酶是任选地天然产生的酶或人工产生的酶。例如,对-氨基苯丙氨酸的生物合成(如WO2002/085923题为“体内掺入非天然氨基酸”中的实施例所述)取决于加入来自其它生物的已知酶的组合。可以通过用含有基因的质粒转化细胞将这些酶的基因引入真核细胞中。当这些基因在细胞中表达时,它们提供了合成所需化合物的酶途径。下面的实施例中提供了任选加入的酶类型的例子。附加酶的序列在,例如,Genbank中发现。也将人工产生的酶以相同方式任选地加入细胞。在该方式中,操纵细胞机器和细胞的资源以生产非天然氨基酸。Many biosynthetic pathways already exist in cells for the production of amino acids and other compounds. Whereas in nature, eg, in eukaryotic cells, biosynthetic methods for particular unnatural amino acids may not exist, such methods are provided by the present invention. For example, biosynthetic pathways for unnatural amino acids are optionally created in host cells by adding new enzymes or modifying existing host cell pathways. Additional novel enzymes are optionally naturally occurring enzymes or artificially produced enzymes. For example, the biosynthesis of p-aminophenylalanine (as described in the examples in WO2002/085923 entitled "In vivo incorporation of unnatural amino acids") depends on the addition of a combination of known enzymes from other organisms. Genes for these enzymes can be introduced into eukaryotic cells by transforming the cells with a plasmid containing the genes. When these genes are expressed in cells, they provide enzymatic pathways for the synthesis of desired compounds. Examples of the types of enzymes that may be optionally added are provided in the Examples below. Sequences for additional enzymes are found, for example, in Genbank. Artificially produced enzymes are also optionally added to the cells in the same manner. In this approach, the cellular machinery and resources of the cell are manipulated to produce unnatural amino acids.

对于生产用于生物合成途径或用于发展已有途径的新酶,可以使用各种方法。例如,将Maxygen,Inc开发的如,循环重组(可从万维网的www.maxygen.com得到),任选地用于开发新酶和途径。参见,例如,Stemmer(1994),“通过DNA改组在体外快速演化蛋白”(Rapid evolution of a protein in vitro by DNA shuffling),Nature 370(4):389-391;和,Stemmer,(1994),通过随机片段化和再组装进行DNA改组:用于分子演化的体外重组(DNA shuffling by random fragmentation andreassembly:In vitro recombination for molecular evolution),Proc.Natl.Acad.Sci.USA.,91:10747-10751。类似地,将Genencor开发的(可从万维网的genencor.com得到)DesignPathTM任选地用于代谢途径工程,例如,设计在细胞中产生0-甲基-L-酪氨酸的途径。该技术用新基因的组合,例如通过功能基因组学鉴定,分子演化和设计重建了在宿主生物中的已有途径。Diversa Corporation(可从万维网diversa.com得到)也提供了快速筛选基因文库和基因途径的技术,例如,建立新途径。For the production of new enzymes for use in biosynthetic pathways or for the development of existing pathways, various methods can be used. For example, cyclic recombination developed by Maxygen, Inc., eg, (available on the World Wide Web at www.maxygen.com), is optionally used to develop new enzymes and pathways. See, e.g., Stemmer (1994), "Rapid evolution of a protein in vitro by DNA shuffling", Nature 370(4):389-391; and, Stemmer, (1994), DNA shuffling by random fragmentation and reassembly: In vitro recombination for molecular evolution, Proc. Natl. Acad. Sci. USA., 91: 10747-10751 . Similarly, DesignPath developed by Genencor (available on the World Wide Web at genencor.com) is optionally used for metabolic pathway engineering, eg, to design pathways to produce O-methyl-L-tyrosine in cells. The technology reconstructs existing pathways in host organisms with combinations of novel genes, eg, identified through functional genomics, molecular evolution and design. Diversa Corporation (available on the World Wide Web at diversa.com) also provides technology for the rapid screening of gene libraries and gene pathways, eg, to create new pathways.

一般地,用本发明设计的生物合成途径生产非天然氨基酸是在足够有效的蛋白生物合成的浓度中生产的,例如,天然细胞的量,但不至于达到影响其它氨基酸浓度或耗尽细胞资源的程度。以此方式体内生产的典型浓度是约10mM至约0.05mM。一旦用含有用于生产具体途径所需酶的基因的质粒转化细胞并产生非天然氨基酸,就任选地用体内选择进一步优化非天然氨基酸的生产,用于核糖体蛋白合成和细胞生长。Generally, unnatural amino acids produced using the biosynthetic pathways designed by the present invention are produced at concentrations sufficient for efficient protein biosynthesis, e.g., natural cellular amounts, but not so high as to affect other amino acid concentrations or deplete cellular resources. degree. Typical concentrations produced in vivo in this manner are about 10 mM to about 0.05 mM. Once cells are transformed with plasmids containing genes for the production of enzymes required for a particular pathway and unnatural amino acids are produced, in vivo selection is optionally used to further optimize the production of unnatural amino acids for ribosomal protein synthesis and cell growth.

具有非天然氨基酸的多肽Polypeptides with unnatural amino acids

具有至少一个非天然氨基酸的感兴趣的蛋白或多肽是本发明的特征。本发明也包括具有至少一个用本发明组合物和方法生产的非天然氨基酸的多肽或蛋白。赋形剂(例如,药学上可接受的赋形剂)也可与该蛋白一起存在。Proteins or polypeptides of interest having at least one unnatural amino acid are a feature of the invention. The invention also includes polypeptides or proteins having at least one unnatural amino acid produced using the compositions and methods of the invention. Excipients (eg, pharmaceutically acceptable excipients) may also be present with the protein.

通过用至少一种非天然氨基酸在真核细胞中生产感兴趣的蛋白或多肽,蛋白或多肽一般包括真核生物翻译后修饰。在某些实施方式中,蛋白包括至少一个非天然氨基酸和至少一个由真核细胞体内产生的翻译后修饰,其中该翻译后修饰不是由原核细胞产生的。例如,该翻译后修饰包括,例如,乙酰化、酰化、脂质-修饰、棕榈酰化、棕榈酸加成、磷酸化、糖脂-连接修饰、糖基化等。在一个方面,该翻译后修饰包括将寡糖(例如,(GlcNAc-Man)2-Man-GlcNAc-GlcNAc))通过GlcNAc-天冬酰胺连接附着到天冬酰胺上。也参见,表7,该表列出了真核生物蛋白(也可存在附加残基,未显示)的N-连接的寡糖的一些例子。在另一方面,翻译后修饰包括将寡糖(例如,Gal-GaINAc,Gal-GlcNAc等)通过GalNAc-丝氨酸或GaINAc-苏氨酸连接,或GlcNAc-丝氨酸或GlcNAc-苏氨酸连接附着到丝氨酸或苏氨酸上。Proteins or polypeptides typically include eukaryotic post-translational modifications by producing a protein or polypeptide of interest in a eukaryotic cell with at least one unnatural amino acid. In certain embodiments, the protein comprises at least one unnatural amino acid and at least one post-translational modification produced in vivo by a eukaryotic cell, wherein the post-translational modification is not produced by a prokaryotic cell. For example, the post-translational modification includes, for example, acetylation, acylation, lipid-modification, palmitoylation, palmitic acid addition, phosphorylation, glycolipid-linkage modification, glycosylation, and the like. In one aspect, the post-translational modification comprises attachment of an oligosaccharide (eg, (GlcNAc-Man) 2 -Man-GlcNAc-GlcNAc)) to asparagine via a GlcNAc-asparagine linkage. See also, Table 7, which lists some examples of N-linked oligosaccharides of eukaryotic proteins (additional residues may also be present, not shown). In another aspect, post-translational modifications include attachment of oligosaccharides (e.g., Gal-GaINAc, Gal-GlcNAc, etc.) to serine via GalNAc-serine or GaINAc-threonine linkages, or GlcNAc-serine or GlcNAc-threonine linkages or on threonine.

表7:通过GlcNAc-连接的寡糖的例子Table 7: Examples of GlcNAc-linked oligosaccharides

Figure A20048002115500521
Figure A20048002115500521

在又一方面,翻译后修饰包括前体的蛋白酶水解加工(例如,降钙素前体、降钙素基因-相关的肽前体、前甲状旁腺激素原、前胰岛素原、胰岛素原、前阿片黑皮素原、阿片黑皮素原等),组装成多亚基蛋白质或大分子组装,转移到细胞中的另一位点(例如细胞器,如内质网、高尔基体、细胞核、溶酶体、过氧化物酶体、线粒体、叶绿体、液泡等,或通过分泌途径)。在某些实施方式中,该蛋白包含分泌或定位序列、表位标记、FLAG标记、聚组氨酸标记、GST融合等。In yet another aspect, post-translational modifications include proteolytic processing of precursors (e.g., pre-calcitonin, calcitonin gene-related peptide precursor, pre-proparathyroid hormone, pre-proinsulin, pre-insulin, pre- proopiomelanocortin, proopiomelanocortin, etc.), assembled into multi-subunit proteins or macromolecular assemblies, transferred to another site in the cell (e.g. organelles such as endoplasmic reticulum, Golgi apparatus, nucleus, lysozyme body, peroxisome, mitochondria, chloroplast, vacuole, etc., or through the secretory pathway). In certain embodiments, the protein comprises a secretion or localization sequence, an epitope tag, a FLAG tag, a polyhistidine tag, a GST fusion, and the like.

非天然氨基酸的一个优点是它提供附加化学部分,可以用来加入附加分子。这些修饰可以在真核细胞中体内生成,或体外生成。因此,在某些实施方式中,翻译后修饰是通过非天然氨基酸的。例如,翻译后修饰可以通过亲核-亲电子反应。大部分现在用于选择性修饰蛋白的反应涉及亲核和亲电子反应配偶体之间共价键形成,例如具有组氨酸或半胱氨酸侧链α-卤代酮的反应。这些情况中的选择性由蛋白中亲核残基的数量和可及性决定。在本发明蛋白质中,可以用其它更具选择性的反应,如具有酰肼的非天然酮式-氨基酸或氨氧基化合物在体外和体内的反应。参见,例如,Cornish,等,(1996)Am.Chem.Soc.,118:8150-8151;Mahal,等,(1997)Science,276:1125-1128;Wang,等,(2001)Science 292:498-500;Chin,等,(2002)Am.Chem.Soc.124:9026-9027;Chin,等,(2002)Proc.Natl.Acad.Sci.,99:11020-11024;Wang,等,(2003)Proc.Natl.Acad.Sci.,100:56-61;Zhang,等,(2003)Biochemistry,42:6735-6746;和Chin,等,(2003)Science,印刷中。这允许用许多试剂,包括荧光团、交联剂、糖衍生物和细胞毒性分子对基本上任何蛋白进行选择性标记。也参见,2003年10月15日提交的题为“糖蛋白合成”(Glycoprotein synthesis)的专利申请USSN10/686,944。例如,通过叠氮基氨基酸进行的翻译后修饰也可通过Staudinger连接(例如,用三芳基膦试剂)进行。参见,例如,Kiick等,(2002)将叠氮化合物掺入重组蛋白中用于通过Staudinger连接进行化学选择性修饰(Incorporation of azides into recombinant proteins forchemoselective modification by the Staudinger ligtation),PNAS 99:19-24。One advantage of unnatural amino acids is that they provide additional chemical moieties that can be used to incorporate additional molecules. These modifications can be made in vivo in eukaryotic cells, or in vitro. Thus, in certain embodiments, the post-translational modification is by an unnatural amino acid. For example, post-translational modifications can be through nucleophilic-electrophilic reactions. Most of the reactions currently used for the selective modification of proteins involve the formation of covalent bonds between nucleophilic and electrophilic reaction partners, such as α-haloketones with histidine or cysteine side chains. Selectivity in these cases is determined by the number and accessibility of nucleophilic residues in the protein. In proteins of the invention, other more selective reactions can be used, such as the reaction of unnatural keto-amino acids with hydrazides or aminooxy compounds in vitro and in vivo. See, e.g., Cornish, et al., (1996) Am. Chem. Soc., 118:8150-8151; Mahal, et al., (1997) Science, 276:1125-1128; Wang, et al., (2001) Science 292:498 -500; Chin, et al., (2002) Am.Chem.Soc.124:9026-9027; Chin, et al., (2002) Proc.Natl.Acad.Sci., 99:11020-11024; ) Proc. Natl. Acad. Sci., 100:56-61; Zhang, et al., (2003) Biochemistry, 42:6735-6746; and Chin, et al., (2003) Science, in press. This allows the selective labeling of essentially any protein with a number of reagents, including fluorophores, cross-linkers, sugar derivatives, and cytotoxic molecules. See also, patent application USSN 10/686,944, filed October 15, 2003, entitled "Glycoprotein synthesis." For example, post-translational modification by azido amino acids can also be performed by Staudinger linkages (eg, with triarylphosphine reagents). See, e.g., Kiick et al., (2002) Incorporation of azides into recombinant proteins for chemoselective modification by the Staudinger ligtation, PNAS 99: 19-24 .

本发明提供了选择性修饰蛋白的另一高效方法,它包括响应于选择密码子,将非天然氨基酸,例如,含有叠氮化物或炔基部分的非天然氨基酸(参见,例如,图11的2和1)遗传掺入蛋白质中。然后可以通过,例如,Huisgen[3+2]环加成反应(参见,例如,Padwa,A.《综合有机合成》(Comprehensive Organic Synthesis),第4卷,(1991)Trost,B.M.编,Pergamon,Oxford,第1069-1109页;和Huisgen,R.《1.3-双极还加成化学》(1,3-Dipolar Cycloaddition Chemistry),(1984)Padwa,A.编,Wiley,New York,第1-176页)分别用例如,炔基或叠氮化物衍生物来修饰这些氨基酸侧链。参见,例如,图16。因为该方法包括环加成而不是亲核取代,所以可以以极高的选择性来修饰蛋白质。该反应可以在室温下、含水条件中以极好的区域选择性(1,4>1,5)通过将催化量的Cu(I)盐加入到反应混合物中进行。参见,例如,Tornoe,等,(2002)Org.Chem.67:3057-3064;和Rostovtsev,等,(2002)Angew.Chem.Int.Ed.Eng.41:2596-2599。可以使用的另一方法是具有四半胱氨酸基序的双砷化合物上的配体交换,参见,例如,Griffin,等,(1998)Science 281:269-272。The present invention provides another highly efficient method for the selective modification of proteins, which comprises, in response to a selector codon, incorporating an unnatural amino acid, e.g., an unnatural amino acid containing an azide or an alkynyl moiety (see, e.g., 2 of FIG. 11 ). and 1) genetic incorporation into proteins. It can then be obtained by, for example, the Huisgen [3+2] cycloaddition reaction (see, for example, Padwa, A. "Comprehensive Organic Synthesis", Vol. 4, (1991) Trost, B.M. ed., Pergamon, Oxford, pp. 1069-1109; and Huisgen, R. "1.3-Dipolar Cycloaddition Chemistry", (1984) Padwa, A. ed., Wiley, New York, pp. 1- 176) to modify these amino acid side chains with, for example, alkynyl or azide derivatives, respectively. See, eg, Figure 16. Because the method involves cycloaddition rather than nucleophilic substitution, proteins can be modified with extremely high selectivity. The reaction can be carried out at room temperature under aqueous conditions with excellent regioselectivity (1,4 > 1,5) by adding catalytic amounts of Cu(I) salts to the reaction mixture. See, eg, Tornoe, et al., (2002) Org. Chem. 67:3057-3064; and Rostovtsev, et al., (2002) Angew. Chem.Int.Ed.Eng.41:2596-2599. Another method that can be used is ligand exchange on diarsenic compounds with a tetracysteine motif, see, eg, Griffin, et al., (1998) Science 281:269-272.

可以通过[3+2]环加成加入本发明蛋白的分子包括实际上任何具有叠氮基或炔基衍生物的分子。参见,例如,本文实施例3和5。这种分子包括但不限于,染料、荧光团、交联剂、糖衍生物、聚合物(例如,聚乙二醇的衍生物)、光交联剂、细胞毒化合物、亲和标记、生物素的衍生物、树脂、珠、第二个蛋白或多肽(或更多)、多核苷酸(例如,DNA、RNA等)、金属螯合剂、辅因子、脂肪酸、碳水化合物等。参见,例如,本文的图13A和实施例3和5。可以将这些分子分别加入到具有炔基的非天然氨基酸,如对-炔丙基氧基苯丙氨酸,或具有叠氮基的非天然氨基酸,如对-叠氮基-苯丙氨酸中。例如,参见图13B和图17A。Molecules that can be added to proteins of the invention via [3+2] cycloaddition include virtually any molecule with an azido or alkynyl derivative. See, eg, Examples 3 and 5 herein. Such molecules include, but are not limited to, dyes, fluorophores, crosslinkers, sugar derivatives, polymers (e.g., derivatives of polyethylene glycol), photocrosslinkers, cytotoxic compounds, affinity tags, biotin derivatives, resins, beads, a second protein or polypeptide (or more), polynucleotides (eg, DNA, RNA, etc.), metal chelators, cofactors, fatty acids, carbohydrates, etc. See, eg, Figure 13A and Examples 3 and 5 herein. These molecules can be added to unnatural amino acids with an alkyne group, such as p-propargyloxyphenylalanine, or with an azido group, such as p-azido-phenylalanine, respectively . See, eg, Figures 13B and 17A.

在另一方面,本发明提供了包括这种分子的组合物和生产这些分子,例如,叠氮基染料(如化学结构4和化学结构6中所示)、炔基聚乙二醇(例如,化学结构7中所示)的方法,其中n是例如,50和10,000、75和5,000、100和2,000、100和1,000等之间的整数。在本发明的实施方式中,炔基聚乙二醇的分子量为,例如,约5,000至约100,000Da、约20,000至约50,000Da、约20,000至约10,000Da(例如,20,000Da)等。In another aspect, the invention provides compositions comprising such molecules and the production of such molecules, e.g., azido dyes (as shown in Chemical Structure 4 and Chemical Structure 6), alkynyl polyethylene glycols (e.g., shown in Chemical Structure 7), wherein n is an integer between, for example, 50 and 10,000, 75 and 5,000, 100 and 2,000, 100 and 1,000, etc. In embodiments of the invention, the molecular weight of the alkynyl polyethylene glycol is, for example, from about 5,000 to about 100,000 Da, from about 20,000 to about 50,000 Da, from about 20,000 to about 10,000 Da (eg, 20,000 Da), etc.

Figure A20048002115500541
Figure A20048002115500541

Figure A20048002115500551
Figure A20048002115500551

也提供了包含这些化合物,例如,具有蛋白和细胞的各种组合物。在本发明的一个方面,含有叠氮基染料(例如,化学结构4或化学结构6)的蛋白还包括至少一种非天然氨基酸(例如,炔基氨基酸),其中通过[3+2]环加成将叠氮基染料附着到非天然氨基酸上。Various compositions comprising these compounds, eg, with proteins and cells, are also provided. In one aspect of the invention, a protein containing an azido dye (e.g., Chemical Structure 4 or Chemical Structure 6) further includes at least one unnatural amino acid (e.g., an alkynyl amino acid), wherein to attach azido-based dyes to unnatural amino acids.

在一个实施方式中,蛋白包括化学结构7的炔基聚乙二醇。在另一实施方式中,该组合物还包括至少一种非天然氨基酸(例如,叠氮基氨基酸),其中通过[3+2]环加成将炔基聚乙二醇附着到非天然氨基酸上。In one embodiment, the protein comprises an alkynyl polyethylene glycol of chemical structure 7. In another embodiment, the composition further comprises at least one unnatural amino acid (e.g., an azidoamino acid), wherein the alkynyl polyethylene glycol is attached to the unnatural amino acid by a [3+2] cycloaddition .

也提供了用于合成叠氮基染料的方法。例如,一种该方法包含:(a)提供含有磺酰卤化物部分的染料化合物;(b)在3-叠氮基丙胺和三乙胺的存在下将染料化合物加热到室,将3-叠氮基丙胺的胺部分与染料化合物的卤化物位置偶联,从而合成叠氮基染料。在一个实施方式中,该染料化合物包括丹磺酰氯,该叠氮基染料包括化学结构4的组合物。在一个方面,该方法还包括从反应混合物中纯化叠氮基染料。参见,例如,本文实施例5。Methods for the synthesis of azido-based dyes are also provided. For example, one such method comprises: (a) providing a dye compound containing a sulfonyl halide moiety; (b) heating the dye compound to a chamber in the presence of 3-azidopropylamine and triethylamine, the 3-azide Azido dyes are synthesized by coupling the amine moiety of the azidopropylamine to the halide site of the dye compound. In one embodiment, the dye compound includes dansyl chloride, and the azido dye includes a composition of chemical structure 4. In one aspect, the method further comprises purifying the azido dye from the reaction mixture. See, eg, Example 5 herein.

在另一实施例中,合成叠氮基染料的方法包括(a)提供含胺的染料化合物;(b)在合适的溶剂中将含胺的染料化合物与碳二亚胺和4-(3-叠氮基丙基氨甲酰基)-丁酸混合,将该酸的羰基与染料化合物的胺部分偶联,从而合成叠氮基染料。在一个实施方式中,碳二亚胺包括1-乙基-3-(3-二甲基氨丙基)碳二亚胺盐酸盐(EDCI)。在一个方面,含胺的染料包括荧光素胺,合适溶剂包括吡啶。例如,含胺的染料任选地包括荧光素胺,叠氮基染料任选地包括化学结构6的组合物。在一个实施方式中,该方法还包括(c)沉淀叠氮基染料;(d)用HCl洗涤沉淀;(e)在EtOAc中溶解洗涤的沉淀;和(f)在己烷中沉淀叠氮基染料。参见,例如,本文实施例5。In another embodiment, a method for synthesizing an azido-based dye comprises (a) providing an amine-containing dye compound; (b) combining the amine-containing dye compound with carbodiimide and 4-(3- Azidopropylcarbamoyl)-butyric acid is mixed and the carbonyl group of the acid is coupled with the amine moiety of the dye compound to synthesize an azido dye. In one embodiment, the carbodiimide comprises 1-ethyl-3-(3-dimethylaminopropyl)carbodiimide hydrochloride (EDCI). In one aspect, the amine-containing dye includes fluoresceinamine and suitable solvents include pyridine. For example, amine-containing dyes optionally include fluorescein amine, and azido-based dyes optionally include compositions of Chemical Structure 6. In one embodiment, the method further comprises (c) precipitating the azido dye; (d) washing the precipitate with HCl; (e) dissolving the washed precipitate in EtOAc; and (f) precipitating the azido group in hexane dye. See, eg, Example 5 herein.

也提供了合成炔丙基酰胺聚乙二醇的方法。例如,该方法包括将炔丙基胺与聚乙二醇(PEG)-羟基琥珀酰亚胺酯在有机溶剂(例如,CH2Cl2)中室温下反应,产生化学结构7的炔丙基酰胺聚乙二醇。在一个实施方式中,该方法还包括用乙酸乙酯沉淀炔丙基酰胺聚乙二醇。在一个方面,该方法还包括在甲醇中再结晶炔丙基酰胺聚乙二醇;和真空下干燥产物。参见,例如,本文的实施例5。Also provided are methods of synthesizing propargylamide polyethylene glycols. For example, the method involves reacting propargylamine with polyethylene glycol (PEG)-hydroxysuccinimide ester in an organic solvent (eg, CH2Cl2 ) at room temperature to produce propargylamide of chemical structure 7 polyethylene glycol. In one embodiment, the method further comprises precipitating the propargylamide polyethylene glycol with ethyl acetate. In one aspect, the method further comprises recrystallizing propargylamide polyethylene glycol in methanol; and drying the product under vacuum. See, eg, Example 5 herein.

本发明的真核细胞提供了合成包含大有用量的非天然氨基酸的蛋白的能力。在一个方面,该组合物任选地包括,例如,至少10微克、至少50微克、至少75微克、至少100微克、至少200微克、至少250微克、至少500微克、至少1毫克、至少10毫克或更多含有非天然氨基酸的蛋白,或用体内蛋白生产方法可获得的量(本文详细提供了重组蛋白生产和纯化)。在另一方面,该蛋白任选地以存在于组合物中,即在例如,细胞裂解物、缓冲液、药物缓冲液或其它悬浮液(例如,体积为,例如,从约1纳升至约100升)中的浓度为例如,每升至少10微克蛋白、每升至少50微克蛋白、每升至少75微克蛋白、每升至少100微克蛋白、每升至少200微克蛋白、每升至少250微克蛋白、每升至少500微克蛋白、每升至少1毫克蛋白或每升至少10毫克蛋白或更多。在真核细胞中包括至少一种非天然氨基酸的蛋白的大量生产(例如,比用其它方法,例如,体外翻译一般性可能量更大)是本发明的特征。The eukaryotic cells of the invention provide the ability to synthesize proteins comprising large useful quantities of unnatural amino acids. In one aspect, the composition optionally includes, for example, at least 10 micrograms, at least 50 micrograms, at least 75 micrograms, at least 100 micrograms, at least 200 micrograms, at least 250 micrograms, at least 500 micrograms, at least 1 milligram, at least 10 milligrams, or More proteins containing unnatural amino acids, or in amounts obtainable with in vivo protein production methods (recombinant protein production and purification are provided in detail herein). In another aspect, the protein is optionally present in a composition, for example, in a cell lysate, buffer, drug buffer or other suspension (for example, in a volume of, for example, from about 1 nanoliter to about 100 liters) is for example at least 10 micrograms protein per liter, at least 50 micrograms protein per liter, at least 75 micrograms protein per liter, at least 100 micrograms protein per liter, at least 200 micrograms protein per liter, at least 250 micrograms protein per liter , at least 500 micrograms of protein per liter, at least 1 mg of protein per liter, or at least 10 mg of protein per liter or more. The production in eukaryotic cells of proteins comprising at least one unnatural amino acid in large quantities (eg, generally in greater quantities than possible by other means, eg, in vitro translation) is a feature of the invention.

可以完成非天然氨基酸的掺入以,例如,修改蛋白结构和/或功能中的变化,例如,改变大小、酸度、亲核性、氢键合、疏水性、蛋白酶靶位点的可及性,靶向一个部分(例如,蛋白阵列)等。包括非天然氨基酸的蛋白可具有增加的或甚至新的催化或物理性质。例如,通过蛋白中包括非天然氨基酸任选地修饰了下面的性质:毒性、生物分布、结构性质、光谱性质、化学和/或光化学性质、催化能力、半衰期(例如血清半衰期)、与其它分子反应的能力,例如共价或非共价等。含有包括至少一种非天然氨基酸的蛋白的组合物可用于,例如,新治疗学、诊断学、催化酶、工业酶、结合蛋白(例如,抗体)和例如,蛋白结构和功能的研究。参见,例如,Dougherty,(2000)非天然氨基酸用作蛋白结构和功能的探针(Unnatural Amino Acids as Probesof Protein Structure and Function),Current 0pinion in Chemical Biology,4:645-652。Incorporation of unnatural amino acids can be accomplished, for example, to modify changes in protein structure and/or function, for example, to alter size, acidity, nucleophilicity, hydrogen bonding, hydrophobicity, accessibility to protease target sites, Targeting a moiety (eg, protein array), etc. Proteins that include unnatural amino acids may have increased or even new catalytic or physical properties. For example, the following properties are optionally modified by including unnatural amino acids in proteins: toxicity, biodistribution, structural properties, spectroscopic properties, chemical and/or photochemical properties, catalytic ability, half-life (e.g., serum half-life), reactivity with other molecules Capabilities, such as covalent or non-covalent, etc. Compositions containing proteins comprising at least one unnatural amino acid are useful, eg, in the study of new therapeutics, diagnostics, catalytic enzymes, industrial enzymes, binding proteins (eg, antibodies), and eg, protein structure and function. See, e.g., Dougherty, (2000) Unnatural Amino Acids as Probes of Protein Structure and Function, Current Opinion in Chemical Biology, 4:645-652.

在本发明的一个方面,组合物包括至少一种具有至少一个,例如,至少两个、至少三个、至少四个、至少五个、至少六个、至少七个、至少八个、至少九个或至少十个或更多的非天然氨基酸的蛋白。非天然氨基酸可以是相同或不同的,例如,在蛋白中可以有1、2、3、4、5、6、7、8、9或10或更多不同位点包含1、2、3、4、5、6、7、8、9或10或更多不同非天然氨基酸。在另一方面,组合物包括蛋白中存在的至少一种,但少于全部的具体氨基酸被非天然氨基酸取代的蛋白。对于给定的具有多于一个非天然氨基酸的蛋白来说,非天然氨基酸可以是相同或不同的(例如,该蛋白可包括两种或多种不同类型的非天然氨基酸,或可包括两种相同的非天然氨基酸)。对于给定的具有多于两个非天然氨基酸的蛋白来说,非天然氨基酸可以是相同、不同或同种的多个非天然氨基酸与至少一种不同的非天然氨基酸的组合。In one aspect of the invention, the composition comprises at least one compound having at least one, for example, at least two, at least three, at least four, at least five, at least six, at least seven, at least eight, at least nine or proteins with at least ten or more unnatural amino acids. Unnatural amino acids can be the same or different, for example, there can be 1, 2, 3, 4, 5, 6, 7, 8, 9 or 10 or more different positions in a protein containing 1, 2, 3, 4 , 5, 6, 7, 8, 9 or 10 or more different unnatural amino acids. In another aspect, a composition includes a protein in which at least one, but less than all, of a specified amino acid present in the protein is substituted with an unnatural amino acid. For a given protein with more than one unnatural amino acid, the unnatural amino acids may be the same or different (e.g., the protein may include two or more different types of unnatural amino acids, or may include two of the same unnatural amino acids). For a given protein with more than two unnatural amino acids, the unnatural amino acid can be a combination of multiple unnatural amino acids of the same, different or same species with at least one different unnatural amino acid.

可以用本文的组合物和方法生产基本上任何包括非天然氨基酸(和任意相应的编码核酸,例如,该核酸包括一种或多种选择密码子)的蛋白(或其部分)。没有对几十万的已知蛋白进行鉴定的尝试,这些蛋白中任意一个都可被修饰为包括一种或多种非天然氨基酸,例如,通过修改任何可用的突变方法在相关翻译系统中包括一种或多种合适的选择密码子。已知蛋白的普通序列库包括GenBank EMBL、DDBJ和NCBI。通过搜索因特网可容易地识别其它库。Essentially any protein (or portion thereof) that includes an unnatural amino acid (and any corresponding encoding nucleic acid, eg, that includes one or more selector codons), can be produced using the compositions and methods herein. No attempt has been made to identify the hundreds of thousands of known proteins, any of which could be modified to include one or more unnatural amino acids, for example, by modifying any of the available mutagenesis methods to include a One or more suitable selector codons. Common sequence repositories for known proteins include GenBank EMBL, DDBJ, and NCBI. Other libraries are easily identified by searching the Internet.

一般地,蛋白与任意可用蛋白(例如,治疗蛋白、诊断蛋白、工业酶或它们的一部分等),例如,至少60%、至少70%、至少75%、至少80%、至少90%、至少95%或至少99%或更多相同,它们包含一个或多个非天然氨基酸。可以修饰包括一种或多种非天然氨基酸的治疗、诊断和其它蛋白的例子包括但不限于,例如,α-1抗胰蛋白酶、血管生成抑制素、抗溶血因子、抗体(抗体的进一步详述见下)、载脂蛋白、脱辅蛋白质、心钠素、心房钠尿多肽、心房肽、C-X-C趋化因子(例如,T39765,NAP-2,ENA-78,Gro-a,Gro-b,Gro-c,IP-10,GCP-2,NAP-4,SDF-1,PF4,MIG)、降钙素、CC趋化因子(例如,单核细胞趋化蛋白-1、单核细胞趋化蛋白-2、单核细胞趋化蛋白-3、单核细胞炎症蛋白-1α、单核细胞炎症蛋白-1β、RANTES、I309、R83915、R91733、HCC1、T58847、D31065、T64262)、CD40配体、C-kit配体、胶原、集落刺激因子(CSF)、补体因子5α、补体抑制剂、补体受体1、细胞因子、(例如,上皮嗜中性粒细胞激活肽-78、GROα/MGSA、GROβ、GROγ、MIP-1α、MIP-1δ、MCP-1)、表皮生长因子(EGF)、促红细胞生成素(“EPO”,代表通过掺入一种或多种非天然氨基酸进行修饰的优选靶)、剥脱性毒素A和B、因子IX、因子VII、因子VIII、因子X、成纤维细胞生长因子(FGF)、纤维蛋白原、纤连蛋白、G-CSF、GM-CSF、葡糖脑苷脂酶、促性腺素、生长因子、Hedgehog蛋白(例如,Sonic,Indian,Desert)、血红蛋白、肝细胞生长因子(HGF)、水蛭素、人血清白蛋白、胰岛素、胰岛素-样生长因子(IGF)、干扰素(例如,IFN-α、IFN-β、IFN-Y)、白介素(例如,IL-1、IL-2、IL-3、IL-4、IL-5、IL-6、IL-7、IL-8、IL-9、IL-10、IL-11、IL-12等)、角质形成细胞生长因子(KGF)、乳铁蛋白、白血病抑制因子、荧光素酶、Neurturin、嗜中性粒细胞抑制因子(NIF)、制瘤素M、成骨蛋白、甲状旁腺激素、PD-ECSF、PDGF、肽激素(例如,人生长激素)、多效营养因子、蛋白A、蛋白G、热源性外毒素A、B和C、松弛素、肾素、SCF、可溶性补体受体I、可溶性I-CAM1、可溶性白介素受体(IL-1、2、3、4、5、6、7、9、10、11、12、13、14、15)、可溶性TNF受体、生长调节素、促生长素抑制素、促生长素、链激酶、超抗原即葡萄球菌肠毒素(SEA、SEB、SEC1、SEC2、SEC3、SED、SEE)、超氧化物歧化酶(SOD)、中毒性休克综合征毒素(TSST-1)、胸腺素α1、组织纤溶酶原激活物、肿瘤坏死因子β(TNFβ)、肿瘤坏死因子受体(TNFR)、肿瘤坏死因子-α(TNFα)、血管内皮生长因子(VEGEF)、尿激酶和许多其它物质。Generally, the protein will be at least 60%, at least 70%, at least 75%, at least 80%, at least 90%, at least 95% of any available protein (e.g., therapeutic protein, diagnostic protein, industrial enzyme or part thereof, etc.), % or at least 99% or more identical, they comprise one or more unnatural amino acids. Examples of therapeutic, diagnostic, and other proteins that can be modified to include one or more unnatural amino acids include, but are not limited to, e.g., alpha-1 antitrypsin, angiostatin, anti-hemolytic factors, antibodies (further details of antibodies see below), apolipoprotein, apoprotein, atrial natriuretic peptide, atrial natriuretic polypeptide, atrial peptide, C-X-C chemokines (eg, T39765, NAP-2, ENA-78, Gro-a, Gro-b, Gro -c, IP-10, GCP-2, NAP-4, SDF-1, PF4, MIG), calcitonin, CC chemokines (e.g., monocyte chemoattractant protein-1, monocyte chemoattractant protein -2, monocyte chemoattractant protein-3, monocyte inflammatory protein-1α, monocyte inflammatory protein-1β, RANTES, I309, R83915, R91733, HCC1, T58847, D31065, T64262), CD40 ligand, C -kit ligands, collagen, colony-stimulating factor (CSF), complement factor 5α, complement inhibitors, complement receptor 1, cytokines, (e.g., epithelial neutrophil-activating peptide-78, GROα/MGSA, GROβ, GROγ, MIP-1α, MIP-1δ, MCP-1), epidermal growth factor (EGF), erythropoietin (“EPO”, representing a preferred target for modification by incorporation of one or more unnatural amino acids), Exfoliative toxins A and B, Factor IX, Factor VII, Factor VIII, Factor X, Fibroblast Growth Factor (FGF), Fibrinogen, Fibronectin, G-CSF, GM-CSF, Glucocerebrosidase , gonadotropins, growth factors, Hedgehog proteins (eg, Sonic, Indian, Desert), hemoglobin, hepatocyte growth factor (HGF), hirudin, human serum albumin, insulin, insulin-like growth factor (IGF), interfering IL-1, IL-2, IL-3, IL-4, IL-5, IL-6, IL-7, IL -8, IL-9, IL-10, IL-11, IL-12, etc.), Keratinocyte Growth Factor (KGF), Lactoferrin, Leukemia Inhibitory Factor, Luciferase, Neurturin, Neutrophil Inhibitory factor (NIF), oncostatin M, osteogenic protein, parathyroid hormone, PD-ECSF, PDGF, peptide hormones (eg, human growth hormone), pleiotropic factor, protein A, protein G, pyrogenic exotoxin A, B and C, relaxin, renin, SCF, soluble complement receptor I, soluble I-CAM1, soluble interleukin receptor (IL-1, 2, 3, 4, 5, 6, 7, 9, 10, 11, 12, 13, 14, 15), soluble TNF receptor, somatomodulin, somatostatin, somatotropin, streptokinase, superantigen namely staphylococcal enterotoxin (SEA, SEB, SEC1, SEC2, SEC3 , SED, SEE), superoxide dismutase (SOD), toxic shock syndrome toxin (TSST-1), thymosin α1, tissue plasminogen activator, tumor necrosis factor beta (TNFβ), tumor necrosis factor receptor (TNFR), tumor necrosis factor-α (TNFα), vascular endothelial growth factor (VEGEF), urokinase, and many others.

一类可用本文描述的用于体内掺入非天然氨基酸的组合物和方法制成的蛋白包括转录调节剂或其部分。转录调节剂的例子包括调节细胞生长、分化、调节等的基因和转录调节蛋白。在原核生物、病毒和真核生物包括真菌、植物、酵母、昆虫和动物包括哺乳动物中发现了转录调节剂,这提供了大量的治疗靶。应理解表达和转录激活物通过很多机制,例如,通过与受体结合、刺激信号转导级联反应、调节转录因子的表达、与启动子和增强子结合、与结合到启动子和增强子的蛋白结合、解旋DNA、剪接前mRNA、聚腺苷化RNA和降解RNA来调节转录。例如,真核细胞中的GAL4蛋白或其部分的组合物也是本发明的特征。一般地,GAL4蛋白或其部分含有至少一个非天然氨基酸。也参见本文中题为“正交氨酰基-tRNA合成酶”的部分。One class of proteins that can be made using the compositions and methods described herein for incorporation of unnatural amino acids in vivo includes transcriptional regulators or portions thereof. Examples of transcriptional regulators include genes and transcriptional regulatory proteins that regulate cell growth, differentiation, regulation, and the like. Transcription regulators are found in prokaryotes, viruses and eukaryotes including fungi, plants, yeast, insects and animals including mammals, providing a large number of therapeutic targets. It is understood that activators of expression and transcription act through a number of mechanisms, for example, by binding to receptors, stimulating signal transduction cascades, regulating the expression of transcription factors, binding to promoters and enhancers, binding to promoters and enhancers Proteins bind, unwind DNA, pre-splicing mRNA, polyadenylate RNA, and degrade RNA to regulate transcription. For example, compositions of GAL4 proteins or portions thereof in eukaryotic cells are also a feature of the invention. Typically, the GAL4 protein or portion thereof contains at least one unnatural amino acid. See also the section herein entitled "Orthogonal Aminoacyl-tRNA Synthetases".

一类本发明的蛋白(例如,具有一种或多种非天然氨基酸的蛋白)包括表达激活物,如细胞因子、炎症分子、生长因子、它们的受体和癌基因产物,例如,白介素(例如,IL-1、IL-2、IL-8等)、干扰素、FGF、IGF-I、IGF-II、FGF、PDGF、TNF、TGF-α、TGF-β、EGF、KGF、SCF/c-Kit、CD40L/CD40、VLA-4/VCAM-1、ICAM-1/LFA-1和透明质酸苷/CD44;信号转导分子和相应的癌基因产物,例如,Mos、Ras、Raf和Met;以及转录激活物和抑制物,例如,p53、Tat、Fos、Myc、Jun、Myb、Rel和甾类激素受体如雌激素、孕酮、睾酮、醛固酮、LDL受体配体和皮质酮受体。One class of proteins of the invention (e.g., proteins with one or more unnatural amino acids) includes expression activators, such as cytokines, inflammatory molecules, growth factors, their receptors, and oncogene products, e.g., interleukins (e.g. , IL-1, IL-2, IL-8, etc.), interferon, FGF, IGF-I, IGF-II, FGF, PDGF, TNF, TGF-α, TGF-β, EGF, KGF, SCF/c- Kit, CD40L/CD40, VLA-4/VCAM-1, ICAM-1/LFA-1, and hyaluronan/CD44; signal transduction molecules and corresponding oncogene products, e.g., Mos, Ras, Raf, and Met; As well as transcriptional activators and repressors such as p53, Tat, Fos, Myc, Jun, Myb, Rel and steroid hormone receptors such as estrogen, progesterone, testosterone, aldosterone, LDL receptor ligand and corticosterone receptor .

本发明也提供了具有至少一个非天然氨基酸的酶(例如,工业酶)或其部分。酶的例子包括但不限于,例如,酰胺酶、氨基酸消旋酶、酰化酶、脱卤素酶、加双氧酶、二芳基丙烷过氧化物酶、差向异构酶、环氧化物水解酶、酯酶、异构酶、激酶、葡萄糖异构酶、糖苷酶、糖基转移酶、卤素过氧化物酶、单加氧酶(如p450)、脂肪酶、木质素过氧化物酶、腈水合酶、腈水解酶、蛋白酶、磷酸酶、枯草杆菌蛋白酶、转氨酶和核酸酶。The invention also provides enzymes (eg, industrial enzymes) or portions thereof having at least one unnatural amino acid. Examples of enzymes include, but are not limited to, for example, amidase, amino acid racemase, acylase, dehalogenase, dioxygenase, diarylpropane peroxidase, epimerase, epoxide hydrolysis Enzymes, esterases, isomerases, kinases, glucose isomerases, glycosidases, glycosyltransferases, haloperoxidases, monooxygenases (e.g. p450), lipases, lignin peroxidases, nitriles Hydrates, Nitrilases, Proteases, Phosphatases, Subtilisins, Transaminases and Nucleases.

很多这些蛋白可从市场上购得(参见,例如,Sigma Bio Sciences 2002目录和价格表),相应的蛋白序列和基因,一般还有它们的很多变体是熟知的(参见,例如,Genbank)。可以根据本发明通过插入一个或多个非天然氨基酸对它们中的任一进行修饰,例如,根据一种或多种治疗、诊断或感兴趣的酶性质改变蛋白。治疗相关性质的例子包括血清半衰期、储存半衰期、稳定性、免疫原性、治疗活性,可检测性(例如,在非天然氨基酸中包括报道基团(例如,标记或标记结合位点))、LD50的降低或其它副作用、通过消化道进入身体的能力(例如口服利用度)等。诊断性质的例子包括储存半衰期、稳定性、诊断活性,可检测性等。相关酶性质的例子包括储存半衰期、稳定性、酶活性、生产能力等。Many of these proteins are commercially available (see, eg, Sigma Bio Sciences 2002 catalog and price list), and the corresponding protein sequences and genes, and generally their many variants, are well known (see, eg, Genbank). Any of these may be modified according to the invention by insertion of one or more unnatural amino acids, eg, to alter the protein according to one or more therapeutic, diagnostic or enzymatic properties of interest. Examples of therapeutically relevant properties include serum half-life, storage half-life, stability, immunogenicity, therapeutic activity, detectability (e.g., inclusion of a reporter group (e.g., label or label binding site) in an unnatural amino acid), LD 50 reduction or other side effects, the ability to enter the body through the digestive tract (such as oral availability), etc. Examples of diagnostic properties include storage half-life, stability, diagnostic activity, detectability, and the like. Examples of relevant enzyme properties include storage half-life, stability, enzyme activity, productivity, and the like.

也可以修饰各种其它蛋白,以包括本发明的一个或多个非天然氨基酸。例如,本发明可包括例如,在来自感染性真菌,例如,曲霉,假丝酵母种;细菌,具体是作为病原菌模型的大肠杆菌,和医学上重要的细菌如葡萄球菌属(例如,金黄色(葡萄球菌))或链球菌属(例如,肺炎(链球菌));原生动物如孢子虫纲(例如疟原虫)、根足虫类(例如内变形虫属)和鞭毛虫类(锥虫属、利什曼虫属、毛滴虫属、贾第虫属等);病毒如(+)RNA病毒(例子包括痘病毒,如牛痘病毒;微小核糖核酸病毒,例如脊髓灰质炎病毒;披膜病毒,例如风疹病毒;黄病毒,例如HCV;和冠状病毒),(-)RNA病毒(例如,弹状病毒,例如VSV;副粘病毒,例如RSV;正粘病毒,例如流感病毒;布尼亚病毒和沙粒病毒),dsDNA病毒(例如呼肠弧病毒),RNA至DNA病毒,即逆转录病毒,如HIV和HTLV,以及某些DNA至RNA病毒,如乙肝病毒的蛋白中,用非天然氨基酸在一种或多种疫苗蛋白中取代一个或多个天然氨基酸。Various other proteins may also be modified to include one or more unnatural amino acids of the invention. For example, the present invention can include, for example, in bacteria from infectious fungi, e.g., Aspergillus, Candida species; bacteria, particularly E. Staphylococci)) or Streptococci (e.g., Pneumoniae (Streptococcus)); protozoa such as Sporozoans (e.g., Plasmodium), Rhizopods (e.g., Endomoebae), and flagellates (Trypanosoma, Leishmania, Trichomonas, Giardia, etc.); viruses such as (+) RNA viruses (examples include poxviruses, such as vaccinia virus; picornaviruses, such as poliovirus; togaviruses, such as rubella virus; flaviviruses such as HCV; and coronaviruses), (-) RNA viruses (e.g. rhabdoviruses such as VSV; paramyxoviruses such as RSV; orthomyxoviruses such as influenza; bunyaviruses and sand granuloviruses), dsDNA viruses (such as reoviruses), RNA-to-DNA viruses, i.e. retroviruses such as HIV and HTLV, and certain DNA-to-RNA viruses such as hepatitis B virus, use unnatural amino acids in a Substitution of one or more natural amino acids in one or more vaccine proteins.

农业相关蛋白,如昆虫抗性蛋白(例如,Cry蛋白)、淀粉和脂质生产酶、植物和昆虫毒素、毒素抗性蛋白、真菌毒素解毒蛋白、植物生长酶(例如,核酮糖1,5-二磷酸羧化酶/加氧酶“RUBlSCO”)、脂肪氧合酶(LOX)和磷酸烯醇丙酮酸(PEP)羧化酶也是非天然氨基酸修饰的合适靶。Agriculture-related proteins, such as insect resistance proteins (e.g., Cry proteins), starch and lipid production enzymes, plant and insect toxins, toxin resistance proteins, mycotoxin detoxification proteins, plant growth enzymes (e.g., ribulose 1,5 - Bisphosphate carboxylase/oxygenase "RUB1SCO"), lipoxygenase (LOX) and phosphoenolpyruvate (PEP) carboxylase are also suitable targets for unnatural amino acid modification.

本发明也提供在真核细胞中生产至少一种含有至少一个非天然氨基酸的蛋白的方法(和该方法生产的蛋白)。例如,方法包括:在合适的培养基中培养含有核酸的真核细胞,该核酸包含至少一个选择密码子并编码该蛋白。该真核细胞也包含:在细胞中起作用并识别选择密码子的正交tRNA(O-tRNA);和优选地氨酰化具有非天然氨基酸的O-tRNA的正交氨酰基tRNA合成酶(O-RS)和含有非天然氨基酸的培养基。The invention also provides methods of producing at least one protein comprising at least one unnatural amino acid (and proteins produced by such methods) in eukaryotic cells. For example, the method comprises: culturing in a suitable medium a eukaryotic cell containing a nucleic acid comprising at least one selector codon and encoding the protein. The eukaryotic cell also comprises: an orthogonal tRNA (O-tRNA) that functions in the cell and recognizes a selector codon; and an orthogonal aminoacyl tRNA synthetase that preferably aminoacylates the O-tRNA with an unnatural amino acid ( O-RS) and media containing unnatural amino acids.

在一个实施方式中,该方法还包括将非天然氨基酸掺入蛋白中,其中非天然氨基酸包含第一活性基团;然后将该蛋白与含有第二活性基团的分子(例如,染料、聚合物如聚乙二醇的衍生物、光交联剂、细胞毒化合物、亲和标记、生物素的衍生物、树脂、第二个蛋白或多肽、金属螯合剂、辅因子、脂肪酸、碳水化合物、多核苷酸(例如,DNA、RNA等)等)接触。第一活性基团与第二活性基团反应,使该分子通过[3+2]环加成附着到非天然氨基酸上。在一个实施方式中,第一活性基团是炔基或叠氮基部分,第二活性基团是叠氮基或炔基部分。例如,第一活性基团是炔基部分(例如,在非天然氨基酸对-炔丙基氧基苯丙氨酸),第二活性基团是叠氮基部分。在另一实施例中,第一活性基团是叠氮基部分(例如,在非天然氨基酸对-叠氮基-L-苯丙氨酸),第二活性基团是炔基部分。In one embodiment, the method further comprises incorporating an unnatural amino acid into a protein, wherein the unnatural amino acid comprises a first reactive group; and then combining the protein with a molecule (e.g., dye, polymer, Such as polyethylene glycol derivatives, photocrosslinkers, cytotoxic compounds, affinity tags, biotin derivatives, resins, second proteins or peptides, metal chelators, cofactors, fatty acids, carbohydrates, polynuclear Nucleotide (eg, DNA, RNA, etc.) etc.) contact. The first reactive group reacts with the second reactive group to attach the molecule to the unnatural amino acid via a [3+2] cycloaddition. In one embodiment, the first reactive group is an alkynyl or azido moiety and the second reactive group is an azido or alkynyl moiety. For example, the first reactive group is the alkynyl moiety (eg, in the unnatural amino acid p-propargyloxyphenylalanine) and the second reactive group is the azido moiety. In another embodiment, the first reactive group is an azido moiety (eg, in the unnatural amino acid p-azido-L-phenylalanine) and the second reactive group is an alkynyl moiety.

在一个实施方式中,O-RS氨酰化具有非天然氨基酸的O-tRNA的效率相当于具有例如,SEQ ID NO.:86或45中所列氨基酸序列的O-RS的效率的至少50%。在另一实施方式中,O-tRNA包含由SEQ ID NO.:65或64加工或编码的,或它们的互补多核苷酸序列。在又一实施方式中,O-RS包含SEQ ID NO.:36-63(例如,36-47、48-63或36-63的任意其它亚组)和/或86中任意一个所列的氨基酸。In one embodiment, the O-RS aminoacylates an O-tRNA with an unnatural amino acid at least 50% as efficiently as an O-RS having an amino acid sequence set forth in, for example, SEQ ID NO.: 86 or 45 . In another embodiment, the O-tRNA comprises processed or encoded by SEQ ID NO.: 65 or 64, or their complementary polynucleotide sequences. In yet another embodiment, the O-RS comprises any one of the amino acids listed in SEQ ID NO.: 36-63 (e.g., 36-47, 48-63, or any other subgroup of 36-63) and/or 86 .

该编码蛋白可包括,例如,治疗蛋白、诊断蛋白、工业酶或它们的一部分。任选地,通过非天然氨基酸进一步修饰该方法生产的蛋白。例如,通过至少一个翻译后修饰在体内任选地修饰的方法生产的蛋白。The encoded protein can include, for example, a therapeutic protein, a diagnostic protein, an industrial enzyme, or a portion thereof. Optionally, proteins produced by this method are further modified by unnatural amino acids. For example, a protein produced by a method that is optionally modified in vivo by at least one post-translational modification.

也提供了生产筛选或选择转录调节蛋白的方法(和用这种方法生产的筛选或选择转录调节蛋白)。例如,方法包括:选择第一个多核苷酸序列,其中的多核苷酸序列编码核酸结合域;并将第一个多核苷酸序列突变以包括至少一个选择密码子。这提供筛选或选择多核苷酸序列。该方法也包括:选择第二个多核苷酸序列,其中第二个多核苷酸序列编码转录激活域;提供含有可操作地连接于第二个多核苷酸序列的筛选或选择多核苷酸序列的构建物;和将该构建物、非天然氨基酸、正交tRNA合成酶(O-RS)和正交tRNA(O-tRNA)引入细胞。响应于筛选或选择多核苷酸序列中的选择密码子,O-RS凭借这些组件优选地氨酰化具有非天然氨基酸的O-tRNA,O-tRNA识别选择密码子并将非天然氨基酸掺入核酸结合域中,从而提供筛选或选择转录调节蛋白。Also provided are methods of producing screening or selecting transcriptional modulating proteins (and screening or selecting transcriptional modulating proteins produced by such methods). For example, the method comprises: selecting a first polynucleotide sequence, wherein the polynucleotide sequence encodes a nucleic acid binding domain; and mutating the first polynucleotide sequence to include at least one selector codon. This provides screening or selection of polynucleotide sequences. The method also includes: selecting a second polynucleotide sequence, wherein the second polynucleotide sequence encodes a transcriptional activation domain; providing a polynucleotide comprising a screening or selection polynucleotide sequence operably linked to the second polynucleotide sequence a construct; and introducing the construct, the unnatural amino acid, an orthogonal tRNA synthetase (O-RS), and an orthogonal tRNA (O-tRNA) into a cell. In response to screening or selection of a selector codon in a polynucleotide sequence, the O-RS utilizes these modules to preferentially aminoacylate the O-tRNA with the unnatural amino acid, the O-tRNA recognizes the selector codon and incorporates the unnatural amino acid into the nucleic acid Binding domains, thereby providing screening or selection of transcriptional regulator proteins.

在某些实施方式中,本方法中感兴趣的蛋白或多肽(或其部分)和/或本发明组合物由核酸编码。一般地,该核酸包含至少一个选择密码子、至少两个选择密码子、至少三个选择密码子、至少四个选择密码子、至少五个选择密码子、至少六个选择密码子、至少七个选择密码子、至少八个选择密码子、至少九个选择密码子、十个或更多选择密码子。In certain embodiments, the protein or polypeptide of interest (or portion thereof) in the methods and/or compositions of the invention is encoded by a nucleic acid. Typically, the nucleic acid comprises at least one selector codon, at least two selector codons, at least three selector codons, at least four selector codons, at least five selector codons, at least six selector codons, at least seven Selector codons, at least eight selector codons, at least nine selector codons, ten or more selector codons.

可以用本领域技术人员熟知的方法以及本文描述的“诱变和其它分子生物学技术”诱变编码感兴趣的蛋白或多肽的基因,以包括,例如,一个或多个用于掺入非天然氨基酸的选择密码子。例如,将用于感兴趣的蛋白的核酸突变,以包括一个或多个选择密码子,提供一个或多个非天然氨基酸的插入。本发明包括任意所述变体,例如,突变体,任意蛋白的形式,例如,包括至少一个非天然氨基酸。类似地,本发明也包括相应的核酸,即任何具有一个或多个选择密码子的核酸,该核酸编码一种或多种非天然氨基酸。A gene encoding a protein or polypeptide of interest can be mutagenized using methods well known to those skilled in the art, as well as "mutagenesis and other molecular biology techniques" described herein, to include, for example, one or more genes for the incorporation of non-natural Selector codons for amino acids. For example, a nucleic acid for a protein of interest is mutated to include one or more selector codons, providing for the insertion of one or more unnatural amino acids. The invention includes any such variant, eg, mutant, form of any protein, eg, comprising at least one unnatural amino acid. Similarly, the invention also includes corresponding nucleic acids, ie, any nucleic acid having one or more selector codons, which encode one or more unnatural amino acids.

在一个示例性实施方式中,本发明提供了组合物(和本发明方法生产的组合物),包括Thr44、GAL4的Arg110TAG突变体,其中GAL4蛋白包括至少一个非天然氨基酸。在另一实施方式中,本发明提供了包括人超氧化物歧化酶(hSOD)的Trp33TAG突变体的组合物,其中hSOD蛋白包括至少一个非天然氨基。In an exemplary embodiment, the invention provides compositions (and compositions produced by the methods of the invention) comprising Thr44, an Arg110TAG mutant of GAL4, wherein the GAL4 protein includes at least one unnatural amino acid. In another embodiment, the present invention provides a composition comprising a Trp33TAG mutant of human superoxide dismutase (hSOD), wherein the hSOD protein includes at least one unnatural amino group.

纯化含有非天然氨基酸的重组蛋白Purification of recombinant proteins containing unnatural amino acids

可以根据本领域技术人员已知和使用的标准步骤纯化本发明的蛋白,例如,含有非天然氨基酸的蛋白,含有非天然氨基酸的蛋白的抗体等,达到部分或基本同质性。因此,可以通过本领域熟知的许多方法中任意一种回收并纯化本发明多肽,包括,例如,硫酸铵或乙醇沉淀、酸或碱抽提、柱层析、亲和柱层析、阴离子或阳离子交换层析、磷酸纤维素层析、疏水作用层析、羟基磷灰石层析、凝集素层析、凝胶电泳等。在生成正确折叠的成熟蛋白中可以按需使用蛋白再折叠步骤。在最后需要高纯度的纯化步骤中,可以使用高效液相层析(HPLC)、亲和层析或其他合适方法。在一个实施方式中,将抗非天然氨基酸(或含有非天然氨基酸的蛋白)的抗体用作纯化试剂,例如,用于基于亲和力的蛋白纯化,该蛋白含有一个或多个非天然氨基酸。一旦纯化,按需达到部分同质性或同质性,则将多肽任选地用作,例如测定组件、治疗剂或用作生产抗体的免疫原。Proteins of the invention, eg, proteins containing unnatural amino acids, antibodies to proteins containing unnatural amino acids, etc., can be purified to partial or substantial homogeneity according to standard procedures known and used by those skilled in the art. Accordingly, polypeptides of the invention can be recovered and purified by any of a number of methods well known in the art, including, for example, ammonium sulfate or ethanol precipitation, acid or base extraction, column chromatography, affinity column chromatography, anionic or cationic Exchange chromatography, phosphocellulose chromatography, hydrophobic interaction chromatography, hydroxyapatite chromatography, lectin chromatography, gel electrophoresis, etc. Protein refolding steps can be used as desired in generating a properly folded mature protein. In the final purification step where high purity is desired, high performance liquid chromatography (HPLC), affinity chromatography or other suitable methods can be used. In one embodiment, antibodies raised against unnatural amino acids (or proteins comprising unnatural amino acids) are used as purification reagents, eg, for affinity-based purification of proteins comprising one or more unnatural amino acids. Once purified, to partial homogeneity or homogeneity as required, the polypeptide is optionally used, eg, as an assay component, a therapeutic agent, or as an immunogen for the production of antibodies.

除了本文中引用的其它参考文献外,各种纯化/蛋白折叠方法都是本领域熟知的,这些方法包括,例如,R.Scopes,《蛋白纯化》(Protein Purification),Springer-Verlag,N.Y.(1982);Deutscher,《酶学方法》(Methods in Enzymology)第182卷:“蛋白纯化指南”(Guide to Protein Purification),Academic Press,Inc.N.Y.(1990);Sandana(1997)《蛋白的生物分离》(Bioseparation of Proteins),Academic Press,Inc.;Bollag等(1996)《蛋白方法》(Protein Methods)第2版,Wiley-Liss,NY;Walker(1996)《蛋白操作程序手册》(The Protein ProtocolsHandbook)Humana Press,NJ,Harris和Angal(1990)《蛋白纯化应用:实用方法》(Protein Purification Applications:A Practical Approach)Oxford的IRL Press,Oxford,England;Harris和Angal《蛋白纯化方法:实用方法》(Protein PurificationMethods:A Practical Approach)Oxford的IRL Press,Oxford,England;Scopes(1993)《蛋白纯化:原理和实践》(Protein Purification:Principles and Practice)第3版SpringerVerlag,NY;Janson和Ryden(1998)《蛋白纯化:原理、高分辨率方法和应用》(Protein Purification:Principles,High Resolution Methods andApplications),第二版,Wiley-VCH,NY;和Walker(1998)《CD-ROM上的蛋白操作程序》(Protein Protocols on CD-ROM)Humana Press,NJ;和其中引用的参考文献中所列方法。Various purification/protein folding methods are well known in the art, including, for example, R. Scopes, "Protein Purification" (Protein Purification), Springer-Verlag, N.Y. (1982), in addition to other references cited herein. ); Deutscher, Methods in Enzymology Vol. 182: "Guide to Protein Purification", Academic Press, Inc. N.Y. (1990); Sandana (1997) Bioseparation of Proteins (Bioseparation of Proteins), Academic Press, Inc.; Bollag et al. (1996) "Protein Methods" (Protein Methods) 2nd Edition, Wiley-Liss, NY; Walker (1996) "The Protein Protocols Handbook" Humana Press, NJ, Harris and Angal (1990) Protein Purification Applications: A Practical Approach IRL Press, Oxford, England; Harris and Angal Protein Purification Applications: A Practical Approach Purification Methods: A Practical Approach) Oxford's IRL Press, Oxford, England; Scopes (1993) Protein Purification: Principles and Practice 3rd Edition SpringerVerlag, NY; Janson and Ryden (1998) Protein Purification: Principles and Practice Purification: Principles, High Resolution Methods and Applications, Second Edition, Wiley-VCH, NY; and Walker (1998) Protein Manipulation on CD-ROM Protocols on CD-ROM) Humana Press, NJ; and the methods listed in the references cited therein.

在真核细胞中用非天然氨基酸生产感兴趣的蛋白或多肽的一个优点是该蛋白或多肽一般以它们的原始构象折叠。然而,在本发明的某些实施方式中,本领域技术人员将认识到,合成、表达和/或纯化后,蛋白可具有与相关多肽所需构象不同的构象。在本发明的一个方面,表达蛋白任选地变性,然后复性。这是通过例如,将侣伴蛋白加入感兴趣的蛋白或多肽和/或通过在离液剂如盐酸胍中使蛋白溶解等完成的。One advantage of using unnatural amino acids to produce proteins or polypeptides of interest in eukaryotic cells is that the proteins or polypeptides generally fold in their native conformation. However, in certain embodiments of the invention, those skilled in the art will recognize that after synthesis, expression and/or purification, proteins may have a different conformation than the desired conformation of the associated polypeptide. In one aspect of the invention, the expressed protein is optionally denatured and then renatured. This is accomplished, for example, by adding chaperones to the protein or polypeptide of interest and/or by solubilizing the protein in a chaotropic agent such as guanidine hydrochloride, etc.

通常,偶而需要将表达多肽变性并还原,然后使多肽再折叠成优选构象。例如,可以将胍、尿素、DTT、DTE和/或侣伴蛋白加入感兴趣的的翻译产物。还原、变性和复性蛋白的方法是本领域技术人员熟知的(参见上述参考文献,以及Debinski,等(1993)J.Biol.Chem.,268:14065-14070;Kreitman和Pastan(1993)Bioconjug.Chem.,4:581-585;和Buchner,等,(1992)Anal.Biochem.,205:263-270)。例如,Debinski,等描述了在胍-DTE中变性和还原包含体蛋白。蛋白可以在含有,例如氧化谷胱甘肽和L-精氨酸的氧化还原缓冲液中再折叠。再折叠试剂可以流动或移动至与一种或多种多肽或其它表达产物接触,反之亦然。Often, it is occasionally necessary to denature and reduce the expressed polypeptide and then refold the polypeptide into a preferred conformation. For example, guanidine, urea, DTT, DTE and/or chaperones can be added to the translation product of interest. Methods of reducing, denaturing and refolding proteins are well known to those skilled in the art (see references above, and Debinski, et al. (1993) J. Biol. Chem., 268: 14065-14070; Kreitman and Pastan (1993) Bioconjug. Chem., 4:581-585; and Buchner, et al., (1992) Anal. Biochem., 205:263-270). For example, Debinski, et al. describe denaturation and reduction of inclusion body proteins in guanidine-DTE. Proteins can be refolded in redox buffers containing, for example, oxidized glutathione and L-arginine. Refolding reagents can flow or move into contact with one or more polypeptides or other expression products, and vice versa.

抗体Antibody

在一个方面,本发明提供了本发明分子,例如合成酶、tRNA和包含非天然氨基酸的蛋白的抗体。将本发明分子的抗体用作纯化试剂,例如,用于纯化本发明分子。此外,抗体可用作指示试剂来指示合成酶、tRNA或包含非天然氨基酸的蛋白的存在,例如,以追踪分子的存在或定位(例如,体内或原位)。In one aspect, the invention provides antibodies to molecules of the invention, such as synthetases, tRNAs, and proteins comprising unnatural amino acids. Antibodies to molecules of the invention are used as purification reagents, eg, to purify molecules of the invention. In addition, antibodies can be used as indicator reagents to indicate the presence of synthetases, tRNAs, or proteins comprising unnatural amino acids, eg, to track the presence or localization of molecules (eg, in vivo or in situ).

本发明的抗体可以是包含一个或多个基本或部分由免疫球蛋白基因或免疫球蛋白基因的片段编码多肽的蛋白。公认的免疫球蛋白基因包括κ、λ、α、γ、δ、ε和υ恒定区基因,以及无数的免疫球蛋白可变区基因。轻链分类为κ或λ。重链分类为γ、υ、α、δ或ε,它们分别依次定义免疫球蛋白类型IgG、IgM、IgA、IgD和IgE。一种典型的免疫球蛋白(如抗体)的结构单位包含四聚体。各四聚体由两个相同的多肽链对组成,各对有一条“轻链”(约25kD)和一条“重链”(约50-70kD)。各链的N末端确定约100-110或更多氨基酸的可变区,主要负责抗原识别。术语可变轻链(VL)和可变重链(VH)分别指这些轻链和重链。An antibody of the invention may be a protein comprising one or more polypeptides substantially or partially encoded by immunoglobulin genes or fragments of immunoglobulin genes. Recognized immunoglobulin genes include the kappa, lambda, alpha, gamma, delta, epsilon, and υ constant region genes, as well as a myriad of immunoglobulin variable region genes. Light chains are classified as either kappa or lambda. Heavy chains are classified as gamma, υ, alpha, delta, or epsilon, which in turn define the immunoglobulin classes IgG, IgM, IgA, IgD, and IgE, respectively. The structural unit of a typical immunoglobulin, such as an antibody, comprises a tetramer. Each tetramer is composed of two identical pairs of polypeptide chains, each pair having a "light chain" (about 25 kD) and a "heavy chain" (about 50-70 kD). The N-terminus of each chain defines a variable region of about 100-110 or more amino acids primarily responsible for antigen recognition. The terms variable light chain (VL) and variable heavy chain (VH) refer to these light and heavy chains, respectively.

抗体以完整的免疫球蛋白或以用不同肽酶消化产生许多良好表征的片段存在。因此,例如,胃蛋白酶在铰链区中的二硫键下面消化抗体,产生F(ab’)2,Fab的二聚体,其本身是由二硫键连接于VH-CH1的轻链。可在温和条件下还原F(ab’)2以打断铰链区中的二硫键,从而将F(ab’)2二聚体转化为Fab’单体。Fab’单体实质上是具有部分铰链区的Fab(对其它抗体片段的更详细描述参见,《基础免疫学》(Fundermental Immunology),第四版,W.E.Paul编,Raven Press,N.Y.(1999))。虽然根据完整抗体的消化定义了不同抗体片段,但是本领域技术人员将理解,也可用化学方法或通过利用重组DNA的方法从头合成所述Fab’片段等。因此,本文中所用术语抗体,也任选地包括通过全抗体修饰或用重组DNA方法从头合成所产生的抗体片段。抗体包括单链抗体,包括单链Fv(sFv或scFv)抗体,其中由可变重链和可变轻链连接在一起(直接或经由肽接头)形成连续的多肽。本发明抗体可以是,例如,多克隆、单克隆、嵌合、人源化、单链、Fab片段、由Fab表达文库产生的片段等。Antibodies exist as intact immunoglobulins or as a number of well-characterized fragments produced by digestion with different peptidases. Thus, for example, pepsin digests an antibody below the disulfide bonds in the hinge region, yielding F(ab') 2 , a dimer of Fab, itself a light chain linked by a disulfide bond to VH - CH1 . The F(ab') 2 can be reduced under mild conditions to break the disulfide bonds in the hinge region, thereby converting the F(ab') 2 dimer to a Fab' monomer. A Fab' monomer is essentially a Fab with a portion of the hinge region (for a more detailed description of other antibody fragments, see Fundermental Immunology, Fourth Edition, ed. WE Paul, Raven Press, NY (1999)). While different antibody fragments are defined in terms of digestion of an intact antibody, those skilled in the art will appreciate that such Fab' fragments, etc., can also be synthesized de novo chemically or by methods utilizing recombinant DNA. Accordingly, the term antibody, as used herein, also optionally includes antibody fragments produced by modification of whole antibodies or de novo synthesis using recombinant DNA methods. Antibodies include single chain antibodies, including single chain Fv (sFv or scFv) antibodies, in which a variable heavy chain and a variable light chain are linked together (directly or via a peptide linker) to form a continuous polypeptide. Antibodies of the invention can be, for example, polyclonal, monoclonal, chimeric, humanized, single chain, Fab fragments, fragments produced by a Fab expression library, and the like.

通常,本发明抗体在各种分子生物或药学方法中用作一般试剂和治疗试剂是有价值的。生产多克隆和单克隆抗体的方法是可用的,可以应用于生产本发明抗体。许多基础教科书描述了标准的抗体生产方法,包括,例如,Borrebaeck(编)(1995)《抗体工程》(Antibody Engineering),第二版,Freeman and Company,NY(Borrebaeck);McCafferty等(1996)《抗体工程,实用方法》(Antibody Engineering,A Practical Approach)Oxford Press的IRL,Oxford,England(McCafferty),和Paul(1995)《抗体工程方案》(Antibody Engineering Protocols)Humana press,Towata,NJ(Paul);Paul(编),(1999)《基础免疫学》(Fundamental Immunology),第五版Raven Press,N.Y.;Coligan(1991)《新编免疫学实验指南》(CurrentProtocols in Immunology)Wiley/Greene,NY;Harlow和Lane(1989)《抗体:实验室手册》(Antibodies:A Laboratory Manual)Cold Harbor Press,NY;Stites等(编)《基础和临床免疫学》(Basic and Clinical Immunology)(第四版)LangeMedical Publications,Los Altos,CA,和其中引用的参考文献;Goding(1986)《单克隆抗体:原理和实践》(Monoclonal Antibodies:Principles andPractice)(第二版)Academic Press,New York,NY;以及Kohler和Milstein(1975)Nature 256:495-497。In general, the antibodies of the present invention are valuable as general reagents and therapeutic reagents in various molecular biological or pharmaceutical methods. Methods for producing polyclonal and monoclonal antibodies are available and can be applied to produce the antibodies of the invention. Standard antibody production methods are described in many basic textbooks, including, for example, Borrebaeck (ed.) (1995) Antibody Engineering, Second Edition, Freeman and Company, NY (Borrebaeck); McCafferty et al. (1996) Antibody Engineering, A Practical Approach) Oxford Press IRL, Oxford, England (McCafferty), and Paul (1995) Antibody Engineering Protocols Humana press, Towata, NJ (Paul) ; Paul (ed.), (1999) "Fundamental Immunology", Fifth Edition Raven Press, N.Y.; Coligan (1991) "Current Protocols in Immunology" (Current Protocols in Immunology) Wiley/Greene, NY; Harlow and Lane (1989) Antibodies: A Laboratory Manual Cold Harbor Press, NY; Stites et al. (eds.) Basic and Clinical Immunology (Fourth Edition) LangeMedical Publications, Los Altos, CA, and references cited therein; Goding (1986) Monoclonal Antibodies: Principles and Practice (Second Edition) Academic Press, New York, NY; and Kohler and Milstein (1975) Nature 256:495-497.

已经开发了用于不依赖于如向动物注射抗原的抗体制备的各种重组技术,它们可以用于本发明内容中。例如,可能在噬菌体或类似载体中产生并选择重组抗体文库。参见,例如,Winter等(1994)通过噬菌体展示技术生产抗体(Making Antibodiesby Phage Display Technology),Annu.Rev.Immunol.12:433-455及其引用作综述的参考文献。也参见,Griffiths和Duncan(1998)通过噬菌体展示选择抗体的策略(Strategies for selection of antibodies by phage display),Curr OpinBiotechnol 9:102-108;Hoogenboom等(1998)抗体噬菌体展示技术及其应用(Antibody phage display technology and its applications),Immunotechnology4:1-20;Gram等(1992)从幼稚的组合免疫球蛋白文库中体外选择和亲和成熟抗体(in vitro selection and affinity maturation of antibodies from a naivecombinatorial immunoglobulin library)PNAS 89:3576-3580;Huse等(1989)Science 246:1275-1281;和Ward等(1989)Nature 341:544-546。Various recombinant techniques have been developed for antibody production that do not rely on, for example, injection of an antigen into an animal and can be used in the context of the present invention. For example, recombinant antibody libraries may be generated and selected in phage or similar vectors. See, eg, Winter et al. (1994) Making Antibodies by Phage Display Technology, Annu. Rev. Immunol. 12:433-455 and references cited therein for review. See also, Griffiths and Duncan (1998) Strategies for selection of antibodies by phage display (Strategies for selection of antibodies by phage display), Curr Opin Biotechnol 9: 102-108; Hoogenboom et al. (1998) Antibody phage display technology and its application (Antibody phage display technology and its applications), Immunotechnology 4:1-20; Gram et al. (1992) In vitro selection and affinity maturation of antibodies from a naive combinatorial immunoglobulin library (in vitro selection and affinity maturation of antibodies from a naive combinatorial immunoglobulin library) PNAS 89 : 3576-3580; Huse et al. (1989) Science 246: 1275-1281; and Ward et al. (1989) Nature 341: 544-546.

在一个实施方式中,抗体文库可包含V基因的所有组成成分(如,从淋巴细胞群中收集或体外组装),将其克隆,用于在丝状噬菌体的表面上展示相关重链和轻链可变域。通过与抗原结合选择噬菌体。由感染噬菌体的细菌表达可溶性抗体,例如通过诱变来改良该抗体。参见,如Balint和Larrick(1993)通过简约诱变进行的抗体工程(Antibody Engineering by Parsimonious Mutagenesis)Gene 137:109-118;Stemmer等(1993)通过酶反向PCR从蛋白接头文库制备活性单链Fv抗体的选择(Selection of an Active Single Chain Fv Antibody From a Protein Linker LibraryPrepared by Enzymatic Inverse PCR)Biotechniques 14(2):256-65;Crameri等(1996)通过DNA改组构建并发展抗体-噬菌体文库(Construction and evolutionof antibody-phage libraries by DNA shuffling)Nature Medicine 2:100-103;和Crameri和Stemmer(1995)组合的多盒式诱变建立了突变型和野生型盒的所有变换(Combinatorial multiple cassette mutagenesis creates all the permutationsof mutant and wildtype cassettes)BioTechniques 18:194-195。In one embodiment, an antibody library may comprise a repertoire of V genes (e.g., collected from a lymphocyte population or assembled in vitro), cloned for display of the relevant heavy and light chains on the surface of filamentous bacteriophage Variable domain. Phage are selected by binding to antigen. Soluble antibodies are expressed by phage-infected bacteria, which are modified, for example, by mutagenesis. See, eg, Balint and Larrick (1993) Antibody Engineering by Parsimonious Mutagenesis Gene 137: 109-118; Stemmer et al. (1993) Preparation of active single-chain Fv from a protein linker library by enzymatic inverse PCR Antibody selection (Selection of an Active Single Chain Fv Antibody From a Protein Linker Library Prepared by Enzymatic Inverse PCR) Biotechniques 14(2): 256-65; Crameri et al. (1996) constructed and developed antibody-phage library (Construction and evolution of antibody-phage libraries by DNA shuffling) Nature Medicine 2:100-103; and Crameri and Stemmer (1995) combined multi-cassette mutagenesis creates all transformations of mutant and wild-type cassettes (Combinatorial multiple cassette mutagenesis creates all the permutations of mutant and wildtype cassettes) BioTechniques 18: 194-195.

用于克隆和表达重组抗体噬菌体系统的试剂盒也是已知和可用的,例如,产自Amersham-Pharmacia Biotechnology(Uppsala,Sweden)的“重组噬菌体抗体系统,小鼠ScFv模块”(recombinant phage antibody system,mouse ScFv module)。噬菌体抗体文库也用于通过链改组生产高亲和人源抗体(参见,例如,Marks等(1992)旁路免疫:通过链改组构建高亲和人抗体(By-Passing Immunization:BuildingHigh Affinity Human Antibodies by Chain Shuffling)Biotechniques 10:779-782。也公认的是,可通过许多商业服务的任何一个制备抗体(如Bethyl Laboratories(Montgomery,TX)、Anawa(Switzerland)、Eurogentec(比利时和在美国宾夕法尼亚州费城等)和许多其它公司。Kits for cloning and expressing recombinant antibody phage systems are also known and available, e.g., "recombinant phage antibody system, mouse ScFv module" from Amersham-Pharmacia Biotechnology (Uppsala, Sweden) mouse ScFv module). Phage antibody libraries have also been used to produce high affinity human antibodies by chain shuffling (see, e.g., Marks et al. (1992) By-Passing Immunization: Building High Affinity Human Antibodies by Chain Shuffling Chain Shuffling) Biotechniques 10: 779-782. It is also recognized that antibodies can be prepared by any of a number of commercial services (e.g. Bethyl Laboratories (Montgomery, TX), Anawa (Switzerland), Eurogentec (Belgium and in Philadelphia, PA, USA, etc. ) and many others.

在某些实施方式中,本发明“人源化“抗体是有用的,如,抗体用于治疗施用时。人源化抗体的使用趋于减少对治疗抗体的不需要的免疫反应的发生率(如,当患者是人时)。上述抗体参考文献描述了人源化策略。除了人源化抗体外,人抗体也是本发明的特征。人抗体由特征性的人免疫球蛋白序列组成。人抗体可通过各种方法生产(参见,例如,Larrick等,美国专利5,001,065作为综述)。通过三体杂交瘤(trioma)技术生产人抗体的一般方法由Ostberg等,(1983),Hybridoma 2:361-367,Ostberg,美国专利4,634,664,和Engelman等,美国专利4,634,666描述。已知在纯化和检测蛋白中使用抗体的各种方法,这些方法可以应用于检测和纯化如本文所述含有非天然氨基酸的蛋白质。通常,抗体对酶联免疫吸附反应、Western印迹、免疫化学、亲和层析法、SPR和很多其他方法是有用的试剂。上述参考文献提供如何进行酶联免疫吸附反应、Western印迹、表面胞质团共振(SPR)等的细节。In certain embodiments, "humanized" antibodies of the invention are useful, eg, when the antibodies are used for therapeutic administration. The use of humanized antibodies tends to reduce the incidence of unwanted immune responses to therapeutic antibodies (eg, when the patient is human). The above antibody references describe humanization strategies. In addition to humanized antibodies, human antibodies are also a feature of the invention. Human antibodies consist of sequences characteristic of human immunoglobulins. Human antibodies can be produced by various methods (see, eg, Larrick et al., US Patent 5,001,065 for a review). General methods for the production of human antibodies by trioma technology are described by Ostberg et al., (1983), Hybridoma 2:361-367, Ostberg, US Patent 4,634,664, and Engelman et al., US Patent 4,634,666. Various methods are known for the use of antibodies in the purification and detection of proteins, which methods can be applied to detect and purify proteins containing unnatural amino acids as described herein. In general, antibodies are useful reagents for ELISA, Western blotting, immunochemistry, affinity chromatography, SPR, and many other methods. The above references provide details on how to perform ELISA, Western blotting, surface plasmon resonance (SPR), etc.

在本发明的一个方面,本发明抗体本身包括非天然氨基酸,提供了具有感兴趣的性质的抗体(例如改进的半衰期、稳定性、毒性等)。亦参见,本文中题为“具有非天然氨基酸的多肽”部分。抗体占目前临床试验中所有化合物的接近50%(Wittrup,(1999)噬菌体展示Tibtech 17:423-424,抗体普遍用作诊断试剂。因此,用非天然氨基酸修饰抗体的能力为修饰这些有价值的试剂提供了重要的工具。In one aspect of the invention, the antibodies of the invention themselves include unnatural amino acids, providing antibodies with properties of interest (eg, improved half-life, stability, toxicity, etc.). See also, the section herein entitled "Polypeptides with Unnatural Amino Acids". Antibodies account for nearly 50% of all compounds currently in clinical trials (Wittrup, (1999) Phage Display Tibtech 17: 423-424, and antibodies are commonly used as diagnostic reagents. Therefore, the ability to modify antibodies with unnatural amino acids is a key factor in modifying these valuable compounds. Reagents provide important tools.

例如,Mab在诊断领域中有很多应用。从简单的斑点试验到涉及面更广的测定方法如来自DuPont Merck Co.的放射性标记的NR-LU-10 Mab,它用于肿瘤成像(Rusch等(1993)NR-LU-10单克隆抗体扫描。计算断层显像评价非小细胞肺癌的有用新手段(NR-LU-10monoclonal antibody scanning.A helpful new adjunct to computedtomography in evaluating non-small-cell lung cancer),J Thorac CardiovascSurg 106:200-4)。如上所述,Mab是ELISA、Westerm印迹、免疫化学、亲和层析法等的中心试剂。可以修饰任何所述诊断抗体,包括一个或多个非天然氨基酸,改变,例如Ab对靶的特异性或亲和力,或例如,通过在非天然氨基酸中包括可检测标记(如光谱、荧光、发光等)改变一种或多种可检测的性质。For example, Mabs have many applications in the field of diagnostics. From simple spot assays to more extensive assays such as the radiolabeled NR-LU-10 Mab from DuPont Merck Co., which is used for tumor imaging (Rusch et al. (1993) NR-LU-10 monoclonal antibody scan A useful new tool for evaluating non-small cell lung cancer by computed tomography (NR-LU-10monoclonal antibody scanning. A helpful new adjunct to computedtomography in evaluating non-small-cell lung cancer), J Thorac CardiovascSurg 106:200-4). As mentioned above, Mab is a central reagent in ELISA, Western blot, immunochemistry, affinity chromatography, etc. Any such diagnostic antibody can be modified to include one or more unnatural amino acids, altering, for example, the specificity or affinity of the Ab for a target, or, for example, by including a detectable label (e.g., spectroscopic, fluorescent, luminescent, etc.) in the unnatural amino acid. ) alters one or more detectable properties.

一类有价值的抗体试剂是治疗抗体。例如,抗体可以是肿瘤特异性的Mab,它能够通过靶向肿瘤细胞抑制肿瘤生长,通过抗体依赖的细胞介导的细胞毒性(ADCC)或补体介导的裂解(CML)破坏肿瘤生长(这些通用型Ab有时称为”魔弹”)。一个例子是利妥苷(Rituxan),一种抗CD20 Mab,用于治疗非霍其金氏淋巴瘤(Scott(1998)利妥苷:一种治疗非霍其金氏淋巴瘤的新单克隆抗体(Rituximab:a newtherapeutic monoclonal antibody for non-Hodgkin’s lymphom),Cancer Pract6:195-7)。第二个例子涉及于扰肿瘤生长的关键组分的抗体。贺赛汀(Herceptin)是一种抗HER-2单克隆抗体,用于治疗转移性乳腺癌,并提供具有此种作用机制的抗体的例子(Baselga等,(1998)重组人源化抗HER2抗体(贺赛汀)增强紫杉醇和阿霉素对过表达HER2/neu的人乳腺癌异种移植瘤的抗肿瘤活性(Recombinanthumanized anti-HER2antibody(Herceptin)enhances the antitumor activity ofpaclitaxel and doxorubicin against HER2/neu overexpressing human breastcancer xenografts)[排错出版于Cancer Res(1999)59(8):2020],Cancer Res 58:2825-31)。第三个例子涉及直接将细胞毒化合物(毒素、放射性核素等)传递至肿瘤或其他感兴趣的部位的抗体。例如,一种应用Mab是CYT-356,90Y连接的抗体,它直接将放射靶向前列腺肿瘤细胞(Deb等(1996)用90Y-CYT-356单克隆抗体治疗激素耐治的前列腺癌(Treatment of hormone-refractory prostate cancer with90Y-CYT-356 monoclonal antibody)Clin Cancer Res 2:1289-97。第四个应用是抗体导向的酶前药疗法,其中共定位至肿瘤的酶在肿瘤附近激活全身给予的前药。例如,开发了连接于羧肽酶A的抗Ep-CAM1抗体,用于治疗结肠直肠癌(Wolfe等,(1999)用人羧肽酶A1的T268G突变体进行的抗体导向的酶前药疗法:前药氨甲蝶呤和胸苷酸合酶抑制剂GW1031和GW1843的体内外研究(Antibody-directed enzymeprodrug therapy with the T268G mutant of human carboxypeptidase Al:in vitroand in vivo studies with prodrugs of methotrexate and the thymidylate synthaseinhibitors GW1031 and GW1843),Bioconjug Chem 10:38-48)。将其它Ab(如拮抗剂)设计为特异性抑制正常细胞功能,以获得疗效。一个例子是正克隆OKT3,一种由Johnson and Johnson提供的抗CD3Mab,用于降低急性器官移植物排斥反应(Strate等(1990)正克隆OKT3作为一线治疗用于急性肾同种异体移植物排斥反应(Orthoclone OKT3 as first-line therapy in acute renal allograft rejection),Transplant Proc 22:219-20。另一类抗体制品为激动剂。将这些单克隆抗体设计为特异性增强正常细胞功能,以获得疗效。例如,用于精神病治疗的基于单抗的乙酰胆碱受体激动剂正在开发之中(Xie等(1997)通过鉴定激动剂ScFv直接证明MuSK参与乙酰胆碱受体簇集(Direct demonstration of MuSK involvement inacetylcholine receptor clustering through ident ification of agonist ScFv),Nat.Biotechnol.15:768-71。可将这些抗体中任意一种修饰成包含一个或多个非天然氨基酸,以增强一种或多种治疗性质(特异性、亲和力、血清半衰期等)。One class of valuable antibody reagents is therapeutic antibodies. For example, an antibody can be a tumor-specific Mab capable of inhibiting tumor growth by targeting tumor cells, disrupting tumor growth through antibody-dependent cell-mediated cytotoxicity (ADCC) or complement-mediated lysis (CML) (these general Type Ab is sometimes called a "magic bullet"). An example is rituxan, an anti-CD20 Mab, for the treatment of non-Hodgkin's lymphoma (Scott (1998) Rituxan: a new monoclonal antibody for non-Hodgkin's lymphoma (Rituximab: a new therapeutic monoclonal antibody for non-Hodgkin's lymphom), Cancer Pract6: 195-7). A second example involves antibodies that interfere with key components of tumor growth. Herceptin is an anti-HER-2 monoclonal antibody used in the treatment of metastatic breast cancer and provides examples of antibodies with this mechanism of action (Baselga et al., (1998) Recombinant humanized anti-HER2 antibody Recombinanthumanized anti-HER2 antibody (Herceptin) enhances the antitumor activity of paclitaxel and doxorubicin against HER2/neu overexpressing human breast cancer xenografts) [published in Cancer Res (1999) 59(8): 2020], Cancer Res 58: 2825-31). A third example involves antibodies that directly deliver cytotoxic compounds (toxins, radionuclides, etc.) to tumors or other sites of interest. For example, one applied Mab is CYT-356, a 90Y-linked antibody that directly targets radiation to prostate tumor cells (Deb et al. (1996) Treatment of hormone-resistant prostate cancer with 90Y-CYT-356 monoclonal antibody (Treatment of hormone-refractory prostate cancer with 90Y-CYT-356 monoclonal antibody) Clin Cancer Res 2:1289-97. A fourth application is antibody-directed enzyme prodrug therapy, in which enzymes co-localized to the tumor activate systemically administered prodrugs in the vicinity of the tumor. For example, an anti-Ep-CAM1 antibody linked to carboxypeptidase A was developed for the treatment of colorectal cancer (Wolfe et al., (1999) Antibody-directed enzyme prodrug therapy with the T268G mutant of human carboxypeptidase A1 : In vivo and in vitro studies of prodrugs methotrexate and thymidylate synthase inhibitors GW1031 and GW1843 GW1031 and GW1843), Bioconjug Chem 10:38-48). Other Abs (such as antagonists) have been designed to specifically inhibit normal cell function for therapeutic efficacy. An example is the positive clone OKT3, an anti CD3Mab, for reducing acute organ graft rejection (Strate et al (1990) Orthoclone OKT3 as first-line therapy in acute renal allograft rejection), Transplant Proc 22:219-20. Another class of antibody preparations are agonists. These monoclonal antibodies are designed to specifically enhance normal cellular function for therapeutic effects. For example, mAb-based acetylcholine receptor agonists for psychiatric treatment Under development (Xie et al. (1997) directly proved that MuSK involved in acetylcholine receptor clustering through identification of agonist ScFv (Direct demonstration of MuSK involvement inacetylcholine receptor clustering through identification of agonist ScFv), Nat.Biotechnol.15: 768-71 . Any of these antibodies can be modified to include one or more unnatural amino acids to enhance one or more therapeutic properties (specificity, affinity, serum half-life, etc.).

另一类抗体产品提供了新功能。这组中主要抗体是催化抗体,如工程改造以模拟酶催化能力的Ig序列(Wentworth和Janda(1998)催化抗体(Catalyticantibodies)Curr Opin Chem Biol 2:138-44)。例如,有趣的应用是在体内用催化抗体mAb-15A10水解可卡因以治疗成瘾(Mets等(1998)一种抗可卡因的催化抗体防止可卡因在大鼠中加强和毒性效应(A catalytic antibody against cocaineprevents cocaine’s reinforcing and toxic effects in rats),Proc Natl AcadSci U S A 95:10176-81)。也可修饰催化抗体,使其包含一个或多个非天然氨基酸,以改进一种或多种感兴趣的的性质。Another class of antibody products offers new capabilities. The primary antibodies in this group are catalytic antibodies, such as Ig sequences engineered to mimic the catalytic capabilities of enzymes (Wentworth and Janda (1998) Catalytic antibodies Curr Opin Chem Biol 2:138-44). For example, an interesting application is the in vivo hydrolysis of cocaine with the catalytic antibody mAb-15A10 to treat addiction (Mets et al. (1998) A catalytic antibody against cocaine prevents cocaine's potentiating and toxic effects in rats (A catalytic antibody against cocaine prevents cocaine's reinforcing and toxic effects in rats), Proc Natl AcadSci U S A 95:10176-81). Catalytic antibodies can also be modified to include one or more unnatural amino acids to modify one or more properties of interest.

通过免疫反应性定义多肽Defining Peptides by Immunoreactivity

因为本发明多肽提供了各种新多肽序列(如本文翻译系统中合成蛋白的情况下包含非天然的氨基酸,或如在本文新合成酶的情况下,标准氨基酸的新序列),这些多肽也提供了例如在免疫测定中可识别的新结构特性。抗体或特异性结合本文发明多肽的抗体的产生,以及抗体或抗血清结合的多肽均为本发明的特征。Because the polypeptides of the invention provide various novel polypeptide sequences (comprising unnatural amino acids as in the case of synthetic proteins in translation systems herein, or new sequences of standard amino acids as in the case of novel synthetases herein), these polypeptides also provide new structural properties recognizable, for example, in immunoassays. The production of antibodies or antibodies that specifically bind a polypeptide of the invention herein, as well as the antibody or antiserum-bound polypeptide are features of the invention.

例如,本发明包括与抗体或抗血清特异性结合的合成酶蛋白或它们与抗体或抗血清特异地免疫反应,产生包含选自(SEQ ID NO:36-63(例如,36-47、48-63或36-63的任何其它亚组)和/或86)中一个或多个的氨基酸序列的免疫原。为了消除与其他同源物的交叉反应性,用可用的对照合成酶同源物,例如野生型大肠杆菌酪氨酰合成酶(TyrRS)(如,SEQ ID NO.2)消减抗体或抗血清。For example, the invention includes synthetase proteins that specifically bind to antibodies or antisera or that specifically immunoreact with antibodies or antisera to produce proteins comprising (SEQ ID NO: 36-63 (e.g., 36-47, 48- 63 or any other subgroup of 36-63) and/or an immunogen of one or more of the amino acid sequences in 86). To eliminate cross-reactivity with other homologs, subtract antibodies or antisera with available control synthetase homologs, such as wild-type E. coli tyrosyl synthetase (TyrRS) (e.g., SEQ ID NO. 2).

在一种典型形式中,免疫测定使用多克隆抗血清,产生抗一种或多种多肽的抗血清,所述多肽包含对应于SEQ ID NO:36-63(如36-47、48-63或36-63的任意其它亚组)和/或86,或它们的实质性亚序列(如,提供至少约30%的全长序列)中一个或多个的一个或多个序列。这组来自SEQ ID NO:36-63和86的潜在多肽免疫原在下文中统称为”免疫原性多肽”。任选地选择所得的抗血清,以与对照合成酶同源物具有低交叉反应性,在多克隆抗血清用于免疫测定前,例如,通过用一种或多种合成酶同源物免疫吸附去除任何这种交叉反应性。In a typical format, the immunoassay uses polyclonal antisera raised against one or more polypeptides comprising a polypeptide corresponding to SEQ ID NO: 36-63 (e.g., 36-47, 48-63 or any other subgroup of 36-63) and/or 86, or one or more of one or more of a substantial subsequence thereof (eg, providing at least about 30% of the full-length sequence). The group of potential polypeptide immunogens from SEQ ID NO: 36-63 and 86 is hereinafter collectively referred to as "immunogenic polypeptides". The resulting antisera are optionally selected to have low cross-reactivity with control synthetase homologues, before the polyclonal antiserum is used in an immunoassay, e.g., by immunoadsorption with one or more synthetase homologues Any such cross-reactivity is removed.

为了生产用于免疫测定的抗血清,如本文所述生产并纯化一种或多种免疫原性多肽。例如,可以在重组细胞中生产重组蛋白。用与标准佐剂,如弗氏佐剂结合的免疫原性蛋白和标准小鼠免疫方案免疫(有关可用于确定特异性免疫反应的抗体产生、免疫测定形式和条件的标准描述参见,例如,Harlow和Lane(1988)《抗体,实验室手册》(Antibodies,A Laboratory Manual),Cold Spring HarborPublications,New York。本文也描述了抗体的附加参考和讨论,本文可应用于通过免疫反应性定义/检测多肽生产抗体)近交系的小鼠(因为小鼠的实际遗传同一性,结果更可重复,所以本测定使用这种小鼠)。或者,将来自本文公开序列的一种或多种合成或重组多肽共轭到载体蛋白,并用作免疫原。To produce antisera for use in immunoassays, one or more immunogenic polypeptides are produced and purified as described herein. For example, recombinant proteins can be produced in recombinant cells. Immunization with the immunogenic protein combined with a standard adjuvant, such as Freund's adjuvant, and a standard mouse immunization protocol (see, e.g., Harlow and Lane (1988) Antibodies, A Laboratory Manual, Cold Spring Harbor Publications, New York. Additional references and discussions of antibodies are also described herein, which may apply to the definition/detection of polypeptides by immunoreactivity Antibody-producing) inbred mice (which were used in this assay because the results were more reproducible due to the actual genetic identity of the mice). Alternatively, one or more synthetic or recombinant polypeptides from the sequences disclosed herein are conjugated to a carrier protein and used as an immunogen.

在免疫测定中,收集多克隆抗血清,并滴定抗免疫原性多肽,例如,用固体支持物上固定的一种或多种免疫原性蛋白进行固相免疫测定。选择、集中并用对照合成酶多肽消减滴度为106或更大的多克隆抗血清,以产生消减的、集中的、滴定的多克隆抗血清。In immunoassays, polyclonal antisera are collected and titrated against the immunogenic polypeptide, eg, a solid phase immunoassay with one or more immunogenic proteins immobilized on a solid support. Polyclonal antisera with a titer of 106 or greater are selected, pooled, and subtracted with a control synthetase polypeptide to generate subtracted, pooled, titrated polyclonal antisera.

测试消减的、集中的、滴定的多克隆抗血清在比较免疫测定中与对照同源物的交叉反应。在这个比较测定中,为消减的、集中的、滴定的多克隆抗血清测定差别结合条件,使滴定的多克隆抗血清结合到免疫原性合成酶的信噪比与结合到对照合成酶同源物相比高至少约5-10倍。也就是说,通过加入非特异性的竞争剂如清蛋白或脱脂奶粉,和/或通过调节盐条件、温度,和/或其他方面来调节结合/洗涤反应的严格性。在后续测定中将这些结合/洗涤条件用以确定测试多肽(相比免疫原性多肽和/或对照多肽的多肽)是否被集中的、消减的多克隆抗血清特异性结合。具体地,测试多肽在差别结合条件下显示,比对照合成酶同源物的信噪比至少高2-5倍,并且与免疫原性多肽相比其信噪比至少约1/2,与已知合成酶相比,该测试多肽与免疫原性多肽共有基本结构相似性,因此是本发明的多肽。Depleted, pooled, titrated polyclonal antisera were tested for cross-reactivity with control congeners in comparative immunoassays. In this comparative assay, differential binding conditions are determined for subtracted, pooled, titrated polyclonal antisera such that the signal-to-noise ratio of titrated polyclonal antiserum binds to an immunogenic synthetase is homologous to binding to a control synthetase at least about 5-10 times higher than that of matter. That is, the stringency of the binding/washing reaction is adjusted by adding non-specific competitors such as albumin or non-fat dry milk, and/or by adjusting salt conditions, temperature, and/or other aspects. These binding/wash conditions are used in subsequent assays to determine whether the test polypeptide (polypeptide compared to the immunogenic polypeptide and/or control polypeptide) is specifically bound by the pooled, subtracted polyclonal antisera. Specifically, the test polypeptide exhibits under differential binding conditions a signal-to-noise ratio that is at least 2-5 fold higher than that of a control synthetase homologue, and that is at least about 1/2 that of an immunogenic polypeptide, and that is at least about 2 times that of an immunogenic polypeptide. Compared to known synthetases, the test polypeptide shares substantial structural similarity with the immunogenic polypeptide and is thus a polypeptide of the invention.

在另一实施例中,将竞争结合式的免疫测定用于测试多肽的检测。例如,如上所述,通过用对照多肽免疫吸附从集中的抗血清混合物中去除交叉反应抗体。然后,将免疫原性多肽固定在与消减的集中的抗血清接触的固体支持物上。测定中加入受试蛋白,以竞争性结合集中的消减的抗血清。与固定蛋白相比,受试蛋白与集中的消减的抗血清竞争性结合的能力,与加入测定以竞争性结合免疫原性多肽的能力(免疫原性多肽与固定的免疫原性多肽有效竞争,以结合集中的抗血清)相比。用标准计算方法计算受试蛋白的交叉反应性百分数。In another embodiment, a competitive binding immunoassay is used for the detection of the test polypeptide. For example, cross-reactive antibodies are removed from pooled antiserum mixtures by immunoadsorption with control polypeptides, as described above. Immunogenic polypeptides are then immobilized on a solid support in contact with the subtracted pooled antiserum. The test protein is added to the assay to competitively bind the pooled subtracted antiserum. The ability of the test protein to bind competitively to the pooled subtracted antiserum, compared to the immobilized protein (the immunogenic polypeptide competes effectively with the immobilized immunogenic polypeptide, compared to binding concentrated antiserum). Calculate the percent cross-reactivity of the test protein using standard calculation methods.

在平行测定中,通过与免疫原性多肽竞争性结合抗血清的能力比较,任选地测定对照蛋白竞争性结合集中的消减的抗血清的能力。此外,用标准计算方法计算对照多肽的交叉反应百分数。当测试多肽的交叉反应性百分数比对照多肽高至少5-10倍时,或测试多肽的结合大约在免疫原性多肽的结合范围内时,认为测试多肽特异地结合集中的消减的抗血清。In parallel assays, the ability of a control protein to competitively bind the pooled depleted antisera is optionally determined by comparison with the ability of the immunogenic polypeptide to competitively bind the antisera. In addition, the percent cross-reactivity of the control polypeptide is calculated using standard calculation methods. The test polypeptide is considered to specifically bind the pooled depleted antiserum when the percent cross-reactivity of the test polypeptide is at least 5-10 fold higher than that of the control polypeptide, or when the binding of the test polypeptide is about in the range of binding of the immunogenic polypeptide.

通常,免疫吸附的和集中的抗血清可用于本文描述的竞争性结合免疫测定,以比较任何测试多肽与免疫原性和/或对照多肽。为了进行此比较,在宽浓度范围中测定各免疫原性、测试和对照多肽,各多肽的量需要能够抑制消减抗血清与,例如固定对照的50%结合,用标准技术测定测试或免疫原性蛋白。如果竞争性测定中测试多肽结合的所需量少于所需免疫原性多肽的量的两倍,认为测试多肽与产生的抗免疫原性蛋白的抗体特异地结合,提供的量是对照多肽的至少约5-10倍。In general, immunoadsorbed and pooled antisera can be used in the competitive binding immunoassays described herein to compare any test polypeptide to immunogenic and/or control polypeptides. For this comparison, each immunogenic, test and control polypeptide is assayed over a wide range of concentrations, the amount of each polypeptide required to inhibit the binding of the subtracted antiserum to, e.g., 50% of the fixed control, and the test or immunogenicity is determined using standard techniques. protein. The test polypeptide is considered to specifically bind to antibodies raised against the immunogenic protein if the required amount of test polypeptide binding in the competition assay is less than twice the amount of the desired immunogenic polypeptide, provided the amount is that of the control polypeptide At least about 5-10 times.

作为特异性的附加测定,用免疫原性多肽(而非对照多肽)任选地完全免疫吸附集中的抗血清,直到几乎没有或没有所得的免疫原性多肽消减的集中的抗血清与用于免疫吸附的免疫原性多肽的不结合可被探测到。然后,测试完全免疫吸附的抗血清与测试多肽的反应性。如果几乎没有或没有观察到反应性(即,观察到完全吸附的抗血清与免疫原性多肽的结合信噪比不高于2倍),那么由免疫原性蛋白诱发的抗血清特异地结合测试多肽。As an additional measure of specificity, the pooled antisera was optionally completely immunosorbed with the immunogenic polypeptide (but not the control polypeptide) until little or no resulting immunogenic polypeptide depleted pooled antisera was used for immunization. Non-binding of the adsorbed immunogenic polypeptide can be detected. The fully immunosorbed antisera are then tested for reactivity with the test polypeptide. If little or no reactivity is observed (i.e., no higher than 2-fold signal-to-noise ratio is observed for binding of fully adsorbed antisera to the immunogenic polypeptide), then the antiserum elicited by the immunogenic protein specifically binds the test peptide.

药物组合物pharmaceutical composition

任选地将本发明的多肽或蛋白(如合成酶、包含一个或多个非天然氨基酸的蛋白等)用于治疗性用途,如与合适的药物载体结合。这种组合物,例如,包含治疗有效量的化合物和药学上可接受的载体或赋形剂。所述载体或赋形剂包括但不限于,盐水、缓冲盐水、葡萄糖、水、甘油、乙醇和/或它们的组合。使剂型适应给药方式。通常,蛋白给药方式是本领域公知的,可以应用于本发明多肽的给药。Optionally, the polypeptide or protein of the present invention (such as a synthetase, a protein comprising one or more unnatural amino acids, etc.) is used for therapeutic purposes, such as in combination with a suitable pharmaceutical carrier. Such compositions, for example, comprise a therapeutically effective amount of a compound and a pharmaceutically acceptable carrier or excipient. The carrier or excipient includes, but is not limited to, saline, buffered saline, dextrose, water, glycerol, ethanol and/or combinations thereof. Adapt the dosage form to the mode of administration. Generally, protein administration methods are well known in the art and can be applied to the administration of the polypeptide of the present invention.

在一种或多种体外和/或体内疾病的动物模型中任选地测试包含一种或多种本发明多肽的治疗组合物,根据本领域中公知的方法确证效能、组织代谢和估计剂量。具体地,最初可通过活性、稳定性或本文中非天然氨基酸到天然氨基酸同源物的其它合适测量方法(如,修饰以包括一个或多个非天然氨基酸的EPO与天然氨基酸EPO相比),即在一个相关的测定中确定剂量。Therapeutic compositions comprising one or more polypeptides of the invention are optionally tested in one or more animal models of in vitro and/or in vivo disease to confirm efficacy, tissue metabolism and estimate dosage according to methods well known in the art. Specifically, initially by activity, stability, or other suitable measure of unnatural amino acid to natural amino acid homologues herein (e.g., EPO modified to include one or more unnatural amino acids compared to natural amino acid EPO), That is, the dosage is determined in a related assay.

给药是通过通常用于引入分子使之最终与血液或组织细胞接触的任意途径进行的。任选地用一种或多种药学上可接受的载体,以任意合适方式给予本发明的非天然氨基酸多肽。本发明内容中给予病人所述多肽的合适给药方法是可用的,并且,虽然可用多于一种途径给予具体组合物,但具体途径可经常提供比另一途径更快速和更有效的作用或反应。Administration is by any of the routes commonly used to introduce molecules into eventual contact with blood or tissue cells. The non-natural amino acid polypeptides of the invention are administered in any suitable manner, optionally with one or more pharmaceutically acceptable carriers. Suitable methods of administration of the polypeptides described in the context of the present invention are available and, while more than one route may be used to administer a particular composition, a particular route may often provide a more rapid and effective effect than another route or reaction.

通过给予的具体组合物,以及用于给予该组合物的具体方法部分决定药学上可接受的载体。因此,本发明药物组合物有各种合适的剂型。The pharmaceutically acceptable carrier is determined in part by the particular composition being administered, and the particular method used to administer the composition. Accordingly, the pharmaceutical compositions of the present invention have various suitable dosage forms.

可以通过许多途径,包括但不限于:经口、静脉内、腹腔内、肌内、透皮、皮下、局部、舌下或直肠方式给予多肽组合物。也可通过脂质体给予非天然氨基酸多肽组合物。这种给药途径和合适剂型通常是本领域技术人员已知的。Polypeptide compositions can be administered by a number of routes including, but not limited to: orally, intravenously, intraperitoneally, intramuscularly, transdermally, subcutaneously, topically, sublingually, or rectally. Non-natural amino acid polypeptide compositions can also be administered via liposomes. Such routes of administration and suitable dosage forms are generally known to those skilled in the art.

也可将非天然氨基酸多肽单独或与其他合适成分联合制成气雾剂(即它们可“雾化”)以通过吸入给药。可将气雾剂置入加压的可接受推进剂,如二氯二氟甲烷、丙烷、氮气等。Non-natural amino acid polypeptides, alone or in combination with other suitable ingredients, can also be formulated into aerosol formulations (ie, they can be "nebulized") for administration by inhalation. Aerosols can be placed into pressurized acceptable propellants, such as dichlorodifluoromethane, propane, nitrogen, and the like.

适合胃肠道外给药的剂型,例如关节内(在关节中)、静脉内、肌内、皮内、腹腔内和皮下途径包括可含有抗氧化剂、缓冲液、抑菌剂和使剂型与计划的受者血液等渗的溶质的水性和非水性等渗无菌注射液,和可包含悬浮剂、增溶剂、增稠剂、稳定剂和防腐剂的水相和非水相无菌悬液。包装核酸的剂型可以单位剂量或多剂量密封容器,如安瓿瓶和小瓶的形式呈现。Dosage forms suitable for parenteral administration such as intra-articular (in a joint), intravenous, intramuscular, intradermal, intraperitoneal and subcutaneous routes include those which may contain antioxidants, buffers, bacteriostatic agents and make the dosage form as intended. Aqueous and non-aqueous isotonic sterile injection solutions of isotonic solutes of the recipient's blood, and aqueous and non-aqueous sterile suspensions which may contain suspending agents, solubilizers, thickeners, stabilizers and preservatives. Dosage forms for packaged nucleic acid can be presented in unit-dose or multi-dose sealed containers, such as ampoules and vials.

胃肠道外给药和静脉内给药是优选的给药方式。具体地,已用于天然氨基酸同源物治疗(如,一般用于EPO、GCSF、GMCSF、IFN、白介素、抗体和/或任何药学上传递的蛋白)的给药途径,与当前使用的剂型一起为包括本发明非天然氨基酸的蛋白提供优选的给药途径和剂型(如当前治疗蛋白的加入聚乙二醇的变体等)。Parenteral and intravenous administration are the preferred modes of administration. Specifically, the route of administration already used for natural amino acid congener therapy (e.g., generally used for EPO, GCSF, GMCSF, IFN, interleukins, antibodies, and/or any pharmaceutically delivered protein), along with the currently used dosage form Preferred routes of administration and dosage forms are provided for proteins comprising the unnatural amino acids of the invention (eg, polyethylene glycol-added variants of current therapeutic proteins, etc.).

在本发明内容中,随时间推移,给予病人的剂量足以对病人产生有益的治疗反应,或例如,根据本申请,抑制病原体感染,或其他合适的活性。剂量由具体的组合物/剂型的效能和所用非天然氨基酸多肽的活性、稳定性或血清半衰期,病人病情,以及待治疗病人的体重或体表面积确定。剂量的大小也由具体病人中存在物、性质以及伴随给予具体组合物/剂型的任何不良副作用的程度等决定。In the context of the present invention, a dose administered to a patient is sufficient to produce a beneficial therapeutic response in the patient over time, or, for example, inhibition of pathogenic infection, or other suitable activity according to the present application. The dosage is determined by the potency of the particular composition/dosage form and the activity, stability or serum half-life of the unnatural amino acid polypeptide employed, the condition of the patient, and the body weight or body surface area of the patient to be treated. The size of the dose will also be determined by the presence, nature, and extent of any adverse side effects associated with the administration of a particular composition/dosage form, etc. in a particular patient.

在确定治疗或预防疾病(如癌症、遗传病、糖尿病、艾滋病等)中施用有效量的组合物/剂型,医生评价循环血浆浓度、剂型毒性、疾病进展和/或有关非天然氨基酸的多肽抗体的产生方面。In determining the treatment or prevention of diseases (such as cancer, genetic disease, diabetes, AIDS, etc.) to administer an effective amount of the composition/dosage form, the doctor evaluates the circulating plasma concentration, dosage form toxicity, disease progression and/or the polypeptide antibody related to unnatural amino acid produce aspects.

给予如70千克病人的剂量一般在与当前使用的治疗蛋白剂量相当的范围内,并根据相关组合物的活性或血清半衰期改变作出调整。本发明组合物/剂型可以通过任何已知的常规疗法,包括抗体给药,疫苗给药,细胞毒剂、天然氨基酸多肽、核酸、核酸类似物、生物反应调节剂等的给药来补充治疗病症。Doses administered to eg 70 kg patients are generally within a range comparable to currently used therapeutic protein doses, adjusted for changes in activity or serum half-life of the relevant composition. The composition/dosage form of the present invention can be supplemented by any known conventional therapy, including antibody administration, vaccine administration, administration of cytotoxic agents, natural amino acid polypeptides, nucleic acids, nucleic acid analogs, biological response modifiers, etc. to supplement treatment of conditions.

对于给药,以相关剂型的LD-50和/或对非天然氨基酸在不同浓度下任何副作用的观察确定的速率给予本发明剂型,例如根据病人体重和总体健康状况。可通过单剂量和均分剂量完成给药。For dosing, dosage forms of the invention are administered at a rate determined by the LD-50 of the relevant dosage form and/or by observation of any side effects of the unnatural amino acid at different concentrations, for example based on the patient's weight and general health. Administration can be accomplished by single or divided doses.

如果进行输注一种剂型的病人发生发热、寒战或肌肉疼痛,那么他/她应接受合适剂量的阿司匹林、布洛芬、对乙酰氨基酚或其他疼痛/发热控制药物。对于经历输注反应,如发热、肌肉疼痛和寒战的病人,应在输注前30分钟预先给予阿司匹林、对乙酰氨基酚或如苯海拉明。对解热药和抗组胺药不迅速反应的更严重寒战和肌肉疼痛,则使用杜冷丁。根据反应的严重程度,减慢或中断治疗。If a patient receiving an infusion of one formulation develops fever, chills, or muscle aches, he or she should receive appropriate doses of aspirin, ibuprofen, acetaminophen, or other pain/fever control medication. Patients who experience infusion reactions, such as fever, muscle pain, and chills, should be pre-prescribed with aspirin, acetaminophen, or diphenhydramine 30 minutes before the infusion. For more severe chills and muscle pain that do not respond quickly to antipyretics and antihistamines, use pethidine. Depending on the severity of the reaction, slow or interrupt treatment.

核酸和多肽序列及变体Nucleic acid and polypeptide sequences and variants

如上下文所述,本发明提供了核酸多核苷酸序列和多肽氨基酸序列,如O-tRNA和O-RS,和,如包含所述序列的组合物和方法。本文中公开了所述序列的例子,如O-tRNA和O-RS(参见表5,如除了SEQ ID NO.:1和2外的SEQ ID NO.3-65、86)。然而,本领域技术人员将理解本发明并不局限于本文公开的序列,例如,实施例和表5。本领域技术人员将理解,本发明也提供了许多相关和甚至不相关的具有本文所述功能的序列,如编码O-tRNA或O-RS。As noted above and below, the present invention provides nucleic acid polynucleotide sequences and polypeptide amino acid sequences, such as O-tRNA and O-RS, and, eg, compositions and methods comprising said sequences. Examples of such sequences are disclosed herein, such as O-tRNA and O-RS (see Table 5, as SEQ ID NO. 3-65, 86 in addition to SEQ ID NO.: 1 and 2). However, those skilled in the art will appreciate that the present invention is not limited to the sequences disclosed herein, eg, the Examples and Table 5. Those skilled in the art will appreciate that the present invention also provides a number of related and even unrelated sequences that function as described herein, such as encoding O-tRNA or O-RS.

本发明也提供多肽(O-RS)和多核苷酸,如O-tRNA,编码O-RS或其部分(如合成酶的活性位点)的多核苷酸,用于构建氨酰基tRNA合成酶突变体的寡核苷酸等。例如,本发明的多肽包括包含SEQ ID NO.:36-63(如36-47、48-63或36-63的任何其它亚组)和/或86中任一所列的氨基酸序列的多肽,包含由SEQ ID NO.:3-35(如3-19、20-35或3-35的任何其它亚组)中任一所列的多核苷酸序列编码的氨基酸序列的多肽,和与对多肽的特异性抗体特异地免疫反应的多肽,该多肽包含SEQ ID NO.:36-63,和/或86中任一个所列氨基酸序列的多肽或包含SEQ ID NO.:3-35(例如,3-19,20-35,或序列3-35的任何其它亚组)中所列任一个多核苷酸序列编码的氨基酸序列的多肽。The present invention also provides polypeptides (O-RS) and polynucleotides, such as O-tRNA, polynucleotides encoding O-RS or parts thereof (such as the active site of synthetases), for the construction of aminoacyl tRNA synthetase mutations Body oligonucleotides, etc. For example, polypeptides of the present invention include polypeptides comprising any of the amino acid sequences listed in SEQ ID NO.: 36-63 (such as 36-47, 48-63 or any other subgroup of 36-63) and/or 86, A polypeptide comprising an amino acid sequence encoded by any of the polynucleotide sequences listed in SEQ ID NO.: 3-35 (such as 3-19, 20-35 or any other subgroup of 3-35), and the polypeptide The specific antibody specific immunoreactive polypeptide, the polypeptide comprises SEQ ID NO.: 36-63, and/or the polypeptide of any one listed amino acid sequence in 86 or comprises SEQ ID NO.: 3-35 (for example, 3 -19, 20-35, or any other subgroup of sequences 3-35) the polypeptide of the amino acid sequence encoded by any one of the polynucleotide sequences listed in).

本发明的多肽也包括与天然产生的酪氨酰氨酰基-tRNA合成酶(TyrRS)(例如,SEQ ID NO.:2)具有至少90%相同氨基酸序列的多肽,和包含A-E族中两种或多种氨基酸的多肽。例如,A族包括与大肠杆菌TyrRS的Tyr37相对应位置上的缬氨酸、异亮氨酸、亮氨酸、甘氨酸、丝氨酸、丙氨酸或苏氨酸。B族包括与大肠杆菌TyrRS的Asn126相对应位置上的天冬氨酸;C族包括与大肠杆菌TyrRS的Asp182相对应位置上的苏氨酸、丝氨酸、精氨酸、天冬酰胺或甘氨酸;D族包括与大肠杆菌TyrRS的Phel83相对应位置上的甲硫氨酸、丙氨酸、缬氨酸或酪氨酸;E族包括与大肠杆菌TyrRS的Leul86相对应位置上的丝氨酸、甲硫氨酸、缬氨酸、半胱氨酸、苏氨酸或丙氨酸。任何这些族组合的亚组也是本发明的特征。例如,在一个实施方式中,O-RS具有两种或多种选自与大肠杆菌TyrRS的Tyr37相对应位置上出现的缬氨酸、异亮氨酸、亮氨酸、或苏氨酸;与大肠杆菌TyrRS的Asp182相对应位置上的苏氨酸、丝氨酸、精氨酸、或甘氨酸;与大肠杆菌TyrRS的Phel83相对应位置上的甲硫氨酸、或酪氨酸;和与大肠杆菌TyrRS的Leul86相对应位置上的丝氨酸、或丙氨酸的氨基酸。在另一实施方式中,O-RS包括两种或多种选自与大肠杆菌TyrRS的Tyr37相对应位置上的甘氨酸、丝氨酸、或丙氨酸,与大肠杆菌TyrRS的Asnl26相对应位置上的天冬氨酸,与大肠杆菌TyrRS的Aspl82相对应位置上的天冬酰胺,与大肠杆菌TyrRS的Phel83相对应位置上的丙氨酸或缬氨酸和/或和与大肠杆菌TyrRS的Leul86相对应位置上的甲硫氨酸、缬氨酸、半胱氨酸、或苏氨酸。Polypeptides of the present invention also include polypeptides having at least 90% identical amino acid sequences to naturally occurring tyrosylaminoacyl-tRNA synthetases (TyrRS) (e.g., SEQ ID NO.: 2), and comprising two or more of A-E families Polypeptides of various amino acids. For example, family A includes valine, isoleucine, leucine, glycine, serine, alanine, or threonine at positions corresponding to Tyr37 of E. coli TyrRS. Family B includes aspartic acid at the position corresponding to Asn126 of Escherichia coli TyrRS; Family C includes threonine, serine, arginine, asparagine or glycine at the position corresponding to Asp182 of Escherichia coli TyrRS; D Family includes methionine, alanine, valine or tyrosine at the corresponding position to Phel83 of Escherichia coli TyrRS; family E includes serine, methionine at the corresponding position to Leul86 of Escherichia coli TyrRS , valine, cysteine, threonine, or alanine. Subgroups of any of these family combinations are also features of the invention. For example, in one embodiment, the O-RS has two or more valine, isoleucine, leucine, or threonine selected from the position corresponding to Tyr37 of Escherichia coli TyrRS; Threonine, serine, arginine, or glycine at the corresponding position of Asp182 of Escherichia coli TyrRS; Methionine or tyrosine at the corresponding position of Phel83 of Escherichia coli TyrRS; and with Escherichia coli TyrRS The amino acid of serine or alanine at the corresponding position of Leul86. In another embodiment, the O-RS includes two or more glycine, serine, or alanine selected from the corresponding position of Tyr37 of E. Aspartic acid, asparagine at the position corresponding to Aspl82 of Escherichia coli TyrRS, alanine or valine at the corresponding position of Phe183 of Escherichia coli TyrRS and/or with the corresponding position of Leul86 of Escherichia coli TyrRS methionine, valine, cysteine, or threonine on

类似地,本发明多肽也包括含有SEQ ID NO.:36-63(例如,36-47、48-63或36-63的任意其它亚组)和/或86中至少20个连续氨基酸的多肽,和如上述A-E族中的两个或多个氨基酸取代。也参见本文表4、6和/或表8。本发明多肽也包括包含任一上述多肽的保守变体的氨基酸序列。Similarly, polypeptides of the present invention also include polypeptides comprising at least 20 consecutive amino acids in SEQ ID NO.: 36-63 (for example, 36-47, 48-63 or any other subgroup of 36-63) and/or 86, and two or more amino acid substitutions as in Groups A-E above. See also Tables 4, 6 and/or Table 8 herein. Polypeptides of the invention also include amino acid sequences comprising conservative variants of any of the aforementioned polypeptides.

在一个实施方式中,组合物包括本发明多肽和赋形剂(例如,缓冲液、水、药学上可接受的赋形剂等)。本发明也提供与本发明多肽特异地免疫反应的抗体或抗血清。In one embodiment, a composition includes a polypeptide of the invention and an excipient (eg, buffer, water, pharmaceutically acceptable excipient, etc.). The invention also provides antibodies or antisera that are specifically immunoreactive with a polypeptide of the invention.

本发明也提供多核苷酸。本发明多核苷酸包括编码本发明感兴趣的蛋白或多肽或包括一个或多个选择密码子,或二者。例如,本发明的多核苷酸包括,例如,含有SEQ ID NO.:3-35(例如,3-19、20-35或序列3-35的任意其它亚组)、64-85中任意一个所列核苷酸序列的多核苷酸;与该多核苷酸序列互补或编码其多核苷酸序列的多核苷酸;和/或编码含有SEQ ID NO.:36-63和/或86中任意一个所列氨基酸序列或其保守变体的多肽的多核苷酸。本发明的多核苷酸也包括编码本发明多肽的多核苷酸。类似地,在高严谨条件下与上述多核苷酸杂交的核酸超过基本上全长的核酸是本发明的多核苷酸。The invention also provides polynucleotides. The polynucleotides of the present invention include encoding a protein or polypeptide of interest of the present invention or include one or more selector codons, or both. For example, polynucleotides of the present invention include, for example, those containing any one of SEQ ID NO.: 3-35 (for example, 3-19, 20-35 or any other subgroup of sequence 3-35), 64-85 A polynucleotide with a sequence of nucleotide sequences; a polynucleotide that is complementary to the polynucleotide sequence or encodes its polynucleotide sequence; and/or encodes any one of SEQ ID NO.: 36-63 and/or 86 A polynucleotide that is a polypeptide having an amino acid sequence or a conservative variant thereof. Polynucleotides of the invention also include polynucleotides encoding polypeptides of the invention. Similarly, nucleic acids that hybridize to the polynucleotides described above under conditions of high stringency over substantially full length are polynucleotides of the invention.

本发明的多核苷酸也包括编码多肽的多核苷酸,该多肽包含与天然产生的酪氨酰氨酰基-tRNA合成酶(TyrRS)(例如,SEQ ID NO.:2)至少90%相同的氨基酸序列,和包含A-E族(上述)中所述的两个或多个突变。与上述多核苷酸和/或含有任一上述多核苷酸的保守变体的多核苷酸至少70%(或至少75%、至少80%、至少85%、至少90%、至少95%、至少98%、或至少99%或更多)相同的多核苷酸也包括在本发明的多核苷酸中。也参见本文的表4、表6和/或表8。The polynucleotides of the invention also include polynucleotides encoding polypeptides comprising amino acids at least 90% identical to naturally occurring tyrosyl-tRNA synthetase (TyrRS) (e.g., SEQ ID NO.: 2) sequence, and comprising two or more of the mutations described in Groups A-E (above). At least 70% (or at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 98% of the above polynucleotides and/or polynucleotides containing conservative variants of any of the above polynucleotides) %, or at least 99% or more) identical polynucleotides are also included in the polynucleotides of the present invention. See also Table 4, Table 6 and/or Table 8 herein.

在某些实施方式中,载体(例如,质粒、粘粒、噬菌体、病毒等)包含本发明的多核苷酸。在一个实施方式中,载体是表达载体。在另一实施方式中,表达载体包括可操作地连接于一种或多种本发明的多核苷酸的启动子。在另一实施方式中,细胞含有包括本发明的多核苷酸的载体。In certain embodiments, a vector (eg, plasmid, cosmid, phage, virus, etc.) comprises a polynucleotide of the invention. In one embodiment, the vector is an expression vector. In another embodiment, an expression vector includes a promoter operably linked to one or more polynucleotides of the invention. In another embodiment, a cell contains a vector comprising a polynucleotide of the invention.

本领域技术人员也将理解,本发明包括公开序列的很多变体。例如,本发明包括产生功能相同序列的公开序列的保守变体。认为本发明包括与至少一种公开序列杂交的核酸多核苷酸序列的变体。本文公开序列的独特亚序列,如通过例如,标准序列对比技术确定的亚序列也包括在本发明中。Those skilled in the art will also appreciate that the invention encompasses many variations of the disclosed sequences. For example, the invention includes conservative variants of the disclosed sequences that result in functionally identical sequences. Variants of nucleic acid polynucleotide sequences that hybridize to at least one of the disclosed sequences are considered to be encompassed by the invention. Unique subsequences of the sequences disclosed herein, such as subsequences determined by, for example, standard sequence alignment techniques, are also encompassed by the invention.

保守变体conservative variant

由于遗传密码的简并,”沉默取代”(即不导致编码多肽改变的核酸序列中的取代)是每个编码氨基酸的核酸序列的暗指特征。类似地,用性质高度相似的不同氨基酸取代氨基酸序列中的一种或几种氨基酸的”保守氨基酸取代”,也容易地鉴定为与公开构建物高度相似。各公开序列的这种保守变体是本发明的特征。Due to the degeneracy of the genetic code, "silent substitutions" (ie, substitutions in a nucleic acid sequence that do not result in a change in the encoded polypeptide) are an implicit feature of every nucleic acid sequence that encodes an amino acid. Similarly, "conservative amino acid substitutions" in which one or several amino acids in an amino acid sequence are replaced with different amino acids with highly similar properties are also readily identified as highly similar to disclosed constructs. Such conservative variants of the various disclosed sequences are a feature of the invention.

具体核苷酸序列的”保守变体”指编码相同或基本相同的氨基酸序列的核酸,或该核酸并不将氨基酸序列编码成基本相同的序列。本领域技术人员将认识到,在编码序列中改变、加入或去除单个氨基酸或小百分比(一般小于5%,更一般小于4%、2%或1%)的氨基酸进行的单独取代、缺失或加入是”保守的修饰变异”,其中改变导致氨基酸的缺失、氨基酸的加入或用化学上相似的氨基酸取代氨基酸。因此,本发明所列多肽序列的”保守变体”包括用相同保守取代基的保守选择氨基酸以小百分比,一般小于5%,更一般小于2%或1%取代多肽序列氨基酸。最后,加入并不改变核酸分子的编码活性的序列,如非功能序列的加入,是基本核酸的保守变体。"Conservative variants" of a particular nucleotide sequence refer to nucleic acids that encode identical or essentially identical amino acid sequences, or that do not encode amino acid sequences that are essentially identical. Those skilled in the art will recognize that individual substitutions, deletions or additions that alter, add or remove a single amino acid or a small percentage (typically less than 5%, more typically less than 4%, 2% or 1%) of amino acids in a coding sequence are "conservatively modified variations" in which the alteration results in the deletion of an amino acid, the addition of an amino acid, or the substitution of an amino acid with a chemically similar amino acid. Thus, "conservative variants" of the polypeptide sequences set forth herein include substituting a small percentage, typically less than 5%, more typically less than 2% or 1% of the amino acids of the polypeptide sequence with conservatively selected amino acids of the same conservative substituent. Finally, the addition of sequences that do not alter the coding activity of the nucleic acid molecule, such as the addition of non-functional sequences, are conservative variants of the base nucleic acid.

提供功能类似的氨基酸的保守取代表是本领域公知的。下面列出了包含互相“保守取代”的天然氨基酸的例子组。Conservative substitution tables providing functionally similar amino acids are well known in the art. Listed below are example groups comprising "conservative substitutions" of natural amino acids for each other.

保守取代组conservative substitution group

  1 1   丙氨酸(A)   丝氨酸(S)  苏氨酸(T)Alanine (A) Serine (S) Threonine (T)   2 2   天冬氨酸(D) 谷氨酸(E)Aspartic Acid (D) Glutamic Acid (E)   33   天冬酰胺(N) 谷胺酰胺(Q)Asparagine (N) Glutamine (Q)   44   精氨酸(R)   赖氨酸(K)Arginine (R) Lysine (K)   55   异亮氨酸(I) 亮氨酸(L)  甲硫氨酸(M)  缬氨酸(V)Isoleucine (I) Leucine (L) Methionine (M) Valine (V)   66   苯丙氨酸(F) 酪氨酸(Y)  色氨酸(W)Phenylalanine (F) Tyrosine (Y) Tryptophan (W)

核酸杂交nucleic acid hybridization

可以用比较杂交鉴定本发明核酸,包括本发明核酸的保守变体,该比较杂交法是区别本发明核酸的优选方法。此外,在高、超高和超超高严谨条件下与SEQ ID NO:3-35(例如,3-19、20-35或序列3-35的任意其它亚组)、64-85代表的核酸杂交的靶核酸是本发明的特征。与给定核酸序列相比,所述核酸的例子包括具有一个或几个沉默或保守核酸取代的核酸。Nucleic acids of the invention, including conservative variants of the nucleic acids of the invention, can be identified by comparative hybridization and are the preferred method for distinguishing nucleic acids of the invention. In addition, nucleic acids represented by SEQ ID NO: 3-35 (for example, 3-19, 20-35 or any other subgroup of sequence 3-35), 64-85 under high, ultra-high and ultra-ultra high stringency conditions Hybridized target nucleic acids are a feature of the invention. Examples of such nucleic acids include nucleic acids with one or several silent or conservative nucleic acid substitutions compared to a given nucleic acid sequence.

当测试核酸与探针的杂交相当于完美匹配的互补靶的至少1/2,即探针与靶在下述条件下杂交的信噪比高达至少1/2时,认为测试核酸与探针核酸特异性杂交,在所述条件下,完美匹配探针与完美匹配互补靶结合的信噪比比杂交到任意不匹配靶核酸时观察到的信噪比至少高约5-10倍。A test nucleic acid is considered specific to a probe nucleic acid when the hybridization of the test nucleic acid to the probe corresponds to at least 1/2 that of a perfectly matched complementary target, i.e., the signal-to-noise ratio of hybridization of the probe to the target is as high as at least 1/2 hybridization under which the signal-to-noise ratio of a perfectly matched probe binding to a perfectly matched complementary target is at least about 5-10 times higher than that observed when hybridizing to any mismatched target nucleic acid.

当核酸一般在溶液中结合时,它们“杂交”。核酸因为各种良好表征的物理-化学力,如氢键、溶剂排斥、碱基堆积等杂交。在Tijssen(1993)《生物化学和分子生物学中的实验室技术--用核酸探针杂交》(Laboratory Techniques inBiochemistry and Molecular Biology--Hybridization with Nucleic Acid Probes)第2章第I部分,“杂交原理和核酸探针测定的策略概要”(Overview of principlesof hybridization and the strategy of nucleic acid probe assays)(Elsevier,New York),和Ausubel(上述)中发现了核酸杂交的广泛指南。Hames和Higgins(1995)《基因探针1》(Gene Probes 1)IRL Press at Oxford University Press,Oxford,England,(Hames和Higgins 1)以及Hames和Higgins(1995)《基因探针2》(GeneProbes 2)IRL Press at Oxford University Press,Oxford,England(Hames和Higgins 2)提供了合成、标记、检测和定量DNA和RNA,包括寡核苷酸的细节。Nucleic acids "hybridize" when they are combined, typically in solution. Nucleic acids hybridize due to various well-characterized physico-chemical forces such as hydrogen bonding, solvent repulsion, base stacking, etc. In Tijssen's (1993) "Laboratory Techniques in Biochemistry and Molecular Biology--Hybridization with Nucleic Acid Probes" (Laboratory Techniques in Biochemistry and Molecular Biology--Hybridization with Nucleic Acid Probes), Chapter 2, Part I, "Principles of hybridization Extensive guidelines for nucleic acid hybridization are found in "Overview of principles of hybridization and the strategy of nucleic acid probe assays" (Elsevier, New York), and Ausubel (above). Hames and Higgins (1995) Gene Probes 1 IRL Press at Oxford University Press, Oxford, England, (Hames and Higgins 1) and Hames and Higgins (1995) Gene Probes 2 ) IRL Press at Oxford University Press, Oxford, England (Hames and Higgins 2) provides details on the synthesis, labelling, detection and quantification of DNA and RNA, including oligonucleotides.

在Southern或Northern印迹中用于具有多于100个互补残基的互补核酸在滤膜上杂交的严谨杂交条件的例子是在含有1毫克肝素的50%福尔马林中42℃杂交过夜。严谨洗涤条件的例子是0.2xSSC在65℃洗涤15分钟(SSC缓冲液的描述参见,Sambrook,上述)。高严谨洗涤之前是低严谨洗涤,以去除背景探针信号。低严谨洗涤的例子是2xSSC在40℃洗涤15分钟。通常具体杂交测定中,信噪比比不相关探针中观察到的高5倍(或更高),表明检测到了特异性杂交。An example of stringent hybridization conditions for hybridization of a complementary nucleic acid having more than 100 complementary residues on a filter in a Southern or Northern blot is hybridization overnight at 42°C in 50% formalin containing 1 mg heparin. An example of stringent wash conditions is a 0.2xSSC wash at 65°C for 15 minutes (see, Sambrook, supra for a description of SSC buffer). High stringency washes were followed by low stringency washes to remove background probe signal. An example of a low stringency wash is a 2xSSC wash at 40°C for 15 minutes. Typically a signal-to-noise ratio 5-fold higher (or higher) in a specific hybridization assay than that observed for an unrelated probe indicates detection of specific hybridization.

核酸杂交实验,如Southern和Northern杂交的内容中“严谨杂交洗涤条件”是序列依赖性的,而且在不同环境参数下是不同的。在Tijssen(1993)(上述)和Hames和Higgins,1和2中发现了核酸杂交的广泛指南。可以容易地经验性确定任何测试核酸的严谨杂交和洗涤条件。例如,在确定高严谨杂交和洗涤条件时,逐渐增加杂交和洗涤条件(例如,通过提高温度、降低盐浓度、提高去垢剂浓度和/或提高有机溶剂,如杂交或洗涤中的福尔马林的浓度),直到符合一组选择标准。例如,杂交和洗涤条件逐渐增加,直到探针与完美配对互补靶结合的信噪比比探针与不匹配靶杂交时所观察到的信噪比至少高5倍。Nucleic acid hybridization experiments, such as "stringent hybridization washing conditions" in the context of Southern and Northern hybridization, are sequence-dependent and different under different environmental parameters. Extensive guidelines for nucleic acid hybridization are found in Tijssen (1993) (supra) and Hames and Higgins, 1 and 2 . Stringent hybridization and wash conditions for any test nucleic acid can be readily determined empirically. For example, when determining highly stringent hybridization and wash conditions, gradually increase the hybridization and wash conditions (e.g., by increasing temperature, decreasing salt concentration, increasing detergent concentration, and/or increasing organic solvents, such as formaldehyde in hybridization or washing). forest concentration) until a set of selection criteria are met. For example, hybridization and wash conditions are gradually increased until the signal-to-noise ratio of probe binding to a perfectly paired complementary target is at least 5-fold higher than that observed when the probe hybridizes to a mismatched target.

选择“非常严谨”条件,以等于具体探针的热熔点(Tm)。Tm是50%测试序列与完美匹配探针杂交的温度(在限定的离子强度和pH下)。为本发明目的,通常将“高严谨”杂交和洗涤条件选择为低于具体序列在限定的离子强度和pH下的Tm约5℃。"Very stringent" conditions are chosen to equal the thermal melting point (Tm) of the particular probe. The Tm is the temperature (under defined ionic strength and pH) at which 50% of the test sequences hybridize to a perfectly matched probe. For the purposes of the present invention, "highly stringent" hybridization and wash conditions are typically selected to be about 5°C below the Tm for the specific sequence at a defined ionic strength and pH.

“超高严谨”杂交和洗涤条件是增加了杂交和洗涤条件的严谨性,直到探针与完美匹配互补靶核酸的结合信噪比是与任意不匹配靶核酸杂交中所观察到的信噪比高达至少10倍。在这种条件下与探针杂交的靶核酸,其信噪比是完美匹配互补靶核酸的至少1/2,认为在超高严谨条件下与探针结合。"Ultra-high stringency" hybridization and wash conditions increase the stringency of the hybridization and wash conditions until the signal-to-noise ratio of probe binding to a perfectly matched complementary target nucleic acid is that observed in hybridization to any mismatched target nucleic acid up to at least 10 times. A target nucleic acid that hybridizes to the probe under such conditions, with a signal-to-noise ratio of at least 1/2 that of a perfectly matched complementary target nucleic acid, is considered bound to the probe under ultra-high stringency conditions.

类似地,甚至可以通过逐渐增加相关杂交测定的杂交和/或洗涤条件确定更高水平的严谨性。例如,在那些条件中,增加了杂交和洗涤条件的严谨性,直到探针和完美匹配互补靶核酸结合的信噪比是与任意不匹配靶核酸杂交中所观察到的信噪比高达至少10倍、20倍、50倍、100倍或500倍或更高。在所述条件下与探针杂交的靶核酸,其信噪比是完美匹配互补靶核酸的至少1/2,认为在超超高严谨条件下与探针结合。Similarly, even higher levels of stringency can be determined by gradually increasing the hybridization and/or wash conditions of the relevant hybridization assay. For example, under those conditions, the stringency of the hybridization and wash conditions is increased until the signal-to-noise ratio of probe binding to a perfectly matched complementary target nucleic acid is as high as that observed in hybridization to any mismatched target nucleic acid of at least 10. times, 20 times, 50 times, 100 times or 500 times or more. A target nucleic acid that hybridizes to the probe under said conditions, with a signal-to-noise ratio of at least 1/2 that of a perfectly matched complementary target nucleic acid, is considered to bind to the probe under ultra-ultra-high stringency conditions.

如果它们编码的多肽基本相同,那么在严谨条件下不互相杂交的核酸仍是基本相同的。这发生在,例如,用遗传密码允许的最大密码子简并产生核酸的拷贝时。Nucleic acids that do not hybridize to each other under stringent conditions are still substantially identical if the polypeptides they encode are substantially identical. This occurs, for example, when a copy of a nucleic acid is created with the maximum codon degeneracy permitted by the genetic code.

独特的亚序列unique subsequence

在一个方面,本发明提供了核酸,该核酸包含选自本文公开的0-tRNA和O-RS序列的核酸中独特的亚序列。与任何已知0-tRNA或O-RS核酸序列相对应的核酸相比,该独特的亚序列是独特的。可以用,例如,设置为默认参数的BLAST进行对比。任何独特亚序列都是有用的,例如,作为探针鉴定本发明核酸。In one aspect, the invention provides a nucleic acid comprising a unique subsequence in a nucleic acid selected from the O-tRNA and O-RS sequences disclosed herein. The unique subsequence is unique compared to the nucleic acid corresponding to any known O-tRNA or O-RS nucleic acid sequence. Comparisons can be made, for example, with BLAST set to default parameters. Any unique subsequence is useful, for example, as a probe to identify nucleic acids of the invention.

类似地,本发明包括多肽,该多肽包含选自本文公开的O-RS序列的多肽中独特的亚序列。这里,与任何已知多肽序列相对应的多肽相比,独特的亚序列是独特的。Similarly, the invention includes polypeptides comprising unique subsequences among polypeptides selected from the O-RS sequences disclosed herein. Here, a unique subsequence is unique compared to the corresponding polypeptide of any known polypeptide sequence.

本发明也提供在严谨条件下与独特的编码寡核苷酸杂交的靶核酸,该寡核苷酸编码选自O-RS序列的多肽中独特的亚序列,其中与任意对照多肽(例如,从其例如,通过突变获得本发明合成酶的亲代序列)相对应的多肽相比,独特的亚序列是独特的。如上所述地确定独特的序列。The invention also provides target nucleic acids that hybridize under stringent conditions to uniquely encoding oligonucleotides encoding unique subsequences of polypeptides selected from O-RS sequences, wherein any reference polypeptide (e.g., from A unique subsequence is unique compared to a polypeptide corresponding to, for example, the parent sequence from which the synthetase of the invention was obtained by mutation. Unique sequences were determined as described above.

序列比较,同一性和同源性Sequence Comparison, Identity and Homology

当采用如下述序列比较算法之一(或本领域技术人员可用的其它算法)或通过视觉检查测量以比较和对比最大一致性时,在两种或多种核酸或多肽序列内容中的术语“相同”或“同一性”百分数指两种或多种相同或具有特定的氨基酸残基或核苷酸相同百分数的序列或亚序列。The term "identical" in the context of two or more nucleic acid or polypeptide sequences is used when comparing and aligning for maximum identity using one of the sequence comparison algorithms described below (or other algorithms available to those skilled in the art) or by visual inspection. " or "percent identity" refers to two or more sequences or subsequences that are identical or have a specified percent identity of amino acid residues or nucleotides.

当用序列比较算法或通过视觉检查测量以比较和对比最大一致性时,在两种核酸或多肽(例如,编码O-tRNA或O-RS的DNA,或O-RS的氨基酸序列)内容中的术语“基本相同”指两种或多种具有至少约60%,优选80%,最优选90-95%核苷酸或氨基酸残基同一性的序列或亚序列。在没有参考实际祖先的情况下,一般认为“基本相同”的序列是“同源”的。优选地,在长度至少约50个残基的序列区域上,更优选在至少约100个残基的区域上存在“基本同一性”,最优选地,在超过至少约150个残基,或超过待比较的两个全长序列上序列基本相同。In the context of two nucleic acids or polypeptides (e.g., DNA encoding an O-tRNA or O-RS, or the amino acid sequence of an O-RS) when compared and aligned for maximum identity using a sequence comparison algorithm or measured by visual inspection The term "substantially identical" refers to two or more sequences or subsequences having at least about 60%, preferably 80%, most preferably 90-95% nucleotide or amino acid residue identity. Sequences that are "substantially identical" are generally considered to be "homologous" without reference to actual ancestry. Preferably, "substantial identity" exists over a sequence region of at least about 50 residues in length, more preferably over a region of at least about 100 residues, most preferably over at least about 150 residues, or over The sequences on the two full-length sequences to be compared are basically the same.

对于序列比较和同源性确定来说,一般将一个序列用作参比序列,与测试序列比较。当使用序列比较算法时,将测试和参比序列输入计算机,如果需要的话指定亚序列坐标,指定序列算法程序参数。然后该序列比较算法根据指定的程序参数,计算测试序列相对于参比序列的序列同一性百分数。For sequence comparison and homology determination, typically one sequence acts as a reference sequence, to which test sequences are compared. When using a sequence comparison algorithm, test and reference sequences are entered into a computer, subsequence coordinates are designated, if necessary, and sequence algorithm program parameters are designated. The sequence comparison algorithm then calculates the percent sequence identities for the test sequences relative to the reference sequence, based on the designated program parameters.

可以通过,例如,Smith和Waterman的局部同源算法,Adv.Appl.Math.2:482(1981),Needleman和Wunsch的同源对比算法,J.Mol.Biol.48:443(1970),Pearson和Lipman的搜索相似性方法,Proc.Nat’l.Acad.Sci.USA 85:2444(1988),这些算法的计算机化执行(Wisconsin遗传学软件包中的GAP、BESTFIT、FASTA和TFASTA,遗传学计算机组(Genetics Computer Group),575Science Dr.,Madison,WI)或视觉检查(通常参见,Ausubel等,下述)为比较进行最优化的序列对比。Can be obtained by, for example, the local homology algorithm of Smith and Waterman, Adv.Appl.Math.2:482 (1981), the homology comparison algorithm of Needleman and Wunsch, J.Mol.Biol.48:443 (1970), Pearson and Lipman's search similarity method, Proc.Nat'l.Acad.Sci.USA 85:2444 (1988), computerized implementation of these algorithms (GAP, BESTFIT, FASTA and TFASTA in the Wisconsin Genetics Package, Genetics Genetics Computer Group, 575 Science Dr., Madison, WI) or visual inspection (see generally, Ausubel et al., infra) optimize sequence alignment for comparison.

一个适合于确定序列同一性百分数和序列相似性的算法例子是BLAST算法,Altschul等,J.Mol.Biol.215:403-410(1990)中描述了这种算法。可通过国家生物技术信息中心(www.ncbi.nlm.nih.gov/)公开地得到进行BLAST分析的软件。该算法包括,首先通过鉴定查询序列中长度W的短字鉴定高分的序列对(HSP),当与数据库序列中相同长度的字对比时,查询序列匹配或满足一些正评价的阈值分数T。T称为邻近字分数阈值(Altschul等,上述)。这些起始邻近字命中(word hits)用作起始搜索寻找含有它们的更长HSP的种子。然后字命中沿各序列的两个方向延伸,直到增加累积对比分数。对于核苷酸序列,用参数M(匹配残基对的奖励分数;总是>0)和N(错配残基的惩罚分数;总是<0)计算累积分数。对于氨基酸序列,用计分矩阵计算累积分数。当:累积对比分数通过来自其最大获得值的参数X下降时;由于一种或多种负得分残基对比使累积分数达到零或零以下;或达到各序列的末端时,各方向上字命中的延伸停止。BLAST算法参数W、T和X确定对比的灵敏性和速度。BLASTN程序(用于核苷酸序列)使用的默认设置为字长(W)11、期望值(E)10、截断值100、M=5、N=-4和两条链的比较。对于氨基酸序列,BLASTP程序使用的默认设置为字长(W)3、期望值(E)10和BLOSUM62计分矩阵(参见Henikoff和Henikoff(1989)Proc.Natl.Acad.Sci.USA 89:10915)。An example of an algorithm suitable for determining percent sequence identity and sequence similarity is the BLAST algorithm, which is described in Altschul et al., J. Mol. Biol. 215:403-410 (1990). Software for performing BLAST analyzes is publicly available through the National Center for Biotechnology Information (www.ncbi.nlm.nih.gov/). The algorithm involves first identifying high-scoring sequence pairs (HSPs) by identifying short words of length W in the query sequence, which match or satisfy some threshold score T for positive evaluation when compared to words of the same length in a database sequence. T is referred to as the neighborhood word score threshold (Altschul et al., supra). These initial neighborhood word hits are used as seeds for initial searches looking for longer HSPs containing them. The word hits are then extended in both directions along each sequence until the cumulative alignment score is increased. Cumulative scores are calculated using, for nucleotide sequences, the parameters M (reward score for a pair of matching residues; always >0) and N (penalty score for mismatching residues; always <0). For amino acid sequences, a scoring matrix is used to calculate the cumulative score. Each upword hit when: the cumulative alignment score falls through the parameter X from its maximum obtained value; the cumulative score reaches zero or below zero due to one or more negative-scoring residue alignments; or the end of each sequence is reached The extension stops. The BLAST algorithm parameters W, T and X determine the sensitivity and speed of the comparison. The BLASTN program (for nucleotide sequences) uses default settings of wordlength (W) 11, expectation (E) 10, cutoff 100, M=5, N=-4, and a comparison of both strands. For amino acid sequences, the BLASTP program uses default settings of wordlength (W) 3, expectation (E) 10, and the BLOSUM62 scoring matrix (see Henikoff and Henikoff (1989) Proc. Natl. Acad. Sci. USA 89:10915).

除了计算序列同一性百分数之外,BLAST算法也对两种序列之间的相似性进行统计分析(参见,例如,Karlin和Altschul,Proc.Nat’1.Acad.Sci.USA 90:5873-5787(1993))。BLAST算法提供的一种测量相似性的方法是最小总和概率(P(N)),它提供了概率说明,这种概率下两种核苷酸或氨基酸序列之间偶然发生匹配。例如,如果在测试核酸和参比核酸的比较中最小总和概率小于约0.1、更优选小于约0.01、最优选小于约0.001,则认为核酸与参比序列相似。In addition to calculating percent sequence identity, the BLAST algorithm also performs a statistical analysis of the similarity between two sequences (see, e.g., Karlin and Altschul, Proc. Nat'l. Acad. Sci. USA 90:5873-5787( 1993)). One measure of similarity provided by the BLAST algorithm is the smallest sum probability (P(N)), which provides an indication of the probability by which a match between two nucleotide or amino acid sequences would occur by chance. For example, a nucleic acid is considered similar to a reference sequence if the smallest sum probability in a comparison of the test nucleic acid and the reference nucleic acid is less than about 0.1, more preferably less than about 0.01, most preferably less than about 0.001.

诱变和其他分子生物学技术Mutagenesis and other molecular biology techniques

描述分子生物学技术的普通教科书包括Berger和Kimmel,分子克隆技术指南,《酶学方法》第152卷(Guide to Molecular Cloning Techniques,Methods inEnzymology)Academic Press,Inc.,San Diego,CA(Berger);Sambrook等,《分子克隆-实验室手册》(Molecular Cloning-A Laboratory Manual)(第二版),第1-3卷,Cold Spring Harbor Laboratory,Cold Spring Harbor,New York,1989(“Sambrook”)和《新编分子生物学实验指南》(Current Protocols in MolecularBiology),F.M.Ausubel等编,Current Protocols,它是Greene PublishingAssociates,Inc.和John Wiley&Sons,Inc.的合资公司,(1999年起增补)(“Ausubel”))。这些教科书描述了诱变、载体的用途、启动子和很多其它与,如基因产生相关的课题,包括用于生产包括非天然氨基酸、正交tRNA、正交合成酶和它们的对在内的蛋白的选择密码子。Common textbooks describing molecular biology techniques include Berger and Kimmel, A Guide to Molecular Cloning Techniques, Volume 152 (Guide to Molecular Cloning Techniques, Methods in Enzymology) Academic Press, Inc., San Diego, CA (Berger); Sambrook et al., Molecular Cloning-A Laboratory Manual (Second Edition), Vols. 1-3, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York, 1989 ("Sambrook") and Current Protocols in Molecular Biology, edited by F.M. Ausubel et al., Current Protocols, a joint venture between Greene Publishing Associates, Inc. and John Wiley & Sons, Inc., (added since 1999) (“Ausubel ")). These textbooks describe mutagenesis, the use of vectors, promoters, and many other topics related to, e.g., gene production, including for the production of proteins including unnatural amino acids, orthogonal tRNAs, orthogonal synthetases, and their pairs the selector codon.

本发明使用各种类型的诱变,例如,以产生tRNA文库,以产生合成酶文库,以将编码非天然氨基酸的选择密码子插入感兴趣的蛋白或多肽。它们包括但不限于定位诱变、随机点诱变,同源重组、DNA改组或其它递归诱变方法,嵌合构建,用含有尿嘧啶模板的诱变,寡核苷酸-导向的诱变,硫代磷酸-修饰的DNA诱变,用缺口双链体DNA诱变等,或它们的任意组合。其它合适的方法包括点错配修复、用修复缺陷型宿主株诱变、限制性选择和限制性纯化、缺失诱变、通过全基因合成诱变、双链断裂修复等。本发明也包括例如,包括嵌合构建物的诱变。在一个实施方式中,可以根据天然产生分子或改变的或突变的天然产生分子的已知信息指导诱变,例如,序列,序列比较、物理性质、晶体结构等。Various types of mutagenesis are used in the present invention, for example, to generate tRNA libraries, to generate synthetase libraries, to insert selector codons encoding unnatural amino acids into proteins or polypeptides of interest. They include but are not limited to site-directed mutagenesis, random point mutagenesis, homologous recombination, DNA shuffling or other recursive mutagenesis methods, chimeric construction, mutagenesis with uracil-containing templates, oligonucleotide-directed mutagenesis, Phosphorothioate-modified DNA mutagenesis, mutagenesis with gapped duplex DNA, etc., or any combination thereof. Other suitable methods include point mismatch repair, mutagenesis with a repair-deficient host strain, restriction selection and restriction purification, deletion mutagenesis, mutagenesis by total gene synthesis, double-strand break repair, and the like. The invention also includes, for example, mutagenesis involving chimeric constructs. In one embodiment, mutagenesis can be directed based on known information about the naturally occurring molecule or an altered or mutated naturally occurring molecule, eg, sequence, sequence comparison, physical properties, crystal structure, and the like.

本文的上述内容和例子描述了这些步骤。在下面的出版物和引用参考文献中可以找到附加信息:Ling等,DNA诱变方法:概要(Approaches to DNA mutagenesis:an overview),Anal Biochem.254(2):157-178(1997);Dale等,用硫代磷酸法进行寡核苷酸-导向的随机诱变(Oligonucleotide-directed random mutagenesisusing the phosphorothioate method),Methods Mol.Biol.57:369-374(1996);Smith,体外诱变(In vitro mutagenesis),Ann.Rev.Genet.19:423-462(1985);Botstein和Shortle,体外诱变的策略和应用(Strategies and applications of invitro mutagenesis),Science 229:1193-1201(1985);Carter,定位诱变(Site-directed mutagenesis),Biochem.J.237:1-7(1986);Kunkel,寡核苷酸导向的诱变效率(The efficiency of oligonucleotide directed mutagenesis),刊于《核酸和分子生物学》(Nucleic Acids&Molecular Biology)(Eckstein,F.和Lilley,D.M.J.编,Springer Verlag,Berlin))(1987);Kunkel,无需表型选择的快速和有效的定位诱变(Rapid and efficient site-specific mutagenesiswithout phenotypic selection),Proc.Natl.Acad.Sci.USA 82:488-492(1985);Kunkel等,无需表型选择的快速和有效的定位诱变(Rapid and efficientsite-specific mutagenesis without phenotypic selection),Methods in Enzymol.154,367-382(1987);Bass等,具有新DNA-结合特异性的突变Trp抑制物(Mutant Trprepressors with new DNA-binding specificities),Science 242:240-245(1988);Methods in Enzymo.100:468-500(1983);Methods in Enzymol.154:329-350(1987);Zoller和Smith,用M13衍生的载体进行寡核苷酸-导向的诱变:在任意DNA片段中产生点突变的有效和通用方法(Oligonucleotide-directed mutagenesisusing M13-derived vectors:an efficient and general procedure for theproduction of point mutations in any DNA fragment),Nucleic Acids Res.10:6487-6500(1982);Zoller和Smith,克隆到M13载体中DNA片段的寡核苷酸-导向的诱变(Oligonucleotide-directed mutagenesis of DNA fragments cloned into M13vectors),Methods in Enzymol.100:468-500(1983);Zoller和Smith,寡核苷酸-导向的诱变:使用两种寡核苷酸引物和单链DNA模板的简单方法(Oligonucleotide-directed mutagenesis:a simple method using twooligonucleotide primers and a single-stranded DNA template),Methods inEnzymol.154:329-350(1987);Taylor等,硫代磷酸修饰的DNA在限制性酶反应制备缺口DNA中的用途(The use of phosphorothioate-modified DNA in restrictionenzyme reactions to prepare nicked DNA),Nucl.Acids Res.13:8749-8764(1985);Taylor等,用硫代磷酸修饰的DNA高频率快速产生寡核苷酸-导向的突变(The rapidgeneration of oligonucleotide-directed mutations at high frequency usingphosphorothioate-modified DNA),Nucl.Acids Res.13:8765-8787(1985);Nakamaye和Eckstein,硫代磷酸基团抑制限制性核酸内切酶Nci I切割及其在寡核苷酸-导向的诱变中的应用(Inhibition of restriction endonuclease Nci Icleavage by phosphorothioate groups and its application tooligonucleotide-directed mutagenesis),Nucl.Acids Res.14:9679-9698(1986);Sayers等,在基于硫代磷酸的寡核苷酸-导向的诱变中的Y-T核酸外切酶(Y-TExonucleases in phosphorothioate-based oligonucleotide-directedmutagenesis),Nucl.Acids Res.16:791-802(1988);Sayers等,通过在溴乙锭的存在下与限制性核酸内切酶反应链特异性切割含有硫代磷酸的DNA(Strandspecific cleavage of phosphorothioate-containing DNA by reaction withrestriction endonucleases in the presence of ethidium bromide),(1988)Nucl.Acids Res.16:803-814;Kramer等,构建寡核苷酸-导向的突变的缺口双链体DNA方法(The gapped duplex DNA approach to oligonucleotide-directed mutationconstruction),Nucl.Acids Res.12:9441-9456(1984);Kramer和Fritz通过缺口双链体DNA寡核苷酸-导向的构建突变(Oligonucleotide-directedconstruction of mutations via gapped duplex DNA),Methods in Enzymol.154:350-367(1987);Kramer等,用于寡核苷酸-导向的构建突变的缺口双链体DNA方法中改进的酶促体外反应(Improved enzymatic in vitro reactions in the gappedduplex DNA approach to oligonucleotide-directed construction of mutations),Nucl.Acids Res.16:7207(1988);Fritz等,寡核苷酸-导向的构建突变:无需酶促体外反应的缺口双链体DNA方法(Oligonucleotide-directed construction ofmutations:a gapped duplex DNA procedure without ezymatic reactions invitro),Nucl.Acids Res.16:6987-6999(1988);Kramer等,点错配修复(PointMismatch Repair),Cell 38:879-887(1984);Carter等,用M13载体改进寡核苷酸定位诱变(Improved oligonucleotide site-directed mutagenesis using M13vectors),Nucl.Acids Res.13:4431-4443(1985);Carter,用M13载体改进寡核苷酸-导向的诱变(Improved oligonucleotide-directed mutagenesis using M13vectors),Methods in Enzymol.154:382-403(1987);Eghtedarzadeh和Henikoff,寡核苷酸用于产生大缺失(Use of oligonucleotides to generate largedeletions),Nucl.Acids Res.14:5115(1986);Wells等,在稳定枯草杆菌蛋白酶的过渡态中氢键形成的重要性(Importance of hydrogen-bond formation instabilizing the transition state of subtilisin),Phil.Trans.R.Soc.Lond.A 317:415-423(1986);Nambiar等,编码核糖核酸酶S蛋白的基因全合成和克隆(Total synthesis and cloning of a gene coding for the ribonuclease S protein),Science 223:1299-1301(1984);Sakamar和Khorana,牛杆菌外节段鸟嘌呤核苷酸-结合蛋白(转导蛋白)的α-亚基的基因全合成和表达(Total synthesis andexpression of a gene for theα-subunit of bovine rod outer segment guaninenucleotide-binding protein(transducin)),Nucl.Acids Res.14:6361-6372(1988);Wells等,盒式诱变:在限定位点产生多突变的有效方法(Cassettemutagenesis:an efficient method for generation of multiple mutations atdefined sites),Gene 34:315-323(1985);Grundstrm等,通过微量的‘鸟枪法’基因合成进行寡核苷酸-导向的诱变(Oligonucleotide-directed mutagenesis bymicroscale’shot-gun’gene synthesis),Nucl.Acids Res.13:3305-3316(1985);Mandecki,大肠杆菌质粒中寡核苷酸-导向的双链断裂修复:定位诱变方法(Oligonucleotide-directed double-strand break repair in plasmids ofEscherichia coli:a method for site-specific mutagenesis),Proc.Natl.Acad.Sci.USA,83:7177-7181(1986);Arnold,用于不平常环境的蛋白工程(Proteinengineering for unusual environments),Current Opinion in Biotechnology 4:450-455(1993);Sieber等,Nature Biotechnology,19:456-460(2001).W.P.C.Stemmer,Nature 370,389-91(1994);和I.A.Lorimer,I.Pastan,Nucleic AcidsRes.23,3067-8(1995)。上述很多方法的其它详情可参见Methods in Enzymology第154卷,它也描述了有用的措施,以解决各种诱变方法中出现的故障问题。The preceding text and examples in this article describe these steps. Additional information can be found in the following publications and cited references: Ling et al., Approaches to DNA mutagenesis: an overview, Anal Biochem. 254(2):157-178 (1997); Dale etc., Oligonucleotide-directed random mutagenesis using the phosphorothioate method, Methods Mol.Biol.57: 369-374 (1996); Smith, in vitro mutagenesis (In in vitro mutagenesis), Ann.Rev.Genet.19: 423-462 (1985); Botstein and Shortle, strategies and applications of in vitro mutagenesis (Strategies and applications of invitro mutagenesis), Science 229: 1193-1201 (1985); Carter , Site-directed mutagenesis (Site-directed mutagenesis), Biochem.J.237:1-7 (1986); Kunkel, The efficiency of oligonucleotide directed mutagenesis (The efficiency of oligonucleotide directed mutagenesis), published in "Nucleic Acids and Molecular Biology" (Nucleic Acids & Molecular Biology) (Eckstein, F. and Lilley, D.M.J. eds., Springer Verlag, Berlin)) (1987); Kunkel, Rapid and efficient site-specific mutagenesis without phenotypic selection mutagenesis without phenotypic selection), Proc.Natl.Acad.Sci.USA 82:488-492 (1985); Kunkel et al., Rapid and efficient site-specific mutagenesis without phenotypic selection without phenotype selection, Methods in Enzymol.154,367-382 (1987); Bass et al., Mutant Trp inhibitors with new DNA-binding specificities (Mutant Trprepressors with new DNA-binding specificities), Science 242:240-245 (1988); Methods in Enzymo.100:468-500 (1983); Methods in Enzymol.154:329-350 (1987); Zoller and Smith, Oligonucleotide-directed mutagenesis with M13-derived vectors: in arbitrary DNA fragments An efficient and general procedure for the production of point mutations in any DNA fragment (Oligonucleotide-directed mutagenesis using M13-derived vectors: an efficient and general procedure for the production of point mutations in any DNA fragment), Nucleic Acids Res. 10: 6487-6500 (1982); Zoller and Smith , Oligonucleotide-directed mutagenesis of DNA fragments cloned into M13 vectors, Methods in Enzymol.100:468-500 (1983); Zoller and Smith, Oligonucleotide Oligonucleotide-directed mutagenesis: a simple method using two oligonucleotide primers and a single-stranded DNA template (Oligonucleotide-directed mutagenesis: a simple method using two oligonucleotide primers and a single-stranded DNA template), Methods in Enzymol.154: 329-350(1987); Taylor et al., The use of phosphorothioate-modified DNA in restriction enzyme reactions to prepare nicked DNA in restriction enzyme reaction (The use of phosphorothioate-modified DNA in restriction enzyme reactions to prepare nicked DNA), Nucl.Acids Res.13 : 8749-8764 (1985); Taylor et al., The rapid generation of oligonucleotide-directed mutations at high frequency using phosphorothioate-modified DNA at high frequency using phosphorothioate-modified DNA, Nucl.Acids Res.13:8765-8787 (1985); Nakamaye and Eckstein, Phosphorothioate groups inhibit restriction endonuclease Nci I cleavage and its application in oligonucleotide-directed mutagenesis (Inhibition of restriction endonuclease Nci Icleavage by phosphorothioate groups and its application tooligonucleotide-directed mutagenesis), Nucl.Acids Res.14:9679-9698 (1986); Sayers et al., Y-T nucleic acid in phosphorothioate-based oligonucleotide-directed mutagenesis Exonuclease (Y-TExonucleases in phosphorothioate-based oligonucleotide-directed mutagenesis), Nucl.Acids Res.16:791-802 (1988); Sayers et al., by chain reaction with restriction endonuclease in the presence of ethidium bromide Strandspecific cleavage of phosphorothioate-containing DNA by reaction with restriction endonucleases in the presence of ethidium bromide, (1988) Nucl.Acids Res. 16:803-814; Kramer et al., Construction of oligonucleotides Acid-directed mutation gap duplex DNA method (The gapped duplex DNA approach to oligonucleotide-directed mutationconstruction), Nucl.Acids Res.12: 9441-9456 (1984); Kramer and Fritz by gap duplex DNA oligonucleus Nucleotide-directed construction of mutations (Oligonucleotide-directed construction of mutations via gapped duplex DNA), Methods in Enzymol.154:350-367 (1987); Kramer et al., for oligonucleotide-directed construction of mutations via gapped duplex DNA Improved enzymatic in vitro reactions in the gappedduplex DNA approach to oligonucleotide-directed construction of mutations), Nucl.Acids Res.16:7207 (1988); Fritz et al., Oligonucleotide- Directed construction of mutations: Oligonucleotide-directed construction of mutations: a gapped duplex DNA procedure without ezymatic reactions invitro, Nucl.Acids Res.16: 6987-6999 (1988); Kramer etc., Point Mismatch Repair (PointMismatch Repair), Cell 38:879-887 (1984); Carter et al., Improved oligonucleotide site-directed mutagenesis using M13 vectors with M13 vectors, Nucl.Acids Res .13:4431-4443(1985); Carter, improved oligonucleotide-directed mutagenesis using M13 vectors (Improved oligonucleotide-directed mutagenesis using M13vectors), Methods in Enzymol.154:382-403(1987); Eghtedarzadeh and Henikoff, Use of oligonucleotides to generate large deletions, Nucl. Acids Res. 14:5115 (1986); Wells et al., The importance of hydrogen bond formation in stabilizing the transition state of subtilisins (Importance of hydrogen-bond formation stabilizing the transition state of subtilisin), Phil.Trans.R.Soc.Lond.A 317:415-423 (1986); Nambiar et al., Total synthesis and cloning of gene encoding ribonuclease S protein (Total synthesis and cloning of a gene coding for the ribonuclease S protein), Science 223:1299-1301 (1984); Sakamar and Khorana, α of the outer segment guanine nucleotide-binding protein (transducer) of Bacillus bovis - Subunit gene synthesis and expression (Total synthesis and expression of a gene for theα-subunit of bovine rod outer segment guaninenucleotide-binding protein (transducin)), Nucl.Acids Res.14: 6361-6372 (1988); Wells et al , Cassette mutagenesis: an efficient method for generation of multiple mutations at defined sites (Cassette mutagenesis: an efficient method for generation of multiple mutations at defined sites), Gene 34: 315-323 (1985); Grundstrm et al., through trace ' Shotgun method'gene synthesis for oligonucleotide-directed mutagenesis (Oligonucleotide-directed mutagenesis bymicroscale'shot-gun'gene synthesis), Nucl.Acids Res.13:3305-3316(1985); Mandecki, E. coli plasmid Oligonucleotide-directed double-strand break repair: a method for site-specific mutagenesis (Oligonucleotide-directed double-strand break repair in plasmamids of Escherichia coli: a method for site-specific mutagenesis), Proc.Natl.Acad.Sci.USA, 83 : 7177-7181 (1986); Arnold, Protein engineering for unusual environments (Protein engineering for unusual environments), Current Opinion in Biotechnology 4: 450-455 (1993); Sieber et al., Nature Biotechnology, 19: 456-460 ( 2001). W.P.C. Stemmer, Nature 370, 389-91 (1994); and I.A. Lorimer, I. Pastan, Nucleic Acids Res. 23, 3067-8 (1995). Additional details on many of the methods described above can be found in Methods in Enzymology Volume 154, which also describes useful measures to address malfunctions in various mutagenesis methods.

本发明也涉及真核宿主细胞和生物,用于通过正交tRNA/RS对体内掺入非天然氨基酸。用本发明的多核苷酸或包括本发明的多核苷酸的构建物,例如本发明载体,可以是,如克隆载体或表达载体遗传改造的(例如,转化、转导或转染)宿主细胞。载体可以是,如质粒、细菌、病毒、裸露的多核苷酸或共轭多核苷酸的形式。通过标准方法,包括电穿孔(From等,Proc.Natl.Acad.Sci.USA 82,5824(1985))、病毒载体感染、在小珠或颗粒的基质内或表面上通过具有核酸的小颗粒高速弹道渗透(Klein等,Nature 327,70-73(1987))将载体引入细胞和/或微生物。The invention also relates to eukaryotic host cells and organisms for incorporation of unnatural amino acids in vivo by orthogonal tRNA/RS pairs. A host cell genetically engineered (eg, transformed, transduced or transfected) with a polynucleotide of the invention or a construct comprising a polynucleotide of the invention, eg, a vector of the invention, can be, eg, a cloning vector or an expression vector. Vectors can be in the form of, for example, plasmids, bacteria, viruses, naked polynucleotides or conjugated polynucleotides. By standard methods, including electroporation (From et al., Proc. Natl. Acad. Sci. USA 82, 5824 (1985)), viral vector infection, high-speed passage of small particles with nucleic acid within or on the matrix of beads or particles Ballistic infiltration (Klein et al., Nature 327, 70-73 (1987)) introduces the vector into cells and/or microorganisms.

可以在为适用于,如筛选步骤、激活启动子或选择转化子的活性而修改的常规营养培养基中培养改造的宿主细胞。这些细胞可任选地培养成转基因生物。其它用于,例如细胞分离和培养(如后续核酸分离)的有用参考文献包括Freshney(1994)《动物细胞培养,基本技术手册》(Culture of Animal Cells,a Manual of BasicTechnique),第三版,Wiley-Liss,New York及其引用的参考文献;Payne等(1992)《在液体系统中培养植物细胞和组织》(Plant Cell and Tissue Culture in LiquidSystems)John Wiley&Sons,Inc.New York,NY;Gamborg和Phillips(编)(1995)《植物细胞、组织和器官培养》(Plant Cell,Tissue and Organ Culture);《基本方法施普林格实验室手册》(Fundamental Methods Springer Lab Manual),Springer-Verlag(Berlin Heidelberg New York)和Atlas和Parks(编)《微生物培养基手册》(The Handbook of Microbiological Media)(1993)CRC Press,BocaRaton,FL。The engineered host cells can be cultured in conventional nutrient media adapted for use in, eg, screening procedures, activation of promoters, or selection of transformants for activity. These cells can optionally be cultured as transgenic organisms. Other useful references for, e.g., cell isolation and culture (e.g., subsequent nucleic acid isolation) include Freshney (1994) Culture of Animal Cells, a Manual of Basic Technique, 3rd ed., Wiley -Liss, New York and references cited; Payne et al. (1992) Plant Cell and Tissue Culture in Liquid Systems John Wiley & Sons, Inc. New York, NY; Gamborg and Phillips (Ed.) (1995) Plant Cell, Tissue and Organ Culture; Fundamental Methods Springer Lab Manual, Springer-Verlag (Berlin Heidelberg New York) and Atlas and Parks (eds.) The Handbook of Microbiological Media (1993) CRC Press, Boca Raton, FL.

将靶核酸引入细胞的几种熟知方法是可用的,其中任何一个均可用于本发明。这些方法包括:将含有DNA的细菌原生质体与受体细胞融合、电穿孔、抛射体轰击和用病毒载体感染(下面进一步讨论)等。可将细菌细胞用于扩增含有本发明DNA构建物的质粒的数目。细菌生长到对数期,可通过本领域已知的各种方法分离细菌中的质粒(参见,例如,Sambrook)。此外,可从市场购得很多用于从细菌中纯化质粒的试剂盒(参见,例如,Pharmacia Biotech的EasyPrepTM、FlexiPrepTM;Stratagene的StrataCleanTM;和Qiagen的QIAprepTM)。然后进一步操纵分离和纯化的质粒,以生产其它质粒,用于转染细胞或掺入相关载体以感染生物体。典型载体包含用于调节具体靶核酸表达的转录和翻译终止子、转录和翻译起始序列,和启动子。载体任选地包含含有至少一个独立终止子序列、允许该盒在真核生物或原核生物或二者中复制的序列(如穿梭载体)和用于原核和真核系统的选择标记的普通表达盒。载体适合于在原核生物、真核生物或优选二者中复制和整合。参见,Giliman和Smith,Gene8:81(1979);Roberts,等,Nature,328:731(1987);Schneider,B.,等,ProteinExpr.Purif.6435:10(1995);Ausubel,Sambrook,Berger(上述)。例如ATCC,如《ATCC细菌和噬菌体目录》(The ATCC Catalogue of Bacteria and Bacteriophage)(1992)Gherna等(编)ATCC出版,提供了用于克隆的细菌和噬菌体目录。用于测序、克隆和分子生物学其它方面的附加基本方法和基础理论依据也参见Watson等(1992)《重组DNA》(Recombinant DNA)第二版Scientific American Books,NY。此外,基本上可从各种商业来源中任意一家定制或规范订购任意核酸(实际上任意标记的核酸,无论标准或非标准),如Midland Certified Reagent Company(Midland,TXmcrc.com)、The Great American Gene Company(Ramona,CA可登录万维网genco.com)、ExpressGen Inc.(Chicago,IL,可登录万维网expressgen.com)、OperonTechnologies Inc.(Alameda,CA)和很多其它公司。Several well-known methods of introducing target nucleic acids into cells are available, any of which can be used in the present invention. These methods include fusion of DNA-containing bacterial protoplasts with recipient cells, electroporation, projectile bombardment, and infection with viral vectors (discussed further below), among others. Bacterial cells can be used to amplify the number of plasmids containing the DNA constructs of the invention. Bacteria are grown to log phase and plasmids can be isolated from bacteria by various methods known in the art (see, eg, Sambrook). In addition, many kits are commercially available for purifying plasmids from bacteria (see, eg, EasyPrep , FlexiPrep from Pharmacia Biotech; StrataClean from Stratagene; and QIAprep from Qiagen). The isolated and purified plasmids are then further manipulated to produce other plasmids for transfection of cells or for incorporation into related vectors to infect organisms. Typical vectors contain transcriptional and translational terminators, transcriptional and translational initiation sequences, and promoters for regulating the expression of a particular target nucleic acid. The vector optionally comprises a common expression cassette containing at least one independent terminator sequence, a sequence that allows the cassette to replicate in eukaryotes or prokaryotes or both (such as a shuttle vector) and selectable markers for both prokaryotic and eukaryotic systems . Vectors are suitable for replication and integration in prokaryotes, eukaryotes, or preferably both. See, Giliman and Smith, Gene 8: 81 (1979); Roberts, et al., Nature, 328: 731 (1987); Schneider, B., et al., Protein Expr. Purif. 6435: 10 (1995); Ausubel, Sambrook, Berger ( above). For example, the ATCC, published by ATCC as "The ATCC Catalog of Bacteria and Bacteriophage" (1992) Gherna et al. (eds.), provides a catalog of bacteria and phages useful for cloning. Additional basic methods and rationale for sequencing, cloning, and other aspects of molecular biology are also found in Watson et al. (1992) Recombinant DNA, 2nd ed. Scientific American Books, NY. In addition, essentially any nucleic acid (virtually any labeled nucleic acid, whether standard or non-standard) can be custom- or standard-ordered from any of a variety of commercial sources, such as Midland Certified Reagent Company (Midland, TXmcrc.com), The Great American Gene Company (Ramona, CA, available on the World Wide Web at genco.com), ExpressGen Inc. (Chicago, IL, available on the World Wide Web at expressgen.com), Operon Technologies Inc. (Alameda, CA), and many others.

试剂盒Reagent test kit

试剂盒也是本发明特征。例如,提供了在细胞中生产含有至少一个非天然氨基酸的蛋白的试剂盒,其中该试剂盒包括含有编码O-tRNA的多核苷酸序列和/或O-tRNA,和/或编码O-RS的多核苷酸序列和/或O-RS的容器。在一个实施方式中,该试剂盒还包括至少一种非天然氨基酸。在另一实施方式中,该试剂盒还包含生产蛋白的说明材料。Kits are also a feature of the invention. For example, a kit for producing a protein comprising at least one unnatural amino acid in a cell is provided, wherein the kit comprises a polynucleotide sequence encoding an O-tRNA and/or an O-tRNA, and/or an O-RS encoding Containers for polynucleotide sequences and/or O-RSs. In one embodiment, the kit also includes at least one unnatural amino acid. In another embodiment, the kit further comprises instructional material for producing the protein.

实施例Example

提供下面的实施例是为了说明,而非限制要求保护的本发明。本领域技术人员将认识到,可以在不背离要求保护的本发明范围的情况下改变各种非重要的参数。The following examples are provided to illustrate, not to limit, the claimed invention. Those skilled in the art will recognize that various insignificant parameters can be changed without departing from the scope of the claimed invention.

实施例1:在真核细胞中掺入非天然氨基酸的氨酰基-tRNA合成酶的生产方法和Example 1: Method for producing aminoacyl-tRNA synthetases incorporating unnatural amino acids in eukaryotic cells and 组合物combination

扩展真核生物遗传密码以包括具有新的物理、化学或生物性质的非天然氨基酸,为在这些细胞中分析和控制蛋白功能提供有力工具。为此目的,描述了在酿酒酵母(S.cerevisiae)中用于分离响应于琥珀密码子以高保真度将非天然氨基酸掺入蛋白中的氨酰基-tRNA合成酶。该方法基于通过在GAL4的DNA结合域和转录激活域之间抑制琥珀密码子,激活GAL4反应性报道基因HIS3、URA3或LacZ。描述了用于正选择活性大肠杆菌酪氨酰-tRNA合成酶(EcTyrRS)变体的GAL4报道基因的最优化。也开发了用URA3报道基因进行失活EcTyrRS变体的负选择,使用加入生长培养基作为‘有毒等位基因’的小分子(5-氟乳清酸(5-FOA))。重要的是,可以在单细胞上和以一定范围的严格性进行正和负选择。这可有助于从大的突变体合成酶文库中分离一定范围的氨酰基-tRNA合成酶(aaRS)活性。模型选择证明了该方法用于分离所需aaRS表型的功效。Expanding the eukaryotic genetic code to include unnatural amino acids with novel physical, chemical or biological properties provides powerful tools for analyzing and controlling protein function in these cells. To this end, the use in S. cerevisiae to isolate aminoacyl-tRNA synthetases that incorporate unnatural amino acids into proteins with high fidelity in response to the amber codon is described. The approach is based on the activation of the GAL4-responsive reporter genes HIS3, URA3 or LacZ by repressing the amber codon between the DNA-binding and transcriptional activation domains of GAL4. Optimization of the GAL4 reporter gene for positive selection of active E. coli tyrosyl-tRNA synthetase (EcTyrRS) variants is described. Negative selection of inactive EcTyrRS variants with the URA3 reporter gene was also developed using a small molecule (5-fluoroorotic acid (5-FOA)) added to the growth medium as a 'toxic allele'. Importantly, positive and negative selection can be performed on single cells and at a range of stringencies. This can facilitate the isolation of a range of aminoacyl-tRNA synthetase (aaRS) activities from large mutant synthetase libraries. Model selection demonstrates the efficacy of this method for isolating the desired aaRS phenotypes.

最近,将非天然氨基酸加入大肠杆菌(E.coli)的遗传密码中提供了在体外和体内分析和操纵蛋白质结构和功能的有效的新手段。以与普通的二十种氨基酸相匹敌的效率和保真度,将具有光亲和标记、重原子、酮和烯烃基的氨基酸和生色团掺入大肠杆菌中的蛋白质。参见,例如,Chin,等,(2002),将光交联剂加入到大肠杆菌的遗传密码中(Addition of a Photocrosslinker to the Genetic Code ofEscherichia coli),Proc.Natl.Acad.Sci.U.S.A.99:11020-11024;Chin和Schultz,(2002),体内用非天然氨基酸诱变进行光交联(In vivo Photocrosslinkingwith Unnatural Amino Acid Mutagenesis),ChemBioChem 11:1135-1137;Chin等,(2002),将对-叠氮基-L-苯丙氨酸加入大肠杆菌的遗传密码中(Addition ofp-Azido-L-phenylalanine to the Genetic code of Escherichia coli),J.Am.Chem.Soc.124:9026-9027;Zhang等,(2002),将链烯选择性掺入大肠杆菌中的蛋白(Theselective incorporation of alkenes into proteins in Escherichia coli),Angew.Chem.Int.Ed.Engl.41:2840-2842;以及Wang和Schultz,(2002),扩展遗传密码(Expanding the Genetic Code),Chem.Comm.1-10。Recently, the incorporation of unnatural amino acids into the genetic code of Escherichia coli (E. coli) provides powerful new means to analyze and manipulate protein structure and function in vitro and in vivo. Incorporation of amino acids and chromophores with photoaffinity tags, heavy atoms, ketones, and alkene groups into proteins in E. coli with efficiency and fidelity that rivals the common twenty amino acids. See, e.g., Chin, et al., (2002), Addition of a Photocrosslinker to the Genetic Code of Escherichia coli, Proc. Natl. Acad. Sci. U.S.A. 99: 11020 -11024; Chin and Schultz, (2002), In vivo Photocrosslinking with Unnatural Amino Acid Mutagenesis (In vivo Photocrosslinking with Unnatural Amino Acid Mutagenesis), ChemBioChem 11:1135-1137; Chin et al. Addition of p-Azido-L-phenylalanine to the Genetic code of Escherichia coli, J.Am.Chem.Soc.124:9026-9027; Zhang et al. , (2002), The selective incorporation of alkenes into proteins in Escherichia coli, Angew.Chem.Int.Ed.Engl.41:2840-2842; and Wang and Schultz, (2002), Expanding the Genetic Code, Chem.Comm.1-10.

以前,已经通过显微注射化学错酰化的嗜热四膜虫tRNA(例如,M.E.Saks,等(1996),用于通过无义抑制将非天然氨基酸体内掺入蛋白质的工程四膜虫tRNAGln(An engineered Tetrahymena tRNAGln for in vivo incorporation ofunnatural amino acids into proteins by nonsense suppression),J.Biol.Chem.271:23169-23175)和相关mRNA,将非天然氨基酸引入爪蟾卵母细胞中的烟碱性乙酰胆碱受体中(例如,M.W.Nowak,等(1998),将非天然氨基酸体内掺入爪蟾卵母细胞表达系统的离子通道中(In vivo incorporation of unnatural amino acids into ionchannels in Xenopus oocyte expression system),Method Enzymol.293:504-529)。这允许了通过引入含有具独特物理或化学性质的侧链的氨基酸对卵母细胞中的受体进行详细的生物物理学研究。参见,例如,D.A.Dougherty(2000),作为蛋白结构和功能探针的非天然氨基酸(Unnatural amino acids as probes of protein structureand function),Curr.Opin.Chem.Biol.4:645-652。不幸的是,该方法仅限于可以显微注射的细胞中的蛋白质,因为tRNA在体外被化学酰化,不能被再酰化,所以蛋白产率非常低。这反过来需要测定蛋白功能的灵敏技术。Previously, chemical misacylated Tetrahymena tRNAs have been engineered by microinjection (e.g., M.E. Saks, et al. (1996), Tetrahymena tRNAGln( An engineered Tetrahymena tRNAGln for in vivo incorporation of unnatural amino acids into proteins by nonsense suppression), J.Biol.Chem.271:23169-23175) and related mRNA, introduces unnatural amino acids into nicotinic acetylcholine in Xenopus oocytes In receptors (for example, M.W.Nowak, et al. (1998), In vivo incorporation of unnatural amino acids into ion channels in Xenopus oocyte expression system (In vivo incorporation of unnatural amino acids into ion channels in Xenopus oocyte expression system), Method Enzymol. 293:504-529). This allows detailed biophysical studies of receptors in oocytes by introducing amino acids with side chains with unique physical or chemical properties. See, eg, D.A. Dougherty (2000), Unnatural amino acids as probes of protein structure and function, Curr. Opin. Chem. Biol. 4:645-652. Unfortunately, this method is limited to proteins in cells that can be microinjected, and protein yields are very low because tRNAs are chemically acylated in vitro and cannot be re-acylated. This in turn requires sensitive techniques for measuring protein function.

在真核细胞中响应于琥珀密码子,将非天然氨基酸遗传掺入蛋白质中引起了大家的兴趣。也参见,H.J.Drabkin等,(1996),哺乳动物细胞中的琥珀抑制取决于大肠杆菌氨酰基-tRNA合成酶基因的表达(Amber suppression in mammalian cellsdependent upon expression of an Escherichia coli aminoacyl-tRNA synthetasegene),Molecular&Cellular Biology 16:907-913;A.K.Kowal,等,(2001),第二十一个氨酰基-tRNA合成酶-抑制型tRNA对在真核生物和真细菌中将氨基酸类似物位点特异性掺入蛋白中的可能用途(Twenty-first aminoacyl-tRNAsynthetase-suppressor tRNA pairs for possible use in site-specificincorporation of amino acid analogues into proteins in eukaryotes and ineubacteria),[评论],Proc.Natl.Acad.Sci.U.S.A.98:2268-2273;和K.Sakamoto,等,(2002),在哺乳动物细胞中将非天然氨基酸位点特异性掺入蛋白质中(Site-specific incorporation of an unnatural amino acid into proteins inmammalian cells),Nucleic Acids Res.30:4692-4699。这将具有显著的技术和实际优点,因为通过tRNA的关联合成酶将对其再酰化-导致产生大量突变蛋白。而且,遗传编码的氨酰基-tRNA合成酶和tRNA原则上是可遗传的,允许非天然氨基酸通过很多细胞分裂掺入蛋白中,而没有指数稀释。Genetic incorporation of unnatural amino acids into proteins in response to the amber codon in eukaryotic cells has attracted interest. See also, H.J.Drabkin et al., (1996), Amber suppression in mammalian cells dependent upon expression of an Escherichia coli aminoacyl-tRNA synthetase gene, Molecular & Cellular Biology 16: 907-913; A.K. Kowal, et al., (2001), Twenty-first aminoacyl-tRNA synthetase-suppressor tRNA pair site-specific incorporation of amino acid analogues in eukaryotes and eubacteria Possible uses in proteins (Twenty-first aminoacyl-tRNAsynthetase-suppressor tRNA pairs for possible use in site-specific incorporation of amino acid analogues into proteins in eukaryotes and neubacteria), [Comment], Proc.Natl.Acad.Sci.U.S.A.98: 2268-2273; and K. Sakamoto, et al., (2002), Site-specific incorporation of an unnatural amino acid into proteins inmammalian cells, Nucleic Acids Res. 30: 4692-4699. This would have significant technical and practical advantages, since the tRNA would be re-acylated by its associated synthetase - resulting in the production of a large number of mutant proteins. Furthermore, genetically encoded aminoacyl-tRNA synthetases and tRNAs are in principle heritable, allowing unnatural amino acids to be incorporated into proteins over many cell divisions without exponential dilution.

已经描述了将新氨基酸加入到大肠杆菌的遗传密码中的必需步骤(参见,例如,D.R.Liu和P.G.Schultz,(1999),具有扩展遗传密码的生物进化的进展(Progresstoward the evolution of an organism with an expanded genetic code),Proc.Natl.Acad.Sci.U.S.A.96:4780-4785;类似原理可用于扩展真核生物的遗传密码。第一步,鉴定正交氨酰基-tRNA合成酶(aaRS)/tRNACUA对。该对需要与宿主细胞翻译机器一起作用,但是aaRS不应该使任何内源性tRNAs具有氨基酸,tRNACUA不应该被任何内源性合成酶氨酰化。参见,例如,D.R.Liu,等,设计用于在体内将非天然氨基酸位点特异性掺入蛋白质中的tRNA和氨酰基-tRNA合成酶(Engineering atRNA and aminoacyl-tRNA synthetase for the site-specific incorporation ofunnatural amino acids into proteins in vivo),Proc.Natl.Acad.Sci.U.S.A.94:10092-10097。第二步,从突变体aaRS文库中选择那些能够仅用非天然氨基酸的aaRS/tRNA对。在大肠杆菌中,利用MjTyrRS的变体选择非天然氨基酸是通过采用两步‘双筛’选择进行的。参见,例如,D.R.Liu和P.G.Schultz,(1999),具有扩展遗传密码的生物进化的进展(Progress toward the evolution of an organism withan expanded genetic code),Proc.Natl.Acad.Sci.U.S.A.96:4780-4785。在真核细胞中使用修饰的选择方法。The necessary steps for adding new amino acids to the genetic code of Escherichia coli have been described (see, for example, DRLiu and PGSchultz, (1999), Progress toward the evolution of an organism with an expanded genetic code), Proc.Natl.Acad.Sci.USA96: 4780-4785; Similar principles can be used to expand the genetic code of eukaryotes. The first step is to identify orthogonal aminoacyl-tRNA synthetase (aaRS)/tRNA CUA pairs. The pair needs to work with the host cell translation machinery, but the aaRS should not render any endogenous tRNAs with amino acids, and the tRNA CUA should not be aminoacylated by any endogenous synthetases. See, e.g., DRLiu, et al., designed for Engineering atRNA and aminoacyl-tRNA synthetase for the site-specific incorporation of unnatural amino acids into proteins in vivo, Proc.Natl. Acad.Sci.USA94:10092-10097.Second step, select those aaRS/tRNA pairs that can only use unnatural amino acids from the mutant aaRS library.In Escherichia coli, utilize the variant of MjTyrRS to select unnatural amino acids by Performed using two-step 'double screen' selection. See, for example, DRLiu and PGSchultz, (1999), Progress toward the evolution of an organism with an expanded genetic code, Proc.Natl. Acad.Sci.USA96:4780-4785.Using modified selection methods in eukaryotic cells.

将酿酒酵母(S.cerevisiae)选作真核宿主生物,因为它是单细胞,具有快速的世代时间,以及相对良好地表征了遗传学特征。参见,例如,D.Burke,等,(2000)《酵母遗传学方法》(Methods in Yeast Genetics),Cold Spring Harbor LaboratoryPress,Cold Spring Harbor,NY。而且,因为真核生物的翻译机器是高度保守的(参见,例如,(1996)《翻译控制》(Translational Control),Cold Spring HarborLaboratory Press,Cold Spring Harbor,NY;Y.Kwok和J.T.Wong,(1980),用氨酰基-tRNA合成酶作为系统发育探针确定红皮盐杆菌和真核生物之间的进化关系(Evolutionary relationship between Halobacterium cutirubrum and eukaryotesdetermined by use of aminoacyl-tRNA synthetases as phylogenetic probes),Canadian Journal of Biochemistry 58:213-218;和(2001)《核糖体》(TheRibosome),Cold Spring Harbor Laboratory Press,Cold Spring Harbor,NY),很可能,发现于酿酒酵母用于掺入非天然氨基酸的aaRS基因可被‘切割和粘贴’到高级真核生物中,与关联tRNAs合作使用(参见,例如,K.Sakamoto,等,(2002)在哺乳动物细胞中将非天然氨基酸位点特异性掺入蛋白质中(Site-specific incorporation of anunnatural amino acid into proteins in mammalian cells),Nucleic Acids Res.30:4692-4699;和C.Kohrer,等,(2001),将琥珀和赭石抑制型tRNAs输入哺乳动物细胞:将氨基酸类似物位点特异性插入蛋白质中的通用方法(Import of amber andochre suppressor tRNAs into mammalian cells:a general approach tosite-specific insertion of amino acid analogues into proteins),Proc.Natl.Acad.Sci.U.S.A.98:1431(-14315)以掺入非天然氨基酸。因此,酿酒酵母遗传密码的扩展是扩展复杂多细胞真核生物的遗传密码的途径。参见,例如,M.Buvoli,等,(2000),在细胞培养和小鼠中通过多聚体化的抑制型tRNA基因抑制无义突变(Suppression of nonsense mutations in cell culture and mice by multimerizedsuppressor tRNA genes),Molecular&Cellular Biology 20:3116-3124。来源于以前用于扩展大肠杆菌遗传密码的詹氏甲烷球菌TyrRS(MjTyrRS)/tRNA(参见例如,L.Wang和P.G.Schultz,(2002),扩展遗传密码(Expanding the Genetic Code),Chem.Comm.1-10)的酪氨酰对在真核生物中不是正交的(例如,P.Fechter,等,(2001),詹氏甲烷球菌和酿酒酵母tRNA(Tyr)中的主要酪氨酸同一性决定子是保守的但表达不同(Major tyrosine identity deterninants in Methanococcus jannaschiiand Saccharomyces cerevisiae tRNA(Tyr)are conserved but expresseddifferently),Eur.J.Biochem.268:761-767),需要新的正交对以扩展真核生物遗传密码。Schimmel和同事们指出,在酿酒酵母中,大肠杆菌酪氨酰-tRNA合成酶(EcTyrRS)/tRNACUA对抑制琥珀密码子;以及在酵母细胞溶胶中,内源性氨酰基tRNA合成酶不载有大肠杆菌tRNACUA(图2)。也参见,例如,H.Edwards,等,(1991),大肠杆菌酪氨酸转移RNA在酿酒酵母中是亮氨酸-特异性转移RNA(An Escherichia colityrosine transfer RNA is a leucine-specific transfer RNA in the yeastSaccharomyces cerevisiae),Proc.Natl.Acad.Sci.U.S.A.88:1153-1156;以及H.Edwards和P.Schimmel(1990),细菌氨酰基-tRNA合成酶选择性识别酿酒酵母中的细菌琥珀抑制子(A bacterial amber suppressor in Saccharomyces cerevisiaeis selectively recognized by a bacterial aminoacyl-tRNA synthetase),Molecular&Cellular Biology 10:1633-1641。此外,EcTyrRS已显示并不在体外载有酵母tRNA。参见,例如,Y.Kwok和J.T.Wong,(1980),用氨酰基-tRNA合成酶作为系统发育探针确定红皮盐杆菌和真核生物之间的进化关系(Evolutionaryrelationship between Halobacterium cutirubrum and eukaryotes determined byuse of aminoacyl-tRNA synthetases as phylogenetic probes),Canadian Journalof Biochemistry 58:213-218;B.P.Doctor,等,(1966),酵母和大肠杆菌酪氨酸tRNA的种特异性的研究(Studies on the species specificity of yeast and E.colityrosine tRNAs),Cold Spring HarborSymp.Quant.Biol.31:543-548;和K.Wakasugi,等,(1998),进化中的遗传密码:将种特异性氨酰化与肽移植物交换(Genetic codein evolution:switching species-specific aminoacylation with a peptidetransplant),EMBO Journal 17:297-305。因此,EcTyrRS/tRNACUA对是酿酒酵母以及高级真核生物中正交对的候选物(例如,A.K.Kowal,等,(2001),第二十一个氨酰基-tRNA合成酶-抑制型tRNA对在真核生物和真细菌中将氨基酸类似物位点特异性掺入蛋白中的可能用途(Twenty-first aminoacyl-tRNA synthetase-suppressor tRNApairs for possible use in site-specific incorporation of amino acid analoguesinto proteins in eukaryotes and in eubacteria),[评论],Proc.Natl.Acad.Sci.U.S.A.98(2001)2268-2273)。S. cerevisiae was chosen as the eukaryotic host organism because it is unicellular, has a rapid generation time, and is relatively well characterized genetically. See, eg, D. Burke, et al. (2000) Methods in Yeast Genetics, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY. Moreover, since the eukaryotic translation machinery is highly conserved (see, e.g., (1996) Translational Control, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY; Y. Kwok and JT Wong, (1980) , Evolutionary relationship between Halobacterium cutirubrum and eukaryotes determined by use of aminoacyl-tRNA synthetases as phylogenetic probes, Canadian Journal of Biochemistry 58:213-218; and (2001) "The Ribosome", Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY), most likely, the aaRS gene found in Saccharomyces cerevisiae for incorporation of unnatural amino acids may are 'cut and pasted' into higher eukaryotes for use in cooperation with cognate tRNAs (see, e.g., K. Sakamoto, et al., (2002) for site-specific incorporation of unnatural amino acids into proteins in mammalian cells ( Site-specific incorporation of anunnatural amino acid into proteins in mammalian cells), Nucleic Acids Res. 30: 4692-4699; and C. Kohrer, et al., (2001), Amber and ocher suppressor tRNAs into mammalian cells: Incorporation of amino acids A general approach to site-specific insertion of amino acid analogues into proteins (Import of amber andochre suppressor tRNAs into mammalian cells: a general approach to site-specific insertion of amino acid analogues into proteins), Proc.Natl.Acad.Sci.USA98: 1431( -14315) to incorporate unnatural amino acids. Therefore, the expansion of the Saccharomyces cerevisiae genetic code is the approach to the expansion of the genetic code of complex multicellular eukaryotes. See, for example, M.Buvoli, etc., (2000), in Cell Culture and Suppression of nonsense mutations in cell culture and mice by multimerized suppressor tRNA genes in mice, Molecular & Cellular Biology 20: 3116-3124. Derived from Methanococcus jannaschii TyrRS (MjTyrRS)/tRNA previously used to expand the genetic code of E. coli (see for example, L. Wang and PGSchultz, (2002), Expanding the Genetic Code (Expanding the Genetic Code), Chem. Comm. 1 -10) tyrosyl pairs are not orthogonal in eukaryotes (eg, P. Fechter, et al., (2001), Major tyrosine identity determination in Methanococcus jannaschii and Saccharomyces cerevisiae tRNA(Tyr) Major tyrosine identity deterninants are conserved but expressed differently (Major tyrosine identity deterninants in Methanococcus jannaschiiand Saccharomyces cerevisiae tRNA(Tyr) are conserved but expressed differently), Eur.J.Biochem.268:761-767), requiring new orthogonal pairs to expand eukaryotic biological genetic code. Schimmel and colleagues showed that in Saccharomyces cerevisiae, the Escherichia coli tyrosyl-tRNA synthetase (EcTyrRS)/tRNA CUA pair suppresses the amber codon; E. coli tRNA CUA (Figure 2). See also, e.g., H.Edwards, et al., (1991), An Escherichia colityrosine transfer RNA is a leucine-specific transfer RNA in the Saccharomyces cerevisiae yeastSaccharomyces cerevisiae), Proc.Natl.Acad.Sci.USA88:1153-1156; and H.Edwards and P.Schimmel (1990), Selective recognition of the bacterial amber suppressor in Saccharomyces cerevisiae by bacterial aminoacyl-tRNA synthetases (A bacterial amber suppressor in Saccharomyces cerevisiaeis selectively recognized by a bacterial aminoacyl-tRNA synthetase), Molecular & Cellular Biology 10: 1633-1641. Furthermore, EcTyrRS has been shown not to host yeast tRNA in vitro. See, for example, Y. Kwok and JT Wong, (1980), Evolutionary relationship between Halobacterium cutirubrum and eukaryotes determined by use of aminoacyl-tRNA synthetases as phylogenetic probes aminoacyl-tRNA synthetases as phylogenetic probes), Canadian Journal of Biochemistry 58: 213-218; BPDoctor, et al., (1966), Studies on the species specificity of yeast and Escherichia coli tyrosine tRNA (Studies on the species specificity of yeast and E .colityrosine tRNAs), Cold Spring HarborSymp.Quant.Biol.31:543-548; and K.Wakasugi, et al., (1998), The genetic code in evolution: species-specific aminoacylation and peptide graft exchange (Genetic codein evolution: switching species-specific aminoacylation with a peptide transplant), EMBO Journal 17: 297-305. Therefore, the EcTyrRS/tRNA CUA pair is a candidate for an orthogonal pair in Saccharomyces cerevisiae as well as in higher eukaryotes (for example, AK Kowal, et al., (2001), the twenty-first aminoacyl-tRNA synthetase-suppressor tRNA pair in Twenty-first aminoacyl-tRNA synthetase-suppressor tRNA pairs for possible use in site-specific incorporation of amino acid analogues into proteins in eukaryotes and in eukaryotes and eubacteria eubacteria), [Review], Proc.Natl.Acad.Sci.USA98(2001)2268-2273).

为了扩展大肠杆菌中EcTyrRS的底物特异性,Nishimura和同事们筛选了易出错的PCR产生的EcTyrRS突变体文库,发现了具有改进的掺入3-氮酪氨酸能力的突变体。参见,例如,F.Hamano-Takaku,等,(2000),突变大肠杆菌酪氨酰tRNA合成酶利用非天然氨基酸氮酪氨酸比利用酪氨酸更有效(A mutant Escherichia colityrosyl tRNA synthetase utilizes the unnatural amino acid azatyrosine moreefficiently than tyrosine),J.Biol.Chem.275:40324-40328。然而,该氨基酸掺入整个大肠杆菌的蛋白质组中,产生的酶仍然优选酪氨酸作为底物。Yokoyama和同事们在麦胚翻译系统中筛选了一小部分设计的EcTyrRS活性位点变体,发现了利用3-碘化酪氨酸比利用酪氨酸更有效的EcTyrRS变体。参见,D.Kiga,等,(2002),在真核翻译中将非天然氨基酸位点特异性掺入蛋白中的工程大肠杆菌酪氨酰-tRNA合成酶及其在麦胚无细胞体系中的应用(An engineered Escherichia colityrosyl-tRNA synthetase for site-specific incorporation of an unnaturalamino acid into proteins in eukaryotic translation and its application ina wheat germ cell-free system),Proc.Natl.Sci.U.S.A.99:9715-9720。与我们在大肠杆菌中开发的酶相反(例如,J.W.Chin,等,(2002),将光交联剂加入到大肠杆菌的遗传密码中(Addition of a Photocrosslinker to the Genetic Code ofEscherichia coli),Proc.Natl.Acad.Sci.U.S.99:11020-11024;J.W.Chin,等,(2002),将对-叠氮基-L-苯丙氨酸加入大肠杆菌的遗传密码中(Addition ofp-Azido-L-phenylalanine to the Genetic code of Escherichia coli),J.Am.Chem.Soc.124:9026-9027;L.Wang,等,(2001),扩展大肠杆菌的遗传密码(Expanding theGenetic Code of Escherichia coli),Science 292:498-500;和L.Wang,等,(2002),将L-3-(2-萘基)丙氨酸加入大肠杆菌的遗传密码中(AddingL-3-(2-naphthyl)alanine to the genetic code of E-coli),J.Am.Chem.Soc.124:1836-1837),该酶在没有非天然氨基酸的情况下仍然掺入酪氨酸。参见,例如,D.Kiga等,(2002),在真核翻译中将非天然氨基酸位点特异性掺入蛋白中的工程大肠杆菌酪氨酰-tRNA合成酶及其在麦胚无细胞体系中的应用(An engineeredEscherichia coli tyrosyl-tRNA synthetase for site-specific incorporation ofan unnatural amino acid into proteins in eukaryotic translation and itsapplication in a wheat germ cell free system),Proc.Natl.Acad.Sci.U.S.A.99:9715-9720。最近,Yokoyama和同事们也证明在哺乳动物细胞中该EcTyrRS突变体与来自嗜热脂肪芽孢杆菌的tRNACUA一起作用以抑制琥珀密码予。参见,K.Sakamoto,等,(2002),在哺乳动物细胞中将非天然氨基酸位点特异性掺入蛋白中,Nucleic AcidsRes.30:4692-4699。To expand the substrate specificity of EcTyrRS in E. coli, Nishimura and colleagues screened a library of EcTyrRS mutants generated by error-prone PCR and found mutants with improved ability to incorporate 3-azatyrosine. See, eg, F. Hamano-Takaku, et al., (2000), A mutant Escherichia colityrosyl tRNA synthetase utilizes the unnatural amino acid nitrogen tyrosine more efficiently than tyrosine amino acid azatyrosine more efficiently than tyrosine), J. Biol. Chem. 275: 40324-40328. However, this amino acid is incorporated throughout the proteome of E. coli, and the enzymes produced still prefer tyrosine as a substrate. Yokoyama and colleagues screened a small number of designed EcTyrRS active-site variants in the wheat germ translation system and found EcTyrRS variants that utilize 3-iodotyrosine more efficiently than tyrosine. See, D. Kiga, et al., (2002), An engineered E. coli tyrosyl-tRNA synthetase for site-specific incorporation of unnatural amino acids into proteins during eukaryotic translation and its expression in a wheat germ cell-free system Application (An engineered Escherichia colityrosyl-tRNA synthetase for site-specific incorporation of an unnatural amino acid into proteins in eukaryotic translation and its application in wheat germ cell-free system), Proc. Natl. Sci. USA99: 9715-9720. In contrast to the enzymes we developed in E. coli (eg, JW Chin, et al., (2002), Addition of a Photocrosslinker to the Genetic Code of Escherichia coli), Proc.Natl .Acad.Sci.US99: 11020-11024; JW Chin, et al., (2002), adding p-Azido-L-phenylalanine to the genetic code of Escherichia coli (Addition ofp-Azido-L-phenylalanine to the Genetic code of Escherichia coli), J.Am.Chem.Soc.124:9026-9027; L.Wang, et al., (2001), Expanding the Genetic Code of Escherichia coli (Expanding the Genetic Code of Escherichia coli), Science 292:498 -500; and L.Wang, et al., (2002), Adding L-3-(2-naphthyl)alanine to the genetic code of Escherichia coli of E-coli), J.Am.Chem.Soc.124:1836-1837), the enzyme still incorporates tyrosine in the absence of unnatural amino acids. See, eg, D. Kiga et al., (2002), An engineered E. coli tyrosyl-tRNA synthetase for site-specific incorporation of unnatural amino acids into proteins during eukaryotic translation and its use in a wheat germ cell-free system (An engineered Escherichia coli tyrosyl-tRNA synthetase for site-specific incorporation of an unnatural amino acid into proteins in eukaryotic translation and its application in a wheat germ cell free system), Proc. Natl. Acad. Sci. USA99: 9715-9720. Recently, Yokoyama and colleagues also demonstrated that the EcTyrRS mutant acts together with tRNA CUA from Bacillus stearothermophilus to suppress the amber codon in mammalian cells. See, K. Sakamoto, et al., (2002), Site-specific incorporation of unnatural amino acids into proteins in mammalian cells, Nucleic Acids Res. 30:4692-4699.

要求任何加入真核遗传密码的氨基酸以类似于普通的二十种氨基酸的保真度掺入。为了完成这个目的,用通常体内选择方法以发现在酿酒酵母中起作用响应于琥珀密码子TAG,掺入非天然氨基酸而非普通氨基酸的EcTyrRS/tRNACUA变体。选择的主要优点是可以从108EcTyrRS活性位点变体文库中迅速选择并富集选择性掺入非天然氨基酸的酶,这比体外筛选的多样性多6-7个数量级。参见,例如,D.Kiga,等,(2002),在真核翻译中将非天然氨基酸位点特异性掺入蛋白中的工程大肠杆菌酪氨酰-tRNA合成酶及其在麦胚无细胞体系中的应用(An engineered Escherichia colityrosyl-tRNA synthetase for site-specific incorporation of an unnaturalamino acid into proteins in eukaryotic translation and its application ina wheat germ cell-free system),Proc.Natl.Acad.Sci.U.S.A.99:9715-9720。这种多样性的增加大大提高了分离EcTyrRS变体的可能性,变体用于以非常高的保真度掺入不同范围的有用功能度。参见,例如,L.Wang和P.G.Schultz,(2002),扩展遗传密码(Expanding the Genetic Code),Chem.Comm.1-10。Any amino acid added to the eukaryotic genetic code is required to be incorporated with a fidelity similar to that of the common twenty amino acids. To this end, the usual in vivo selection methods were used to find EcTyrRS/tRNA CUA variants that function in Saccharomyces cerevisiae, incorporating unnatural amino acids instead of common amino acids in response to the amber codon TAG. The main advantage of the selection is that enzymes that selectively incorporate unnatural amino acids can be rapidly selected and enriched from a library of 10 8 EcTyrRS active site variants, which is 6-7 orders of magnitude more diverse than in vitro screening. See, e.g., D. Kiga, et al., (2002), E. coli tyrosyl-tRNA synthetase engineered for site-specific incorporation of unnatural amino acids into proteins during eukaryotic translation and its cell-free system in wheat germ (An engineered Escherichia colityrosyl-tRNA synthetase for site-specific incorporation of an unnaturalamino acid into proteins in eukaryotic translation and its application in wheat germ cell-free system), Proc.Natl.Acad.Sci.USA99:9715-9720 . This increase in diversity greatly increases the likelihood of isolating EcTyrRS variants for the incorporation of different ranges of useful functionality with very high fidelity. See, eg, L. Wang and PGSchultz, (2002), Expanding the Genetic Code, Chem. Comm. 1-10.

为了推广酿酒酵母的选择方法,使用了转录激活蛋白、GAL4(参见图1)。参见,例如,A.Laughon,等,(1984),鉴定两种通过酿酒酵母GAL4基因编码的蛋白(Identification of two proteins encoded by the Saccharomyces cerevisiae GAL4gene),Molecular&Cellular Biology 4:268-275;A.Laughon和R.F.Gesteland,(1984),酿酒酵母GAL4基因的一级结构(Primary structure of the Saccharomycescerevisiae GAL4 gene),Molecular&Cellular Biology 4:260-267;L.Keegan,等,(1986),从真核调节蛋白的转录-激活功能中分离DNA结合(Separation of DNAbinding from the transcription-activating function of a eukaryoticregulatory protein),Science 231:699-704;和M.Ptashne,(1988),真核转录激活物是如何工作的(How eukaryotic transcriptional activators work),Nature335:683-689。该881个氨基酸的蛋白N-末端147个氨基酸形成DNA结合域(DBD),它与DNA序列特异地结合。参见,例如,M.Carey,等,(1989),GAL4的氨基-末端片段与DNA结合成二聚体(An amino-terminal fragment of GAL4binds DNA as adimer),J.Mol.Biol.209:423-432;和E.Giniger,等,(1985),GAL4,一种酵母阳性调节蛋白的特异性DNA结合(Specific DNA binding of GAL4,a positiveregulatory protein of yeast),Cell 40:767-774。由间插蛋白序列将DBD连接到C-末端113个氨基酸激活域(AD),当该激活域与DNA结合时可激活转录。参见,例如,J.Ma和M.Ptashne,(1987),GAL4的缺失分析限定了两种转录激活节段(Deletion analysis of GAL4 defines two transcriptional activating segments),Cell 48:847-853:和J.Ma和M.Ptashne,(1987),GAL80识别GAL4羧基-末端的30个氨基酸(The carboxy-terminal 30amino acids of GAL4are recognized byGAL80),Cell 50:137-142。我们想像,通过将琥珀密码子置于靠近含有GAL4的N-末端DBD和它的C-末端AD的单个多肽的N-末端DBD处,通过EcTyrRS/tRNACUA对的琥珀抑制可以与通过GAL4的转录激活连接(图1,A组)。通过选择合适的GAL4激活的报道基因,正和负选择都可以用该基因进行(图1,B组)。虽然很多基于补充细胞的氨基酸营养缺陷型报道基因可以用于正选择(如URA3,LEU2,HISS,LYS2),但HISS基因是吸引人的报道基因,因为可以通过加入3-氨基三唑(3-AT)以剂量依赖方式调节它编码的蛋白的活性(咪唑甘油磷酸脱氢酶)。参见,例如,G.M.Kishore和D.M.Shah,(1988),作为除草剂的氨基酸生物合成抑制剂(Amino acid biosynthesisinhibitors as herbicides),Annual Review of Biochemistry 57:627-663。在酿酒酵母中,较少基因已用于负选择。已经成功使用的几种负选择策略之一(参见,例如,A.J.DeMaggio,等,(2000),酵母分裂-杂交系统(The yeast split-hybridsystem),Method Enzymol.328:128-137;H.M.Shih,等,(1996),阳性遗传选择破坏蛋白-蛋白相互作用:鉴定阻止与辅激活物CBP结合的CREB突变(A positivegenetic selection for disrupting protein-protein interactions :identification of CREB mutations,that prevent association with thecoactivator CBP),Proc.Natl.Acad.Sci.U.S.A.93:13896-13901;M.Vidal,等,(1996),用酵母反向双杂交系统遗传表征哺乳动物蛋白-蛋白相互作用域的遗传特征(Genetic characterization of a mammalian protein-protein interactiondomain by using a yeast reverse two-hybrid system),[评论],Proc.Natl.Acad.Sci.U.S.A.93:10321-10326;和M.Vidal,等,(1996),反向双杂交和单杂交系统检测蛋白-蛋白解离和DNA-蛋白相互作用(Reverse two-hybrid and one-hybridsystems to detect dissociation of protein-protein and DNA-proteininteractions),[评论],Proc.Natl.Acad.Sci.U.S.A.93:10315-10320)是在Vidal和同事们开发的‘反向双杂交’系统中描述的URA3/5-氟乳清酸(5-FOA)负选择(例如,J.D.Boeke,等,(1984),在酵母中正选择缺少乳清酸核苷-5’-磷酸脱羧酶活性的突变体:5-氟乳清酸抗性(A positive selection for mutants lackingorotidine-5’-phosphate decarboxylase activity in yeast:5-fluoroorotic acidresistance),Molecular&General Genetics 197:345-346)系统。参见,M.Vidal,等,(1996),用酵母反向双杂交系统遗传表征哺乳动物蛋白-蛋白相互作用域(Genetic characterizationof a mammalian protein-protein interaction domainby using a yeast reverse two-hybrid system),[评论],Proc.Natl.Acad.Sci.U.S.A.93:10321-10326;和M.Vidal,等,(1996),反向双杂交和单杂交系统检测蛋白-蛋白解离和DNA-蛋白相互作用(Reverse two-hybrid and one-hybrid systems todetect dissociation of protein-protein and DNA-protein interactions),[评论],Proc.Natl.Acad.Sci.U.S.93:10315-10320)。在反向双杂交系统中,将基因组整合的URA3报道基因置于严密控制的启动子下,该启动子含有GAL4DNA结合位点。当相互作用的两种蛋白与GAL4DBD和GAL4AD产生融合时,它们就重建了GAL4的活性,并激活URA3的转录。在5-FOA存在下,URA3基因产物将5-FOA转化成有毒产物,杀死细胞。参见,J.D.Boeke,等,上述。将该选择用于选择破坏蛋白-蛋白相互作用的蛋白和选择破坏蛋白-蛋白相互作用的突变。也描述了用于筛选蛋白-蛋白相互作用的小分子抑制剂的变体。参见,例如,J.Huang和S.L.Schreiber,(1997)用于在纳米液滴中选择蛋白-蛋白相互作用的小分子抑制剂的酵母遗传体系(A yeastgenetic system for selecting small molecule inhibitors of protein-proteininteractions in nanodroplets),Proc.Natl.Acad.Sci.U.S.A.94:13396-13401.To generalize the selection method in S. cerevisiae, the transcriptional activator protein, GAL4 (see Figure 1 ) was used. See, for example, A.Laughon, et al., (1984), Identification of two proteins encoded by the Saccharomyces cerevisiae GAL4 gene, Molecular & Cellular Biology 4:268-275; A.Laughon and RFGesteland, (1984), Primary structure of the Saccharomycescerevisiae GAL4 gene, Molecular & Cellular Biology 4: 260-267; L. Keegan, et al., (1986), Transcription from eukaryotic regulatory protein- Separation of DNAbinding from the transcription-activating function of a eukaryotic regulatory protein, Science 231:699-704; and M.Ptashne, (1988), How eukaryotic transcriptional activators work transcriptional activators work), Nature335:683-689. The N-terminal 147 amino acids of the 881 amino acid protein form a DNA binding domain (DBD), which specifically binds DNA sequences. See, for example, M. Carey, et al., (1989), An amino-terminal fragment of GAL4 binds DNA as adimer (An amino-terminal fragment of GAL4 binds DNA as adimer), J. Mol. Biol. 209: 423- 432; and E. Giniger, et al., (1985), Specific DNA binding of GAL4, a positive regulatory protein of yeast, Cell 40:767-774. The DBD is linked by an intervening protein sequence to a C-terminal 113 amino acid activation domain (AD), which activates transcription when bound to DNA. See, e.g., J.Ma and M.Ptashne, (1987), Deletion analysis of GAL4 defines two transcriptional activating segments (Deletion analysis of GAL4 defines two transcriptional activating segments), Cell 48:847-853: and J. Ma and M. Ptashne, (1987), GAL80 recognizes the carboxy-terminal 30 amino acids of GAL4 (The carboxy-terminal 30 amino acids of GAL4 are recognized by GAL80), Cell 50: 137-142. We imagined that by placing the amber codon close to the N-terminal DBD of a single polypeptide containing the N-terminal DBD of GAL4 and its C-terminal AD, amber repression by the EcTyrRS/tRNA CUA pair could be linked to transcription by GAL4 Activate connections (Fig. 1, panel A). By selecting an appropriate GAL4-activated reporter gene, both positive and negative selection can be performed with this gene (Figure 1, panel B). Although many amino acid auxotrophic reporter genes based on supplementation of cells can be used for positive selection (such as URA3, LEU2, HISS, LYS2), HISS genes are attractive reporter genes because they can be expressed by adding 3-aminotriazole (3- AT) modulates the activity of the protein it encodes (imidazole glycerol phosphate dehydrogenase) in a dose-dependent manner. See, eg, GM Kishore and DM Shah, (1988), Amino acid biosynthesis inhibitors as herbicides, Annual Review of Biochemistry 57:627-663. In S. cerevisiae, fewer genes have been used for negative selection. One of several negative selection strategies that have been used successfully (see, e.g., AJ DeMaggio, et al., (2000), Yeast split-hybrid system (The yeast split-hybrid system), Method Enzymol. 328:128-137; HM Shih, et al. (1996), Positive genetic selection for disrupting protein-protein interactions: identification of CREB mutations that prevent association with the coactivator CBP (A positive genetic selection for disrupting protein-protein interactions: identification of CREB mutations, that prevent association with the coactivator CBP), Proc .Natl.Acad.Sci.USA93:13896-13901; M.Vidal, et al., (1996), Genetic characterization of a mammalian protein interaction domain using yeast reverse two-hybrid system -protein interactiondomain by using a yeast reverse two-hybrid system), [Review], Proc.Natl.Acad.Sci.USA93:10321-10326; and M.Vidal, et al., (1996), Reverse two-hybrid and single-hybrid Systems to detect protein-protein dissociation and DNA-protein interactions (Reverse two-hybrid and one-hybridsystems to detect dissociation of protein-protein and DNA-protein interactions), [Review], Proc.Natl.Acad.Sci.USA93:10315 -10320) is the URA3/5-fluoroorotic acid (5-FOA) negative selection described in the 'reverse two-hybrid' system developed by Vidal and colleagues (eg, JDBoeke, et al., (1984), positive in yeast A positive selection for mutants lackingorotidine-5'-phosphate decarboxylase activity in yeast: 5-fluoroorotic acid resistance, Molecular & General Genetics 197:345-346) system. See, M.Vidal, et al., (1996), Genetic characterization of a mammalian protein-protein interaction domain by using a yeast reverse two-hybrid system, [ Review], Proc.Natl.Acad.Sci.USA93:10321-10326; and M.Vidal, et al., (1996), Reverse two-hybrid and one-hybrid systems for detection of protein-protein dissociation and DNA-protein interactions (Reverse two-hybrid and one-hybrid systems to detect dissociation of protein-protein and DNA-protein interactions), [Review], Proc. Natl. Acad. Sci. US93: 10315-10320). In the reverse two-hybrid system, the genome-integrated URA3 reporter gene was placed under a tightly controlled promoter containing a GAL4 DNA-binding site. When the interacting proteins fused to GAL4DBD and GAL4AD, they reestablished GAL4 activity and activated URA3 transcription. In the presence of 5-FOA, the URA3 gene product converts 5-FOA into a toxic product that kills the cell. See, JDBoeke, et al., supra. This selection is used to select for proteins that disrupt protein-protein interactions and to select for mutations that disrupt protein-protein interactions. Variants of small molecule inhibitors for screening protein-protein interactions are also described. See, for example, J.Huang and SL Schreiber, (1997) A yeast genetic system for selecting small molecule inhibitors of protein-protein interactions in nanodroplets (A yeast genetic system for selecting small molecule inhibitors of protein-protein interactions in nanodroplets ), Proc.Natl.Acad.Sci.USA94:13396-13401.

在全长GAL4中琥珀密码子的合适选择允许用HIS3或URA3GAL4激活的报道基因有效正选择活性EcTyrRS变体,以在酵母细胞中补充组氨酸或尿嘧啶营养缺陷型。而且,URA3报道基因可以用于在5-FOA存在下负选择失活EcTyrRS变体。此外,可以将使用lacZ的比色测定用于读出酵母细胞中氨酰基-tRNA合成酶活性。Appropriate selection of amber codons in full-length GAL4 allows efficient positive selection of active EcTyrRS variants with HIS3 or URA3GAL4-activated reporters to complement histidine or uracil auxotrophs in yeast cells. Furthermore, the URA3 reporter gene can be used for negative selection to inactivate EcTyrRS variants in the presence of 5-FOA. In addition, a colorimetric assay using lacZ can be used to read out aminoacyl-tRNA synthetase activity in yeast cells.

结果和讨论Results and discussion

在组成型ADH1启动子的控制下表达EcTyrRS基因,从相同的高拷贝酵母质粒(pEcTyrRStRNACUA,图1,C组)中表达tRNACUA基因。在pEcTyrRStRNACUA和低拷贝报道基因的共转化时,该报道基因在嵌入MaV203的GAL4构建物的DNA结合域和激活域之间含有单琥珀突变,细胞在缺少组氨酸和含有10-20mM 3-AT的选择性培养基上生长(图2)。当MaV203细胞转染了相同的GAL4构建物和失活合成酶突变体(A5)或缺少EctRNA基因的构建物时,在10mM 3-AT上没有观察到生长(图2)。这些实验确定EcTyrRS可以从ADH1启动子以功能形式组成型表达,在MaV203中有最小的内源性琥珀抑制,和在该系统中酵母合成酶几乎没有载有EctRNACUA。参见,例如,H.Edwards,等,(1991),大肠杆菌酪氨酸转移RNA在酿酒酵母中是亮氨酸-特异性转移RNA(AnEscherichia coli tyrosine transfer RNA is a leucine-specific transfer RNAin the yeast Saccharomyces cerevisiae),Proc.Natl.Acad.Sci.U.S.A.88:1153-1156;以及H.Edwards和P.Schimmel,(1990),细菌氨酰基-tRNA合成酶选择性识别酿酒酵母中的细菌琥珀抑制子(A bacterial amber suppressor inSaccharomyces cerevisiae is selectively recognized by a bacterialaminoacyl-tRNA synthetase),Molecular&Cellular Biology 10:1633-1641。因为EcTyrRS不载有酿酒酵母tRNA(例如,Y.Kwok和J.T.Wong,(1980),用氨酰基-tRNA合成酶作为系统发育探针确定红皮盐杆菌和真核生物之间的进化关系(Evolutionary relationship between Halobacterium cutirubrum and eukaryotesdetermined by use of aminoacyl-tRNA synthetases as phylogenetic probes),Canadian Journal of Biochemistry 58:213-218;B.P.Doctor,等,(1966),酵母和大肠杆菌酪氨酸tRNA的种特异性的研究(Studies on the species specificityof yeast and E.coli tyrosine tRNAs),Cold Spring HarborSymp.Quant.Biol.31:543-548;和K.Wakasugi,等,(1998),进化中的遗传密码:将种特异性氨酰化与肽移植物交换(Genetic code in evolution:switching species-specificaminoacylation with a peptide transplant),EMBO Journal 17:297-305),这些实验证实EcTyrRS/EctRNACUA在酿酒酵母中是正交对。The EcTyrRS gene was expressed under the control of the constitutive ADH1 promoter, and the tRNA CUA gene was expressed from the same high-copy yeast plasmid ( pEcTyrRStRNACUA , Figure 1, panel C). Upon co-transformation of pEcTyrRStRNA CUA and a low-copy reporter gene containing a single amber mutation between the DNA-binding and activation domains of the GAL4 construct embedded in MaV203, cells lacked histidine and contained 10–20 mM 3- AT selective medium growth (Figure 2). When MaV203 cells were transfected with the same GAL4 construct and an inactive synthetase mutant (A5) or a construct lacking the EctRNA gene, no growth was observed at 10 mM 3-AT (Figure 2). These experiments established that EcTyrRS can be constitutively expressed in a functional form from the ADH1 promoter, with minimal endogenous amber repression in MaV203, and that the yeast synthase is barely loaded with EctRNA CUA in this system. See, e.g., H.Edwards, et al., (1991), Escherichia coli tyrosine transfer RNA is a leucine-specific transfer RNA in the yeast Saccharomyces cerevisiae), Proc.Natl.Acad.Sci.USA88:1153-1156; and H.Edwards and P.Schimmel, (1990), bacterial aminoacyl-tRNA synthetase selective recognition of the bacterial amber suppressor in Saccharomyces cerevisiae (A bacterial amber suppressor in Saccharomyces cerevisiae is selectively recognized by a bacterial aminoacyl-tRNA synthetase), Molecular & Cellular Biology 10: 1633-1641. Because EcTyrRS does not carry Saccharomyces cerevisiae tRNA (eg, Y.Kwok and JTWong, (1980), aminoacyl-tRNA synthetases were used as phylogenetic probes to determine the evolutionary relationship between Halobacter erythrodermis and eukaryotes (Evolutionary relationship between Halobacterium cutirubrum and eukaryotes determined by use of aminoacyl-tRNA synthetases as phylogenetic probes), Canadian Journal of Biochemistry 58: 213-218; BPDoctor, et al., (1966), a study on the species specificity of yeast and Escherichia coli tyrosine tRNA ( Studies on the species specificity of yeast and E. coli tyrosine tRNAs), Cold Spring HarborSymp. Quant. Biol. 31: 543-548; These experiments confirmed that EcTyrRS/EctRNA CUA is an orthogonal pair in S. cerevisiae.

虽然第一代GAL4嵌合体能够激活弱HIS3报道基因的转录,但是它不能在MaV203中激活URA3报道基因的转录足以在大于20mM的3-AT浓度上或在-URA平板上明显生长(图2)。为了选择EcTyrRS的目的,制成了变体第二代GAL4构建物。该GAL4报道基因被设计得更有活性,具有更大的动态范围,以避免回复体聚集。为了提高GAL4报道基因的活性,在强ADH1启动子的控制下使用全长GAL4(它的转录激活活性是DBD-AD融合体的两倍(参见,例如,J.Ma和M.Ptashne,(1987),GAL4的缺失分析限定了两种转录激活节段(Deletion analysis of GAL4 defines twotranscriptional activating segments),Cell 48:847-853),并使用了高拷贝的2-微米质粒(拷贝数是起始GAL4嵌合体的着丝粒质粒的10-30倍)。质粒拷贝数和它编码的蛋白活性的增加应该延伸报道基因的动态范围。琥珀突变是靶向编码氨基酸残基2和147的GAL4基因区域(图3)。该区域足够序列特异性DNA结合(参见,例如,M.Carey,等,(1989),GAL4的氨基-末端片段与DNA结合成二聚体(An amino-terminalfragment of GAL4 binds DNA as a dimer),J.Mol.Biol.209:423-432),位于GAL4基因中第一隐蔽激活域的5’侧(参见,例如,J.Ma和M.Ptashne,(1987)GAL4的缺失分析限定了两种转录激活节段(Deletion analysis of GAL4 defines twotranscriptional activating segments),Cell 48:847-853),Cell 48:847-853),以使不预计在琥珀抑制不存在的情况下产生的截短产物能激活转录。氨基酸密码子的选择突变由以前对GAL4的饱和诱变选择指导(参见,例如,M.Johnston和J.Dover,(1988),酿酒酵母GAL4-编码的转录激活蛋白的突变分析(Mutational analysis ofthe GAL4-encoded transcriptional activator protein of Saccharomycescerevisiae),Genetics 120:63-74),以及GAL4的N-末端DNA结合域的X射线结构(参见,例如,R.Marmorstein,等,(1992),通过GAL4进行DNA识别:蛋白-DNA复合物的结构(DNA recognition by GAL4:structure of a protein-DNA complex),[评论],Nature 356:408-414;和J.D.Baleja,等,(1992),酿酒酵母Cd2-GAL4的DNA-结合域的溶液结构(Solution structure of the DNA-binding domain ofCd2-GAL4from S.cerevisiae),[评论],Nature 356:450-453)和它的二聚化区域的NMR结构。参见,例如,P.Hidalgo,等,(2001),通过GAL11P募集转录机器:GAL4二聚化域的结构和相互作用(Recruitment of the transcriptional machinerythrough GALllP:structure and interactions of the GAL4 dimerization domain),Genes&Development15:1007-1020。While the first-generation GAL4 chimera was able to activate transcription of the weak HIS3 reporter gene, it was unable to activate transcription of the URA3 reporter gene in MaV203 sufficiently to grow appreciably at 3-AT concentrations greater than 20 mM or on -URA plates (Fig. 2) . For the purpose of selecting for EcTyrRS, a variant second generation GAL4 construct was made. The GAL4 reporter gene was designed to be more active with a larger dynamic range to avoid aggregation of revertants. To increase the activity of the GAL4 reporter gene, full-length GAL4 was used under the control of the strong ADH1 promoter (which is twice as transcriptionally active as the DBD-AD fusion (see, e.g., J.Ma and M.Ptashne, (1987) ), GAL4 deletion analysis defined two transcriptional activating segments (Deletion analysis of GAL4 defines two transcriptional activating segments), Cell 48:847-853), and used a high-copy 2-micron plasmid (copy number is the starting GAL4 10-30 times the centromeric plasmid of the chimera). The increase in plasmid copy number and the activity of the protein it encodes should extend the dynamic range of the reporter gene. The amber mutation is targeting the region of the GAL4 gene encoding amino acid residues 2 and 147 ( Figure 3). This region is sufficient for sequence-specific DNA binding (seeing, for example, M. Carey, et al., (1989), the amino-terminal fragment of GAL4 combines with DNA to form a dimer (An amino-terminal fragment of GAL4 binds DNA as a dimer), J.Mol.Biol.209:423-432), located on the 5' side of the first cryptic activation domain in the GAL4 gene (see, e.g., J.Ma and M.Ptashne, (1987) Deletion analysis of GAL4 Deletion analysis of GAL4 defines two transcriptional activating segments (Deletion analysis of GAL4 defines two transcriptional activating segments, Cell 48:847-853), Cell 48:847-853), so that truncation not expected to occur in the absence of amber suppression Short products activate transcription. Amino acid codon selection mutations were guided by previous saturation mutagenesis selection for GAL4 (see, e.g., M. Johnston and J. Dover, (1988), Mutational analysis of the GAL4-encoded transcriptional activator protein of Saccharomyces cerevisiae -encoded transcriptional activator protein of Saccharomycescerevisiae), Genetics 120:63-74), and the X-ray structure of the N-terminal DNA-binding domain of GAL4 (see, e.g., R. Marmorstein, et al., (1992), DNA recognition by GAL4 : Structure of a protein-DNA complex (DNA recognition by GAL4: structure of a protein-DNA complex), [Review], Nature 356: 408-414; and J.D.Baleja, et al., (1992), Saccharomyces cerevisiae Cd2-GAL4 The solution structure of the DNA-binding domain of Cd2-GAL4 from S.cerevisiae, [Review], Nature 356:450-453) and the NMR structure of its dimerization region. See, for example, P. Hidalgo, et al., (2001), Recruitment of the transcriptional machinery through GALllP: structure and interactions of the GAL4 dimerization domain, Genes & Development 15: 1007-1020.

将全长GAL4克隆到基于小pUC的载体中,以通过定位诱变迅速构建10个单琥珀突变体(在氨基酸L3、I13、T44、F68、R110、V114、T121、I127、S131、T145的密码子处)。然后,在全长ADH1启动子的控制下将GAL4和产生的琥珀突变体亚克隆到2-微米的酵母载体中,以建立pGADGAL4和一系列称为pGADGAL4(xxTAG)的琥珀突变体(图1,C组),其中xx指GAL4基因中的突变为琥珀密码子的氨基酸密码子。用ECTyrRS/tRNACUA或A5/tRNACUA将各GAL4突变体共转化到MaV203细胞中,将转化子转化为亮氨酸和色氨酸原养型(protrophy)。pGADGAL4本身以非常低的效率转化(<GAL4琥珀突变体的10-3倍),在如此高拷贝下对MaV203细胞大概有毒;用GAL4的琥珀突变体没有观察到这种效果。Full-length GAL4 was cloned into a small pUC-based vector to rapidly construct 10 single amber mutants (codons at amino acids L3, I13, T44, F68, R110, V114, T121, I127, S131, T145) by site-directed mutagenesis sub-office). Then, GAL4 and the resulting amber mutants were subcloned into 2-micron yeast vectors under the control of the full-length ADH1 promoter to establish pGADGAL4 and a series of amber mutants called pGADGAL4(xxTAG) (Figure 1, Group C), wherein xx refers to an amino acid codon mutated into an amber codon in the GAL4 gene. Each GAL4 mutant was co-transformed into MaV203 cells with ECTyrRS/tRNA CUA or A5/tRNA CUA , converting the transformants to leucine and tryptophan protrophy. pGADGAL4 itself transforms with very low efficiency (< 10-3 times that of the GAL4 amber mutant), and at such high copies is presumably toxic to MaV203 cells; this effect was not observed with the GAL4 amber mutant.

在活性或死合成酶的存在下,在-URA平板和0.1%5-FOA平板上测定GAL4报道基因的表型(图3,A组)。在野生型或失活EcTyrRS存在下,五个GAL4突变体(L3TAG、I13TAG、T44TAG、F68TAG、S131TAG)在-URA平板上生长,不能在0.1%5-FOA上生长。在这些琥珀突变体中,内源性抑制明显足够将EcTyrRS/tRNACUA介导的抑制推进到MaV203中URA3报道基因的动态范围以外。五个GAL4单琥珀突变体(R110TAG,V114TAG,T121TAG,I127TAG,T145TAG)在没有尿嘧啶和存在EcTyrRS/tRNACUA的情况下生长(但不是A5/tRNACUA),在5-FOA上显示了反向表型。这些突变体显示了EcTyrRS依赖性表型,属于MaV203中URA3报道基因的动态范围内。用GAL4的R110TAG突变体观察到在-URA和0.1%5-FOA上最洁净的EcTyrRS依赖性表型。然而,当与A5共转化时,该突变体在X-GAL测定中显示一些蓝色。为了进一步改进动态范围,使一系列六个GAL4的双琥珀突变体含有R110TAG(图3,B组),(L3TAG、R110TAG;I13TAG、R110TAG;T44TAG、R110TAG;R110TAG、T121TAG;R110TAG、I127TAG;R110TAG、T145TAG)。这些双突变体中的四个(I13TAG、R110TAG;R110TAG、T121TAG;R110TAG、I127TAG和T145TAG、R110TAG)不能在没有尿嘧啶的条件下生长,而能够在0.1%5-FOA上生长。这些双突变体具有平板测定的动态范围以外的活性。双突变体中的两个(L3TAG、R110TAG和T44TAG、R110TAG)能够在野生型EcTyrRS/tRNACUA而非A5/tRNACUA存在的条件下在-URA平板上生长;这些突变体在5-FOA上也显示预期的交互表型。选择pGADGAL4(T44TAG、R110TAG),这两个GAL4突变体中更有活性的进行更详细的表征(图4)。含有pGADGAL4(T44TAG、R110TAG)/pEcTyrRS-tRNACUA的MaV203在X-GAL上是蓝色的,但是相应的含有pA5/tRNACUA的株则不是。类似地,含有pGADGAL4(T44TAG,R110TAG)/pEcTyrRS/tRNACUA的MaV203在具有浓度高达75mM的3-AT的平板和-URA平板上茁壮生长,但是相应的含有pA5/tRNACUA的株不能在10mM 3AT或没有尿嘧啶的情况下生长。总之,pGADGAL4的EcTyrRS依赖性表型(T44TAG,R110TAG)可以跨越MaV203中URA3,HIS3和lacZ报道基因的动态范围。The phenotype of the GAL4 reporter gene was determined on -URA plates and 0.1% 5-FOA plates in the presence of active or dead synthetase (Figure 3, panel A). Five GAL4 mutants (L3TAG, I13TAG, T44TAG, F68TAG, S131TAG) grew on -URA plates and failed to grow on 0.1% 5-FOA in the presence of wild-type or inactive EcTyrRS. In these amber mutants, endogenous repression was apparently sufficient to advance EcTyrRS/tRNA CUA- mediated repression beyond the dynamic range of the URA3 reporter gene in MaV203. Five GAL4 single amber mutants (R110TAG, V114TAG, T121TAG, I127TAG, T145TAG) grown in the absence of uracil and in the presence of EcTyrRS/tRNA CUA (but not A5/tRNA CUA ) showed reverse on 5-FOA Phenotype. These mutants display an EcTyrRS-dependent phenotype that falls within the dynamic range of the URA3 reporter gene in MaV203. The cleanest EcTyrRS-dependent phenotype was observed with the R110TAG mutant of GAL4 on -URA and 0.1% 5-FOA. However, this mutant showed some blue color in the X-GAL assay when co-transformed with A5. To further improve the dynamic range, a series of six double amber mutants of GAL4 containing R110TAG (Fig. 3, panel B), (L3TAG, R110TAG; I13TAG, R110TAG; T44TAG, R110TAG; R110TAG, T121TAG; R110TAG, I127TAG; R110TAG, T145TAG). Four of these double mutants (I13TAG, R110TAG; R110TAG, T121TAG; R110TAG, I127TAG and T145TAG, R110TAG) were unable to grow in the absence of uracil but were able to grow on 0.1% 5-FOA. These double mutants had activity outside the dynamic range of the plate assay. Two of the double mutants (L3TAG, R110TAG and T44TAG, R110TAG) were able to grow on -URA plates in the presence of wild-type EcTyrRS/tRNA CUA but not A5/tRNA CUA ; these mutants also grew on 5-FOA Show expected interaction phenotypes. pGADGAL4(T44TAG, R110TAG), the more active of the two GAL4 mutants, was selected for more detailed characterization (Figure 4). MaV203 containing pGADGAL4(T44TAG, R110TAG)/pEcTyrRS-tRNA CUA was blue on X-GAL, but the corresponding strain containing pA5/tRNA CUA was not. Similarly, MaV203 containing pGADGAL4(T44TAG, R110TAG)/pEcTyrRS/tRNA CUA thrived on plates with 3-AT concentrations up to 75 mM and on -URA plates, but the corresponding strain containing pA5/tRNA CUA failed to grow on 10 mM 3AT. Or grow without uracil. In conclusion, the EcTyrRS-dependent phenotype of pGADGAL4 (T44TAG, R110TAG) can span the dynamic range of URA3, HIS3 and lacZ reporter genes in MaV203.

感兴趣的是确定GAL4突变体的活性,其中T44或R110被除了酪氨酸的氨基酸取代,因为在不改变GAL4活性的情况下取代不同氨基酸的能力可能对选择可以将非天然氨基酸掺入蛋白的突变氨酰基-tRNA合成酶有用。参见,例如,M.Pasternak,等,(2000),用于形成具有扩展遗传密码的生物的新正交抑制型tRNA/氨酰基-tRNA合成酶对(A new orthogonal suppressor tRNA/aminoacyl-tRNA synthetase pair forevolving an organism with an expanded genetic code),Helvetica Chemica Acta83:2277。将GAL4中残基T44的一系列五个突变体(T44Y,T44W,T44F,T44D,T44K)构建到pGADGAL4(R110TAG)中,因为pGADGAL4本身是有毒的。将GAL4中R110位的类似系列的突变体(R110Y,R110W,R110F,R110D,R110K)构建到pGADGAL4(T44TAG)中。这些突变体与我们在掺入蛋白中感兴趣的大疏水氨基酸侧链是有偏向的,但也包含带正电荷和负电荷的残基,作为允许的严格测试。用pEcTyrRS/tRNACUA将各突变体共转化到MaV203细胞中,用邻-硝基苯基-β-D-半乳糖吡喃糖苷(0NPG)水解测定leu+trp+分离物的lacZ产生(图5)。在所有情况中,细胞间活性差异小于3倍,该细胞含有取代了T44或R110的不同氨基酸的GAL4。这种最小的可变性证明了这些位点允许在不改变GAL4的转录活性的情况下进行氨基酸取代。正如由选择性平板上测定的单琥珀突变体活性所料,在GAL4(R110TAG)背景中制成的T44突变体导致ONPG的水解比在GAL4(T44TAG)背景中制成的R110突变体更慢。It is of interest to determine the activity of GAL4 mutants in which T44 or R110 are substituted with amino acids other than tyrosine, as the ability to substitute different amino acids without altering GAL4 activity may be useful for selection of proteins that can incorporate unnatural amino acids. Mutating aminoacyl-tRNA synthetases is useful. See, e.g., M. Pasternak, et al., (2000), A new orthogonal suppressor tRNA/aminoacyl-tRNA synthetase pair (A new orthogonal suppressor tRNA/aminoacyl-tRNA synthetase pair for formation of organisms with extended genetic code forevolving an organism with an expanded genetic code), Helvetica Chemica Acta 83:2277. A series of five mutants of residue T44 in GAL4 (T44Y, T44W, T44F, T44D, T44K) were constructed into pGADGAL4(R110TAG), since pGADGAL4 itself is toxic. A similar series of mutants at position R110 in GAL4 (R110Y, R110W, R110F, R110D, R110K) were constructed into pGADGAL4(T44TAG). These mutants were biased towards the large hydrophobic amino acid side chains we were interested in incorporating proteins, but also contained positively and negatively charged residues, as allowed for stringent testing. Each mutant was co-transformed into MaV203 cells with pEcTyrRS/tRNA CUA , and the lacZ production of leu+trp+ isolates was assayed by o-nitrophenyl-β-D-galactopyranoside (ONPG) hydrolysis (Figure 5) . In all cases, the activity differed less than 3-fold between cells containing GAL4 substituted for different amino acids of T44 or R110. This minimal variability demonstrates that these sites allow amino acid substitutions without altering the transcriptional activity of GAL4. As expected from the single amber mutant activity assayed on selective plates, the T44 mutant made in the GAL4(R110TAG) background resulted in slower hydrolysis of ONPG than the R110 mutant made in the GAL4(T44TAG) background.

进行模型富集研究以检测该系统从大过量的失活合成酶中选择活性合成酶的能力(表1,表2,图6)。该选择模仿了在非天然氨基酸存在下从变体文库中选择活性合成酶的能力。将含有GAL4(T44、R110)和EcTyrRS/tRNACUA的MaV203细胞与由0D660确定过量10至106倍的GAL4(T44TAG,R110TAG)和A5/tRNACUA,以及当铺板于非选择性-leu,-trp培养基上通过X-Gal覆盖测定变蓝的菌落部分混合。选择那些能够在50mM 3-AT或在没有尿嘧啶的情况下存活的细胞。在3-AT或-URA上存活的细胞在X-Gal测定中蓝色和白色的比例,与不存在选择下的相同比例比较时,清楚地证明正选择可以从死合成酶中富集活性合成酶(表1),系数>105。起始比例大于1∶105,测定准确富集一般是不可能的,因为不多于106个细胞可以方便地铺板而没有显著细胞间串话导致不可靠表型。Model enrichment studies were performed to test the ability of this system to select active synthetases from a large excess of inactive synthetases (Table 1, Table 2, Figure 6). This selection mimics the ability to select for active synthetases from a library of variants in the presence of unnatural amino acids. MaV203 cells containing GAL4 (T44, R110) and EcTyrRS/tRNA CUA were plated with a 10- to 10 -fold excess of GAL4 (T44TAG, R110TAG) and A5/tRNA CUA as determined by OD660, and when plated on non-selective -leu,- Colonies on trp medium that turned blue by X-Gal overlay assay were partially mixed. Those cells were selected that could survive on 50 mM 3-AT or in the absence of uracil. The ratio of blue to white in the X-Gal assay of cells surviving on 3-AT or -URA, when compared to the same ratio in the absence of selection, clearly demonstrates that positive selection can enrich active synthesis from dead synthases Enzymes (Table 1), coefficient >10 5 . Starting at ratios greater than 1: 105 , determining accurate enrichment is generally not possible, since no more than 106 cells can be plated conveniently without significant intercellular crosstalk leading to unreliable phenotypes.

表1.模型正选择功能性EcTyrRS。Table 1. Model positively selects for functional EcTyrRS.

Figure A20048002115500941
Figure A20048002115500941

a)通过OD660测定a) Measured by OD660

b)在X-Gal上b) On X-Gal

表2.模型负选择无功能EcTyrRS(A5)。Table 2. Model negative selection for non-functional EcTyrRS (A5).

a)通过OD660测定a) Measured by OD660

b)在X-Gal上b) On X-Gal

在非天然氨基酸存在下正选择后,选择的细胞将含有能够使用天然氨基酸和能够使用加入的非天然氨基酸的合成酶。为分离仅能够使用非天然氨基酸的合成酶,必须从选择的克隆中除去编码使用天然氨基酸的合成酶的细胞。这可以用负选择完成,负选择中,非天然氨基酸被保留,而那些与天然氨基酸一起发挥作用的合成酶被去除。以与模型正选择类似的方式进行模型负选择。将EcTyrRS/tRNACUA与过量10至105倍的A5/tRNACUA混合,在0.1%5-FOA上进行选择。将在0.1%5-FOA上存活细胞在X-GAL测定中是白色和蓝色的比例与非选择性条件下的相同比例进行比较(参见表2),清楚的是负选择可以从活性合成酶中富集死合成酶,系数至少0.6x104。在起始比例大于1∶104,测定准确富集一般是不可能的,因为不多于105个细胞可以方便地铺板而没有显著细胞间串话导致不可靠表型。Following positive selection in the presence of an unnatural amino acid, selected cells will contain synthetases capable of using the natural amino acid as well as capable of using the added unnatural amino acid. To isolate synthetases capable of using only unnatural amino acids, cells encoding synthetases using natural amino acids must be removed from the selected clones. This can be done using negative selection, in which unnatural amino acids are retained and those synthetases that work with natural amino acids are removed. Model negative selection is performed in a similar manner to model positive selection. EcTyrRS/tRNA CUA was mixed with a 10 to 10 5 -fold excess of A5/tRNA CUA and selected on 0.1% 5-FOA. Comparing the proportion of surviving cells that were white and blue in the X-GAL assay on 0.1% 5-FOA with the same proportion under non-selective conditions (see Table 2), it is clear that negative selection can be obtained from active synthetase Dead synthetases are enriched in , with a factor of at least 0.6x10 4 . At starting ratios greater than 1: 104 , determining accurate enrichment is generally not possible, since no more than 105 cells can be plated conveniently without significant intercellular crosstalk leading to unreliable phenotypes.

开发了一种通用方法,进行识别非天然氨基酸的aaRS正选择和识别天然氨基酸的aaRS负选择。通过改变选择的严格性,可以分离各种合成酶活性。将该方法应用于用EcTyrRS变体的模型选择中显示在单轮正选择中富集大于105,在单轮负选择中大于0.6x104。这些观察提示该方法可以提供快速到达正交氨酰基-tRNA合成酶,其功能是将具有各种侧链的非天然氨基酸位点特异地掺入酿酒酵母的蛋白中。而且,酿酒酵母中产生的酶可以用于高等真核生物。A general approach was developed for positive selection of aaRS that recognizes unnatural amino acids and negative selection for aaRS that recognizes natural amino acids. By varying the stringency of selection, various synthetase activities can be isolated. Application of this method to model selection with EcTyrRS variants showed enrichment greater than 10 5 in a single round of positive selection and greater than 0.6×10 4 in a single round of negative selection. These observations suggest that this approach may provide rapid access to orthogonal aminoacyl-tRNA synthetases, which function to site-specifically incorporate unnatural amino acids with various side chains into S. cerevisiae proteins. Furthermore, enzymes produced in S. cerevisiae can be used in higher eukaryotes.

材料和方法Materials and methods

载体构建vector construction

用引物tRNA5’:GGGGGGACCGGTGGGGGGACCGGTAAGCTTCCCGATAAGGGAGCAGGCCAGTAAAAAGCATTACCCCGTGGTGGGTTCCCGA(SEQ ID NO:89)和tRNA3’:GGCGGCGCTAGCAAGCTTCCCGATAAGGGAGCAGGCCAGTAAAAAGGGAAGTTCAGGGACTTTTGAAAAAAATGGTGGTGGGGGAAGGAT(SEQ ID NO:90)从pESCSU3URA中PCR扩增tRNACUA基因。这个以及其它所有PCR反应都用Roche的ExpandPCR试剂盒,根据生产商说明书进行。限制性核酸内切酶NheI和AgeI消化后,将该tRNA基因插入2微米载体pESCTrp(Stratagene)中的相同位点之间,产生ptRNACUA。用引物PADHf:IGGGGGGACCGGTIGGGGGGACCGGTCGGGATCGAAGAAATGATGGTAAATGAAATAGGAAATCAAGG(SEQ ID NO:91)和pADHR:GGGGGGGAATTCAGTTGATTGTATGCTTGGTATAGCTTGAAATATTGTGCAGAAAAAGAAAC(SEQ ID NO:92)从pDBLeu(Invitrogen)中PCR扩增全长ADH1启动子,用AgeI和EcoRI消化。用引物pESCTrp1:TCATAACGAGAATTCCGGGATCGAAGAAATGATGGTAAATGAAATAGGAAATCTCATAACGAGAATTCATGGCAAGCAGTAACTTG(SEQ ID NO:93)和pESCTrp2:TTACTACGTGCGGCCGCATGG CAAGCA GTAACTTGTTACTACGTGCGGCCGCTTATTTCCAGCAAATCAGAC(SEQ ID NO:94)扩增EcTyrRS。用EcoRI和NotI消化EcTyrRS PCR产物。然后用AgeI和NotI消化ptRNACUA。将这三个DNA三连接产生pEcTyrRS-tRNACUA。用寡核苷酸F37Afwd:CCGATCGCGCTCGCTTGCGGCTTCGATC(SEQ ID NO:95)、N126Afwd:ATCGCGGCGAACGCCTATGAC TGGTTC(SEQ ID NO:96)、182、183、186A、GTTGCAGGGTTATGCCGCCGCCTGTGCGAACAAACAG TAC(SEQ ID NO:97)和它们的反向补体,以及侧翼的寡核苷酸4783:GCCGCTTTGCTATCAAGTATAAATAG(SEQID NO:98)、3256:CAAGCCGACAACCTTGATTGG(SEQ ID NO:99)和作为模板的pEcTyrRS-tRNACUA进行重叠PCR,建立具有氨基酸残基(活性位点中37、126、182、183和186位突变为丙氨酸)的质粒pA5-tRNACUA。用EcoRI和NotI消化PCR产物,连接到用相同酶消化时释放的pEcTyrRS-tRNACUA的大片段中。为构建第一代DB-AD报道者,用正向引物pADfwd:GGGGACAAGTTTGTACAAAAAAGCAGGCTACGCCAATTTTAATCAAAGTGGGAATATTGC(SEQ ID NO:100)或pADfwd(TAG)GGGGACAAGTTTGTACAAAAAAGCAGGCTAGGCCAATTTTAATCAAAGTGGGAATATTGC(SEQ ID NO:101)和ADrev:GGGGACCACTTTGTACAAGAAAGCTGGGTTACTCTTTTTTTGGGTTTGGTGGGGTATC(SEQ ID NO:102)从pGADT7(Clontech)中PCR扩增GAL4DNA结合域。用Clonase步骤,根据生产商的说明书将这些PCR产物克隆到载体pDEST3-2(invitrogen)中,产生pDB-AD和pDB-(TAG)-AD。为构建PGADGAL4和变体,用引物ADH1428-1429AAGCTATACCAAGCATACAATC(SEQID NO:103)和GAL4C:ACAAGGCCTTGCTAGCTTACTCTTTTTTTGGGTTTGGTGGGGTATCTTC(SEQ ID NO:104)从pCL1(Clontech)中PCR扩增GAL4基因。根据生产商的说明书将该片段克隆到载体pCR2.1 TOPO(Invitrogen)中。用HindIII消化含有GAL4基因的克隆(pCR2.1TOPOGAL4),将2.7kb GAL4片段凝胶纯化并连接到用HindIII消化的pGADT7的大片段上,用小牛肠磷酸酶处理,凝胶纯化。根据生产商说明书进行Quikchange反应(Stratagene),用列在补充信息中的引物将GAL4基因的变体建立在pCR2.1上。以与野生型GAL4基因相同的方式将GAL4突变体克隆到pGADT7中。所有最终构建物都经过DNA测序确认。用引物tRNA5':GGGGGGACCGGTGGGGGGACCGGTAAGCTTCCCGATAAGGGAGCAGGCCAGTAAAAAGCATTACCCCGTGGTGGGTTCCCGA(SEQ ID NO:89)和tRNA3':GGCGGCGCTAGCAAGCTTCCCGATAAGGGAGCAGGCCAGTAAAAAGGGAAGTTCAGGGACTTTTGAAAAAAATGGTGGTGGGGGAAGGAT(SEQ ID NO:90)从pESCSU3URA中PCR扩增tRNA CUA基因。 This and all other PCR reactions were performed using Roche's ExpandPCR kit according to the manufacturer's instructions. After digestion with restriction endonucleases NheI and AgeI, the tRNA gene was inserted between the same sites in the 2 micron vector pESCTrp (Stratagene), generating ptRNA CUA . The primers PADHf: IGGGGGGACCGGTIGGGGGGACCGGTCGGGATCGAAGAAATGATGGTAAATGAAATAGGAAATCAAGG (SEQ ID NO: 91) and pADHR: GGGGGGGAATTCAGTTGATTGTATGCTTGGTATAGCTTGAAATATTGTGCAGAAAAAAGAAAC (SEQ ID NO: 92) were PCR-digested and amplified from pDBLeu (Invitrogen) full length ADHgeI promoter. Amplified with primers pESCTrp1: TCATAACGAGAATTCCGGGATCGAAGAAATGATGGTAAATGAAATAGGAAATCTCATAACGAGAATTCATGGCAAGCAGTAACTTG (SEQ ID NO: 93) and pESCTrp2: TTACTACGTGCGGCCGCATGGCAAGCAGTAACTTGTTACTACGTGCGGCCGCTTATTTCCAGCAAATCAGAC (SEQ ID NO: 4y4y) The EcTyrRS PCR product was digested with EcoRI and NotI. ptRNACUA was then digested with AgeI and NotI. Triple ligation of these three DNAs generated pEcTyrRS- tRNACUA . With oligonucleotides F37Afwd: CCGATCGCGCTCGCTTGCGGCTTCGATC (SEQ ID NO: 95), N126Afwd: ATCGCGGCGAACGCCTATGAC TGGTTC (SEQ ID NO: 96), 182, 183, 186A, GTTGCAGGGTTATGCCGCCGCCTGTGCGAACAAACAG TAC (SEQ ID NO: 97) and their reverse complements, And the flanking oligonucleotides 4783: GCCGCTTTGCTATCAAGTATAAATAG (SEQ ID NO: 98), 3256: CAAGCCGACAACCTTGATTGG (SEQ ID NO: 99) and pEcTyrRS-tRNA CUA as a template were carried out by overlapping PCR to establish an amino acid residue (37 in the active site). , 126, 182, 183 and 186 mutations to alanine) plasmid pA5-tRNA CUA . The PCR product was digested with EcoRI and NotI and ligated into the large fragment of pEcTyrRS- tRNACUA released when digested with the same enzymes.为构建第一代DB-AD报道者,用正向引物pADfwd:GGGGACAAGTTTGTACAAAAAAGCAGGCTACGCCAATTTTAATCAAAGTGGGAATATTGC(SEQ ID NO:100)或pADfwd(TAG)GGGGACAAGTTTGTACAAAAAAGCAGGCTAGGCCAATTTTAATCAAAGTGGGAATATTGC(SEQ ID NO:101)和ADrev:GGGGACCACTTTGTACAAGAAAGCTGGGTTACTCTTTTTTTGGGTTTGGTGGGGTATC(SEQ ID NO:102)从The GAL4 DNA binding domain was PCR amplified in pGADT7 (Clontech). These PCR products were cloned into the vector pDEST3-2 (invitrogen) using the Clonase procedure according to the manufacturer's instructions, generating pDB-AD and pDB-(TAG)-AD. To construct PGADGAL4 and variants, the GAL4 gene was PCR amplified from pCL1 (Clontech) with primers ADH1428-1429AAGCTATACCAAGCATACAATC (SEQ ID NO: 103) and GAL4C: ACAAGGCCTTGCTAGCTTACTCTTTTTTTGGGTTTGGTGGGGTATCTTC (SEQ ID NO: 104). This fragment was cloned into the vector pCR2.1 TOPO (Invitrogen) according to the manufacturer's instructions. A clone containing the GAL4 gene (pCR2.1TOPOGAL4) was digested with HindIII, and the 2.7 kb GAL4 fragment was gel purified and ligated to a large fragment of pGADT7 digested with HindIII, treated with calf intestinal phosphatase, and gel purified. A Quikchange reaction (Stratagene) was performed according to the manufacturer's instructions, and the variants of the GAL4 gene were established on pCR2.1 using the primers listed in the Supplementary Information. The GAL4 mutant was cloned into pGADT7 in the same manner as the wild-type GAL4 gene. All final constructs were confirmed by DNA sequencing.

酵母培养基和操作Yeast media and handling

酿酒酵母株MaV203(Invitrogen)是MATα;leu2-3,112;trp1109;his3Δ200;ade2-101;cyh2R;cyh1R;GAL4Δ;gal80Δ;GAL1::lacZ;HIS3UASGAL1::HIS3@LYS2;SPALlOUASGALl::URA3。酵母培养基购自Clontech,5-FOA和X-GAL购自Invitrogen,3-AT购自BIO 101。YPER(酵母蛋白抽提试剂)和ONPG购自Pierce Chemicals。通过PEG/醋酸锂法(参见,例如,D.Burke,等,(2000)《酵母遗传学方法》(Methods in YeastGenetics),Cold Spring Harbor Laboratory Press,Cold Spring Harbor,NY)进行质粒转化,在合适的合成完全撤除成分培养基上选择转化子。为了在MaV203上测试各种质粒组合给予的表型,将来自各转化的合成完全撤除成分平板的酵母菌落重悬于15微升无菌水中,然后在感兴趣的选择性培养基上划线。各表型至少用五个独立菌落确认。Saccharomyces cerevisiae strain MaV203 (Invitrogen) is MATα; leu2-3,112; trp1109; his3Δ200; ade2-101 ; cyh2R; cyh1R ; . Yeast medium was purchased from Clontech, 5-FOA and X-GAL were purchased from Invitrogen, and 3-AT was purchased from BIO 101. YPER (Yeast Protein Extraction Reagent) and ONPG were purchased from Pierce Chemicals. Plasmid transformation was carried out by the PEG/lithium acetate method (see, e.g., D. Burke, et al., (2000) "Methods in Yeast Genetics", Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY), at appropriate Transformants were selected on synthetic complete withdrawal medium. To test the phenotype conferred by various plasmid combinations on MaV203, yeast colonies from synthetic fully withdrawn components plates from each transformation were resuspended in 15 microliters of sterile water and then streaked on the selective media of interest. Each phenotype was confirmed with at least five independent colonies.

通过凝胶覆盖法进行X-GAL测定。参见,I.G.Serebriiskii和E.A.Golemis,(2000),lacZ在研究基因功能中的用途:用于酵母双杂交系统的β-半乳糖苷测定的评价(Uses of lacZ to study gene function:evaluation of beta-galactosidaseassays employed in the yeast two-hybrid system),Analytical Biochemistry 285:1-15。简要说,在琼脂平板上通过加入几次纯氯仿裂解菌落或细胞小块。氯仿蒸发后,将含有0.25克/升XGAL的1%琼脂糖和含有0.1M Na2PO4的缓冲液加于平板表面。琼脂糖一凝固,就将平板置于37℃孵育12小时。通过将1毫升SD-leu、-trp接种在具有单菌落的96孔板中,并在30℃振荡孵育进行ONPG测定。在96孔微量滴定板中平行记录100微升细胞以及几个细胞稀释液的OD660。将细胞(100微升)与100微升YPER:ONPG(1xPBS、50%v/v YPER、20mM MgCl2、0.25%v/vβ-巯基乙醇和3mM ONPG)混合,在37℃振荡孵育。在显色时,离心沉淀细胞,将上清转移到洁净的96孔微量滴定板(Nunclon,目录号167008),记录A420。所有数据显示至少4个独立克隆试验的平均值,误差条显示标准差。用等式:β-半乳糖苷酶单位=1000.A420/(V.t.OD660)计算ONPG水解,其中V是以微升表示的细胞体积,t是以分钟表示的孵育时间。参见,例如,I.G.Serebriiskii和E.A.Golemis,(2000),lacZ在研究基因功能中的用途:用于酵母双杂交系统的β-半乳糖苷测定的评价(Uses of lacZ to study genefunction:evaluation of beta-galactosidase assays employed in the yeasttwo-hybrid system),Analytical Biochemistry 285:1-15。一个β-半乳糖苷酶单位相当于每细胞每分钟水解1微摩尔ONPG。参见,Serebriiskii和Golemis,上述。在SPECTRAmaxl90板阅读器上进行分光光度读数。X-GAL assay was performed by the gel overlay method. See, IG Serebriiskii and EA Golemis, (2000), Uses of lacZ to study gene function: evaluation of beta-galactosidase assays employed in the yeast two-hybrid system), Analytical Biochemistry 285: 1-15. Briefly, colonies or small pieces of cells were lysed on agar plates by several additions of pure chloroform. After the chloroform had evaporated, a buffer containing 0.25 g/L XGAL in 1% agarose and 0.1 M Na2PO4 was added to the surface of the plate . Once the agarose had solidified, the plates were incubated at 37°C for 12 hours. The ONPG assay was performed by inoculating 1 ml of SD-leu,-trp in a 96-well plate with a single colony and incubating at 30 °C with shaking. The OD660 of 100 microliters of cells as well as several cell dilutions was recorded in parallel in a 96-well microtiter plate. Cells (100 μl) were mixed with 100 μl YPER:ONPG (1xPBS, 50% v/v YPER, 20 mM MgCl2, 0.25% v/v β-mercaptoethanol and 3 mM ONPG) and incubated at 37°C with shaking. Upon color development, the cells were pelleted by centrifugation, the supernatant was transferred to a clean 96-well microtiter plate (Nunclon, Cat# 167008), and the A420 was recorded. All data show the mean of at least 4 independent clonal experiments and error bars show standard deviation. ONPG hydrolysis was calculated using the equation: β-galactosidase units = 1000.A420/( VtOD660 ), where V is the cell volume in microliters and t is the incubation time in minutes. See, e.g., IG Serebriiskii and EA Golemis, (2000), Uses of lacZ to study gene function: evaluation of beta-galactosidase assays for the yeast two-hybrid system employed in the yeast two-hybrid system), Analytical Biochemistry 285: 1-15. One β-galactosidase unit is equivalent to hydrolyzing 1 micromole of ONPG per minute per cell. See, Serebriiskii and Golemis, supra. Spectrophotometric readings were performed on a SPECTRAmax 190 plate reader.

模型选择model selection

正选择:将两个过夜培养物培养在SD-Leu、-Trp中。一个包含载有pEcTyrRS-tRNACUA/pGADGAL4(T44、R110TAG)的MaV203,另一个载有pA5-tRNASU3/pGADGAL4(T44、R110TAG)。离心收集这些细胞,通过涡旋重悬于0.9%NaCl。然后将两个细胞溶液稀释到相同的OD660。将载有pEcTyrRS-tRNACUA/pGADGAL4(T44、R110TAG)的MaV203连续稀释为7个数量级,然后将各稀释液与未稀释的载有pA5-tRNACUA/pGADGAL4(T44、R110TAG)的MaV2031∶1体积:体积混合,以提供确定比例的含有活性和失活酪氨酰-tRNA合成酶的细胞。对各比列稀释液进行第二次连续稀释,其中细胞数量减少,但保持载有pEcTyrRS-tRNACUA/pGADGAL4(T44、R110TAG)和pA5-tRNACUA/pGADGAL4(T44。R110TAG)的细胞比例。将这些稀释液铺平板于SD-Leu、-trp、SD-Leu、-Trp、-URA和SD-Leu、-Trp、-His+50mM 3-AT上。60小时后,用Eagle Eye CCD相机(Stratagene)对各平板上的菌落计数,用X-GALβ-半乳糖苷酶测定确认存活细胞的表型。分离来自几个单独蓝色或白色菌落的细胞并在SD-leu、-trp中生长至饱和,用标准方法分离质粒DNA。用DNA测序确认EcTyrRS变体的身份。Positive selection: Two overnight cultures were grown in SD-Leu,-Trp. One contained MaV203 loaded with pEcTyrRS- tRNACUA /pGADGAL4 (T44, R110TAG) and the other loaded pA5-tRNASU3/pGADGAL4 (T44, R110TAG). These cells were collected by centrifugation and resuspended in 0.9% NaCl by vortexing. The two cell solutions were then diluted to the same OD660 . MaV203 loaded with pEcTyrRS-tRNA CUA /pGADGAL4 (T44, R110TAG) was serially diluted to 7 orders of magnitude, and then each dilution was mixed with undiluted MaV203 loaded with pA5-tRNA CUA /pGADGAL4 (T44, R110TAG) in a 1:1 volume : Mix in volume to provide a defined ratio of cells containing active and inactive tyrosyl-tRNA synthetase. A second serial dilution was performed on each serial dilution in which the number of cells was reduced but the proportion of cells loaded with pEcTyrRS- tRNACUA /pGADGAL4 (T44, R110TAG) and pA5- tRNACUA /pGADGAL4 (T44. R110TAG) was maintained. These dilutions were plated on SD-Leu, -trp, SD-Leu, -Trp, -URA and SD-Leu, -Trp, -His+50 mM 3-AT. After 60 hours, colonies on each plate were counted with an Eagle Eye CCD camera (Stratagene), and the phenotype of surviving cells was confirmed with an X-GAL β-galactosidase assay. Cells from several individual blue or white colonies were isolated and grown to saturation in SD-leu,-trp, and plasmid DNA was isolated using standard methods. The identity of the EcTyrRS variant was confirmed by DNA sequencing.

负选择:以与正选择类似的方式进行模型负选择,除了将载有pA5-tRNACUA/pGADGAL4(T44、R110TAG)的MaV203连续稀释并与固定密度的载有pEcTyrRS-tRNACUA/pGADGAL4(T44、R110TAG)的MaV203混合。将细胞铺平板于SD-leu、-trp+0.1%5-FOA上,48小时后计算菌落数量,平板处理如上所述。Negative selection: Negative selection of the model was performed in a similar manner to positive selection, except that MaV203 loaded with pA5-tRNA CUA /pGADGAL4 (T44, R110TAG) was serially diluted and mixed with a fixed density of pEcTyrRS-tRNA CUA /pGADGAL4 (T44, R110TAG) MaV203 mix. The cells were plated on SD-leu, -trp+0.1% 5-FOA, the number of colonies was counted after 48 hours, and the plate treatment was as described above.

下面的寡核苷酸(表3)与它们的反向补体联用,以通过Quikchange诱变构建定位突变体。突变位置用粗体文字表示。The following oligonucleotides (Table 3) were used in conjunction with their reverse complements to construct site-directed mutants by Quikchange mutagenesis. Mutation positions are indicated in bold text.

表3:用于构建定位突变体的寡核苷酸。Table 3: Oligonucleotides used to construct positional mutants.

琥珀amber

突变体寡核苷酸序列mutant oligonucleotide sequence

L3TAG5’-ATGAAGTAGCTGTCTTCTATCGAACAAGCATGCG-3’(SEQ ID NO:66)L3TAG5'-ATGAAGTAGCTGTCTTTCTATCGAACAAGCATGCG-3' (SEQ ID NO: 66)

I13TAG5’-CGAACAAGCATGCGATTAGTGCCGACTTAAAAAG-3’(SEQ ID NO:67)I13TAG5'-CGAACAAGCATGCGATTAGTGCCGACTTAAAAAG-3' (SEQ ID NO: 67)

T44TAG5’-CGCTACTCTCCCAAATAGAAAAGGTCTCCGCTG-3’(SEQ ID NO:68)T44TAG5'-CGCTACTCTCCCAAATAGAAAAGGTCTCCGCTG-3' (SEQ ID NO: 68)

F68TAG5’-CTGGAACAGCTATAGCTACTGATTTTTCCTCG-3’(SEQ ID NO:69)F68TAG5'-CTGGAACAGCTATAGCTACTGATTTTTCCTCG-3' (SEQ ID NO: 69)

R110TAG5’-GCCGTCACAGATTAGTTGGCTTCAGTGGAGACTG-3’(SEQDNO:70)R110TAG5'-GCCGTCACACAGATTAGTTGGCTTCAGTGGAGACTG-3' (SEQDNO: 70)

V114TAG5’-GATTGGCTTCATAGGAGACTGATATGCTCTAAC-3’(SEQ ID NO:71)V114TAG5'-GATTGGCTTCATAGGAGACTGATATGCTCTAAC-3' (SEQ ID NO: 71)

T121TAG5’-GCCTCTATAGTTGAGACAGCATAGAATAATGCG-3’(SEQ ID NO:72)T121TAG5'-GCCTCTATAGTTGAGACAGCATAGAATAATGCG-3' (SEQ ID NO: 72)

I127TAG5’-GAGACAGCATAGATAGAGTGCGACATCATCATCGG-3’(SEQ ID NO:73)I127TAG5'-GAGACAGCATAGATAGAGTGCGACATCATCATCGG-3' (SEQ ID NO: 73)

S131TAG5’-GAATAAGTGCGACATAGTCATCGGAAGAGAGTAGTAG-3’(SEQ ID NO:74)S131TAG5'-GAATAAGTGCGACATAGTCATCGGAAGAGAGTAGTAG-3' (SEQ ID NO: 74)

T145TAG5’-GGTCAAAGACAGTTGTAGGTATCGATTGACTCGGC-3’(SEQ ID NO:75)T145TAG5'-GGTCAAAGACAGTTGTAGGTATCGATTGACTCGGC-3' (SEQ ID NO: 75)

允许allow

位点突变体寡核苷酸序列Site mutant oligonucleotide sequence

T44F5’-CGCTACTCTCCCCAAATTTAAAAGGTCTCCGCTG-3’(SEQ ID NO:76)T44F5'-CGCTACTCTCCCCAAATTTAAAAGGTCTCCGCTG-3' (SEQ ID NO: 76)

T44Y5’-CGCTACTCTCCCCAAATATAAAAGGTCTCCGCTG-3’(SEQ ID NO:77)T44Y5'-CGCTACTTCCCCAAATATAAAAGGTCTCCGCTG-3' (SEQ ID NO: 77)

T44W5’-CGCTACTCTCCCCAAATGGAAAAGGTCTCCGCTG-3’(SEQ ID NO:78)T44W5'-CGCTACTCTCCCCAAATGGAAAAGGTCTCCGCTG-3' (SEQ ID NO: 78)

T44D5’-CGCTACTCTCCCCAAAGATAAAAGGTCTCCGCTG-3’(SEQ ID NO:79)T44D5'-CGCTACTTCTCCCCAAAGATAAAAGGTCTCCGCTG-3' (SEQ ID NO: 79)

T44K5’-CGCTACTCTCCCCAAAAAAAAAAGGTCTCCGCTG-3’(SEQ ID NO:80)T44K5'-CGCTACTCTCCCCAAAAAAAAAAAGGTCTCCGCTG-3' (SEQ ID NO: 80)

R110F5’-GCCGTCACAGATTTTTTGGCTTCAGTGGAGACTG-3’(SEQ ID NO:81)R110F5'-GCCGTCACCAGATTTTTTGGCTTCAGTGGAGACTG-3' (SEQ ID NO: 81)

R110Y5’-GCCGTCACAGATTATTTGGCTTCAGTGGAGACTG-3’(SEQ ID NO:82)R110Y5'-GCCGTCACAGATTATTTGGCTTCAGTGGAGACTG-3' (SEQ ID NO: 82)

R110W5’-GCCGTCACAGATTGGTTGGCTTCAGTGGAGACTG-3’(SEQ ID NO:83)R110W5'-GCCGTCACAGATTGGTTGGCTTCAGTGGAGACTG-3' (SEQ ID NO: 83)

R110D5’-GCCGTCACAGATGATTTGGCTTCAGTGGAGACTG-3’(SEQ ID NO:84)R110D5'-GCCGTCACAGATGATTTGGCTTCAGTGGAGACTG-3' (SEQ ID NO: 84)

R110K5’-GCCGTCACAGATAAATTGGCTTCAGTGGAGACTG-3’(SEQ ID NO:85)R110K5'-GCCGTCACAGATAAATTGGCTTCAGTGGAGACTG-3' (SEQ ID NO: 85)

实施例2:扩展的真核生物遗传密码Example 2: Extended eukaryotic genetic code

描述了将非天然氨基酸加入到酿酒酵母遗传密码中的通常和快速的途径。响应于无义密码子TAG,将五个氨基酸以高保真度有效掺入蛋白质中。这些氨基酸的侧链含有酮基,可以在体外或体内用广范围的化学探针和试剂将其独特地修饰;含有重原子的氨基酸用于结构研究;以及光交联剂用于蛋白质相互作用的细胞研究。该方法不仅去除遗传密码对我们在酵母中操纵蛋白质结构和功能的强加的限制,它提供了系统性扩展多细胞真核生物的遗传密码的途径。A general and rapid route for the incorporation of unnatural amino acids into the genetic code of Saccharomyces cerevisiae is described. Five amino acids were efficiently incorporated into the protein with high fidelity in response to the nonsense codon TAG. These amino acids contain ketone groups in their side chains, which can be uniquely modified in vitro or in vivo with a wide range of chemical probes and reagents; heavy atom-containing amino acids for structural studies; and photocrosslinkers for protein-protein interactions. cell research. This approach not only removes the constraints imposed by the genetic code on our manipulation of protein structure and function in yeast, it provides a way to systematically expand the genetic code of multicellular eukaryotes.

虽然化学家已经开发了合成和操纵小分子结构的有效方法和策略(参见,例如,E..J.Corey和X.-M.Cheng,《化学合成的逻辑》(The Logic of ChemicalSynthesis)(Wiley-Interscience,New York,1995)),但是合理控制蛋白质结构和功能的能力仍处于萌芽状态。虽然在很多情况下已经可能在整个蛋白质组中竞争性掺入与普通氨基酸接近的结构类似物,但是诱变方法限于普通的20个氨基酸构件。参见,例如,K.Kirshenbaum,等,(2002),ChemBioChem 3:235-7;和V.Doring等,(2001),Science 292:501-4。全合成(参见,例如,B.Merrifield,(1986),Science232:341-7(1986))和半合成方法(参见,例如,D.Y.Jackson等,(1994)Science 266:243-7;和P.E.Dawson和S.B.Kent,(2000),Annual Review of Biochemistry 69:923-60,已经使合成肽和小蛋白成为可能,但对于超过10千道尔顿(kDa)的蛋白,用途更有限。包括化学酰化的正交tRNA的生物合成方法(参见,例如,D.Mendel,等,(1995),Annual Review of Biophysics and Biomolecular Structure 24:435-462;和V.W.Cornish,等(1995年3月31日),Angewandte Chemie-International Editionin English 34:621-633已经允许在体外(参见,例如,J.A.Ellman,等,(1992),Science 255:197-200)或在显微注射的细胞中(参见,例如,D.A.Dougherty,(2000),Current Opinion in Chemical Biology4:645-52)将非天然氨基酸掺入较大的蛋白质中。然而,化学酰化的化学计量特性严重限制了可以产生的蛋白质的量。因此,尽管作出了很大努力,但是在整个进化中,二十个遗传编码的氨基酸(除吡咯赖氨酸和硒半胱氨酸(参见,例如,A.Bock等,(1991),Molecular Microbiology 5:515-20;和G.Srinivasan,等,(2002),Science 296:1459-62)以外)已经限制蛋白、可能是整个生物的性质。Although chemists have developed efficient methods and strategies for synthesizing and manipulating the structure of small molecules (see, e.g., E..J.Corey and X.-M.Cheng, The Logic of Chemical Synthesis (Wiley -Interscience, New York, 1995)), but the ability to rationally control protein structure and function is still in its infancy. While competitive incorporation of close structural analogs to common amino acids has been possible in many cases throughout the proteome, mutagenesis approaches are limited to the common 20 amino acid building block. See, eg, K. Kirshenbaum, et al., (2002), ChemBioChem 3:235-7; and V. Doring et al., (2001), Science 292:501-4. Total synthesis (seeing, for example, B.Merrifield, (1986), Science 232:341-7 (1986)) and semi-synthetic methods (seeing, for example, D.Y.Jackson et al., (1994) Science 266:243-7; and P.E. Dawson and S.B.Kent, (2000), Annual Review of Biochemistry 69:923-60, have made possible the synthesis of peptides and small proteins, but more limited use for proteins over 10 kilodaltons (kDa). Including chemical acylation Biosynthetic methods for orthogonal tRNAs (see, for example, D.Mendel, et al., (1995), Annual Review of Biophysics and Biomolecular Structure 24:435-462; and V.W.Cornish, et al. (1995 March 31st), Angewandte Chemie-International Edition in English 34:621-633 has allowed in vitro (see, e.g., J.A. Ellman, et al., (1992), Science 255:197-200) or in microinjected cells (see, e.g., D.A. Dougherty, (2000), Current Opinion in Chemical Biology 4: 645-52) incorporates unnatural amino acids in larger proteins. However, the stoichiometric nature of chemical acylation severely limits the amount of protein that can be produced. Therefore, although Great efforts have been made, but throughout evolution, the twenty genetically encoded amino acids (except pyrrolysine and selenocysteine (see, e.g., A. Bock et al., (1991), Molecular Microbiology 5:515 -20; and G. Srinivasan, et al., (2002), Science 296:1459-62)) have restricted the properties of proteins, possibly whole organisms.

为了克服该限制,将新组件加入到原核生物大肠杆菌(E.coli)的蛋白质生物合成机器中(例如,L.Wang,等,(2001),Science 292:498-500),这使体内遗传编码非天然氨基酸成为可能。响应于琥珀密码子TAG,将一些具有新化学、物理或生物学性质的新氨基酸有效和选择性地掺入蛋白质中。参见,例如,J.W.Chin等,(2002),Journal of the American Chemical Society 124:9026-9027;J.W.Chin和P.G.Schultz,(2002),ChemBioChem 11:1135-1137;J.W.Chin,等,(2002),PNASUnited States of America 99:11020-11024:和L.Wang和P.G.Schultz,(2002),Chem.Comm.,1:1-10。然而,因为翻译机器在原核生物和真核生物间并不是非常保守的,加入大肠杆菌的生物合成机器的组件通常不能用于在真核细胞中将非天然氨基酸位点特异地掺入蛋白质中,以研究或操纵细胞过程。To overcome this limitation, new components were added to the protein biosynthetic machinery of the prokaryote Escherichia coli (E. It is possible to encode unnatural amino acids. Some novel amino acids with novel chemical, physical or biological properties are efficiently and selectively incorporated into proteins in response to the amber codon TAG. See, e.g., J.W.Chin et al., (2002), Journal of the American Chemical Society 124:9026-9027; J.W.Chin and P.G. Schultz, (2002), ChemBioChem 11:1135-1137; J.W.Chin, et al., (2002), PNAS United States of America 99:11020-11024: and L. Wang and P.G. Schultz, (2002), Chem.Comm., 1:1-10. However, because the translation machinery is not well conserved between prokaryotes and eukaryotes, components of the biosynthetic machinery added to E. coli cannot usually be used to site-specifically incorporate unnatural amino acids into proteins in eukaryotes, to study or manipulate cellular processes.

因此,建立在真核细胞中会扩展遗传编码的氨基酸数目的翻译组件。选择酿酒酵母作为起始真核宿主生物,因为它是有用的模式真核生物,容易进行遗传操纵(参见,例如,D.Burke,等,(2000),《酵母遗传学方法》(Methods in YeastGenetics)(Cold Spring Harbor Laboratory Press,Cold Spring Harbor,NY),它的翻译机器与高等真核生物的翻译机器高度同源(参见,例如,T.R.Hughes,(2002),Funct.Integr.Genomics 2:199-211)。新构件加入酿酒酵母遗传密码需要不与酵母翻译机器的任何组件交叉反应的独特的密码子、tRNA和氨酰基-tRNA合成酶(‘aaRS’)(参见,例如,Noren等,(1989)Science 244:182;Furter(1998)ProteinSci.7:419;和Liu等,(1999)PNAS USA 96:4780)。一个候选正交对是来自大肠杆菌的琥珀抑制酪氨酰-tRNA合成酶-tRNACUA对(参见,例如,H.M.Goodman,等,(1968),Nature 217:1019-24;和D.G.Barker,等,(1982),FEBS Letters 150:419-23)。当大肠杆菌酪氨酰-tRNA合成酶(TyrRS)和大肠杆菌tRNACUA在酿酒酵母中遗传编码但不氨酰化酿酒酵母胞质tRNA时,大肠杆菌酪氨酰-tRNA合成酶(TyrRS)有效氨酰化大肠杆菌tRNACUA。参见,例如,H.Edwards和P.Schimmel,(1990),Molecular&CellularBiology 10:1633-41;和H.Edwards,等,(1991),PNAS United States of America88:1153-6。此外,对于酿酒酵母氨酰基-tRNA合成酶来说,大肠杆菌酪氨酰tRNACUA是差的底物(参见,例如,V.Trezeguet,等,(1991),Molecular&Cellular Biology11:2744-51),但是它在酿酒酵母中加工,从核输出到胞质(参见,例如,S.L.Wolin和A.G.Matera,(1999)Gene&Development13:1-10)并有效作用于蛋白翻译。参见,例如,H.Edwards和P.Schimmel,(1990)Molecular&Cellular Biology 10:1633-41;H.Edwards,等,(1991),PNAS United States of America 88:1153-6;和V.Trezeguet,等,(1991),Molecular&Cellular Biology 11:2744-51。而且,大肠杆菌TyrRS不具有编辑机制,因此不应该校正与tRNA连接的非天然氨基酸。Thus, establishing translational modules in eukaryotic cells expands the number of genetically encoded amino acids. Saccharomyces cerevisiae was chosen as the starting eukaryotic host organism because it is a useful model eukaryote that is easily genetically manipulated (see, e.g., D. Burke, et al., (2000), Methods in Yeast Genetics ) (Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY), whose translation machinery is highly homologous to that of higher eukaryotes (see, e.g., TR Hughes, (2002), Funct. Integr. Genomics 2:199- 211). The addition of new building blocks to the S. cerevisiae genetic code requires unique codons, tRNAs and aminoacyl-tRNA synthetases ('aaRS') that do not cross-react with any components of the yeast translation machinery (see, e.g., Noren et al., (1989 ) Science 244: 182; Furter (1998) ProteinSci.7: 419; and Liu et al., (1999) PNAS USA 96: 4780). A candidate orthogonal pair is the amber inhibitory tyrosyl-tRNA synthetase from Escherichia coli- tRNA CUA pair (see, for example, HM Goodman, et al., (1968), Nature 217:1019-24; and DGBarker, et al., (1982), FEBS Letters 150:419-23). When E. coli tyrosyl-tRNA synthesis E. coli tyrosyl-tRNA synthetase (TyrRS) efficiently aminoacylates E. coli tRNA CUA when genetically encoded in Saccharomyces cerevisiae but does not aminoacylate S. cerevisiae cytoplasmic tRNA . See, For example, H.Edwards and P.Schimmel, (1990), Molecular & Cellular Biology 10:1633-41; And H.Edwards, et al., (1991), PNAS United States of America88:1153-6.In addition, for Saccharomyces cerevisiae aminoacyl- E. coli tyrosyl tRNA CUA is a poor substrate for tRNA synthetases (see, for example, V. Trezeguet, et al., (1991), Molecular & Cellular Biology 11:2744-51), but it is processed in Saccharomyces cerevisiae from Nuclear export to the cytoplasm (see, e.g., SL Wolin and AG Matera, (1999) Gene & Development 13:1-10) and efficient in protein translation. See, e.g., H. Edwards and P. Schimmel, (1990) Molecular & Cellular Biology 10:1633 -41; H. Edwards, et al., (1991), PNAS United States of America 88:1153-6; and V. Trezeguet, et al., (1991), Molecular & Cellular Biology 11:2744-51. Also, E. coli TyrRS does not have an editing mechanism, so unnatural amino acids linked to tRNAs should not be corrected.

为了改变正交TyrRS的氨基酸特异性以使它用所需非天然氨基酸和没有任何内源性氨基酸氨酰化tRNACUA,产生TyrRS突变体的一个大文库,并进行遗传选择。根据来自嗜热脂肪芽孢杆菌的同源TyrRS的晶体结构(参见,例如,P.Brick,等,(1989),Journal of Molecular Biology 208:83),将位于结合酪氨酸芳环的对位6.5内的大肠杆菌TyrRS活性位点中的五个残基(嗜热脂肪芽孢杆菌,图7,A组)突变。例如,为建立突变体EcTyrRS文库,将五个靶向突变的位置首先转化为丙氨酸密码子以产生A5RS基因。这在基因中两个质粒之间独特的PstI位点上分开。基本上根据本领域已知技术(参见,例如,Stemmer等,(1993)Biotechniques 14:256-265)的描述建立该文库。一个质粒含有A5RS基因的5’半,另一质粒含有A5RS基因的3’半。通过用扩增整个质粒的寡核苷酸引物进行PCR,对各片断进行诱变。掺入的引物含有NNK(N=A+G+T+C,K=G+T)和BsaI限制性核酸内切酶识别位点。用BsaI消化,连接产生的两种环形质粒,各含有EcTyrRS基因一半的突变拷贝。然后用PstI消化这两种质粒,并通过连接组装成单一质粒,导致全长突变基因的组装。将突变EcTyrRS基因从该质粒中切下并连接入pA5RS/tRNACUA中的EcoRI和NotI位点之间。用PEG-醋酸锂法将该文库转化到酿酒酵母Mav203:pGADGAL4(2TAG)中,产生~108个独立的转化子。To alter the amino acid specificity of an orthogonal TyrRS such that it aminoacylates tRNACUA with the desired unnatural amino acid and without any endogenous amino acids, a large library of TyrRS mutants was generated and genetically selected. According to the crystal structure of the homologous TyrRS from Bacillus stearothermophilus (see, e.g., P. Brick, et al., (1989), Journal of Molecular Biology 208:83), it will be located at para-position 6.5 binding to the tyrosine aromatic ring. Five residues in the active site of E. coli TyrRS within A (B. stearothermophilus, Figure 7, panel A) were mutated. For example, to create a mutant EcTyrRS library, the five targeted mutation positions were first converted to alanine codons to generate the A5RS gene. This is split at a unique PstI site between the two plasmids in the gene. The library is constructed essentially as described by techniques known in the art (see, eg, Stemmer et al. (1993) Biotechniques 14:256-265). One plasmid contains the 5' half of the A5RS gene and the other plasmid contains the 3' half of the A5RS gene. Each fragment was subjected to mutagenesis by PCR with oligonucleotide primers that amplify the entire plasmid. The incorporated primers contained NNK (N=A+G+T+C, K=G+T) and BsaI restriction endonuclease recognition sites. Digestion with BsaI and ligation resulted in two circular plasmids, each containing a mutated copy of one half of the EcTyrRS gene. The two plasmids were then digested with PstI and assembled into a single plasmid by ligation, resulting in assembly of the full-length mutant gene. The mutant EcTyrRS gene was excised from this plasmid and ligated into pA5RS/ tRNACUA between the EcoRI and NotI sites. The library was transformed into S. cerevisiae Mav203: pGADGAL4(2TAG) using the PEG-lithium acetate method, resulting in -108 independent transformants.

用该文库转化酿酒酵母的选择株[MaV203:pGADGAL4(2TAG)(参见,例如,M.Vidal,等,(1996),PNAS United States of America 93:10321-6;M.Vidal,等,(1996),PNAS United States of America 93:10315-201和Chin等,(2003)Chem.Biol.10:511)]以提供108个独立的转化子,在1mM非天然氨基酸的存在下生长(图8,C组)。在转录激活物GAL4中抑制两种允许琥珀密码子导致全长GAL4的产生和GAL4-反应性HIS3、URA3和lacZ报道基因的转录激活(图8,A组)。例如,允许密码子是用于Gal4的T44和R110。HIS3和URA3在缺少尿嘧啶(-ura)或含有20mM 3-氨基三唑(参见,例如,G.M.Kishore和D.M.Shah,(1988),Annual Review of Biochemistry57,627-63)(3-AT,His3蛋白的竞争性抑制剂)以及缺少组氨酸(-his)的培养基中的表达允许正选择表达活性aaRS-tRNACUA对的克隆。如果突变TyrRS载上具有氨基酸的tRNACUA,那么细胞生物合成组氨酸和尿嘧啶,且存活。在没有3-AT和非天然氨基酸的情况下扩增存活细胞,从选择性掺入非天然氨基酸的细胞中去除全长GAL4。为去除响应于琥珀密码子掺入内源性氨基酸的克隆,将细胞培养于含有0.1%5-氟乳清酸(5-FOA)而缺少非天然氨基酸的培养基上。作为用天然氨基酸抑制GAL4琥珀突变的结果,表达URA3的那些细胞将5-FOA转化为有毒产物,杀死细胞。参见,例如,J.D.Boeke,等,(1984),Molecular&General Genetics 197:345-6。在非天然氨基酸存在下扩增存活克隆,再应用于正选择。LacZ报道基因允许用比色方法区别活性和失活合成酶-tRNA对(图8,B组)。This library was used to transform a selection strain of Saccharomyces cerevisiae [MaV203: pGADGAL4(2TAG) (see, e.g., M. Vidal, et al., (1996), PNAS United States of America 93:10321-6; M. Vidal, et al., (1996 ), PNAS United States of America 93: 10315-201 and Chin et al., (2003) Chem.Biol.10: 511)] to provide 10 8 independent transformants, growing in the presence of 1 mM unnatural amino acid (Fig. 8 , Group C). Inhibition of the two permissive amber codons in the transcriptional activator GAL4 resulted in the production of full-length GAL4 and transcriptional activation of the GAL4-responsive HIS3, URA3 and lacZ reporter genes (Figure 8, panel A). For example, allowed codons are T44 and R110 for Gal4. Competition of HIS3 and URA3 in the absence of uracil (-ura) or with 20 mM 3-aminotriazole (see, e.g., GM Kishore and DMShah, (1988), Annual Review of Biochemistry 57, 627-63) (3-AT, His3 protein inhibitor) as well as expression in media lacking histidine (-his) allowed positive selection for clones expressing active aaRS-tRNA CUA pairs. If the mutant TyrRS carries a tRNAcuA with the amino acid, the cell biosynthesizes histidine and uracil and survives. Expansion of surviving cells in the absence of 3-AT and unnatural amino acids removes full-length GAL4 from cells that selectively incorporate unnatural amino acids. To eliminate clones incorporating endogenous amino acids in response to the amber codon, cells were grown on medium containing 0.1% 5-fluoroorotic acid (5-FOA) and lacking unnatural amino acids. As a result of suppression of the GAL4 amber mutation with natural amino acids, those cells expressing URA3 convert 5-FOA into a toxic product, killing the cells. See, eg, JDBoeke, et al., (1984), Molecular & General Genetics 197:345-6. Surviving clones are expanded in the presence of unnatural amino acids and then used for positive selection. The LacZ reporter gene allowed colorimetric discrimination between active and inactive synthetase-tRNA pairs (Figure 8, panel B).

通过使用该方法,将五个具有不同空间和电子性质的新氨基酸(图7,B组)独立地加入酿酒酵母的遗传密码中。这些氨基酸包括对-乙酰基-L-苯丙氨酸(1)、对-苯甲酰基-L-苯丙氨酸(2)、对-叠氮基-L-苯丙氨酸(3)、氧-甲基-L-酪氨酸(4)和对-碘代-L-苯丙氨酸(5)(在图7,B组中以数字表示)。对-乙酰基-L-苯丙氨酸的酮官能团的独特反应性允许用一系列含肼或羟胺的试剂在体外和体内进行蛋白的选择性修饰(参见,例如,V.W.Cornish,等,(1996年8月28日),Journal of the AmericanChemical Society 118:8150-8151;和Zhang,Smith,Wang,Brock,Schultz,准备中)。可以证明,对-碘代-L-苯丙氨酸的重原子可用于定相X射线结构数据(用多波长不规则衍射)。对-苯甲酰基-L-苯丙氨酸和对-叠氮基-L-苯丙氨酸的二苯甲酮和叠氮苯侧链允许蛋白在体内和体外有效的光交联(参见例如,Chin等,(2002)J.Am.Chem.Soc.,124:9026;Chin和Schultz,(2002)Chem.Bio.Chem.11:1135;和Chin等,(2002)PNAS,USA 99:11020)。可用同位素标记的甲基容易地取代氧-甲基-L-酪氨酸的甲基,在使用核磁共振和振动光谱学中用作局部结构和动力学的探针。三轮选择(正-负-正)后,分离几个菌落,它们在-ura或在20mM 3-AT-his培养基上的存活严格依赖于选择的非天然氨基酸的加入。参见,图8,D组。相同克隆仅在有1mM非天然氨基酸的情况下在x-gal上是蓝色的。这些实验证明观察的表型由演化的氨酰基-tRNA合成酶-tRNACUA对和它们的关联氨基酸的组合产生(参见,表4)。By using this method, five new amino acids with different steric and electronic properties (Fig. 7, panel B) were independently added to the genetic code of S. cerevisiae. These amino acids include p-acetyl-L-phenylalanine (1), p-benzoyl-L-phenylalanine (2), p-azido-L-phenylalanine (3), Oxy-methyl-L-tyrosine (4) and p-iodo-L-phenylalanine (5) (numbered in Figure 7, panel B). The unique reactivity of the ketone function of p-acetyl-L-phenylalanine allows the selective modification of proteins in vitro and in vivo with a range of hydrazine- or hydroxylamine-containing reagents (see, e.g., VW Cornish, et al., (1996 Aug. 28), Journal of the American Chemical Society 118:8150-8151; and Zhang, Smith, Wang, Brock, Schultz, in preparation). It can be shown that the heavy atoms of p-iodo-L-phenylalanine can be used to phase X-ray structure data (using multi-wavelength irregular diffraction). The benzophenone and azidobenzene side chains of p-benzoyl-L-phenylalanine and p-azido-L-phenylalanine allow efficient photocrosslinking of proteins in vivo and in vitro (see e.g. , Chin et al., (2002) J.Am.Chem.Soc., 124:9026; Chin and Schultz, (2002) Chem.Bio.Chem.11:1135; and Chin et al., (2002) PNAS, USA 99:11020 ). The methyl group of oxy-methyl-L-tyrosine can be easily substituted with an isotopically labeled methyl group for use as a probe of local structure and dynamics using nuclear magnetic resonance and vibrational spectroscopy. After three rounds of selection (positive-negative-positive), several colonies were isolated whose survival on -ura or on 20 mM 3-AT-his medium was strictly dependent on the addition of the selected unnatural amino acid. See, Figure 8, panel D. The same clones are blue on x-gal only in the presence of 1 mM unnatural amino acid. These experiments demonstrated that the observed phenotypes result from combinations of evolved aminoacyl-tRNA synthetase-tRNA CUA pairs and their cognate amino acids (see, Table 4).

例如,为选择突变体合成酶,将细胞(~109)在液体SD-leu、-trp+1mM氨基酸中培养4小时。然后离心收集细胞,重悬于0.9%NaCl,铺平板于SD-leu、-trp、-his+20mM 3-AT、+1mM非天然氨基酸或SD-leu、-trp、-ura、+1mM非天然氨基酸上。30℃48至60小时后,从板上刮入液体SD-leu、-trp中,在30℃培养15小时。离心收集细胞,重悬于0.9%NaCI,铺平板于SD-leu、-trp+0.1%5-FOA上。30℃48小时后,将细胞刮入液体SD-leu、-trp+1mM非天然氨基酸中,培养15小时。然后离心收集细胞,重悬于0.9%NaCl,铺平板于SD-leu、-trp、-his+20mM 3-AT、+1mM非天然氨基酸或SD-leu,-trp,-ura,+1mM非天然氨基酸上。为筛选选择细胞的表型,将来自各选择的菌落(192)转移到含有0.5毫升SD-leu、-trp的96孔板的孔中,在30℃培养24小时。向每孔加入甘油(50%v/v;0.5毫升),存在或没有1mM非天然氨基酸的情况下,将细胞复制铺板于琼脂(SD-leu、-trp;SD-leu、-trp、-his、+20mM 3-AT;SD-leu、-trp、-ura)上。用琼脂糖覆盖法在SD-leu、-trp平板上进行X-Gal测定。For example, to select for mutant synthetases, cells (~ 109 ) were incubated in liquid SD-leu, -trp + 1 mM amino acids for 4 hours. Then the cells were collected by centrifugation, resuspended in 0.9% NaCl, plated on SD-leu, -trp, -his+20mM 3-AT, +1mM unnatural amino acid or SD-leu, -trp, -ura, +1mM unnatural on amino acids. After 48 to 60 hours at 30°C, liquid SD-leu,-trp was scraped from the plate and incubated at 30°C for 15 hours. Cells were collected by centrifugation, resuspended in 0.9% NaCI, and plated on SD-leu, -trp+0.1% 5-FOA. After 48 hours at 30°C, cells were scraped into liquid SD-leu, -trp + 1 mM unnatural amino acid and incubated for 15 hours. Then the cells were collected by centrifugation, resuspended in 0.9% NaCl, plated on SD-leu, -trp, -his+20mM 3-AT, +1mM unnatural amino acid or SD-leu, -trp, -ura, +1mM unnatural on amino acids. To screen the phenotype of selected cells, colonies (192) from each selection were transferred to wells of a 96-well plate containing 0.5 ml SD-leu,-trp and incubated at 30°C for 24 hours. Glycerol (50% v/v; 0.5 ml) was added to each well, and cells were plated in replicates on agar (SD-leu, -trp; SD-leu, -trp, -his) in the presence or absence of 1 mM unnatural amino acid. , +20 mM 3-AT; SD-leu, -trp, -ura). X-Gal assays were performed on SD-leu,-trp plates using the agarose overlay method.

为了进一步证明观察的表型是由于正交突变体TyrRS/tRNA对位点特异性掺入非天然氨基酸,产生并表征含有各非天然氨基酸的人超氧化物歧化酶1(hSOD)的突变体(参见,例如,H.E.Parge,等,(1992),PNAS United States of America 89:6109-13)。To further demonstrate that the observed phenotypes were due to site-specific incorporation of unnatural amino acids by the orthogonal mutant TyrRS/tRNA pair, mutants of human superoxide dismutase 1 (hSOD) containing each unnatural amino acid were generated and characterized ( See, eg, H.E. Parge, et al. (1992), PNAS United States of America 89:6109-13).

例如,用PS356(ATCC)作为模板,通过重叠PCR进行加入编码C-末端六组氨酸标记的DNA和将人超氧化物歧化酶基因中的Trp33密码子突变为琥珀密码子。将hSOD(Trp33TAG)HIS在来自pYES2.1(Invitrogen,Carlsbad,CA USA)的GAL1启动子和CYC1终止子之间克隆。用pYES2.1hSOD(Trp33 TAG)HIS将pECTyrRS-tRNACUA衍生质粒上的突变体合成酶和tRNA基因共转化入InvSc株(Invitrogen)中。对于蛋白表达,将细胞培养于SD-trp,-ura+棉子糖中,通过加入半乳糖在OD660 0.5诱导表达。通过Ni-NTA层析(Qiagen,Valencia,CA,USA)纯化HSOD突变体。For example, addition of DNA encoding a C-terminal hexahistidine tag and mutation of the Trp33 codon in the human superoxide dismutase gene to an amber codon were performed by overlap PCR using PS356 (ATCC) as a template. The hSOD(Trp33TAG)HIS was cloned between the GAL1 promoter and CYC1 terminator from pYES2.1 (Invitrogen, Carlsbad, CA USA). The mutant synthetase and tRNA genes on the pECTyrRS-tRNA CUA derived plasmid were co-transformed into InvSc strain (Invitrogen) with pYES2.1hSOD(Trp33 TAG)HIS. For protein expression, cells were cultured in SD-trp,-ura+raffinose and expression was induced at OD660 0.5 by addition of galactose. HSOD mutants were purified by Ni-NTA chromatography (Qiagen, Valencia, CA, USA).

从33位上含有琥珀密码子的基因生产六-组氨酸-标记的hSOD严格依赖于对-乙酰基PheRS-1-tRNACUA和1mM对-乙酰基-L-苯丙氨酸(密度测定<0.1%,在没有任何一个组件的情况下)(参见图9)。纯化含有全长hSOD的对-乙酰基-L-苯丙氨酸(例如,通过Ni-NTA亲和层析),产率为50纳克/毫升,与从含有大肠杆菌TyrRStRNACUA的细胞中纯化的产率相差不大。为了比较,在相同条件下,野生型hSODHIS的纯化产率为250纳克/毫升。Production of hexa-histidine-tagged hSOD from a gene containing an amber codon at position 33 is strictly dependent on p-acetyl PheRS-1-tRNA CUA and 1 mM p-acetyl-L-phenylalanine (densitometry< 0.1% in the absence of either component) (see Figure 9). Purify p-acetyl-L-phenylalanine containing full-length hSOD (e.g., by Ni-NTA affinity chromatography) in a yield of 50 ng/mL, compared to purification from cells containing E. coli TyrRStRNA CUA The yields are not much different. For comparison, the purified yield of wild-type hSODHIS was 250 ng/ml under the same conditions.

图9说明遗传编码非天然氨基酸的hSOD(33TAG)HIS在酿酒酵母中的蛋白表达(如图7B组所示,在图9中以它们在图7,B组中的编号表示)。图9的上部说明存在(+)和没有(-)非天然氨基酸情况下从酵母中纯化的hSOD的SDS-聚丙烯酰胺凝胶电泳,非天然氨基酸以数字表示,与图7,B组中用考马斯蓝染色的非天然氨基酸相-致。细胞含有为所示氨基酸选择的突变体合成酶-tRNA对。图9的中部说明用抗hSOD抗体探测的Western印迹。图9的下部说明用抗C-末端His6标记的抗体探测的Western印迹。Figure 9 illustrates the protein expression of hSOD(33TAG)HIS genetically encoding unnatural amino acids in Saccharomyces cerevisiae (shown in Figure 7 panel B and represented in Figure 9 by their numbers in Figure 7, panel B). The upper part of Figure 9 illustrates SDS-polyacrylamide gel electrophoresis of hSOD purified from yeast in the presence (+) and absence (-) of unnatural amino acids, which are represented numerically, as in Figure 7, panel B Coomassie blue-stained unnatural amino acids correspond to each other. Cells contain mutant synthetase-tRNA pairs selected for the indicated amino acids. The middle part of Figure 9 illustrates a Western blot probed with an anti-hSOD antibody. The lower part of Figure 9 illustrates a Western blot probed with an antibody against the C-terminal His6 tag.

通过将突变蛋白的胰蛋白酶消化物进行液相色谱和串联质谱分析确定掺入氨基酸的身份。例如,用胶体考马斯染色使质谱蛋白条带显色。将与野生型和突变型SOD相对应的凝胶条带从聚丙烯酰胺凝胶上切下,切成1.5毫米的立方体,还原并烷化,然后进行基本如上所述的胰蛋白酶水解。参见,例如,A.Shevchenko,等,(1996),Analytical Chemistry 68,850-858。通过纳米流反相HPLC/μESI/MS与LCQ离子阱质谱仪分析含有非天然氨基酸的胰蛋白酶肽。在装有纳米喷雾HPLC(Agilent 1100系列)的Finnigan LCQ Deca离子阱质谱仪(Thermo Finnigan)上进行液相色谱串联质谱(LC-MS/MS)分析。参见,例如,图10,A-H组。The identity of the incorporated amino acids was determined by liquid chromatography and tandem mass spectrometry of tryptic digests of the mutant proteins. For example, colloidal Coomassie staining is used to visualize mass spectrometric protein bands. Gel bands corresponding to wild-type and mutant SOD were excised from polyacrylamide gels, cut into 1.5 mm cubes, reduced and alkylated, followed by trypsin hydrolysis essentially as described above. See, eg, A. Shevchenko, et al., (1996), Analytical Chemistry 68, 850-858. Analysis of tryptic peptides containing unnatural amino acids by nanoflow reversed-phase HPLC/μESI/MS with LCQ ion trap mass spectrometer. Liquid chromatography tandem mass spectrometry (LC-MS/MS) analyzes were performed on a Finnigan LCQ Deca ion trap mass spectrometer (Thermo Finnigan) equipped with Nanospray HPLC (Agilent 1100 series). See, eg, Figure 10, panels A-H.

用离子阱质谱仪分离并片段化前体离子,它与带单电荷或双电荷离子的含有非天然氨基酸(标为Y*)的肽Val-Y*-Gly-Ser-Ile-Lys(SEQ ID NO:87)对应。片段离子质量可以是明确指定的,确认了对-乙酰基-L-苯丙氨酸的位点特异性掺入(参见,图10,A组)。没有观察到酪氨酸或其它氨基酸代替对-乙酰基-L-苯丙氨酸,从肽谱的信噪比获得最小99.8%的掺入纯度。当对-苯甲酰基PheRS-1、对-叠氮基PheRS-1、氧-meTyrRS-1或对-碘代PheRS-1用于将对-苯甲酰基-L-苯丙氨酸、对-叠氮基-L-苯丙氨酸、氧-甲基-L-赖氨酸或对-碘代-L-苯丙氨酸掺入hSOD(参见,图9和图10,A-H组)中时,观察到类似的蛋白表达的保真度和效率。在实验的样品制备中,对-叠氮基-L-苯丙氨酸被还原成对-氨基-L-苯丙氨酸,在质谱中观察到了后者。该还原并不通过含有对-叠氮基-L-苯丙氨酸的纯化SOD的化学衍生在体内发生。在对照实验中,制备在33位含有色氨酸、酪氨酸和亮氨酸的六-组氨酸-标记的hSOD,进行质谱测定(参见图10,F、G和H组)。含有氨基酸33的离子在这些样品的质谱中清晰可见。Ion trap mass spectrometry was used to separate and fragment the precursor ion, which was associated with the peptide Val-Y*-Gly-Ser-Ile-Lys (SEQ ID NO: 87) corresponding. Fragment ion masses could be unambiguously assigned, confirming site-specific incorporation of p-acetyl-L-phenylalanine (see, Figure 10, panel A). No substitution of p-acetyl-L-phenylalanine by tyrosine or other amino acids was observed, a minimum incorporation purity of 99.8% was obtained from the signal-to-noise ratio of the peptide map. When p-benzoyl PheRS-1, p-azido PheRS-1, oxy-meTyrRS-1 or p-iodoPheRS-1 are used to convert p-benzoyl-L-phenylalanine, p- Azido-L-phenylalanine, oxy-methyl-L-lysine, or p-iodo-L-phenylalanine when incorporated into hSOD (see, Figure 9 and Figure 10, panels A-H) , similar fidelity and efficiency of protein expression were observed. During sample preparation for the experiment, p-azido-L-phenylalanine was reduced to p-amino-L-phenylalanine, which was observed in the mass spectrum. This reduction does not occur in vivo by chemical derivatization of purified SOD containing p-azido-L-phenylalanine. In a control experiment, hexa-histidine-labeled hSOD containing tryptophan, tyrosine and leucine at position 33 was prepared for mass spectrometry (see Figure 10, panels F, G and H). Ions containing amino acid 33 are clearly visible in the mass spectra of these samples.

将5个非天然氨基酸独立地加入酿酒酵母的遗传密码证明了我们方法的通用性,提示它可应用于其它非天然氨基酸,包括自旋标记的、金属结合的或可光异构的氨基酸。该方法可产生具有新的或提高性质的蛋白,以及在酵母中易于控制蛋白功能。而且,在哺乳动物细胞中,大肠杆菌酪氨酰-tRNA合成酶与嗜热脂肪芽孢杆菌tRNACUA形成正交对。参见,例如,Sakamoto等,(2002)Nucleic Acids Res.30:4692。因此,人们可以使用在酵母中开发的氨酰基-tRNA合成酶将非天然氨基酸加到高等真核生物的遗传密码中。The independent addition of five unnatural amino acids to the genetic code of S. cerevisiae demonstrates the generality of our method, suggesting that it can be applied to other unnatural amino acids, including spin-labeled, metal-binding or photoisomerizable amino acids. This approach can lead to the generation of proteins with new or improved properties, as well as easy control of protein function in yeast. Furthermore, in mammalian cells, the E. coli tyrosyl-tRNA synthetase forms an orthogonal pair with the Bacillus stearothermophilus tRNA CUA . See, eg, Sakamoto et al. (2002) Nucleic Acids Res. 30:4692. Thus, one can use aminoacyl-tRNA synthetases developed in yeast to add unnatural amino acids to the genetic code of higher eukaryotes.

表4.选择的氨酰基-TRNA合成酶的序列Table 4. Sequences of selected aminoacyl-TRNA synthetases

Figure A20048002115501051
Figure A20048002115501051

a这些克隆也含有Aspl65Gly突变 aThese clones also contained the Aspl65Gly mutation

实施例3:将具有新反应性的氨基酸加到真核生物的遗传密码中Example 3: Adding amino acids with novel reactivity to the genetic code of eukaryotes

证明了一个基于[3+2]环加成与蛋白生物共轭的方法,该方法是位点特异性,快速,可靠和可逆的。非常需要在生理条件下以高度选择方式修饰蛋白的化学反应。参见,例如,Lemineux和Bertozzi,(1996)TIBTECH,16:506-513。目前用于蛋白选择性修饰的大部分反应都包括亲核和亲电子反应配偶之间形成共价键,例如α-卤代酮与组氨酸或半胱氨酸侧链的反应。在这些情况下,选择性由蛋白中亲核残基的数量和可及性决定。在合成或半合成蛋白的情况下,可以使用其它更具选择性的反应,如非天然酮式-氨基酸与酰肼或氨氧基化合物的反应。参见,例如,Cornish,等,(1996)Am.Chem.Soc.,118:8150-8151;和Mahal,等,(1997)Science,276:1125-1128。最近,在细菌和酵母中用具有改变氨基酸特异性的正交tRNA-合成酶对已可能遗传编码非天然氨基酸(参见,例如,Wang,等,(2001)Science 292:498-500;Chin,等,(2002)Am.Chem.Soc.124:9026-9027;和Chin,等,(2002)Proc.Natl.Acad.Sci.,99:11020-11024),包括含有酮的氨基酸(参见,例如,Wang,等,(2003)Proc.Natl.Acad.Sci.,100:56-61;Zhang,等,(2003)Biochemistry,42:6735-6746;和Chin,等,(2003)Science,印刷中)。该方法已使得用包括荧光团、交联剂和细胞毒分子在内的许多试剂选择性标记基本上任何蛋白质成为可能。A method based on [3+2] cycloaddition and protein bioconjugation is demonstrated that is site-specific, fast, reliable and reversible. Chemical reactions that modify proteins in a highly selective manner under physiological conditions are highly desirable. See, eg, Lemineux and Bertozzi, (1996) TIBTECH, 16:506-513. Most reactions currently used for selective protein modification involve the formation of covalent bonds between nucleophilic and electrophilic reaction partners, such as the reaction of α-haloketones with histidine or cysteine side chains. In these cases, selectivity is determined by the number and accessibility of nucleophilic residues in the protein. In the case of synthetic or semi-synthetic proteins, other more selective reactions can be used, such as the reaction of unnatural keto-amino acids with hydrazides or aminooxy compounds. See, eg, Cornish, et al., (1996) Am. Chem. Soc., 118:8150-8151; and Mahal, et al., (1997) Science, 276:1125-1128. Recently, it has been possible to genetically encode unnatural amino acids in bacteria and yeast with orthogonal tRNA-synthetase pairs with altered amino acid specificities (see, e.g., Wang, et al., (2001) Science 292:498-500; Chin, et al. , (2002) Am.Chem.Soc.124:9026-9027; and Chin, et al., (2002) Proc.Natl.Acad.Sci., 99:11020-11024), including ketone-containing amino acids (see, for example, Wang, et al., (2003) Proc. Natl. Acad. Sci., 100:56-61; Zhang, et al., (2003) Biochemistry, 42:6735-6746; and Chin, et al., (2003) Science, in press) . This approach has enabled the selective labeling of essentially any protein with a number of reagents including fluorophores, cross-linkers and cytotoxic molecules.

描述了用于蛋白选择性修饰的一种高效方法,它包括响应于,例如,琥珀无义密码子TAG,将含有叠氮化物或乙炔的非天然氨基酸遗传掺入蛋白质中。然后可以分别用炔基(乙炔)或叠氮化物衍生物通过Huisgen[3+2]环加成反应(参见,例如,Padwa,A.《综合有机合成》(Comprehensive Organic Synthesis),第4卷,(1991)Trost,B.M.编,Pergamon,Oxford,第1069-1109页;和Huisgen,R.刊于《1.3-双极环加成化学》(1,3-Dipolar Cycloaddition Chemistry),(1984)Padwa,A.编,Wiley,New York,第1-176页)修饰这些氨基酸侧链。因为该方法包括环加成而非亲核取代,所以可以以极高的选择性来修饰蛋白质(可以使用的另一方法是具有四半胱氨酸基序的双砷化合物上的配体交换,参见,例如,Griffin,等,(1998)Science 281:269-272)。该反应可以在室温、含水条件下以极好的区域选择性(1,4>1,5)通过将催化量的Cu(I)盐加入到反应混合物中进行。参见,例如,Tornoe,等,(2002)Org.Chem.67:3057-3064;和Rostovtsev,等,(2002)Angew.Chem.Int.Ed.Engl.41:2596-2599。实际上,Finn和同事们已经证明此叠氮化物-炔[3+2]环加成可以在完整的豇豆花叶病毒表面上进行。参见,例如,Wang,等,(2003)J.Am.Chem.Soc.,125:3192-3193。另一最近实施例将叠氮基亲电子引入蛋白,和随后的[3+2]环加成,参见,例如,Speers,等,(2003)J.Am.Chem.Soc.,125:4686-4687。A highly efficient method for the selective modification of proteins is described which involves the genetic incorporation of unnatural amino acids containing azide or acetylene into proteins in response to, for example, the amber nonsense codon TAG. This can then be done via Huisgen [3+2] cycloaddition with alkynyl (acetylene) or azide derivatives, respectively (see, e.g., Padwa, A. Comprehensive Organic Synthesis, vol. 4, (1991) Trost, B.M. ed., Pergamon, Oxford, pp. 1069-1109; and Huisgen, R. in 1,3-Dipolar Cycloaddition Chemistry, (1984) Padwa, A. Ed., Wiley, New York, pp. 1-176) modify these amino acid side chains. Because this method involves cycloaddition rather than nucleophilic substitution, proteins can be modified with very high selectivity (another method that can be used is ligand exchange on diarsenic compounds with a tetracysteine motif, see , eg, Griffin, et al. (1998) Science 281:269-272). The reaction can be carried out at room temperature under aqueous conditions with excellent regioselectivity (1,4 > 1,5) by adding catalytic amounts of Cu(I) salts to the reaction mixture. See, eg, Tornoe, et al., (2002) Org. Chem. 67:3057-3064; and Rostovtsev, et al., (2002) Angew. Chem.Int.Ed.Engl.41:2596-2599. Indeed, Finn and colleagues have demonstrated that this azide-alkyne [3+2] cycloaddition can be performed on the surface of intact cowpea mosaic virus. See, eg, Wang, et al., (2003) J. Am. Chem. Soc., 125:3192-3193. Another recent example of electrophilic introduction of an azido group into proteins, and subsequent [3+2] cycloaddition, see, e.g., Speers, et al., (2003) J.Am.Chem.Soc., 125:4686- 4687.

为了将炔基(乙炔)或叠氮化物官能团选择性引入真核蛋白的独特位点,在酵母中产生演化的正交TyrRS/tRNACUA对,它遗传编码乙炔和叠氮基氨基酸,分别如图11的1和2所示。可以在后续的还加成反应中,在生理条件下用荧光团有效并选择性地标记所得的蛋白。To selectively introduce alkynyl (acetylene) or azide functional groups into unique sites on eukaryotic proteins, evolved orthogonal TyrRS/tRNA CUA pairs genetically encoding acetylene and azido amino acids were generated in yeast, as shown in Fig. 1 and 2 of 11. The resulting protein can be efficiently and selectively labeled with a fluorophore under physiological conditions in a subsequent further addition reaction.

以前,在酵母中证明大肠杆菌酪氨酰tRNA-tRNA合成酶对是正交的,即,tRNA或合成酶均不与内源性酵母tRNA或合成酶交叉反应。参见,例如,Chin,等,(2003)Chem.Biol.,10:511-519。该正交tRNA-合成酶对已经用于响应于TAG密码子,将许多非天然氨基酸选择性地和有效地掺入酵母中(例如,Chin,等,(2003)Science,印刷中)。为了改变大肠杆菌酪氨酰-tRNA合成酶的氨基酸特异性,以接受图11的氨基酸1或2,通过随机化Tyr37、Asn126、Asp182、Phe183和Leu186的密码子产生~107突变体的文库。根据来自嗜热脂肪芽孢杆菌的同源合成酶的晶体结构选择这五个残基。为获得具体氨基酸用作底物的合成酶,使用了-种选择方案,其中将转录激活物GAL4的基因的Thr44和Arg110的密码子转化为琥珀无义密码子(TAG)。参见,例如,Chin,等,(2003)Chem.Biol.,10:511-519。在MaV203:pGADGAL4(2TAG)酵母株中抑制这些琥珀密码子导致产生全长GAL4(参见,例如,Keegan,等,(1986)Science,231:699-704;和Ptashne,(1988)Nature,335:683-689),它反过来驱动HIS3和URA3报道基因的表达。后一个基因产物补充组氨酸和尿嘧啶营养缺陷,允许在图11的1或2存在下选择载有活性合成酶突变体的克隆。通过在缺乏图11的1或2但含有5-氟乳清酸的培养基上生长去除装载内源性氨基酸的合成酶,URA3将5-氟乳清酸转化为有毒产物。通过对该文库进行三轮选择(正、负、正),我们鉴定了选择性针对图11的1(pPR-EcRS1-5)和针对图11的2(pAZ-EcRS1-6)的合成酶,如表8所示。Previously, it was demonstrated in yeast that the E. coli tyrosyl tRNA-tRNA synthetase pair is orthogonal, ie, neither tRNA nor synthetase cross-reacts with endogenous yeast tRNA or synthetase. See, eg, Chin, et al., (2003) Chem. Biol., 10:511-519. This orthogonal tRNA-synthetase pair has been used to selectively and efficiently incorporate a number of unnatural amino acids into yeast in response to the TAG codon (eg, Chin, et al., (2003) Science, in press). To alter the amino acid specificity of E. coli tyrosyl-tRNA synthetase to accept amino acid 1 or 2 of Figure 11, ~10 7 were generated by randomizing the codons for Tyr 37 , Asn 126 , Asp 182 , Phe 183 and Leu 186 A library of mutants. These five residues were chosen based on the crystal structure of a homologous synthetase from Bacillus stearothermophilus. To obtain synthetases for which specific amino acids serve as substrates, a selection scheme was used in which the codons for Thr44 and Arg110 of the gene for the transcriptional activator GAL4 were converted to an amber nonsense codon (TAG). See, eg, Chin, et al., (2003) Chem. Biol., 10:511-519. Suppression of these amber codons in the MaV203:pGADGAL4(2TAG) yeast strain resulted in the production of full-length GAL4 (see, e.g., Keegan, et al., (1986) Science, 231:699-704; and Ptashne, (1988) Nature, 335: 683-689), which in turn drives the expression of the HIS3 and URA3 reporter genes. The latter gene product complements the histidine and uracil auxotrophy, allowing selection of clones harboring active synthetase mutants in the presence of 1 or 2 of Figure 11. URA3 converts 5-fluoroorotic acid into a toxic product by removing endogenous amino acid-loaded synthetases by growing on media lacking 1 or 2 of Figure 11 but containing 5-fluoroorotic acid. By performing three rounds of selection (positive, negative, positive) on this library, we identified synthetases selectively against 1 of Figure 11 (pPR-EcRS1-5) and against 2 of Figure 11 (pAZ-EcRS1-6), As shown in Table 8.

所有合成酶都显示了强的序列相似性,包括保守的Asn126,这提示该残基具有重要的功能作用。令人惊讶的是,合成酶pPR-EcRS-2和pAZ-EcRS-6,逐步形成以分别结合图11的1和2,会聚成相同序列(Tyr37→Thr37、Asn126→Asn126、Asp182→Ser182和Phe183→Ala183、Leu186→Leu186)。结合酪氨酸的酚羟基与Tyr37和Asp182之间的氢键由于分别突变成Thr和Ser而被破坏。Phe183转化为Ala,可能为容纳非天然氨基酸提供更多空间。为证实该合成酶(和其它合成酶)接受氨基酸作为底物的能力,将载有合成酶质粒的选择株培养于缺乏尿嘧啶(从缺乏组氨酸的培养基中获得了相同结果)但补充有图11的1或2的培养基上。生长结果揭示五个炔合成酶中的四个能够将两种非天然氨基酸加到其tRNA上。叠氮基合成酶似乎更具选择性,因为只有pAZ-EcRS-6(与pPR-EcRS-2相同)能够用图11的1和2氨酰化其tRNA。没有图11的1或2情况下未检测到生长的事实提示,合成酶不接受20种普通氨基酸中任意一种作为底物。参见图14。All synthetases showed strong sequence similarity, including the conserved Asn 126 , suggesting an important functional role for this residue. Surprisingly, the synthetases pPR-EcRS-2 and pAZ-EcRS-6, formed stepwise to bind 1 and 2 of Figure 11, respectively, converged into the same sequence (Tyr 37 →Thr 37 , Asn 126 →Asn 126 , Asp 182 → Ser 182 and Phe 183 → Ala 183 , Leu 186 → Leu 186 ). The hydrogen bonds between the tyrosine-binding phenolic hydroxyl group and Tyr 37 and Asp 182 are disrupted by mutations to Thr and Ser, respectively. Conversion of Phe 183 to Ala may provide more room for unnatural amino acids. To confirm the ability of this synthetase (and other synthetases) to accept amino acids as substrates, selected strains carrying the synthetase plasmid were grown in cultures lacking uracil (the same results were obtained from media lacking histidine) but supplemented with There are 1 or 2 of Figure 11 on the medium. Growth results revealed that four of the five alkyne synthases were able to add two unnatural amino acids to their tRNAs. The azido synthetase seems to be more selective since only pAZ-EcRS-6 (same as pPR-EcRS-2) was able to aminoacylate its tRNA with 1 and 2 of Figure 11 . The fact that no growth was detected without 1 or 2 of Figure 11 suggests that the synthetase does not accept any of the 20 common amino acids as substrates. See Figure 14.

对于所有其它使用pPR-EcRS-2(pAZ-EcRS-6)的实验,允许人们简单地通过将图11的1或2加入含有表达株的培养基来控制将哪个非天然氨基酸掺入。对于蛋白质生产,将融合了C-末端6xHis标记的人超氧化物歧化酶-1(SOD)的允许残基Trp33的密码子突变为TAG。例如,将人超氧化物歧化酶(Trp33TAG)HIS在pYES2.1(Invitrogen,Carlsbad,CA USA)的GAL1启动子和CYC1终止子之间克隆。用pYES2.1SOD(Trp33TAG)HIS将pECTyrRS-tRNACUA衍生质粒上的突变体合成酶和tRNA基因共转化入InvSc株(Invitrogen)中。对于蛋白质表达,将细胞生长于SD-tr、-ura+棉子糖中,通过加入半乳糖在OD660 0.5诱导表达。在存在或没有图11的1mM 1或2的情况下表达蛋白,通过Ni-NTA层析(Qiagen,Valencia,CA,USA)纯化。For all other experiments using pPR-EcRS-2 (pAZ-EcRS-6), one was allowed to control which unnatural amino acid was incorporated simply by adding 1 or 2 of Figure 11 to the medium containing the expressing strain. For protein production, the codon for the permissive residue Trp33 of human superoxide dismutase-1 (SOD) fused to a C-terminal 6xHis tag was mutated to TAG. For example, human superoxide dismutase (Trp 33 TAG) HIS was cloned between the GAL1 promoter and CYC1 terminator of pYES2.1 (Invitrogen, Carlsbad, CA USA). The mutant synthetase and tRNA genes on the pECTyrRS-tRNA CUA derived plasmid were co-transformed into InvSc strain (Invitrogen) with pYES2.1SOD(Trp 33 TAG)HIS. For protein expression, cells were grown in SD-tr, -ura+raffinose and expression was induced at OD660 0.5 by addition of galactose. Proteins were expressed in the presence or absence of 1 mM 1 or 2 of Figure 11 and purified by Ni-NTA chromatography (Qiagen, Valencia, CA, USA).

SDS-PAGE和Western印迹分析揭示非天然氨基酸依赖性蛋白表达,与没有图11的1或2情况下的蛋白表达相比,密度测定确定其保真度>99%。参见图12。为进一步确认掺入氨基酸的身份,将胰蛋白酶消化物进行液相色谱和串联质谱分析。SDS-PAGE and Western blot analysis revealed unnatural amino acid-dependent protein expression, as determined by densitometry with >99% fidelity compared to protein expression without 1 or 2 of Figure 11 . See Figure 12. To further confirm the identity of the incorporated amino acids, tryptic digests were subjected to liquid chromatography and tandem mass spectrometry.

例如,用镍亲和柱纯化野生型和突变型hSOD,用胶体考马斯染色使蛋白条带显色。将与野生型和突变型SOD相对应的凝胶条带从聚丙烯酰胺凝胶上切下,切成1.5毫米的立方体,还原并烷化,然后进行基本如上所述的胰蛋白酶水解。参见,例如,Shevchenko,A等,(1996)Anal.Chem.68:850-858。通过纳米流反相HPLC/μESI/MS与LCQ离子阱质谱仪分析含有非天然氨基酸的胰蛋白酶肽。参见,图15,A和B组。在装有纳米喷雾HPLC(Agilent 1100系列)的Finnigan LCQ Deca离子阱质谱仪(Thermo Finnigan)上进行液相色谱串联质谱(LC-MS/MS)分析。For example, wild-type and mutant hSOD were purified using nickel affinity columns, and protein bands were visualized by colloidal Coomassie staining. Gel bands corresponding to wild-type and mutant SOD were excised from polyacrylamide gels, cut into 1.5 mm cubes, reduced and alkylated, followed by trypsin hydrolysis essentially as described above. See, eg, Shevchenko, A. et al. (1996) Anal. Chem. 68:850-858. Analysis of tryptic peptides containing unnatural amino acids by nanoflow reversed-phase HPLC/μESI/MS with LCQ ion trap mass spectrometer. See, Figure 15, panels A and B. Liquid chromatography tandem mass spectrometry (LC-MS/MS) analyzes were performed on a Finnigan LCQ Deca ion trap mass spectrometer (Thermo Finnigan) equipped with Nanospray HPLC (Agilent 1100 series).

用离子阱质谱仪分离并片段化前体离子,它与带单电荷或双电荷的含有非天然氨基酸(标为Y*)的肽VY*GSIK(SEQ ID NO:87)前体离子对应。片段离子质量可以是明确指定的,确认各非天然氨基酸的位点特异性掺入。LCMS/MS并未表明在此位置掺入任何天然氨基酸。所有突变体肽的信噪比>1000,这提示掺入的保真度优于99.8%。参见,图15,A和B组。Ion trap mass spectrometry was used to separate and fragment the precursor ion corresponding to the singly or doubly charged peptide VY*GSIK (SEQ ID NO: 87) precursor ion containing an unnatural amino acid (labeled Y*). Fragment ion masses can be unambiguously assigned, confirming site-specific incorporation of each unnatural amino acid. LCMS/MS did not indicate the incorporation of any natural amino acid at this position. Signal-to-noise ratios for all mutant peptides were >1000, suggesting a fidelity of incorporation better than 99.8%. See, Figure 15, panels A and B.

为证明可以通过叠氮-炔[3+2]环加成反应将小有机分子共轭至蛋白质,合成图13中A组所示的染料3-6,它们含有乙炔基或叠氮基并具有丹磺酰或荧光素荧光团(参见本文的实施例5)。环加成本身用0.01mM蛋白在pH8的磷酸盐缓冲液(PB)中,在图13A组中所示2mM 3-6、1mM CuSO4和~1毫克铜线的存在下,37℃反应4小时(参见图13,B组)进行。To demonstrate that small organic molecules can be conjugated to proteins via azide-alkyne [3+2] cycloaddition reactions, dyes 3–6 shown in panel A in Figure 13 were synthesized, which contain ethynyl or azide groups and have Dansyl or fluorescein fluorophores (see Example 5 herein). The cycloaddition itself was reacted with 0.01 mM protein in phosphate buffer (PB) pH 8 in the presence of 2 mM 3-6, 1 mM CuSO 4 and ~1 mg copper wire as shown in Figure 13 panel A, at 37°C for 4 hours (see Figure 13, Panel B) proceed.

例如,向45微升蛋白的PB缓冲液(pH=8)中加入1微升CuSO4(在H20中50mM)、2微升染料(在EtOH中50mM)、2微升三(1-苄基-1H-[1,2,3]三唑-4-基甲基)胺(在DMSO中50mM)和铜线。室温或37℃下4小时或4℃过夜后,加入450微升H2O,将混合物离心通过透析膜(10kDa截断)。用2x500微升通过离心洗涤上清后,溶液体积为50毫升。通过SDS-PAGE分析20毫升的样品。可以通过在H2O/MeOH/AcOH(5∶5∶1)中浸泡过夜从凝胶中去除偶尔剩余的染料。将三(羧乙基)膦用作还原剂通常导致标记效率更低。与较早的观察(例如,Wang,Q.等,(2003)J.Am.Chem.Soc.125:3192-3193)不同,存在或没有三(三唑基)胺配体并不实质性影响反应的结果。For example, to 45 μl of protein in PB buffer (pH=8) was added 1 μl of CuSO 4 (50 mM in H 2 O), 2 μl of dye (50 mM in EtOH), 2 μl of tris(1- Benzyl-1H-[1,2,3]triazol-4-ylmethyl)amine (50 mM in DMSO) and copper wire. After 4 hours at room temperature or 37°C or overnight at 4°C, 450 microliters of H2O were added and the mixture was centrifuged through a dialysis membrane (10 kDa cut-off). After washing the supernatant by centrifugation with 2x500 microliters, the solution volume was 50 milliliters. 20 ml samples were analyzed by SDS-PAGE. Occasional residual dye can be removed from the gel by soaking overnight in H2O /MeOH/AcOH (5:5:1). The use of tris(carboxyethyl)phosphine as reducing agent generally results in less efficient labeling. Unlike earlier observations (eg, Wang, Q. et al., (2003) J.Am.Chem.Soc. 125:3192-3193), the presence or absence of tris(triazolyl)amine ligands does not substantially affect The result of the reaction.

透析后,用SDS-PAGE分析标记蛋白,在图13A组中所示3-4丹磺酰染料(λex=337纳米,λem=506纳米)的情况下用光密度计或在图13A组中所示5-6荧光素染料(λex=483纳米,λem=516纳米)的情况下用感光成像仪在凝胶内成像。参见,例如,Blake,(2001)Curr.Opin.Pharmacol.,1:533-539;Wouters,等,(2001)Trendsin Cell Biology 11:203-211;和Zacharias,等,(2000)Curr.Opin.Neurobiol.,10:416-421。通过LC MS/MS分析胰蛋白酶消化物表征标记蛋白,显示了荧光团的位点特异性附着,转化率平均为75%(例如,通过比较用图13,A组中所示5或6标记的SOD的A280/A495值确定)。图13,A组中所示3和炔蛋白或图13,A组中所示4和叠氮基蛋白之间没有可观察的反应的事实确证了此生物共轭的选择性。After dialysis, the labeled proteins were analyzed by SDS-PAGE with a densitometer in the case of the 3-4 dansyl dyes (λ ex = 337 nm, λ em = 506 nm) shown in Figure 13 Panel A or in Figure 13 Panel A In the case of 5-6 fluorescein dyes (λ ex = 483 nm, λ em = 516 nm) as indicated in the gel were imaged with a photosensitive imager. See, eg, Blake, (2001) Curr. Opin. Pharmacol., 1:533-539; Wouters, et al., (2001) Trendsin Cell Biology 11:203-211; and Zacharias, et al., (2000) Curr. Opin. Neurobiol., 10: 416-421. Characterization of labeled proteins by LC MS/MS analysis of tryptic digests showed site-specific attachment of the fluorophore with an average conversion rate of 75% (e.g., by comparison with 5 or 6 labeled in Figure 13, panel A). The A 280 /A 495 value of SOD was determined). The fact that there was no observable reaction between 3 shown in Figure 13, panel A and the alkyne protein or 4 and the azide protein shown in Figure 13, panel A, confirms the selectivity of this bioconjugate.

表8进化形成的合成酶Table 8 Synthetases formed by evolution

为1选择的pPR-EcRS和为2选择的pAZEcRS(如图11所示)pPR-EcRS selected for 1 and pAZEcRS selected for 2 (as shown in Figure 11)

Figure A20048002115501091
Figure A20048002115501091

实施例4:炔氨基酸的合成Embodiment 4: the synthesis of alkyne amino acid

在本发明的一个方面,本发明提供了炔基氨基酸。式IV说明了炔氨基酸的一个结构的例子:In one aspect of the invention, the invention provides alkynyl amino acids. Formula IV illustrates an example of the structure of an alkyne amino acid:

Figure A20048002115501092
Figure A20048002115501092

炔氨基酸一般是具有式IV的任意结构,其中R1是用于二十种天然氨基酸之一的取代基,R2是炔基取代基。例如,图11中1说明对-炔丙基氧基苯丙氨酸的结构。可以合成对-炔丙基氧基苯丙氨酸,例如,如下所述。在这个实施方式中,对-炔丙基氧基苯丙氨酸的合成可以在起始于市售N-Boc-酪氨酸的三个步骤中完成。An alkyne amino acid is generally any structure of formula IV, where R is a substituent for one of the twenty natural amino acids and R is an alkynyl substituent. For example, 1 in Figure 11 illustrates the structure of p-propargyloxyphenylalanine. p-propargyloxyphenylalanine can be synthesized, for example, as described below. In this embodiment, the synthesis of p-propargyloxyphenylalanine can be accomplished in three steps starting from commercially available N-Boc-tyrosine.

例如,将N-叔-丁氧基羰基-酪氨酸(2克,7毫摩尔,1当量)和K2CO3(3克,21毫摩尔,3当量)悬浮于无水DMF(15毫升)。将炔丙基溴(2.1毫升,21毫摩尔,3当量,80%甲苯溶液)缓慢加入,室温下搅拌反应混合物18小时。加入水(75毫升和Et2O(50毫升),分层,用Et2O(2x50毫升)提取水相。干燥(MgSO4)混合的有机层,减压去除溶剂。获得黄色油状产物(2.3克,91%),无需进一步纯化就用于下一步骤。以下面的化学结构8说明Boc-保护的产物:For example, N-tert-butoxycarbonyl-tyrosine (2 g, 7 mmol, 1 eq) and K 2 CO 3 (3 g, 21 mmol, 3 eq) were suspended in anhydrous DMF (15 mL ). Propargyl bromide (2.1 mL, 21 mmol, 3 eq, 80% in toluene) was added slowly and the reaction mixture was stirred at room temperature for 18 hours. Water (75 mL and Et2O (50 mL) were added, the layers were separated and the aqueous phase was extracted with Et2O (2x50 mL). The combined organic layers were dried ( MgSO4 ) and the solvent was removed under reduced pressure. The product was obtained as a yellow oil (2.3 g, 91%), used in the next step without further purification. The Boc-protected product is illustrated by the following chemical structure 8:

Figure A20048002115501101
Figure A20048002115501101

2-叔-丁氧基羰基氨基-3-[4-(丙-2-炔基氧基)苯基]-丙酸炔丙基酯2-tert-Butoxycarbonylamino-3-[4-(prop-2-ynyloxy)phenyl]-propargyl propionate

在0℃下,小心地将乙酰氯(7毫升)加入甲醇(60毫升)中,以产生5M无水HCl的甲醇溶液。加入前一步骤的产物(2克,5.6毫摩尔),搅拌反应物4小时,此时允许加热到环境温度。减压去除挥发性物质后,获得淡黄色固体(1.6克,98%)(参见化学结构9),将它直接用于下一步骤。Acetyl chloride (7 mL) was carefully added to methanol (60 mL) at 0°C to give a 5M solution of anhydrous HCl in methanol. The product from the previous step (2 g, 5.6 mmol) was added and the reaction was stirred for 4 hours, at which point it was allowed to warm to ambient temperature. After removal of volatiles under reduced pressure, a light yellow solid (1.6 g, 98%) was obtained (see Chemical Structure 9), which was used directly in the next step.

2-氨基-3-[4-(丙-2-炔基氧基)苯基]-丙酸炔丙基酯2-Amino-3-[4-(prop-2-ynyloxy)phenyl]-propargyl propanoate

将来自前一步骤的炔丙基酯(1.6克,5.5毫摩尔)溶解于2N NaOH(14毫升)和MeOH(10毫升)的含水混合物中。室温下搅拌1.5小时后,通过加入浓HCI将pH调整到7。加入水(20毫升),将混合物置于4℃过夜。过滤沉淀,用冰冷的H2O洗涤,真空干燥,产生1.23克(90%)图11中的1(2-氨基-3-苯基丙酸(1)(也称为对-炔丙基氧基苯丙氨酸),白色固体。1HNMR(400MHz,D2O)(如D2O中的钾盐)δ7.20(d,J=8.8Hz,2H),6.99(d,J=8.8Hz,2H)4.75(s,2H),3.50(dd,J=5.6,7.2Hz,1H),2.95(dd,J=5.6,13.6Hz,1H),2.82(dd,J=7.2,13.6Hz,1H);13C NMR(100MHz,D2O)δ181.3,164.9,155.6,131.4,130.7,115.3,57.3,56.1,39.3;HRMS(CI)m/z220.0969[C12H13NO3(M+1)需要220.0968]。The propargyl ester from the previous step (1.6 g, 5.5 mmol) was dissolved in an aqueous mixture of 2N NaOH (14 mL) and MeOH (10 mL). After stirring at room temperature for 1.5 hours, the pH was adjusted to 7 by addition of concentrated HCI. Water (20 mL) was added and the mixture was left at 4°C overnight. The precipitate was filtered, washed with ice-cold H2O , and dried in vacuo to yield 1.23 g (90%) of 1(2-amino-3-phenylpropanoic acid (1) (also known as p-propargyloxy phenylalanine), white solid. 1 HNMR (400MHz, D 2 O) (as potassium salt in D 2 O) δ7.20 (d, J=8.8Hz, 2H), 6.99 (d, J=8.8 Hz, 2H) 4.75(s, 2H), 3.50(dd, J=5.6, 7.2Hz, 1H), 2.95(dd, J=5.6, 13.6Hz, 1H), 2.82(dd, J=7.2, 13.6Hz, 1H); 13 C NMR (100MHz, D 2 O) δ181.3, 164.9, 155.6, 131.4, 130.7, 115.3, 57.3, 56.1, 39.3; HRMS (CI) m/z220.0969 [C 12 H 13 NO 3 ( M+1) requires 220.0968].

实施例5:通过[3+2]环加成将分子加入到具有非天然氨基酸的蛋白中Example 5: Addition of molecules to proteins with unnatural amino acids via [3+2] cycloaddition

在一个方面,本发明提供了将含有非天然氨基酸的蛋白质与附加取代分子偶联的方法和相关组合物。例如,可以通过[3+2]环加成将附加取代基加入非天然氨基酸。参见,例如,图16。例如,可根据下面公开的[3+2]环加成反应的条件将所需分子的[3+2]环加成(例如,包括第二活性基团,如炔三键或叠氮基)到具有非天然氨基酸的蛋白(例如,具有第一活性基团,如叠氮基或三键)中。例如,将包含非天然氨基酸的蛋白的PB缓冲液(pH=8)加入CuSO4、所需分子和铜线中。混合物孵育后(例如,室温或37℃下约4小时,或4℃过夜),加入H2O,通过透析膜过滤混合物。可以通过,例如凝胶分析来分析加入样品。In one aspect, the invention provides methods and related compositions for coupling unnatural amino acid-containing proteins to additional substitution molecules. For example, additional substituents can be added to unnatural amino acids by [3+2] cycloaddition. See, eg, Figure 16. For example, a [3+2] cycloaddition of a desired molecule (e.g., including a second reactive group such as an alkyne triple bond or an azido group) can be performed according to the conditions of the [3+2] cycloaddition reaction disclosed below. into proteins with unnatural amino acids (eg, with a first reactive group such as an azide group or a triple bond). For example, PB buffer (pH = 8) of a protein containing an unnatural amino acid is added to CuSO4 , the desired molecule and a copper wire. After incubation of the mixture (eg, about 4 hours at room temperature or 37°C, or overnight at 4°C), H2O is added and the mixture is filtered through a dialysis membrane. Spiked samples can be analyzed by, for example, gel analysis.

所述分子的例子包括但不限于,例如,具有三键或叠氮基的分子,如具有图13,A组的式3、4、5和6等结构的分子。而且,可以将三键或叠氮基掺入其它感兴趣的分子,例如聚合物(如聚(乙二醇)和衍生物)、交联剂、附加染料、光交联剂、细胞毒化合物、亲和标记、生物素、糖、树脂、珠、第二种蛋白或多肽、金属螯合剂、辅因子、脂肪酸、碳水化合物、多核苷酸(例如DNA、RNA等)等的结构中,然后也可用于[3+2]环加成。Examples of such molecules include, but are not limited to, for example, molecules having a triple bond or an azide group, such as molecules having structures such as formulas 3, 4, 5 and 6 of Figure 13, Group A. Furthermore, triple bonds or azido groups can be incorporated into other molecules of interest, such as polymers (such as poly(ethylene glycol) and derivatives), crosslinkers, additional dyes, photocrosslinkers, cytotoxic compounds, In the structure of affinity tags, biotin, sugars, resins, beads, second proteins or polypeptides, metal chelators, cofactors, fatty acids, carbohydrates, polynucleotides (e.g. DNA, RNA, etc.), then also available Cycloaddition at [3+2].

在本发明的一个方面,可以如下所述合成具有图13,A组的式3、4、5或6的分子。例如,通过在0℃下将炔丙基胺(250微升,3.71毫摩尔,3当量)加入丹磺酰氯(500毫克,1.85毫摩尔,1当量)和三乙胺(258微升,1.85毫摩尔,1当量)的CH2Cl2(10毫升)溶液,合成图13,A组的3中和下面的化学结构3中所显示的炔染料。搅拌1小时后,将反应混合物加热到室温,再搅拌1小时。真空去除挥发物,通过硅胶层析(Et2O/己烷=1∶1)纯化粗产物,产生黄色固体的图13,A组的3(418毫克,78%)。分析数据与文献中报道的相同。参见,例如,Bolletta,F等,(1996)Organometallics 15:2415-17。化学结构3中显示了本发明中可使用的炔染料的结构的例子:In one aspect of the invention, molecules of formula 3, 4, 5 or 6 of Figure 13, Panel A can be synthesized as described below. For example, by adding propargylamine (250 μl, 3.71 mmol, 3 equiv) to dansyl chloride (500 mg, 1.85 mmol, 1 equiv) and triethylamine (258 μl, 1.85 mmol) at 0° C. mol, 1 eq) in CH2Cl2 (10 mL) to synthesize the alkyne dye shown in Figure 13, Group A, 3 and in Chemical Structure 3 below. After stirring for 1 hour, the reaction mixture was warmed to room temperature and stirred for an additional 1 hour. The volatiles were removed in vacuo and the crude product was purified by silica gel chromatography ( Et2O /Hexane = 1:1) to yield Figure 13, 3 of Group A (418 mg, 78%) as a yellow solid. Analytical data were the same as reported in the literature. See, eg, Bolletta, F, et al. (1996) Organometallics 15:2415-17. Examples of structures of alkyne dyes that can be used in the present invention are shown in Chemical Structure 3:

Figure A20048002115501111
Figure A20048002115501111

通过在0℃下将3-叠氮基丙胺(例如,如Carboni,B等,(1993)J.Org.Chem.58:3736-3741中所述)(371毫克,3.71毫摩尔,3当量)加入丹磺酰氯(500毫克,1.85毫摩尔,1当量)和三乙胺(258微升,1.85毫摩尔,1当量)的CH2Cl2(10毫升)溶液合成图13,A组的4中和下面的化学结构4中所显示的叠氮基染料。搅拌1小时后,将反应混合物加热到室温,再搅拌1小时。真空去除挥发物,通过硅胶层析(Et2O/己烷=1∶1)纯化粗产物,产生黄色油状的图13,A组的4(548毫克,89%)。1HNMR(400MHz,CDCl3)δ8.55(d,J=8.4Hz,1H),8.29(d,J=8.8Hz,1H),8.23(dd,J=1.2,7.2Hz,1H),7.56-7.49(comp,2H),7.18(d,J=7.6Hz,1H),5.24(brs,1H),3.21(t,J=6.4Hz,2H),2.95(dt,J=6.4Hz,2H),2.89(s,6H),1.62(quin,J=6.4Hz,2H);13C NMR(100MHz,CDCl3)δ134.3,130.4,129.7,129.4,128.4,123.3,118.8,115.3,48.6,45.4,40.6,28.7(在13C NMR谱中并非所有的季碳原子信号都可见);HRMS(CI)m/z 334.1336[C15H20N5O2S(M+1)需要334.1332]。化学结构4中显示了叠氮基染料的结构的例子:3-Azidopropylamine (eg, as described in Carboni, B et al., (1993) J. Org. Chem. 58:3736-3741) (371 mg, 3.71 mmol, 3 equiv) at 0°C Add dansyl chloride (500 mg, 1.85 mmol, 1 eq) and triethylamine (258 μl, 1.85 mmol, 1 eq) in CH 2 Cl 2 (10 ml) to synthesize Figure 13, Group A 4 and the azido dye shown in Chemical Structure 4 below. After stirring for 1 hour, the reaction mixture was warmed to room temperature and stirred for an additional 1 hour. The volatiles were removed in vacuo and the crude product was purified by silica gel chromatography ( Et2O /Hexane = 1:1) to yield Figure 13, 4 of Group A (548 mg, 89%) as a yellow oil. 1 HNMR (400MHz, CDCl 3 ) δ8.55 (d, J=8.4Hz, 1H), 8.29 (d, J=8.8Hz, 1H), 8.23 (dd, J=1.2, 7.2Hz, 1H), 7.56- 7.49(comp, 2H), 7.18(d, J=7.6Hz, 1H), 5.24(brs, 1H), 3.21(t, J=6.4Hz, 2H), 2.95(dt, J=6.4Hz, 2H), 2.89 (s, 6H), 1.62 (quin, J=6.4Hz, 2H); 13 C NMR (100MHz, CDCl 3 ) δ134.3, 130.4, 129.7, 129.4, 128.4, 123.3, 118.8, 115.3, 48.6, 45.4, 40.6, 28.7 (not all quaternary carbon atom signals are visible in the 13 C NMR spectrum); HRMS (CI) m/z 334.1336 [334.1332 required for C 15 H 20 N 5 O 2 S (M+1 ). An example of the structure of an azido-based dye is shown in Chemical Structure 4:

Figure A20048002115501121
Figure A20048002115501121

通过在室温下将EDCI(1-乙基-3-(3-二甲基氨丙基)碳二亚胺盐酸盐)(83毫克,0.43毫摩尔,1当量)加入荧光素胺(150毫克,0.43毫摩尔,1当量)和10-十一碳一炔酸(79毫克,0.43毫摩尔,1当量)的吡啶(2毫升)溶液,合成图13,A组的5中和下面的化学结构5中所显示的炔染料。将悬液搅拌过夜,将反应混合物倾入H2O(15毫升)中。通过加入浓HCl将该溶液酸化(pH<2)。搅拌1小时后,过滤掉沉淀,用H2O(5毫升)洗涤,溶解于少量的EtOAc。己烷的加入导致图13,A组的5以橙色晶体析出,收集并在真空下干燥(138毫克,63%)。分析数据与文献中报道的相同。参见,例如,Crisp,G.T.和Gore,J.(1997)Tetrahedron 53:1505-1522。化学结构5中显示了炔染料的结构的例子:By adding EDCI (1-ethyl-3-(3-dimethylaminopropyl)carbodiimide hydrochloride) (83 mg, 0.43 mmol, , 0.43 mmol, 1 eq) and 10-undecynoic acid (79 mg, 0.43 mmol, 1 eq) in pyridine (2 mL), synthesized in Figure 13, 5 of Group A and the following chemical structures Alkyne dyes shown in 5. The suspension was stirred overnight and the reaction mixture was poured into H2O (15 mL). The solution was acidified (pH<2) by adding concentrated HCl. After stirring for 1 h, the precipitate was filtered off, washed with H2O (5 mL), and dissolved in a small amount of EtOAc. Addition of hexane resulted in Figure 13, Group A. 5 precipitated as orange crystals which were collected and dried under vacuum (138 mg, 63%). Analytical data were the same as reported in the literature. See, eg, Crisp, GT and Gore, J. (1997) Tetrahedron 53: 1505-1522. An example of the structure of an alkyne dye is shown in Chemical Structure 5:

Figure A20048002115501122
Figure A20048002115501122

通过在室温下将EDCI(1-乙基-3-(3-二甲基氨丙基)碳二亚胺盐酸盐)(83毫克,0.43毫摩尔,1当量)加入荧光素胺(150毫克,0.43毫摩尔,1当量)和4-(3-叠氮基丙基氨基甲酰基)-丁酸(例如,通过3-叠氮基丙胺与戊二酸酐的反应合成)(92毫克,0.43毫摩尔,1当量)的吡啶(2毫升)溶液合成图13,A组的6中和下面的化学结构6中所显示的叠氮基染料。将悬液搅拌过夜,将反应混合物倾入H2O(15毫升)中。通过加入浓HCl将该溶液酸化(pH<2)。搅拌1小时后,过滤掉沉淀,用1N HCl(3x3毫升)洗涤,溶解于少量的EtOAc。己烷的加入导致图13,A组的6以橙色晶体析出,收集并在真空下干燥(200毫克,86%)。1H NMR(400MHz,CD3OD)δ8.65(s,1H),8.15(d,J=8.4Hz,1H),7.61-7.51(comp,2H),7.40(d,J=8.4Hz,1H),7.35(brs,2H),7.22-7.14(comp,2H),6.85-6.56(comp,3H),3.40-3.24(comp,4H),2.54(t,J=7.2Hz,2H),2.39-2.30(comp,2H),2.10-1.99(comp,2H),1.82-1.72(comp,2H);13CNMR(100MHz,CD3OD)δ175.7,174.4,172.4,167.9,160.8,143.0,134.3,132.9,131.8,129.6,124.4,123.3,121.1,118.5,103.5,50.2,38.0,37.2,36.2,29.8,22.9(在13C NMR谱中并非所有的季碳原子信号都可见);HRMS(CI)m/z544.1835[C28H25N5O7(M+1)需要544.1827]。化学结构6中显示了叠氮基染料的结构的例子:By adding EDCI (1-ethyl-3-(3-dimethylaminopropyl)carbodiimide hydrochloride) (83 mg, 0.43 mmol, , 0.43 mmol, 1 equivalent) and 4-(3-azidopropylcarbamoyl)-butyric acid (for example, synthesized by the reaction of 3-azidopropylamine with glutaric anhydride) (92 mg, 0.43 mg mol, 1 equivalent) in pyridine (2 mL) to synthesize the azido dye shown in Figure 13, Group A, 6 and in Chemical Structure 6 below. The suspension was stirred overnight and the reaction mixture was poured into H2O (15 mL). The solution was acidified (pH<2) by adding concentrated HCl. After stirring for 1 h, the precipitate was filtered off, washed with 1N HCl (3x3 mL), dissolved in a small amount of EtOAc. Addition of hexane resulted in Figure 13, group A. 6 precipitated as orange crystals which were collected and dried under vacuum (200 mg, 86%). 1 H NMR (400MHz, CD 3 OD) δ8.65(s, 1H), 8.15(d, J=8.4Hz, 1H), 7.61-7.51(comp, 2H), 7.40(d, J=8.4Hz, 1H ), 7.35 (brs, 2H), 7.22-7.14 (comp, 2H), 6.85-6.56 (comp, 3H), 3.40-3.24 (comp, 4H), 2.54 (t, J=7.2Hz, 2H), 2.39- 2.30 (comp, 2H), 2.10-1.99 (comp, 2H), 1.82-1.72 (comp, 2H); 13 CNMR (100MHz, CD 3 OD) δ175.7, 174.4, 172.4, 167.9, 160.8, 143.0, 134.3, 132.9, 131.8, 129.6, 124.4, 123.3, 121.1, 118.5, 103.5, 50.2, 38.0, 37.2, 36.2, 29.8, 22.9 (not all quaternary carbon atom signals are visible in 13 C NMR spectrum); HRMS(CI)m /z544.1835 [C 28 H 25 N 5 O 7 (M+1) requires 544.1827]. An example of the structure of an azido-based dye is shown in Chemical Structure 6:

Figure A20048002115501131
Figure A20048002115501131

在一个实施方式中,也可以将PEG分子加入到具有非天然氨基酸,例如叠氮基氨基酸或炔丙基氨基酸的蛋白质中。例如,可以通过[3+2]环加成将炔丙基酰胺PEG(如图17,A组中所示)加入到具有叠氮基氨基酸的蛋白质中。参见例如,图17,A组。图17,B组说明了具有加入的PEG取代基的蛋白质的凝胶分析。In one embodiment, PEG molecules can also be added to proteins with unnatural amino acids, such as azido amino acids or propargyl amino acids. For example, propargylamide PEG (as shown in Figure 17, panel A) can be added to proteins with azido amino acids via [3+2] cycloaddition. See, eg, Figure 17, Panel A. Figure 17, panel B illustrates gel analysis of proteins with added PEG substituents.

在本发明的一个方面,可如下所述合成炔丙基酰胺PEG(如图17,A组中所示)。例如,将炔丙基胺(30微升)的CH2Cl2(1毫升)溶液加入20kDa PEG-羟基琥珀酰亚胺酯(120毫克,购自Nektar)中。室温下搅拌反应4小时。然后加入Et2O(10毫升),过滤掉沉淀,通过加入Et2O(10毫升)从MeOH(1毫升)中二次再结晶。将产物在真空下干燥,产生白色固体(105毫克,产率88%)。参见,例如,图17,C组。In one aspect of the invention, propargylamide PEG (shown in Figure 17, panel A) can be synthesized as follows. For example, a solution of propargylamine (30 μL) in CH2Cl2 (1 mL) was added to 20 kDa PEG-hydroxysuccinimide ester (120 mg, purchased from Nektar). The reaction was stirred at room temperature for 4 hours. Et2O (10 mL) was then added, the precipitate was filtered off and recrystallized a second time from MeOH (1 mL) by adding Et2O (10 mL). The product was dried under vacuum to yield a white solid (105 mg, 88% yield). See, eg, Figure 17, panel C.

实施例6:示例性O-RS和0-tRNAExample 6: Exemplary O-RS and O-tRNA

示例性O-tRNA包含SEQ ID NO.:65(参见,表5)。O-RS例子包括SEQ ID NOs.:36-63、86(参见,表5)。编码O-RS或其部分(如活性位点)的多核苷酸的例子包括SEQ ID NOs.:3-35。此外,表6中说明了示例性O-RS的氨基酸改变。An exemplary O-tRNA comprises SEQ ID NO.: 65 (see, Table 5). O-RS examples include SEQ ID NOs.: 36-63, 86 (see, Table 5). Examples of polynucleotides encoding O-RS or portions thereof (eg, active site) include SEQ ID NOs.: 3-35. In addition, amino acid changes for exemplary O-RSs are illustrated in Table 6.

表6:进化形成的EcTyrRS变体Table 6: EcTyrRS variants formed by evolution

Figure A20048002115501141
Figure A20048002115501141

a这些克隆也含有Asp165Gly突变 aThese clones also contained the Asp165Gly mutation

可以理解,本文描述的实施例和实施方式仅作为示例性目的,本领域技术人员能够对它们进行各种的修改或改变,而仍然包括在本申请的精神和范围以及所附权利要求的范围之内。It can be understood that the embodiments and implementations described herein are for illustrative purposes only, and those skilled in the art can make various modifications or changes to them, while still being included in the spirit and scope of the application and the scope of the appended claims Inside.

虽然为阐明和理解已经在一些细节方面详述了前述发明,但是对于本领域技术人员来说,能够通过阅读本公开在形式和细节上作出各种改变,而并不背离本发明的范围。例如,本文描述的所有技术和装置都可以用于不同组合。本申请中引用的所有出版物、专利、专利申请和/或其它文件均为了所有目的以相同程度整体引入本文作为参考,似乎各单独出版物、专利、专利申请和/或其它文件为了所有目的单独引入作为参考。While the foregoing invention has been described in some detail for purposes of illustration and understanding, it will be apparent to those skilled in the art from the reading of this disclosure that various changes in form and detail can be made without departing from the scope of the invention. For example, all of the techniques and devices described herein can be used in various combinations. All publications, patents, patent applications, and/or other documents cited in this application are herein incorporated by reference in their entirety for all purposes to the same extent as if each individual publication, patent, patent application, and/or other document were for all purposes individually Incorporated by reference.

表5table 5

  SEQ IDNO.:SEQ ID NO.:   标记mark   序列sequence   SEQ IDNO.:1SEQ ID NO.: 1   大肠杆菌野生型TyrRS(合成酶)多核苷酸Escherichia coli wild-type TyrRS (synthetase) polynucleotide   ATGGCAAGCAGTAACTTGATTAAACAATTGCAAGAGCGGGGGCTGGTAGCCCAGGTGACGGACGAGGAAGCGTTAGCAGAGCGACTGGCGCAAGGCCCGATCGCGCTCTATTGCGGCTTCGATCCTACCGCTGACAGCTTGCATTTGGGGCATCTTGTTCCATTGTTATGCCTGAAACGCTTCCAGCAGGCGGGCCACAAGCCGGTTGCGCTGGTAGGCGGCGCGACGGGTCTGATTGGCGACCCGAGCTTCAAAGCTGCCGAGCGTAAGCTGAACACCGAAGAAACTGTTCAGGAGTGGGTGGACAAAATCCGTAAGCAGGTTGCCCCGTTCCTCGATTTCGACTGTGGAGAAAACTCTGCTATCGCGGCGAACAACTATGACTGGTTCGGCAATATGAATGTGCTGACCTTCCTGCGCGATATTGGCAAACACTTCTCCGTTAACCAGATGATCAACAAAGAAGCGGTTAAGCAGCGTCTCAACCGTGAAGATCAGGGGATTTCGTTCACTGAGTTTTCCTACAACCTGTTGCAGGGTTATGACTTCGCCTGTCTGAACAAACAGTACGGTGTGGTGCTGCAAATTGGTGGTTCTGACCAGTGGGGTAACATCACTTCTGGTATCGACCTGACCCGTCGTCTGCATCAGAATCAGGTGTTTGGCCTGACCGTTCCGCTGATCACTAAAGCAGATGGCACCAAATTTGGTAAAACTGAAGGCGGCGCAGTCTGGTTGGATCCGAAGAAAACCAGCCCGTACAAATTCTACCAGTTCTGGATCAACACTGCGGATGCCGACGTTTACCGCTTCCTGAAGTTCTTCACCTTTATGAGCATTGAAGAGATCAACGCCCTGGAAGAAGAAGATAAAAACAGCGGTAAAGCACCGCGCGCCCAGTATGTACTGGCGGAGCAGGTGACTCGTCTGGTTCACGGTGAAGAAGGTTTACAGGCGGCAAAACGTATTACCGAATGCCTGTTCAGCGGTTCTTTGAGTGCGCTGAGTGAAGCGGACTTCGAACAGCTGGCGCAGGACGGCGTACCGATGGTTGAGATGGAAAAGGGCGCAGACCTGATGCAGGCACTGGTCGATTCTGAACTGCAACCTTCCCGTGGTCAGGCACGTAAAACTATCGCCTCCAATGCCATCACCATTAACGGTGAAAAACAGTCCGATCCTGAATACTTCTTTAAAGAAGAAGATCGTCTGTTTGGTCGTTTTACCTTACTGCGTCGCGGTAAAAAGAATTACTGTCTGATTTGCTGGAAATAA  ATGGCAAGCAGTAACTTGATTAAACAATTGCAAGAGCGGGGGCTGGTAGCCCAGGTGACGGACGAGGAAGCGTTAGCAGAGCGACTGGCGCAAGGCCCGATCGCGCTCTATTGCGGCTTCGATCCTACCGCTGACAGCTTGCATTTGGGGCATCTTGTTCCATTGTTATGCCTGAAACGCTTCCAGCAGGCGGGCCACAAGCCGGTTGCGCTGGTAGGCGGCGCGACGGGTCTGATTGGCGACCCGAGCTTCAAAGCTGCCGAGCGTAAGCTGAACACCGAAGAAACTGTTCAGGAGTGGGTGGACAAAATCCGTAAGCAGGTTGCCCCGTTCCTCGATTTCGACTGTGGAGAAAACTCTGCTATCGCGGCGAACAACTATGACTGGTTCGGCAATATGAATGTGCTGACCTTCCTGCGCGATATTGGCAAACACTTCTCCGTTAACCAGATGATCAACAAAGAAGCGGTTAAGCAGCGTCTCAACCGTGAAGATCAGGGGATTTCGTTCACTGAGTTTTCCTACAACCTGTTGCAGGGTTATGACTTCGCCTGTCTGAACAAACAGTACGGTGTGGTGCTGCAAATTGGTGGTTCTGACCAGTGGGGTAACATCACTTCTGGTATCGACCTGACCCGTCGTCTGCATCAGAATCAGGTGTTTGGCCTGACCGTTCCGCTGATCACTAAAGCAGATGGCACCAAATTTGGTAAAACTGAAGGCGGCGCAGTCTGGTTGGATCCGAAGAAAACCAGCCCGTACAAATTCTACCAGTTCTGGATCAACACTGCGGATGCCGACGTTTACCGCTTCCTGAAGTTCTTCACCTTTATGAGCATTGAAGAGATCAACGCCCTGGAAGAAGAAGATAAAAACAGCGGTAAAGCACCGCGCGCCCAGTATGTACTGGCGGAGCAGGTGACTCGTCTGGTTCACGGTGAAGAAGGTTTACAGGCGGCAAAACGTATTACCGAATGCCTGTTCAGCGGTTCTTTGAGTGCGCTGAGTGAAGCGGACTTCGAACAGCTGGCGCAGGACGGCGTACCGATGGTTGAGATGGAAAAGGGCGCAGACCTGATGCAGGCACTGGTCGATTCTGAACTGCAACCTTCCCGTGGTCAGGCACGTAAAACTATCGCCTCCAATGCCATCACCATTAACGGTGAAAAACAGTCCGATCCTGAATACTTCTTTAAAGAAGAAGATCGTCTGTTTGGTCGTTTTACCTTACTGCGTCGCGGTAAAAAGAATTACTGTCTGATTTGCTGGAAATAA   SEQ IDNO.:2SEQ ID NO.: 2   大肠杆菌野生型TyrRS(合成酶)氨基酸(aa)Escherichia coli wild-type TyrRS (synthetase) amino acid (aa)   MASSNLIKQLQERGLVAQVTDEEALAERLAQGPIALYCGFDPTADSLHLGHLVPLLCLKRFQQAGHKPVALVGGATGLIGDPSFKAAERKLNTEETVQEWVDKIRKQVAPFLDFDCGENSAIAANNYDWFGNMNVLTFLRDIGKHFSVNQMINKEAVKQRLNREDQGISFTEFSYNLLQGYDFACLNKQYGVVLQIGGSDQWGNITSGIDITRRLHQNQVFGLTVPLITKADGTKFGKTEGGAVWLDPKKTSPYKFYQFWINTADADVYRFLKFFTFMSIEEINALEEEDKNSGKAPRAQYVLAEQVTRLVHGEEGLQAAKRITECLFSGSLSALSEADFEQLAQDGVPMVEMEKGADLMQALVDSELQPSRGQARKTIASNAITINGEKQSDPEYFFKEEDRLFGRFTLLRRGKKNYCLICWK  MASSNLIKQLQERGLVAQVTDEEALAERLAQGPIALYCGFDPTADSLHLGHLVPLLCLKRFQQAGHKPVALVGGATGLIGDPSFKAAERKLNTEETVQEWVDKIRKQVAPFLDFDCGENSAIAANNYDWFGNMNVLTFLRDIGKHFSVNQMINKEAVKQRLNREDQGISFTEFSYNLLQGYDFACLNKQYGVVLQIGGSDQWGNITSGIDITRRLHQNQVFGLTVPLITKADGTKFGKTEGGAVWLDPKKTSPYKFYQFWINTADADVYRFLKFFTFMSIEEINALEEEDKNSGKAPRAQYVLAEQVTRLVHGEEGLQAAKRITECLFSGSLSALSEADFEQLAQDGVPMVEMEKGADLMQALVDSELQPSRGQARKTIASNAITINGEKQSDPEYFFKEEDRLFGRFTLLRRGKKNYCLICWK   SEQ IDNO.:3SEQ ID NO.: 3   pOMe-1合成酶多核苷酸pOMe-1 synthetase polynucleotide   ATGGCAAGCAGTAACTTGATTAAACAATTGCAAGAGCGGGGGCTGGTAGCCCAGGTGACGGACGAGGAAGCGTTAGCAGAGCGACTGGCGCAAGGCCCGATCGCACTCGTGTGTGGCTTCGATCCTACCGCTGACAGCTTGCATTTGGGGCATCTTGTTCCATTGTTATGCCTGAAACGCTTCCAGCAGGCGGGCCACAAGCCGGTTGCGCTGGTAGGCGGCGCGACGGGTCTGATTGGCGACCCGAGCTTCAAAGCTGCCGAGCGTAAGCTGAACACCGAAGAAACTGTTCAG  ATGGCAAGCAGTAACTTGATTAAACAATTGCAAGAGCGGGGGCTGGTAGCCCAGGTGACGGACGAGGAAGCGTTAGCAGAGCGACTGGCGCAAGGCCCGATCGCACTCGTGTGTGGCTTCGATCCTACCGCTGACAGCTTGCATTTGGGGCATCTTGTTCCATTGTTATGCCTGAAACGCTTCCAGCAGGCGGGCCACAAGCCGGTTGCGCTGGTAGGCGGCGCGACGGGTCTGATTGGCGACCCGAGCTTCAAAGCTGCCGAGCGTAAGCTGAACACCGAAGAAACTGTTCAG

  GAGTGGGTGGACAAAATCCGTAAGCAGGTTGCCCCGTTCCTCGATTTCGACTGTGGAGAAAACTCTGCTATCGCGGCCAATAATTATGACTGGTTCGGCAATATGAATGTGCTGACCTTCCTGCGCGATATTGGCAAACACTTCTCCGTTAACCAGATGATCAACAAAGAAGCGGTTAAGCAGCGTCTCAACCGTGAAGATCAGGGGATTTCGTTCACTGAGTTTTCCTACAACCTGCTGCAGGGTTATAGTATGGCCTGTTTGAACAAACAGTACGGTGTGGTGCTGCAAATTGGTGGTTCTGACCAGTGGGGTAACATCACTTCTGGTATCGACCTGACCCGTCGTCTGCATCAGAATCAGGTGTTTGGCCTGACCGTTCCGCTGATCACTAAAGCAGATGGCACCAAATTTGGTAAAACTGAAGGCGGCGCAGTCTGGTTGGATCCGAAGAAAACCAGCCCGTACAAATTCTACCAGTTCTGGATCAACACTGCGGATGCCGACGTTTACCGCTTCCTGAAGTTCTTCACCTTTATGAGCATTGAAGAGATCAACGCCCTGGAAGAAGAAGATAAAAACAGCGGTAAAGCACCGCGCGCCCAGTATGTACTGGCGGAGCAGGTGACTCGTCTGGTTCACGGTGAAGAAGGTTTACAGGCGGCAAAACGTATTACCGAATGCCTGTTCAGCGGTTCTTTGAGTGCGCTGAGTGAAGCGGACTTCGAACAGCTGGCGCAGGACGGCGTACCGATGGTTGAGATGGAAAAGGGCGCAGACCTGATGCAGGCACTGGTCGATTCTGAACTGCAACCTTCCcGTGGTCAGGCACGTAAAACTATCGCCTCCAATGCCATCACCATTAACGGTGAAAAACAGTCCGATCCTGAATACTTCTTTAAAGAAGAAGATCGTCTGTTTGGTCGTTTTACCTTACTGCGTCGCGGTAAAAAGAATTACTGTCTGATTTGCTGGAAATAA  GAGTGGGTGGACAAAATCCGTAAGCAGGTTGCCCCGTTCCTCGATTTCGACTGTGGAGAAAACTCTGCTATCGCGGCCAATAATTATGACTGGTTCGGCAATATGAATGTGCTGACCTTCCTGCGCGATATTGGCAAACACTTCTCCGTTAACCAGATGATCAACAAAGAAGCGGTTAAGCAGCGTCTCAACCGTGAAGATCAGGGGATTTCGTTCACTGAGTTTTCCTACAACCTGCTGCAGGGTTATAGTATGGCCTGTTTGAACAAACAGTACGGTGTGGTGCTGCAAATTGGTGGTTCTGACCAGTGGGGTAACATCACTTCTGGTATCGACCTGACCCGTCGTCTGCATCAGAATCAGGTGTTTGGCCTGACCGTTCCGCTGATCACTAAAGCAGATGGCACCAAATTTGGTAAAACTGAAGGCGGCGCAGTCTGGTTGGATCCGAAGAAAACCAGCCCGTACAAATTCTACCAGTTCTGGATCAACACTGCGGATGCCGACGTTTACCGCTTCCTGAAGTTCTTCACCTTTATGAGCATTGAAGAGATCAACGCCCTGGAAGAAGAAGATAAAAACAGCGGTAAAGCACCGCGCGCCCAGTATGTACTGGCGGAGCAGGTGACTCGTCTGGTTCACGGTGAAGAAGGTTTACAGGCGGCAAAACGTATTACCGAATGCCTGTTCAGCGGTTCTTTGAGTGCGCTGAGTGAAGCGGACTTCGAACAGCTGGCGCAGGACGGCGTACCGATGGTTGAGATGGAAAAGGGCGCAGACCTGATGCAGGCACTGGTCGATTCTGAACTGCAACCTTCCcGTGGTCAGGCACGTAAAACTATCGCCTCCAATGCCATCACCATTAACGGTGAAAAACAGTCCGATCCTGAATACTTCTTTAAAGAAGAAGATCGTCTGTTTGGTCGTTTTACCTTACTGCGTCGCGGTAAAAAGAATTACTGTCTGATTTGCTGGAAATAA   SEQ IDNO.:4SEQ ID NO.: 4   pOMe-2合成酶多核苷酸pOMe-2 synthetase polynucleotide   ATGGCAAGCAGTAACTTGATTAAACAATTGCAAGAGCGGGGGCTGGTAgCCCAGGTGACGGACGAGGAAGCGTTAGCAGAGCGACTGGCGCAAGGCCCGATCGCACTCACTTGTGGCTTCGATCCTACCGCTGACAGCTTGCATTTGGGGCATCTTGTTCCATTGTTATGCCTGAAACGCTTCCAGCAGGCGGGCCACAAGCCGGTTGCGCTGGTAGGCGGCGCGACGGGTCTGATTGGCGACCCGAGCTTCAAAGCTGCCGAGCGTAAGCTGAACACCGAAGAAACTGTTCAGGAGTGGGTGGACAAAATCCGTAAGCAGGTTGCCCCGTTCCTCGATTTCGACTGTGGAGAAAACTCTGCTATCGCGGCCAATAATTATGACTGGTTCAGCAATATGAATGTGCTGACCTTCCTGCGCGATATTGGCAAACACTTCTCCGTTAACCAGATGATCAACAAAGAAGCGGTTAAGCAGCGTCTCAACCGTGAAGATCAGGGGATTTCGTTCACTGAGTTTTCCTACAACCTGCTGCAGGGTTATACGTATGCCTGTCTGAACAAACAGTACGGTGTGGTGCTGCAAATTGGTGGTTCTGACCAGTGGGGTAACATCACTTCTGGTATCGACCTGACCCGTCGTCTGCATCAGAATCAGGTGTTTGGCCTGACCGTTCCGCTGATCACTAAAGCAGATGGCACCAAATTTGGTAAAACTGAAGGCGGCGCAGTCTGGTTGGATCCGAAGAAAACCAGCCCGTACAAATTCTACCAGTTCTGGATCAACACTGCGGATGCCGACGTTTACCGCTTCCTGAAGTTCTTCACCTTTATGAGCATTGAAGAGATCAACGCCCTGGAAGAAGAAGATAAAAACAGCGGTAAAGCACCGCGCGCCCAGTATGTACTGGCGGAGCAGGTGACTCGTCTGGTTCACGGTGAAGAAGGTTTACAGGCGGCAAAACGTATTACCGAATGCCTGTTCAGCGGTTCTTTGAGTGCGCTGAGTGAAGCGGACTTCGAACAGCTGGCGCAGGACGGCGTACCGATGGTTGAGATGGAAAAGGGCGCAGACCTGATGCAGGCACTGGTCGATTCTGAACTGCAACCTTCCCGTGGTCAGGCACGTAAAACTATCGCCTCCAATGCCATCACCATTAACGGTGAAAAACAGTCCGATCCTGAATACTTCTTTAAAGAAGAAGATCGTCTGTTTGGTCGTTTTACCTTACTGCGTCGCGGTAAAAAGAATTACTGTCTGATTTGCTGGAAATAA  ATGGCAAGCAGTAACTTGATTAAACAATTGCAAGAGCGGGGGCTGGTAgCCCAGGTGACGGACGAGGAAGCGTTAGCAGAGCGACTGGCGCAAGGCCCGATCGCACTCACTTGTGGCTTCGATCCTACCGCTGACAGCTTGCATTTGGGGCATCTTGTTCCATTGTTATGCCTGAAACGCTTCCAGCAGGCGGGCCACAAGCCGGTTGCGCTGGTAGGCGGCGCGACGGGTCTGATTGGCGACCCGAGCTTCAAAGCTGCCGAGCGTAAGCTGAACACCGAAGAAACTGTTCAGGAGTGGGTGGACAAAATCCGTAAGCAGGTTGCCCCGTTCCTCGATTTCGACTGTGGAGAAAACTCTGCTATCGCGGCCAATAATTATGACTGGTTCAGCAATATGAATGTGCTGACCTTCCTGCGCGATATTGGCAAACACTTCTCCGTTAACCAGATGATCAACAAAGAAGCGGTTAAGCAGCGTCTCAACCGTGAAGATCAGGGGATTTCGTTCACTGAGTTTTCCTACAACCTGCTGCAGGGTTATACGTATGCCTGTCTGAACAAACAGTACGGTGTGGTGCTGCAAATTGGTGGTTCTGACCAGTGGGGTAACATCACTTCTGGTATCGACCTGACCCGTCGTCTGCATCAGAATCAGGTGTTTGGCCTGACCGTTCCGCTGATCACTAAAGCAGATGGCACCAAATTTGGTAAAACTGAAGGCGGCGCAGTCTGGTTGGATCCGAAGAAAACCAGCCCGTACAAATTCTACCAGTTCTGGATCAACACTGCGGATGCCGACGTTTACCGCTTCCTGAAGTTCTTCACCTTTATGAGCATTGAAGAGATCAACGCCCTGGAAGAAGAAGATAAAAACAGCGGTAAAGCACCGCGCGCCCAGTATGTACTGGCGGAGCAGGTGACTCGTCTGGTTCACGGTGAAGAAGGTTTACAGGCGGCAAAACGTATTACCGAATGCCTGTTCAGCGGTTCTTTGAGTGCGCTGAGTGAAGCGGACTTCGAACAGCTGGCGCAGGACGGCGTACCGATGGTTGAGATGGAAAAGGGCGCAGACCTGATGCAGGCACTGGTCGATTCTGAACTGCAACCTTCCCGTGGTCAGGCACGTAAAACTATCGCCTCCAATGCCATCACCATTAACGGTGAAAAACAGTCCGATCCTGAATACTTCTTTAAAGAAGAAGATCGTCTGTTTGGTCGTTTTACCTTACTGCGTCGCGGTAAAAAGAATTACTGTCTGATTTGCTGGAAATAA   SEQ IDNO.:5SEQ ID NO.: 5   pOMe-3合成酶多核苷酸pOMe-3 synthetase polynucleotide   ATGGCAAGCAGTAACTTGATTAAACAATTGCAAGAGCGGGGGCTGGTAGCCCAGGTGACGGACGAGGAAGCGTTAGCAGAGCGACTGGCGCAAGGCCCGATCGCACTCGTGTGTGGCTTCGATCCTACCGCTGACAGCTTGCATTTGGGGCATCTTGTTCCATTGTTATGCCTGAAACGCTTCCAGCAGGCGGGCCACAAGCCGGTTGCGCTGGTAGGCGGCGCGACGGGTCTGATTGGCGACCCGAGCTTCAAAGCTGCCGAGCGTAAGCTGAACACCGAAGAAACTGTTCAGGAGTGGGTGGACAAAATCCGTAAGCAGGTTGCCCCGTTCCTCGATTTCGACTGTGGAGAAAACTCTGCTATCGCGGCCAATAATTATGACTGGTTCGGCAATATGAATGTGCTGACCTTCCTGCGCGATATTGGCAAACACTTCTCCGTTAACCAGATGATCAACAAAGAAGCGGTTAAGCAGCGTCTCAACCGTGAAGATCAGGGGATTTCGTTCACTGAGTTTTCCTACAACCTGCTGCAGGGTTATAGTATGGCCTGTTTGAACAAACAGTACGGTGTGGTGCTGCAAATTGGTGGTTCTGACCAGTGGGGTAACATCACTTCTGGTATCGACCTGACCCGTCGTCTGCATCAGAATCAGGTGTTTGGCCTGACCGTTCCGCTGATCACTAAAGCAGATGGCACCAAATTTGGTAAAACTGAAGGCGGCGCAGTCTGGTTGGATCCGAAGAAAACCAGCCCGTACAAATTCTACCAGTTCTGGATCAACACT  ATGGCAAGCAGTAACTTGATTAAACAATTGCAAGAGCGGGGGCTGGTAGCCCAGGTGACGGACGAGGAAGCGTTAGCAGAGCGACTGGCGCAAGGCCCGATCGCACTCGTGTGTGGCTTCGATCCTACCGCTGACAGCTTGCATTTGGGGCATCTTGTTCCATTGTTATGCCTGAAACGCTTCCAGCAGGCGGGCCACAAGCCGGTTGCGCTGGTAGGCGGCGCGACGGGTCTGATTGGCGACCCGAGCTTCAAAGCTGCCGAGCGTAAGCTGAACACCGAAGAAACTGTTCAGGAGTGGGTGGACAAAATCCGTAAGCAGGTTGCCCCGTTCCTCGATTTCGACTGTGGAGAAAACTCTGCTATCGCGGCCAATAATTATGACTGGTTCGGCAATATGAATGTGCTGACCTTCCTGCGCGATATTGGCAAACACTTCTCCGTTAACCAGATGATCAACAAAGAAGCGGTTAAGCAGCGTCTCAACCGTGAAGATCAGGGGATTTCGTTCACTGAGTTTTCCTACAACCTGCTGCAGGGTTATAGTATGGCCTGTTTGAACAAACAGTACGGTGTGGTGCTGCAAATTGGTGGTTCTGACCAGTGGGGTAACATCACTTCTGGTATCGACCTGACCCGTCGTCTGCATCAGAATCAGGTGTTTGGCCTGACCGTTCCGCTGATCACTAAAGCAGATGGCACCAAATTTGGTAAAACTGAAGGCGGCGCAGTCTGGTTGGATCCGAAGAAAACCAGCCCGTACAAATTCTACCAGTTCTGGATCAACACT

  GCGGATGCCGACGTTTACCGCTTCCTGAAGTTCTTCACCTTTATGAGCATTGAAGAGATCAACGCCCTGGAAGAAGAAGATAAAAACAGCGGTAAAGCACCGCGCGCCCAGTATGTACTGGCGGAGCAGGTGACTCGTCTGGTTCACGGTGAAGAAGGTTTACAGGCGGCAAAACGTATTACCGAATGCCTGTTCAGCGGTTCTTTGAGTGCGCTGAGTGAAGCGGACTTCGAACAGCTGGCGCAGGACGGCGTACCGATGGTTGAGATGGAAAAGGGCGCAGACCTGATGCAGGCACTGGTCGATTCTGAACTGCAACCTTCCCGTGGTCAGGCACGTAAAACTATCGCCTCCAATGCCATCACCATTAACGGTGAAAAACAGTCCGATCCTGAATACTTCTTTAAAGAAGAAGATCGTCTGTTTGGTCGTTTTACCTTACTGCGTCGCGGTAAAAAGAATTACTGTCTGATTTGCTGGAAATAA  GCGGATGCCGACGTTTACCGCTTCCTGAAGTTCTTCACCTTTATGAGCATTGAAGAGATCAACGCCCTGGAAGAAGAAGATAAAAACAGCGGTAAAGCACCGCGCGCCCAGTATGTACTGGCGGAGCAGGTGACTCGTCTGGTTCACGGTGAAGAAGGTTTACAGGCGGCAAAACGTATTACCGAATGCCTGTTCAGCGGTTCTTTGAGTGCGCTGAGTGAAGCGGACTTCGAACAGCTGGCGCAGGACGGCGTACCGATGGTTGAGATGGAAAAGGGCGCAGACCTGATGCAGGCACTGGTCGATTCTGAACTGCAACCTTCCCGTGGTCAGGCACGTAAAACTATCGCCTCCAATGCCATCACCATTAACGGTGAAAAACAGTCCGATCCTGAATACTTCTTTAAAGAAGAAGATCGTCTGTTTGGTCGTTTTACCTTACTGCGTCGCGGTAAAAAGAATTACTGTCTGATTTGCTGGAAATAA   SEQ IDNO.:6SEQ ID NO.: 6   pOMe-4合成酶多核苷酸pOMe-4 synthetase polynucleotide   ATGGCAAGCAGTAACTTGATTAAACAATTGCAAGAgCGGGGGCTGGTAGCCCAGGTGACGGACGAGGAAGCGTTAGCAGAGCGACTGGCGCAAGGCCCGATCGCACTCGTGTGTGGCTTCGATCCTACCGCTGACAGCTTGCATTTGGGGCATCTTGTTCCATTGTTATGCCTGAAACGCTTCCAGCAGGCGGGCCACAAGCCGGTTGCGCTGGTAGGCGGCGCGACGGGTCTGATTGGCGACCCGAGCTTCAAAGCTGCCGAGCGTAAGCTGAACACCGAAGAAACTGTTCAGGAGTGGGTGGACAAAATCCGTAAGCAGGTTGCCCCGTTCCTCGATTTCGACTGTGGAGAAAACTCTGCTATCGCGGCCAATAATTATGACTGGTTCGGCAATATGAATGTGCTGACCTTCCTGCGCGATATTGGCAAACACTTCTCCGTTAACCAGATGATCAACAAAGAAGCGGTTAAGCAGCGTCTCAACCGTGAAGATCAGGGGATTTCGTTCACTGAGTTTTCCTACAACCTGCTGCAGGGTTATAGTATGGCCTGTTTGAACAAACAGTACGGTGTGGTGCTGCAAATTGGTGGTTCTGACCAGTGGGGTAACATCACTTCTGGTATCGACCTGACCCGTCGTCTGCATCAGAATCAGGTGTTTGGCCTGACCGTTCCGCTGATCACTAAAGCAGATGGCACCAAATTTGGTAAAACTGAAGGCGGCGCAGTCTGGTTGGATCCGAAGAAAACCAGCCCGTACAAATTCTACCAGTTCTGGATCAACACTGCGGATGCCGACGTTTACCGCTTCCTGAAGTTCTTCACCTTTATGAGCATTGAAGAGATCAACGCCCTGGAAGAAGAAGATAAAAACAGCGGTAAAGCACCGCGCGCCCAGTATGTACTGGCGGAGCAGGTGACTCGTCTGGTTCACGGTGAAGAAGGTTTACAGGCGGCAAAACGTATTACCGAATGCCTGTTCAGCGGTTCTTTGAGTGCGCTGAGTGAAgCGGACTTCGAACAGCTGGCGCAGGACGGCGTACCGATGGTTGAGATGGAAAAGGGCGCAGACCTGATGCAGGCACTGGTCGATTCTGAACTGCAACCTTCCCGTGGTCAGGCACGTAAAACTATCGCCTCCAATGCCATCACCATTAACGGTGAAAAACAGTCCGATCCTGAATACTTCTTTAAAGAAGAAGATCGTCTGTTTGGTCGTTTTACCTTACTGCGTCGCGGTAAAAAGAATTACTGTCTGATTTGCTGGAAATAA  ATGGCAAGCAGTAACTTGATTAAACAATTGCAAGAgCGGGGGCTGGTAGCCCAGGTGACGGACGAGGAAGCGTTAGCAGAGCGACTGGCGCAAGGCCCGATCGCACTCGTGTGTGGCTTCGATCCTACCGCTGACAGCTTGCATTTGGGGCATCTTGTTCCATTGTTATGCCTGAAACGCTTCCAGCAGGCGGGCCACAAGCCGGTTGCGCTGGTAGGCGGCGCGACGGGTCTGATTGGCGACCCGAGCTTCAAAGCTGCCGAGCGTAAGCTGAACACCGAAGAAACTGTTCAGGAGTGGGTGGACAAAATCCGTAAGCAGGTTGCCCCGTTCCTCGATTTCGACTGTGGAGAAAACTCTGCTATCGCGGCCAATAATTATGACTGGTTCGGCAATATGAATGTGCTGACCTTCCTGCGCGATATTGGCAAACACTTCTCCGTTAACCAGATGATCAACAAAGAAGCGGTTAAGCAGCGTCTCAACCGTGAAGATCAGGGGATTTCGTTCACTGAGTTTTCCTACAACCTGCTGCAGGGTTATAGTATGGCCTGTTTGAACAAACAGTACGGTGTGGTGCTGCAAATTGGTGGTTCTGACCAGTGGGGTAACATCACTTCTGGTATCGACCTGACCCGTCGTCTGCATCAGAATCAGGTGTTTGGCCTGACCGTTCCGCTGATCACTAAAGCAGATGGCACCAAATTTGGTAAAACTGAAGGCGGCGCAGTCTGGTTGGATCCGAAGAAAACCAGCCCGTACAAATTCTACCAGTTCTGGATCAACACTGCGGATGCCGACGTTTACCGCTTCCTGAAGTTCTTCACCTTTATGAGCATTGAAGAGATCAACGCCCTGGAAGAAGAAGATAAAAACAGCGGTAAAGCACCGCGCGCCCAGTATGTACTGGCGGAGCAGGTGACTCGTCTGGTTCACGGTGAAGAAGGTTTACAGGCGGCAAAACGTATTACCGAATGCCTGTTCAGCGGTTCTTTGAGTGCGCTGAGTGAAgCGGACTTCGAACAGCTGGCGCAGGACGGCGTACCGATGGTTGAGATGGAAAAGGGCGCAGACCTGATGCAGGCACTGGTCGATTCTGAACTGCAACCTTCCCGTGGTCAGGCACGTAAAACTATCGCCTCCAATGCCATCACCATTAACGGTGAAAAACAGTCCGATCCTGAATACTTCTTTAAAGAAGAAGATCGTCTGTTTGGTCGTTTTACCTTACTGCGTCGCGGTAAAAAGAATTACTGTCTGATTTGCTGGAAATAA   SEQ IDNO.:7SEQ ID NO.: 7   pOMe-5合成酶多核苷酸pOMe-5 synthetase polynucleotide   ATGGCAAGCAGTAACTTGATTAAACAATTGCAAGAGCGGGGGCTGGTAgCCCAGGTGACGGACGAGGAAGCGTTAGCAGAGCGACTGGCGCAAGGCCCGATCGCACTCACGTGTGGCTTCGATCCTACCGCTGACAGCTTGCATTTGGGGCATCTTGTTCCATTGTTATGCCTGAAACGCTTCCAGCAGGCGGGCCACAAGCCGGTTGCGCTGGTAGGCGGCGCGACGGGTCTGATTGGCGACCCGAGCTTCAAAGCTGCCGAGCGTAAGCTGAACACCGAAGAAACTGTTCAGGAGTGGGTGGACAAAATCCGTAAGCAGGTTGCCCCGTTCCTCGATTTCGACTGTGGAGAAAACTCTGCTATCGCGGCCAATAATTATGACTGGTTCGGCAATATGAATGTGCTGACCTTCCTGCGCGATATTGGCAAACACTTCTCCGTTAACCAGATGATCAACAAAGAAGCGGTTAAGCAGCGTCTCAACCGTGAAGATCAGGGGATTTCGTTCACTGAGTTTTCCTACAGCCTGCTGCAGGGTTATACGATGGCCTGTCTGAACAAACAGTACGGTGTGGTGCTGCAAATTGGTGGTTCTGACCAGTGGGGTAACATCACTTCTGGTATCGACCTGACCCGTCGTCTGCATCAGAATCAGGTGTTTGGCCTGACCGTTCCGCTGATCACTAAAGCAGATGGCACCAAATTTGGTAAAACTGAAGGCGGCGCAGTCTGGTTGGATCCGAAGAAAACCAGCCCGTACAAATTCTACCAGTTCTGGATCAACACTGCGGATGCCGACGTTTACCGCTTCCTGAAGTTCTTCACCTTTATGAGCATTGAAGAGATCAACGCCCTGGAAGAAGAAGATAAAAACAGCGGTAAAGCACCGCGCGCCCAGTATGTACTGGCGGAGCAGGTGACTCGTCTGGTTCACGGTGAAGAAGGTTTACAGGCGGCAAAACGTATTACCGAATGCCTGTTCAGCGGTTCTTTGAGTGCGCTGAGTGAAGCGGACTTCGAACAGCTGGCGCAGGACGGCGTACCGATGGTTGAGATGGAAAAGGGCGCAGACCTGATGCAGGCACTGGTCGATTCTGAACTGCAACCTTCCCGTGGTCAGGCACGTAAAACTATCGCCTCCAATGCCATCACCATTAACGGTGAAAAACAGTCCGATCCTGAATACTTCTTTAAAGAAGAAGATCGTCTGTTTGGTCGTTTTACCTTACT  ATGGCAAGCAGTAACTTGATTAAACAATTGCAAGAGCGGGGGCTGGTAgCCCAGGTGACGGACGAGGAAGCGTTAGCAGAGCGACTGGCGCAAGGCCCGATCGCACTCACGTGTGGCTTCGATCCTACCGCTGACAGCTTGCATTTGGGGCATCTTGTTCCATTGTTATGCCTGAAACGCTTCCAGCAGGCGGGCCACAAGCCGGTTGCGCTGGTAGGCGGCGCGACGGGTCTGATTGGCGACCCGAGCTTCAAAGCTGCCGAGCGTAAGCTGAACACCGAAGAAACTGTTCAGGAGTGGGTGGACAAAATCCGTAAGCAGGTTGCCCCGTTCCTCGATTTCGACTGTGGAGAAAACTCTGCTATCGCGGCCAATAATTATGACTGGTTCGGCAATATGAATGTGCTGACCTTCCTGCGCGATATTGGCAAACACTTCTCCGTTAACCAGATGATCAACAAAGAAGCGGTTAAGCAGCGTCTCAACCGTGAAGATCAGGGGATTTCGTTCACTGAGTTTTCCTACAGCCTGCTGCAGGGTTATACGATGGCCTGTCTGAACAAACAGTACGGTGTGGTGCTGCAAATTGGTGGTTCTGACCAGTGGGGTAACATCACTTCTGGTATCGACCTGACCCGTCGTCTGCATCAGAATCAGGTGTTTGGCCTGACCGTTCCGCTGATCACTAAAGCAGATGGCACCAAATTTGGTAAAACTGAAGGCGGCGCAGTCTGGTTGGATCCGAAGAAAACCAGCCCGTACAAATTCTACCAGTTCTGGATCAACACTGCGGATGCCGACGTTTACCGCTTCCTGAAGTTCTTCACCTTTATGAGCATTGAAGAGATCAACGCCCTGGAAGAAGAAGATAAAAACAGCGGTAAAGCACCGCGCGCCCAGTATGTACTGGCGGAGCAGGTGACTCGTCTGGTTCACGGTGAAGAAGGTTTACAGGCGGCAAAACGTATTACCGAATGCCTGTTCAGCGGTTCTTTGAGTGCGCTGAGTGAAGCGGACTTCGAACAGCTGGCGCAGGACGGCGTACCGATGGTTGAGATGGAAAAGGGCGCAGACCTGATGCAGGCACTGGTCGATTCTGAACTGCAACCTTCCCGTGGTCAGGCACGTAAAACTATCGCCTCCAATGCCATCACCATTAACGGTGAAAAACAGTCCGATCCTGAATACTTCTTTAAAGAAGAAGATCGTCTGTTTGGTCGTTTTACCTTACT

  GCGTCGCGGTAAAAAGAATTACTGTCTGATTTGCTGGAAATAAGCGTCGCGGTAAAAAGAATTACTGTCTGATTTGCTGGAAATAA   SEQ IDNO.:8SEQ ID NO.: 8   pOMe-6(活性位点)合成酶多核苷酸pOMe-6 (active site) synthetase polynucleotide   CGGGGGCTGGTAGCCCAGGTGACGGACGAGGAAGCGTTAGCAGAGCGACTGGCGCAAGGCCCGATCGCACTCACTTGTGGCTTCGATCCTACCGCTGACAGCTTGCATTTGGGGCATCTTGTTCCATTGTTATGCCTGAAACGCTTCCAGCAGGCGGGCCACAAGCCGGTTGCGCTGGTAGGCGGCGCGACGGGTCTGATTGGCGACCCGAGCTTCAAAGCTGCCGAGCGTAAGCTGAACACCGAAGAAACTGTTCAGGAGTGGGTGGACAAAATCCGTAAGCAGGTTGCCCCGTTCCTCGATTTCGACTGTGGAGAAAACTCTGCTATCGCGGCCAATAATTATGACTGGTTCAGCAATATGAATGTGCTGACCTTCCTGCGCGATATTGGCAAACACTTCTCCGTTAACCAGATGATCAACAAAGAAGCGGTTAAGCAGCGTCTCAACCGTGAAGATCAGGGGATTTCGTTCACTGAGTTTTCCTACAACCTGCTGCAGGGTTATACGTATGCCTGTCTGAACAAACAGTACGGTGTG  CGGGGGCTGGTAGCCCAGGTGACGGACGAGGAAGCGTTAGCAGAGCGACTGGCGCAAGGCCCGATCGCACTCACTTGTGGCTTCGATCCTACCGCTGACAGCTTGCATTTGGGGCATCTTGTTCCATTGTTATGCCTGAAACGCTTCCAGCAGGCGGGCCACAAGCCGGTTGCGCTGGTAGGCGGCGCGACGGGTCTGATTGGCGACCCGAGCTTCAAAGCTGCCGAGCGTAAGCTGAACACCGAAGAAACTGTTCAGGAGTGGGTGGACAAAATCCGTAAGCAGGTTGCCCCGTTCCTCGATTTCGACTGTGGAGAAAACTCTGCTATCGCGGCCAATAATTATGACTGGTTCAGCAATATGAATGTGCTGACCTTCCTGCGCGATATTGGCAAACACTTCTCCGTTAACCAGATGATCAACAAAGAAGCGGTTAAGCAGCGTCTCAACCGTGAAGATCAGGGGATTTCGTTCACTGAGTTTTCCTACAACCTGCTGCAGGGTTATACGTATGCCTGTCTGAACAAACAGTACGGTGTG   SEQ IDNO.:9SEQ ID NO.: 9   pOMe-7(活性位点)合成酶多核苷酸pOMe-7 (active site) synthetase polynucleotide   CGGGGGCTGGTACCCCAGGTGACGGACGAGGAAGCGTTAGCAGAGCGACTGGCGCAAGGCCCGATCGCACTCACTTGTGGCTTCGATCCTACCGCTGACAGCTTGCATTTGGGGCATCTTGTTCCATTGTTATGCCTGAAACGCTTCCAGCAGGCGGGCCACAAGCCGGTTGCGCTGGTAGGCGGCGCGACGGGTCTGATTGGCGACCCGAGCTTCAAAGCTGCCGAGCGTAAGCTGAACACCGAAGAAACTGTTCAGGAGTGGGTGGACAAAATCCGTAAGCAGGTTGCCCCGTTCCTCGATTTCGACTGTGGAGAAAACTCTGCTATCGCGGCCAATAATTATGACTGGTTCAGCAATATGAATGTGCTGACCTTCCTGCGCGATATTGGCAAACACTTCTCCGTTAACCAGATGATCAACAAAGAAGCGGTTAAGCAGCGTCTCAACCGTGAAGATCAGGGGATTTCGTTCACTGAGTTTTCCTACAACCTGCTGCAGGGTTATACGTATGCCTGTCTGAACAAACAGTACGGTGTG  CGGGGGCTGGTACCCCAGGTGACGGACGAGGAAGCGTTAGCAGAGCGACTGGCGCAAGGCCCGATCGCACTCACTTGTGGCTTCGATCCTACCGCTGACAGCTTGCATTTGGGGCATCTTGTTCCATTGTTATGCCTGAAACGCTTCCAGCAGGCGGGCCACAAGCCGGTTGCGCTGGTAGGCGGCGCGACGGGTCTGATTGGCGACCCGAGCTTCAAAGCTGCCGAGCGTAAGCTGAACACCGAAGAAACTGTTCAGGAGTGGGTGGACAAAATCCGTAAGCAGGTTGCCCCGTTCCTCGATTTCGACTGTGGAGAAAACTCTGCTATCGCGGCCAATAATTATGACTGGTTCAGCAATATGAATGTGCTGACCTTCCTGCGCGATATTGGCAAACACTTCTCCGTTAACCAGATGATCAACAAAGAAGCGGTTAAGCAGCGTCTCAACCGTGAAGATCAGGGGATTTCGTTCACTGAGTTTTCCTACAACCTGCTGCAGGGTTATACGTATGCCTGTCTGAACAAACAGTACGGTGTG   SEQ IDNO.:10SEQ ID NO.: 10   pOMe-8(活性位点)合成酶多核苷酸pOMe-8 (active site) synthetase polynucleotide   CGGGGGCTGGTAGCCCAGGTGACGGACGAGGAAGCGTTAGCAGAGCGACTGGCGCAAGGCCCGATCGCACTCACTTGTGGCTTCGATCCTACCGCTGACAGCTTGCATTTGGGGCATCTTGTTCCATTGTTATGCCTGAAACGCTTCCAGCAGGCGGGCCACAAGCCGGTTGCGCTGGTAGGCGGCGCGACGGGTCTGATTGGCGACCCGAGCTTCAAAGCTGCCGAGCGTAAGCTGAACACCGAAGAAACTGTTCAGGAGTGGGTGGACAAAATCCGTAAGCAGGTTGCCCCGTTCCTCGATTTCGACTGTGGAGAAAACTCTGCTATCGCGGCCAATAATTATGACTGGTTCAGCAATATGAATGTGCTGACCTTCCTGCGCGATATTGGCAAACACTTCTCCGTTAACCAGATGATCAACAAAGAAGCGGTTAAGCAGCGTCTCAACCGTGAAGATCAGGGGATTTCGTTCACTGAGTTTTCCTACAACCTGCTGCAGGGTTATACGTATGCCTGTCTGAACAAACAGTACGGTGTG  CGGGGGCTGGTAGCCCAGGTGACGGACGAGGAAGCGTTAGCAGAGCGACTGGCGCAAGGCCCGATCGCACTCACTTGTGGCTTCGATCCTACCGCTGACAGCTTGCATTTGGGGCATCTTGTTCCATTGTTATGCCTGAAACGCTTCCAGCAGGCGGGCCACAAGCCGGTTGCGCTGGTAGGCGGCGCGACGGGTCTGATTGGCGACCCGAGCTTCAAAGCTGCCGAGCGTAAGCTGAACACCGAAGAAACTGTTCAGGAGTGGGTGGACAAAATCCGTAAGCAGGTTGCCCCGTTCCTCGATTTCGACTGTGGAGAAAACTCTGCTATCGCGGCCAATAATTATGACTGGTTCAGCAATATGAATGTGCTGACCTTCCTGCGCGATATTGGCAAACACTTCTCCGTTAACCAGATGATCAACAAAGAAGCGGTTAAGCAGCGTCTCAACCGTGAAGATCAGGGGATTTCGTTCACTGAGTTTTCCTACAACCTGCTGCAGGGTTATACGTATGCCTGTCTGAACAAACAGTACGGTGTG   SEQ IDNO.:11SEQ ID NO.: 11   pOMe-9(活性位点)合成酶多核苷酸pOMe-9 (active site) synthetase polynucleotide   CGGGGGCTGGTAGCCCAGGTGACGGACGAGGAAGCGTTAGCAGAGCGACTGGCGCAAGGCCCGATCGCACTCACTTGTGGCTTCGATCCTACCGCTGACAGCTTGCATTTGGGGCATCTTGTTCCATTGTTATGCCTGAAACGCTTCCAGCAGGCGGGCCACAAGCCGGTTGCGCTGGTAGGCGGCGCGACGGGTCTGATTGGCGACCCGAGCTTCAAAGCTGCCGAGCGTAAGCTGAACACCGAAGAAACTGTTCAGGAGTGGGTGGACAAAATCCGTAAGCAGGTTGCCCCGTTCCTCGATTTCGACTGTGGAGAAAACTCTGCTATCGCGGCCAATAATTATGACTGGTTCGGCAATATGAATGTGCTGACCTTCCTGCGCGATATTGGCAAACACTTCTCCGTTAACCAGATGATCAACAAAGAAGCGGTTAAGCAGCGTCTCAACCGTGAAGATCAGGGGATTTCGTTCACTGAGTTTTCCTACAACCTGCTGCAGGGTTATTCGTATGCCTGTGCGAACAAACAGTACGGTGTG  CGGGGGCTGGTAGCCCAGGTGACGGACGAGGAAGCGTTAGCAGAGCGACTGGCGCAAGGCCCGATCGCACTCACTTGTGGCTTCGATCCTACCGCTGACAGCTTGCATTTGGGGCATCTTGTTCCATTGTTATGCCTGAAACGCTTCCAGCAGGCGGGCCACAAGCCGGTTGCGCTGGTAGGCGGCGCGACGGGTCTGATTGGCGACCCGAGCTTCAAAGCTGCCGAGCGTAAGCTGAACACCGAAGAAACTGTTCAGGAGTGGGTGGACAAAATCCGTAAGCAGGTTGCCCCGTTCCTCGATTTCGACTGTGGAGAAAACTCTGCTATCGCGGCCAATAATTATGACTGGTTCGGCAATATGAATGTGCTGACCTTCCTGCGCGATATTGGCAAACACTTCTCCGTTAACCAGATGATCAACAAAGAAGCGGTTAAGCAGCGTCTCAACCGTGAAGATCAGGGGATTTCGTTCACTGAGTTTTCCTACAACCTGCTGCAGGGTTATTCGTATGCCTGTGCGAACAAACAGTACGGTGTG   SEQ IDNO.:12SEQ ID NO.: 12   pOMe-10(活性位点)合成酶多核苷酸pOMe-10 (active site) synthetase polynucleotide   CGGGGGCTGGTAGCCCAGGTGACGGACGAGGAAGCGTTAGCAGAGCGACTGGCGCAAGGCCCGATCGCACTCACTTGTGGCTTCGATCCTACCGCTGACAGCTTGCATTTGGGGCATCTTGTTCCATTGTTATGCCTGAAACGCTTCCAGCAGGCGGGCCACAAGCCGGTTGCGCTGGTAGGCGGCGCGACGGGTCTGATTGGCGACCCGAGCTTCAAAGCTGCCGAGCGTAAGCTGAACACCGAAGAAACTGTTCAGGAGTGGGTGGACAAAATCCGTAAGCAGGTTGCCCCGTTCCTCGATTTCGACTGTGGAGAAAACTCTGCTATCGCGGCCAATAATTATGACTGGTTCAGCAATATGAATGTGCTGACCTTCCTGCGCGATATTGGCAAACACTTCTCCGTTAACCAGATGATCAACAAAGAAGCGGTTAAGCAGCGTCTCAACCGTGAAGATCAGGGGATTTCGTTCACTGAGTTTTCCTACAACCTGCTGCAGGGTTATACGTATGCCTGTCTGAACAAACAGTACGGTGTG  CGGGGGCTGGTAGCCCAGGTGACGGACGAGGAAGCGTTAGCAGAGCGACTGGCGCAAGGCCCGATCGCACTCACTTGTGGCTTCGATCCTACCGCTGACAGCTTGCATTTGGGGCATCTTGTTCCATTGTTATGCCTGAAACGCTTCCAGCAGGCGGGCCACAAGCCGGTTGCGCTGGTAGGCGGCGCGACGGGTCTGATTGGCGACCCGAGCTTCAAAGCTGCCGAGCGTAAGCTGAACACCGAAGAAACTGTTCAGGAGTGGGTGGACAAAATCCGTAAGCAGGTTGCCCCGTTCCTCGATTTCGACTGTGGAGAAAACTCTGCTATCGCGGCCAATAATTATGACTGGTTCAGCAATATGAATGTGCTGACCTTCCTGCGCGATATTGGCAAACACTTCTCCGTTAACCAGATGATCAACAAAGAAGCGGTTAAGCAGCGTCTCAACCGTGAAGATCAGGGGATTTCGTTCACTGAGTTTTCCTACAACCTGCTGCAGGGTTATACGTATGCCTGTCTGAACAAACAGTACGGTGTG   SEQ IDNO.:13SEQ ID NO.: 13   pOMe-11(活性位点)合成酶多核苷酸pOMe-11 (active site) synthetase polynucleotide   CGGGGGCTGGTACCcCAGGTGACGGACGAGGAAGCGTTAGCAGAGCGACTGGCGCAAGGCCCGATCGCACTCCTTTGTGGCTTCGATCCTACCGCTGACAGCTTGCATTTGGGGCATCTTGTTCCATTGTTATGCCTGAAACGCTTCCAGCAGGCGGGCCACAAGCCGGTTGCGCTGGTAGGCGGCGCGACGGGTCTGATTGGCGACCCGAGCTTCAAAGCTGCCGAGCGTAAGCTGAACACCGACGGGGGCTGGTACCcCAGGTGACGGACGAGGAAGCGTTAGCAGAGCGACTGGCGCAAGGCCCGATCGCACTCCTTTGTGGCTTCGATCCTACCGCTGACAGCTTGCATTTGGGGCATCTTGTTCCATTGTTATGCCTGAAACGCTTCCAGCAGGCGGGCCACAAGCCGGTTGCGCTGGTAGGCGGCAGGACGGGTCTGATTGGCGAACTC

  AGAAACTGTTCAGGAGTGGGTGGACAAAATCCGTAAGCAGGTTGCCCCGTTCCTCGATTTCGACTGTGGAGAAAACTCTGCTATCGCGGCCAATAATTATGACTGGTTCGGCAATATGAATGTGCTGACCTTCCTGCGCGATATTGGCAAACACTTCTCCGTTAACCAGATGATCAACAAAGAAGCGGTTAAGCAGCGTCTCAACCGTGAAGATCAGGGGATTTCGTTCACTGAGTTTTCCTACAACCTGCTGCAGGGTTATTCTATTGCCTGTTCGAACAAACAGTACGGTGTG  AGAAACTGTTCAGGAGTGGGTGGACAAAATCCGTAAGCAGGTTGCCCCGTTCCTCGATTTCGACTGTGGAGAAAACTCTGCTATCGCGGCCAATAATTATGACTGGTTCGGCAATATGAATGTGCTGACCTTCCTGCGCGATATTGGCAAACACTTCTCCGTTAACCAGATGATCAACAAAGAAGCGGTTAAGCAGCGTCTCAACCGTGAAGATCAGGGGATTTCGTTCACTGAGTTTTCCTACAACCTGCTGCAGGGTTATTCTATTGCCTGTTCGAACAAACAGTACGGTGTG   SEQ IDNO.:14SEQ ID NO.: 14   pOMe-12(活性位点)合成酶多核苷酸pOMe-12 (active site) synthetase polynucleotide   CGGGGGCTGGTAGCCCAGGTGACGGACGAGGAAGCGTTAGCAGAGCGACTGGCGCAAGGCCCGATCGCACTCGTGTGTGGCTTCGATCCTACCGCTGACAGCTTGCATTTGGGGCATCTTGTTCCATTTTATGCCTGAAACGCTTCCAGCAGGCGGGCCACAAGCCGGTTGCGCTGGTAGGCGGCGCGACGGGTCTGATTGGCGACCCGAGCTTCAAAGCTGCCGAGCGTAAGCTGAACACCGAAGAAACTGTTCAGGAGTGGGTGGACAAAATCCGTAAGCAGGTTGCCCCGTTCCTCGATTTCGACTGTGGAGAAAACTCTGCTATCGCGGCCAATAATTATGACTGGTTCGGCAATATGAATGTGCTGACCTTCCTGCGCGATATTGGCAAACACTTCTCCGTTAACCAGATGATCAACAAAGAAGCGGTTAAGCAGCGTCTCAACCGTGAAGATCAGGGGATTTCGTTCACTGAGTTTTCCTACAACCTGCTGCAGGGTTAGTATTGCCTGTTTGAACAAACAGTACGGTGTG  CGGGGGCTGGTAGCCCAGGTGACGGACGAGGAAGCGTTAGCAGAGCGACTGGCGCAAGGCCCGATCGCACTCGTGTGTGGCTTCGATCCTACCGCTGACAGCTTGCATTTGGGGCATCTTGTTCCATTTTATGCCTGAAACGCTTCCAGCAGGCGGGCCACAAGCCGGTTGCGCTGGTAGGCGGCGCGACGGGTCTGATTGGCGACCCGAGCTTCAAAGCTGCCGAGCGTAAGCTGAACACCGAAGAAACTGTTCAGGAGTGGGTGGACAAAATCCGTAAGCAGGTTGCCCCGTTCCTCGATTTCGACTGTGGAGAAAACTCTGCTATCGCGGCCAATAATTATGACTGGTTCGGCAATATGAATGTGCTGACCTTCCTGCGCGATATTGGCAAACACTTCTCCGTTAACCAGATGATCAACAAAGAAGCGGTTAAGCAGCGTCTCAACCGTGAAGATCAGGGGATTTCGTTCACTGAGTTTTCCTACAACCTGCTGCAGGGTTAGTATTGCCTGTTTGAACAAACAGTACGGTGTG   SEQ IDNO.:15SEQ ID NO.: 15   pOMe-13(活性位点)合成酶多核苷酸pOMe-13 (active site) synthetase polynucleotide   CGGGGGCTGGTACCCCAGGTGACGGACGAGGAAGCGTTAGCAGAGCGACTGGCGCAAGGCCCGATCGCACTCGTGTGTGGCTTCGATCCTACCGCTGACAGCTTGCATTTGGGGCATCTTGTTCCATTGTTATGCCTGAAACGCTTCCAGCAGGCGGGCCACAAGCCGGTTGCGCTGGTAGGCGGCGCGACGGGTCTGATTGGCGACCCGAGCTTCAAAGCTGCCGAGCGTAAGCTGAACACCGAAGAAACTGTTCAGGAGTGGGTGGACAAAATCCGTAAGCAGGTTGCCCCGTTCCTCGATTTCGACTGTGGAGAAAACTCTGCTATCGCGGCCAATAATTATGACTGGTTCGGCAATATGAATGTGCTGACCTTCCTGCGCGATATTGGCAAACACTTCTCCGTTAACCAGATGATCAACAAAGAAGCGGTTAAGCAGCGTCTCAACCGTGAAGATCAGGGGATTTCGTTCACTGAGTTTTCCTACAACCTGCTGCAGGGTTATAGTATTGCCTGTTTGAACAAACAGTACGGTGTG  CGGGGGCTGGTACCCCAGGTGACGGACGAGGAAGCGTTAGCAGAGCGACTGGCGCAAGGCCCGATCGCACTCGTGTGTGGCTTCGATCCTACCGCTGACAGCTTGCATTTGGGGCATCTTGTTCCATTGTTATGCCTGAAACGCTTCCAGCAGGCGGGCCACAAGCCGGTTGCGCTGGTAGGCGGCGCGACGGGTCTGATTGGCGACCCGAGCTTCAAAGCTGCCGAGCGTAAGCTGAACACCGAAGAAACTGTTCAGGAGTGGGTGGACAAAATCCGTAAGCAGGTTGCCCCGTTCCTCGATTTCGACTGTGGAGAAAACTCTGCTATCGCGGCCAATAATTATGACTGGTTCGGCAATATGAATGTGCTGACCTTCCTGCGCGATATTGGCAAACACTTCTCCGTTAACCAGATGATCAACAAAGAAGCGGTTAAGCAGCGTCTCAACCGTGAAGATCAGGGGATTTCGTTCACTGAGTTTTCCTACAACCTGCTGCAGGGTTATAGTATTGCCTGTTTGAACAAACAGTACGGTGTG   SEQ IDNO.:16SEQ ID NO.: 16   pOMe-14(活性位点)合成酶多核苷酸pOMe-14 (active site) synthetase polynucleotide   CGGGGGCTGGTAGCCCAGGTGACGGACGAGGAAGCGTTAGCAGAGCGACTGGCGCAAGGCCCGATCGCACTCTGGTGTGGCTTCGATCCTACCGCTGACAGCTTGCATTTGGGGCATCTTGTTCCATTGTTATGCCTGAAACGCTTCCAGCAGGCGGGCCACAAGCCGGTTGCGCTGGTAGGCGGCGCGACGGGTCTGATTGGCGACCCGAGCTTCAAGGCTGCCGAGCGTAAGCTGAACACCGAAGAAACTGTTCAGGAGTGGGTGGACAAAATCCGTAAGCAGGTTGCCCCGTTCCTCGATTTCGACTGTGGAGAAAACTCTGCTATCGCGGCCAATTGTTATGACTGGTTCGGCAATATGAATGTGCTGACCTTCCTGCGCGATATTGGCAAACACTTCTCCGTTAACCAGATGATCAACAAAGAAGCGGTTAAGCAGCGTCTCAACCGTGAAGATCAGGGGATTTCGTTCACTGAGTTTTCCTACAACCTGCTGCAGGGTTATATGCGTGCCTGTGAGAACAAACAGTACGGTGTG  CGGGGGCTGGTAGCCCAGGTGACGGACGAGGAAGCGTTAGCAGAGCGACTGGCGCAAGGCCCGATCGCACTCTGGTGTGGCTTCGATCCTACCGCTGACAGCTTGCATTTGGGGCATCTTGTTCCATTGTTATGCCTGAAACGCTTCCAGCAGGCGGGCCACAAGCCGGTTGCGCTGGTAGGCGGCGCGACGGGTCTGATTGGCGACCCGAGCTTCAAGGCTGCCGAGCGTAAGCTGAACACCGAAGAAACTGTTCAGGAGTGGGTGGACAAAATCCGTAAGCAGGTTGCCCCGTTCCTCGATTTCGACTGTGGAGAAAACTCTGCTATCGCGGCCAATTGTTATGACTGGTTCGGCAATATGAATGTGCTGACCTTCCTGCGCGATATTGGCAAACACTTCTCCGTTAACCAGATGATCAACAAAGAAGCGGTTAAGCAGCGTCTCAACCGTGAAGATCAGGGGATTTCGTTCACTGAGTTTTCCTACAACCTGCTGCAGGGTTATATGCGTGCCTGTGAGAACAAACAGTACGGTGTG   SEQ IDNO.:17SEQ ID NO.: 17   对-乙酰基Phe-1(活性位点)合成酶多核苷酸p-Acetyl Phe-1 (active site) synthetase polynucleotide   CGGGGGCTGGTAGCCCAGGTGACGGACGAGGAAGCGTTAGCAGAGCGACTGGCGCAAGGCCCGATCGCACTCATTTGTGGCTTCGATCCTACCGCTGACAGCTTGCATTTGGGGCATCTTGTTCCATTGTTATGCCTGAAACGCTTCCAGCAGGCGGGCCACAAGCCGGTTGCGCTGGTAGGCGGCGCGACGGGTCTGATTGGCGACCCGAGCTTCAAAGCTGCCGAGCGTAAGCTGAACACCGAAGAAACTGTTCAGGAGTGGGTGGACAAAATCCGTAAGCAGGTTGCCCCGTTCCTCGATTTCGACTGTGGAGAAAACTCTGCTATCGCGGCCAATAATTATGACTGGTTCGGCAATATGAATGTGCTGACCTTCCTGCGCGATATTGGCAAACACTTCTCCGTTAACCAGATGATCAACAAAGAAGCGGTTAAGCAGCGTCTCAACCGTGAAGGTCAGGGGATTTCGTTCACTGAGTTTTCCTACAACCTGCTGCAGGGTTATGGTATGGCCTGTGCTAACAAACAGTACGGTGTGGTGCTGCAAATTGGTGGTTCTGACCAATGGGGTAACATCACTTCTGGTATCGACCTGACCCGTCGTCTGCATCAGAATCAGGTG  CGGGGGCTGGTAGCCCAGGTGACGGACGAGGAAGCGTTAGCAGAGCGACTGGCGCAAGGCCCGATCGCACTCATTTGTGGCTTCGATCCTACCGCTGACAGCTTGCATTTGGGGCATCTTGTTCCATTGTTATGCCTGAAACGCTTCCAGCAGGCGGGCCACAAGCCGGTTGCGCTGGTAGGCGGCGCGACGGGTCTGATTGGCGACCCGAGCTTCAAAGCTGCCGAGCGTAAGCTGAACACCGAAGAAACTGTTCAGGAGTGGGTGGACAAAATCCGTAAGCAGGTTGCCCCGTTCCTCGATTTCGACTGTGGAGAAAACTCTGCTATCGCGGCCAATAATTATGACTGGTTCGGCAATATGAATGTGCTGACCTTCCTGCGCGATATTGGCAAACACTTCTCCGTTAACCAGATGATCAACAAAGAAGCGGTTAAGCAGCGTCTCAACCGTGAAGGTCAGGGGATTTCGTTCACTGAGTTTTCCTACAACCTGCTGCAGGGTTATGGTATGGCCTGTGCTAACAAACAGTACGGTGTGGTGCTGCAAATTGGTGGTTCTGACCAATGGGGTAACATCACTTCTGGTATCGACCTGACCCGTCGTCTGCATCAGAATCAGGTG   SEQ IDNO.:18SEQ ID NO.: 18   对二苯甲酮-1(活性位点)合成酶多核苷酸p-Benzophenone-1 (active site) synthetase polynucleotide   CAGGTGACGGACGAGGAAGCGTTAGCAGAGCGACTGGCGCAAGGCCCGATCGCACTCGGTTGTGGCTTCGATCCTACCGCTGACAGCTTGCATTTGGGGCATCTTGTTCCATTGTTATGCCTGAAACGCTTCCAGCAGGCGGGCCACAAGCCGGTTGCGCTGGTAGGCGGCGCGACGGGTCTGATTGGCGACCCGAGCTTCAAAGCTGCCGAGCGTAAGCTGAACACCGAAGAAACTGTTCAGGAGTGGGTGGACAAAATCCGTAAGCAGGTTGCCCCGTTCCTCGATTTCGACTGTGGAGAAAACTCTGCTATCGCGGCCAATAATTATGACTGGTTCGGCAATATGAATGTGCTGACCTTCCTGCGCGATATTGGCAAACACTTCTCCGTTAACCAGATGATCAACAAAGAAGCGGTTAAGCAGCGTCTCAACCGTGAAGATCAGGGGATTTCGTTCACTGAGTTTTCCTACAACCTGCTGCAGGGTTAT  CAGGTGACGGACGAGGAAGCGTTAGCAGAGCGACTGGCGCAAGGCCCGATCGCACTCGGTTGTGGCTTCGATCCTACCGCTGACAGCTTGCATTTGGGGCATCTTGTTCCATTGTTATGCCTGAAACGCTTCCAGCAGGCGGGCCACAAGCCGGTTGCGCTGGTAGGCGGCGCGACGGGTCTGATTGGCGACCCGAGCTTCAAAGCTGCCGAGCGTAAGCTGAACACCGAAGAAACTGTTCAGGAGTGGGTGGACAAAATCCGTAAGCAGGTTGCCCCGTTCCTCGATTTCGACTGTGGAGAAAACTCTGCTATCGCGGCCAATAATTATGACTGGTTCGGCAATATGAATGTGCTGACCTTCCTGCGCGATATTGGCAAACACTTCTCCGTTAACCAGATGATCAACAAAGAAGCGGTTAAGCAGCGTCTCAACCGTGAAGATCAGGGGATTTCGTTCACTGAGTTTTCCTACAACCTGCTGCAGGGTTAT

  GGTTTTGCCTGTTTGAACAAACAGTACGGTGTGGTGCTGCAAATTGGTGGTTCTGACCAGTGGGGTAACATCACTTCTGGTATCGACCTGACCCGTCGTCTGCATCAGAATCAGGTGGGTTTTGCCTGTTTGAACAAACAGTACGGTGTGGTGCTGCAAATTGGTGGTTCTGACCAGTGGGGTAACATCACTTCTGGTATCGACCTGACCCGTCGTCTGCATCAGAATCAGGTG   SEQ IDNO.:19SEQ ID NO.: 19   对二苯甲酮-2(活性位点)合成酶多核苷酸p-Benzophenone-2 (active site) synthetase polynucleotide   GCGTTAGCAGAGCGACTGGCGCAAGGCCCGATCGCACTCGGGTGTGGCTTCGATCCTACCGCTGACAGCTTGCATTTGGGGCATCTTGTTCCATTGTTATGCCTGAAACGCTTCCAGCAGGCGGGCCACAAGCCGGTTGCGCTGGTAGGCGGCGCGACGGGTCTGATTGGCGACCCGAGCTTCAAAGCTGCCGAGCGTAAGCTGAACACCGAAGAAACTGTTCAGGAGTGGGTGGACAAAATCCGTAAGCAGGTTGCCCCGTTCCTCGATTTCGACTGTGGAGAAAACTCTGCTATCGCGGCCAATAATTATGACTGGTTCGGCAATATGAATGTGCTGACCTTCCTGCGCGATATTGGCAAACACTTCTCCGTTAACCAGATGATCAACAAAGAAGCGGTTAAGCAGCGTCTCAACCGTGAAGATCAGGGGATTTCGTTCACTGAGTTTTCCTACAACCTGCTGCAGGGTTATGGTTATGCCTGTATGAACAAACAGTACGGTGTGGTGCTGCAAATTGGTGGTTCTGACCAGTGGGGTAACATCACTTCTGGTATCGACCTGACCCGTCGTCTGCATCAGAATCAGGTG  GCGTTAGCAGAGCGACTGGCGCAAGGCCCGATCGCACTCGGGTGTGGCTTCGATCCTACCGCTGACAGCTTGCATTTGGGGCATCTTGTTCCATTGTTATGCCTGAAACGCTTCCAGCAGGCGGGCCACAAGCCGGTTGCGCTGGTAGGCGGCGCGACGGGTCTGATTGGCGACCCGAGCTTCAAAGCTGCCGAGCGTAAGCTGAACACCGAAGAAACTGTTCAGGAGTGGGTGGACAAAATCCGTAAGCAGGTTGCCCCGTTCCTCGATTTCGACTGTGGAGAAAACTCTGCTATCGCGGCCAATAATTATGACTGGTTCGGCAATATGAATGTGCTGACCTTCCTGCGCGATATTGGCAAACACTTCTCCGTTAACCAGATGATCAACAAAGAAGCGGTTAAGCAGCGTCTCAACCGTGAAGATCAGGGGATTTCGTTCACTGAGTTTTCCTACAACCTGCTGCAGGGTTATGGTTATGCCTGTATGAACAAACAGTACGGTGTGGTGCTGCAAATTGGTGGTTCTGACCAGTGGGGTAACATCACTTCTGGTATCGACCTGACCCGTCGTCTGCATCAGAATCAGGTG   SEQ IDNO.:20SEQ ID NO.: 20   对叠氮基Phe-1(活性位点)合成酶多核苷酸p-azido Phe-1 (active site) synthetase polynucleotide   GGGCTGGTAGCCCAGGTGACGGACGNAGAAGCGTTAGCAGAGCGACTGGCGCAAGGCCCGATCGCACTCCTTTGTGGCTTCGATCCTACCGCTGACAGCTTGCATTTGGGGCATCTTGTTCCATTGTTATGCCTGAAACGCTTCCAGCAGGCGGGCCACAAGCCGGTTGCGCTGGTAGGCGGCGCGACGGGTCTGATTGGCGACCCGAGCTTCAAAGCTGCCGAGCGTAAGCTGAACACCGAAGAAACTGTTCAGGAGTGGGTGGACAAAATCCGTAAGCAGGTTGCCCCGTTCCTCGATTTCGACTGTGGAGAAAACTCTGCTATCGCGGCCAATAATTATGACTGGTTCGGCAATATGAATGTGCTGACCTTCCTGCGCGATATTGGCAAACACTTCTCCGTTAACCAGATGATCAACAAAGAAGCGGTTAAGCAGCGTCTCAACCGTGAAGATCAGGGGATTTCGTTCACTGAGTTTTCCTACAACCTGCTGC AGGGTTATTCTATGGCCTGTGCGAACAAACAGTACGGTGTGGTGCTGCAAATTGGTGGTTCTGACCAGTGGGGTAACATCACTTCTGGTATCGACCTGACCCGTCGTCTGCATCANAATCANGTG  GGGCTGGTAGCCCAGGTGACGGACGNAGAAGCGTTAGCAGAGCGACTGGCGCAAGGCCCGATCGCACTCCTTTGTGGCTTCGATCCTACCGCTGACAGCTTGCATTTGGGGCATCTTGTTCCATTGTTATGCCTGAAACGCTTCCAGCAGGCGGGCCACAAGCCGGTTGCGCTGGTAGGCGGCGCGACGGGTCTGATTGGCGACCCGAGCTTCAAAGCTGCCGAGCGTAAGCTGAACACCGAAGAAACTGTTCAGGAGTGGGTGGACAAAATCCGTAAGCAGGTTGCCCCGTTCCTCGATTTCGACTGTGGAGAAAACTCTGCTATCGCGGCCAATAATTATGACTGGTTCGGCAATATGAATGTGCTGACCTTCCTGCGCGATATTGGCAAACACTTCTCCGTTAACCAGATGATCAACAAAGAAGCGGTTAAGCAGCGTCTCAACCGTGAAGATCAGGGGATTTCGTTCACTGAGTTTTCCTACAACCTGCTGC AGGGTTATTCTATGGCCTGTGCGAACAAACAGTACGGTGTGGTGCTGCAAATTGGTGGTTCTGACCAGTGGGGTAACATCACTTCTGGTATCGACCTGACCCGTCGTCTGCATCANAATCANGTG   SEQ IDNO.:21SEQ ID NO.: 21   对叠氮基Phe-2(活性位点)合成酶多核苷酸p-azido Phe-2 (active site) synthetase polynucleotide   TTAGCAGAGCGACTGGCGCAAGGCCCGATCGCACTCGTTTGTGGCTTCGATCCTACCGCTGACAGCTTGCATTTGGGGCATCTTGTTCCATTGTTATGCCTGAAACGCTTCCAGCAGGCGGGCCACAAGCCGGTTGCGCTGGTAGGCGGCGCGACGGGTCTGATTGGCGACCCGAGCTTCAAAGCTGCCGAGCGTAAGCTGAACACCGAAGAAACTGTTCAGGAGTGGGTGGACAAAATCCGTAAGCAGGTTGCCCCGTTCCTCGATTTCGACTGTGGAGAAAACTCTGCTATCGCGGCCAATAATTATGACTGGTTCGGCAATATGAATGTGCTGACCTTCCTGCGCGATATTGGCAAACACTTCTCCGTTAACCAGATGATCAACAAAGAAGCGGTTAAGCAGCGTCTCAACCGTGAAGATCAGGGGATTTCGTTCACTGAGTTTTCCTACAACCTGCTGCAGGGTTATTCTGCGGCCTGTGCGAACAAACAGTACGGTGTGGTGCTGCAAATTGGTGGTTCTGACCAGTGGGGTAACATCACTTCTGGTATCGACCTGACCCGTCGTCTGCATCAGAATCAGGTG  TTAGCAGAGCGACTGGCGCAAGGCCCGATCGCACTCGTTTGTGGCTTCGATCCTACCGCTGACAGCTTGCATTTGGGGCATCTTGTTCCATTGTTATGCCTGAAACGCTTCCAGCAGGCGGGCCACAAGCCGGTTGCGCTGGTAGGCGGCGCGACGGGTCTGATTGGCGACCCGAGCTTCAAAGCTGCCGAGCGTAAGCTGAACACCGAAGAAACTGTTCAGGAGTGGGTGGACAAAATCCGTAAGCAGGTTGCCCCGTTCCTCGATTTCGACTGTGGAGAAAACTCTGCTATCGCGGCCAATAATTATGACTGGTTCGGCAATATGAATGTGCTGACCTTCCTGCGCGATATTGGCAAACACTTCTCCGTTAACCAGATGATCAACAAAGAAGCGGTTAAGCAGCGTCTCAACCGTGAAGATCAGGGGATTTCGTTCACTGAGTTTTCCTACAACCTGCTGCAGGGTTATTCTGCGGCCTGTGCGAACAAACAGTACGGTGTGGTGCTGCAAATTGGTGGTTCTGACCAGTGGGGTAACATCACTTCTGGTATCGACCTGACCCGTCGTCTGCATCAGAATCAGGTG   SEQ IDNO.:22SEQ ID NO.: 22   对叠氮基Phe-3(活性位点)合成酶多核苷酸p-azido Phe-3 (active site) synthetase polynucleotide   GACGAGGAAGCGTTAGCAGAGCGACTGGCGCAAGGCCCGATCGCACTCCTGTGTGGCTTCGATCCTACCGCTGACAGCTTGCATTTGGGGCATCTTGTTCCATTGTTATGCCTGAAACGCTTCCAGCAGGCGGGCCACAAGCCGGTTGCGCTGGTAGGCGGCGCGACGGGTCTGATTGGCGACCCGAGCTTCAAAGCTGCCGAGCGTAAGCTGAACACCGAAGAAACTGTTCAGGAGTGGGTGGACAAAATCCGTAAGCAGGTTGCCCCGTTCCTCGATTTCGACTGTGGAGAAAACTCTGCTATCGCGGCCAATAATTATGACTGGTTCGGCAATATGAATGTGCTGACCTTCCTGCGCGATATTGGCAAACACTTCTCCGTTAACCAGATGATCAACAAANAAGCGGTTAAGCAGCGTCTCAACCGTGAAGATCAGGGGATTTCGTTCACTGAGTTTTCCTACAACCTGCTGCAGGGTTATTCGGCTGCCTGTGCGAACAAACAGTACGGNGNGGNGCTGCAAATTGGNGGTTCTGACCAGGGGGGTAACATCACTTCTGGTATCGACCTGACCCGTCGTCTGCATCAAAATCAGGTG  GACGAGGAAGCGTTAGCAGAGCGACTGGCGCAAGGCCCGATCGCACTCCTGTGTGGCTTCGATCCTACCGCTGACAGCTTGCATTTGGGGCATCTTGTTCCATTGTTATGCCTGAAACGCTTCCAGCAGGCGGGCCACAAGCCGGTTGCGCTGGTAGGCGGCGCGACGGGTCTGATTGGCGACCCGAGCTTCAAAGCTGCCGAGCGTAAGCTGAACACCGAAGAAACTGTTCAGGAGTGGGTGGACAAAATCCGTAAGCAGGTTGCCCCGTTCCTCGATTTCGACTGTGGAGAAAACTCTGCTATCGCGGCCAATAATTATGACTGGTTCGGCAATATGAATGTGCTGACCTTCCTGCGCGATATTGGCAAACACTTCTCCGTTAACCAGATGATCAACAAANAAGCGGTTAAGCAGCGTCTCAACCGTGAAGATCAGGGGATTTCGTTCACTGAGTTTTCCTACAACCTGCTGCAGGGTTATTCGGCTGCCTGTGCGAACAAACAGTACGGNGNGGNGCTGCAAATTGGNGGTTCTGACCAGGGGGGTAACATCACTTCTGGTATCGACCTGACCCGTCGTCTGCATCAAAATCAGGTG   SEQ IDNO.:23SEQ ID NO.: 23   对叠氮基Phe-4(活性位点)合成酶多核苷酸p-azido Phe-4 (active site) synthetase polynucleotide   GCGTTAGCAGAGCGACTGGCGCAAGGCCCGATCGCACTCGTTTGTGGCTTCGATCCTACCGCTGACAGCTTGCATTTGGGGCATCTTGTTCCATTGTTGTGCCTGAAACGCTTCCAGCAGGCGGGCCACAAGCCGGTTGCGCTGGTAGGCGGCGCGACGGGTCTGATTGGCGACCCGAGCTTCAAAGCTGCCGAGCGTAAGCTGAACACCGAAGAAACTGTTCAGGAGTGGGTGGACAAAATCCGTAAGCAGGTTGCCCCGTTCCTCGATTTCGACTGTGGAGAAAACTCTGCTATCGCGGCCAATAATTATGACTGGTTCGGCAATATGAATGTGCTGACCTTCCTGCGCGATATTGGCAAACACTTCTCCGTTAACCAGATGATCAACAAAGA  GCGTTAGCAGAGCGACTGGCGCAAGGCCCGATCGCACTCGTTTGTGGCTTCGATCCTACCGCTGACAGCTTGCATTTGGGGCATCTTGTTCCATTGTTGTGCCTGAAACGCTTCCAGCAGGCGGGCCACAAGCCGGTTGCGCTGGTAGGCGGCGCGACGGGTCTGATTGGCGACCCGAGCTTCAAAGCTGCCGAGCGTAAGCTGAACACCGAAGAAACTGTTCAGGAGTGGGTGGACAAAATCCGTAAGCAGGTTGCCCCGTTCCTCGATTTCGACTGTGGAGAAAACTCTGCTATCGCGGCCAATAATTATGACTGGTTCGGCAATATGAATGTGCTGACCTTCCTGCGCGATATTGGCAAACACTTCTCCGTTAACCAGATGATCAACAAAGA

  AGCGGTTAAGCAGCGTCTCAACCGTGAAGATCAGGGGATTTCGTTCACTGAGTTTTCCTACAACCTGCTGCAGGGTTATAGTGCGGCCTGTGTTAACAAACAGTACGGTGTGGTGCTGCAAATTGGTGGTTCTGACCAGTGGGGTAACATCACTTCTGGTATCGACCTGACCCGTCGTCTGCATCAGAATCANGTGAGCGGTTAAGCAGCGTCTCAACCGTGAAGATCAGGGGATTTCGTTCACTGAGTTTTCCTACAACCTGCTGCAGGGTTATAGTGCGGCCTGTGTTAACAAACAGTACGGTGTGGTGCTGCAAATTGGTGGTTCTGACCAGTGGGGTAACATCACTTCTGGTATCGACCTGACCCGTCGTCTGCATCAGAATCANGTG   SEQ IDNO.:24SEQ ID NO.: 24   对叠氮基Phe-5(活性位点)合成酶多核苷酸p-azido Phe-5 (active site) synthetase polynucleotide   GACGAGGAAGCGTTAGCAGAGCGACTGGCGCAAGGCCCGATCGCACTCATTTGTGGCTTCGATCCTACCGCTGACAGCTTGCATTTGGGGCATCTTGTTCCATTGTTATGCCTGAAACGCTTCCAGCAGGCGGGCCACAAGCCGGTTGCGCTGGTAGGCGGCGCGACGGGTCTGATTGGCGACCCGAGCTTCAAAGCTGCCGAGCGTAAGCTGAACACCGAAGAAACTGTTCAGGAGTGGGTGGACAAAATCCGTAAGCAGGTTGCCCCGTTCCTCGATTTCGACTGTGGAGAAAACTCTGCTATCGCGGCCAATGATTATGACTGGTTCGGCAATATGAATGTGCTGACCTTCCTGCGCGATTTGGCAAACACTTCTCCGTTAACCAGATGATCAACAAAGAAGCGGTTAAGCAGCGTCTCAACCGTGAAGATCAGGGGATTTCGTTCACTGAGTTTTCCTACAACCTGCTGCAGGGTTATAATTTTGCCTGTGTGAACAAACAGTACGGTGTGGTGCTGCAAATTGGTGGTTCTGACCAGTGGGGTAACATCACTTCTGGTATCGACCTGACCCGTCGTCTGCATCAGAATCAGGTG  GACGAGGAAGCGTTAGCAGAGCGACTGGCGCAAGGCCCGATCGCACTCATTTGTGGCTTCGATCCTACCGCTGACAGCTTGCATTTGGGGCATCTTGTTCCATTGTTATGCCTGAAACGCTTCCAGCAGGCGGGCCACAAGCCGGTTGCGCTGGTAGGCGGCGCGACGGGTCTGATTGGCGACCCGAGCTTCAAAGCTGCCGAGCGTAAGCTGAACACCGAAGAAACTGTTCAGGAGTGGGTGGACAAAATCCGTAAGCAGGTTGCCCCGTTCCTCGATTTCGACTGTGGAGAAAACTCTGCTATCGCGGCCAATGATTATGACTGGTTCGGCAATATGAATGTGCTGACCTTCCTGCGCGATTTGGCAAACACTTCTCCGTTAACCAGATGATCAACAAAGAAGCGGTTAAGCAGCGTCTCAACCGTGAAGATCAGGGGATTTCGTTCACTGAGTTTTCCTACAACCTGCTGCAGGGTTATAATTTTGCCTGTGTGAACAAACAGTACGGTGTGGTGCTGCAAATTGGTGGTTCTGACCAGTGGGGTAACATCACTTCTGGTATCGACCTGACCCGTCGTCTGCATCAGAATCAGGTG   SEQ IDNO.:25SEQ ID NO.: 25   对叠氮基Phe-6(活性位点)合成酶多核苷酸p-azido Phe-6 (active site) synthetase polynucleotide   CGACTGGCGCAAGGCCCGATCGCACTCACGTGTGGCTTCGATCCTACCGCTGACAGCTTGCATTTGGGGCATCTTGTTCCATTGTTATGCCTGAAACGCTTCCAGCAGGCGGGCCACAAGCCGGTTGCGCTGGTAGGCGGCGCGACGGGTCTGATTGGCGACCCGAGCTTCAAAGCTGCCGAGCGTAAGCTGAACACCGAAGAAACTGTTCAGGAGTGGGTGGACAAAATCCGTAAGCAGGTTGCCCCGTTCCTCGATTTCGACTGTGGAGAAAACTCTGCTATCGCGGCCAATAATTATGACTGGTTCGGCAATATGAATGTGCTGACCTTCCTGCGCGATTTGGCAAACACTTCTCCGTTAACCAGATGATCAACAAAGAAGCGGTTAAGCAGCGTCTCAACCGTGAAGATCAGGGGATTTCGTTCACTGAGTTTTCCTACAATCTGCTGCAGGGTTATTCGGCTGCCTGTCTTAACAAACAGTACGGTGTGGTGCTGCAAATTGGTGGTTCTGACCAGTGGGGTAACATCACTTCTGGTATCGACCTGACCCGTCGTCTGCATCAGAATCAGGTG  CGACTGGCGCAAGGCCCGATCGCACTCACGTGTGGCTTCGATCCTACCGCTGACAGCTTGCATTTGGGGCATCTTGTTCCATTGTTATGCCTGAAACGCTTCCAGCAGGCGGGCCACAAGCCGGTTGCGCTGGTAGGCGGCGCGACGGGTCTGATTGGCGACCCGAGCTTCAAAGCTGCCGAGCGTAAGCTGAACACCGAAGAAACTGTTCAGGAGTGGGTGGACAAAATCCGTAAGCAGGTTGCCCCGTTCCTCGATTTCGACTGTGGAGAAAACTCTGCTATCGCGGCCAATAATTATGACTGGTTCGGCAATATGAATGTGCTGACCTTCCTGCGCGATTTGGCAAACACTTCTCCGTTAACCAGATGATCAACAAAGAAGCGGTTAAGCAGCGTCTCAACCGTGAAGATCAGGGGATTTCGTTCACTGAGTTTTCCTACAATCTGCTGCAGGGTTATTCGGCTGCCTGTCTTAACAAACAGTACGGTGTGGTGCTGCAAATTGGTGGTTCTGACCAGTGGGGTAACATCACTTCTGGTATCGACCTGACCCGTCGTCTGCATCAGAATCAGGTG   SEQ IDNO.:26SEQ ID NO.: 26   pPR-EcRS-1(炔丙基氧基苯丙氨酸合成酶)(活性位点)合成酶多核苷酸pPR-EcRS-1 (propargyloxyphenylalanine synthetase) (active site) synthetase polynucleotide   CGGGGGCTGGTANCCCAGGTGACGGACGAGGAAGCGTTAGCAGAGCGACTGGCGCAAGGCCCGATCGCACTCGGGTGTGGCTTCGATCCTACCGCTGACAGCTTGCATTTGGGGCATCTTGTTCCATTGTTATGCCTGAAACGCTTCCAGCAGGCGGGCCACAAGCCGGTTGCGCTGGTAGGCGGCGCGACGGGTCTGATTGGCGACCCGAGCTTCAAAGCTGCCGAGCGTAAGCTGAACACCGAAGAAACTGTTCAGGAGTGGGTGGACAAAATCCGTAAGCAGGTTGCCCCGTTCCTCGATTTCGACTGTGGAGAAAACTCTGCTATCGCGGCCAATAATTATGACTGGTTCGGCAATATGAATGTGCTGACCTTCCTGCGCGATATTGGCAAACACTTCTCCGTTAACCAGATGATCAACAAAGAAGCGGTTAAGCAGCGTCTCAACCGTGAAGATCAGGGGATTTCGTTCACTGAGTTTTCCTACAACCTGCTGCAGGGTTATTCTATGGCCTGTTTGAACAAACAGTACGGTGTGGTGCTGCAAATTGGTGGTTCTGACCAGTGGGGTAACATCACTTCTGGTATCGACCTGANCCGTCGTCTGCATCAGAATCAGGTG  CGGGGGCTGGTANCCCAGGTGACGGACGAGGAAGCGTTAGCAGAGCGACTGGCGCAAGGCCCGATCGCACTCGGGTGTGGCTTCGATCCTACCGCTGACAGCTTGCATTTGGGGCATCTTGTTCCATTGTTATGCCTGAAACGCTTCCAGCAGGCGGGCCACAAGCCGGTTGCGCTGGTAGGCGGCGCGACGGGTCTGATTGGCGACCCGAGCTTCAAAGCTGCCGAGCGTAAGCTGAACACCGAAGAAACTGTTCAGGAGTGGGTGGACAAAATCCGTAAGCAGGTTGCCCCGTTCCTCGATTTCGACTGTGGAGAAAACTCTGCTATCGCGGCCAATAATTATGACTGGTTCGGCAATATGAATGTGCTGACCTTCCTGCGCGATATTGGCAAACACTTCTCCGTTAACCAGATGATCAACAAAGAAGCGGTTAAGCAGCGTCTCAACCGTGAAGATCAGGGGATTTCGTTCACTGAGTTTTCCTACAACCTGCTGCAGGGTTATTCTATGGCCTGTTTGAACAAACAGTACGGTGTGGTGCTGCAAATTGGTGGTTCTGACCAGTGGGGTAACATCACTTCTGGTATCGACCTGANCCGTCGTCTGCATCAGAATCAGGTG   SEQ IDNO.:27SEQ ID NO.: 27   pPR-EcRS-2(活性位点)合成酶多核苷酸pPR-EcRS-2 (active site) synthetase polynucleotide   CGGGGGCTGGTAGCCCAGGTGACGGACGAGGAAGCGTTAGCAGAGCGACTGGCGCAAGGCCCGATCGCACTCACGTGTGGCTTCGATCCTACCGCTGACAGCTTGCATTTGGGGCATCTTGTTCCATTGTTATGCCTGAAACGCTTCCAGCAGGCGGGCCACAAGCCGGTTGCGCTGGTAGGCGGCGCGACGGGTCTGATTGGCGACCCGAGCTTCAAAGCTGCCGAGCGTAAGCTGAACACCGAAGAAACTGTTCAGGAGTGGGTGGACAAAATCCGTAAGCAGGTTGCCCCGTTCCTCGATTTCGACTGTGGAGAAAACTCTGCTATCGCGGCCAATAATTATGACTGGTTCGGCAATATGAATGTGCTGACCTTCCTGCGCGATATTGGCAAACACTTCTCCGTTAACCAGATGATCAACAAAGAAGCGGTTAAGCAGCGTCTCAACCGTGAAGATCAGGGGATTTCGTTCACTGAGTTTTCCTACAATCTGCTGCAGGGTTATTCGGCTGCCTGTCTTAACAAACAGTACGGTGTGGTGCTGCAAATTGGTGGTTCTGACCAGTGGGGTAACATCACTTCTGGTATCGAACCTGANCCGTCGTCTGCATCAAAATCAAGTG  CGGGGGCTGGTAGCCCAGGTGACGGACGAGGAAGCGTTAGCAGAGCGACTGGCGCAAGGCCCGATCGCACTCACGTGTGGCTTCGATCCTACCGCTGACAGCTTGCATTTGGGGCATCTTGTTCCATTGTTATGCCTGAAACGCTTCCAGCAGGCGGGCCACAAGCCGGTTGCGCTGGTAGGCGGCGCGACGGGTCTGATTGGCGACCCGAGCTTCAAAGCTGCCGAGCGTAAGCTGAACACCGAAGAAACTGTTCAGGAGTGGGTGGACAAAATCCGTAAGCAGGTTGCCCCGTTCCTCGATTTCGACTGTGGAGAAAACTCTGCTATCGCGGCCAATAATTATGACTGGTTCGGCAATATGAATGTGCTGACCTTCCTGCGCGATATTGGCAAACACTTCTCCGTTAACCAGATGATCAACAAAGAAGCGGTTAAGCAGCGTCTCAACCGTGAAGATCAGGGGATTTCGTTCACTGAGTTTTCCTACAATCTGCTGCAGGGTTATTCGGCTGCCTGTCTTAACAAACAGTACGGTGTGGTGCTGCAAATTGGTGGTTCTGACCAGTGGGGTAACATCACTTCTGGTATCGAACCTGANCCGTCGTCTGCATCAAAATCAAGTG   SEQ IDNO.:28SEQ ID NO.: 28   pPR-EcRS-3(活性位点)合成酶多核苷酸pPR-EcRS-3 (active site) synthetase polynucleotide   CGGGGGCTGGTACCCCAAGTGACGGACGAGGAAACGTTAGCAGAGCGACTGGCGCAAGGCCCGATCGCACTCTCTTGTGGCTTCGATCCTACCGCTGACAGCTTGCATTTGGGGCATCTTGTTCCATTGTTATGCCTGAAACGCTTCCAGCAGGCAGGCCACAAGCCGGTTGCGCTGGTAGGCGGCGCGACGGGTCTGATTGGCGACCCGAGCTTCAAAGCTGCCGAGCGTAAGCTGAACACCGAAGAAA CTGTTCAGGAGTGGGTGGACAAAATCCGTAAGCAGGTTGCCCCGTTCCTCGATTTCGACTGTGGAGAAAACTCTGCTATCGCGGCCAATAATT  CGGGGGCTGGTACCCCAAGTGACGGACGAGGAAACGTTAGCAGAGCGACTGGCGCAAGGCCCGATCGCACTCTCTTGTGGCTTCGATCCTACCGCTGACAGCTTGCATTTGGGGCATCTTGTTCCATTGTTATGCCTGAAACGCTTCCAGCAGGCAGGCCACAAGCCGGTTGCGCTGGTAGGCGGCGCGACGGGTCTGATTGGCGACCCGAGCTTCAAAGCTGCCGAGCGTAAGCTGAACACCGAAGAAA CTGTTCAGGAGTGGGTGGACAAAATCCGTAAGCAGGTTGCCCCGTTCCTCGATTTCGACTGTGGAGAAAACTCTGCTATCGCGGCCAATAATT

  ATGACTGGTTCGGCAATATGAATGTGCTGACCTTCCTGCGCGATATTGGCAAACACTTCTCCGTTAACCAGATGATCAACAAAGAAGCGGTTAAGCAGCGTCTCAACCGTGAAGATCAGGGGATTTCGTTCACTGAGTTTTCCTACAACCTGCTGCAGGGTTATACGATGGCCTGTGTGAACAAACAGTACGGTGTGGTGCTGCAAATTGGTGGTTCTGACCAGTGGGGTAACATCACTTCTGGTATCGACCTGACCCGTCGTCTGCATCAGAATCAGGTG  ATGACTGGTTCGGCAATATGAATGTGCTGACCTTCCTGCGCGATATTGGCAAACACTTCTCCGTTAACCAGATGATCAACAAAGAAGCGGTTAAGCAGCGTCTCAACCGTGAAGATCAGGGGATTTCGTTCACTGAGTTTTCCTACAACCTGCTGCAGGGTTATACGATGGCCTGTGTGAACAAACAGTACGGTGTGGTGCTGCAAATTGGTGGTTCTGACCAGTGGGGTAACATCACTTCTGGTATCGACCTGACCCGTCGTCTGCATCAGAATCAGGTG   SEQ IDNO.:29SEQ ID NO.: 29   pPR-EcRS-4(活性位点)合成酶多核苷酸pPR-EcRS-4 (active site) synthetase polynucleotide   CGGGGGCTGGTAGCCCAGGTGACGGACGAGGAAGCGTTAGCAGAGCGACTGGCGCAAGGCCCGATCGCACTCGCGTGCGGCTTCGATCCTACCGCTGACAGCTTGCATTTGGGGCATCTTGTTCCATTGTTATGCCTGAAACGCTTCCAGCAGGCGGGCCACAAGCCGGTTGCGCTGGTAGGCGGCGCGACGGGTCTGATTGGCGACCCGAGCTTCAAGGCTGCCGAGCGTAAGCTGAACACCGAAGAAACTGTTCAGGAGTGGGTGGACAAAATCCGTAAGCAGGTTGCCCCGTTCCTCGATTTCGACTGTGGAGAAAACTCTGCTATCGCGGCCAATAATTATGACTGGTTCGGCAATGAATGTGCTGACCTTCCTGCGCGATTGGCAAACACTTCTCCGTTAACCAGATGATCAACAAAGAAGCGGTTAAGCAGCGTCTCAACCGTGAAGATCAGGGGATTTCGTTCACTGAGTTTTCCTACAACCTGCTGCAGGGTTATTCTTATGCCTGTCTTAACAAACAGTACGGTGTGGTGCTGCAAATTGGTGGTTCTGACCAGTGGGGTAACATCACTTCTGGTATCGACCTGACCCGTCGTCTGCATCAGAATCAGGTG  CGGGGGCTGGTAGCCCAGGTGACGGACGAGGAAGCGTTAGCAGAGCGACTGGCGCAAGGCCCGATCGCACTCGCGTGCGGCTTCGATCCTACCGCTGACAGCTTGCATTTGGGGCATCTTGTTCCATTGTTATGCCTGAAACGCTTCCAGCAGGCGGGCCACAAGCCGGTTGCGCTGGTAGGCGGCGCGACGGGTCTGATTGGCGACCCGAGCTTCAAGGCTGCCGAGCGTAAGCTGAACACCGAAGAAACTGTTCAGGAGTGGGTGGACAAAATCCGTAAGCAGGTTGCCCCGTTCCTCGATTTCGACTGTGGAGAAAACTCTGCTATCGCGGCCAATAATTATGACTGGTTCGGCAATGAATGTGCTGACCTTCCTGCGCGATTGGCAAACACTTCTCCGTTAACCAGATGATCAACAAAGAAGCGGTTAAGCAGCGTCTCAACCGTGAAGATCAGGGGATTTCGTTCACTGAGTTTTCCTACAACCTGCTGCAGGGTTATTCTTATGCCTGTCTTAACAAACAGTACGGTGTGGTGCTGCAAATTGGTGGTTCTGACCAGTGGGGTAACATCACTTCTGGTATCGACCTGACCCGTCGTCTGCATCAGAATCAGGTG   SEQ IDNO.:30SEQ ID NO.: 30   pPR-EcRS-5(活性位点)合成酶多核苷酸pPR-EcRS-5 (active site) synthetase polynucleotide   CGGGGGCTGGTAGCCCAGGTGACGGACGAGGAAGCGTTAGCAGAGCGACTGGCGCAAGGCCCGATCGCACTCGCGTGTGGCTTCGATCCTACCGCTGACAGCTTGCATTTGGGGCATCTTGTTCCATTGTTATGCCTGAAACGCTTCCAGCAGGCGGGCCACAAGCCGGTTGCGCTGGTAGGCGGCGCGACGGGTCTGATTGGCGACCCGAGCTTCAAAGCTGCCGAGCGTAAGCTGAACACCGAAGAAACTGTTCAGGAGTGGGTGGACAAAATCCGTAAGCAGGTTGCCCCGTTCCTCGATTTCGACTGTGGAGAAAACTCTGCTATCGCGGCCAATAATTATGACTGGTTCGGCAATATGAATGTGCTGACCTTCCTGCGCGATATTGGCAAACACTTCTCCGTTAACCAGATGATCAACAAAGAAGCGGTTAAGCAGCGTCTCAACCGTGAAGATCAGGGGATTTCGTTCACTGAGTTTTCCTACAACCTGCTGCAGGGTTATACGATGGCCTGTTGTAACAAACAGTACGGTGTGGTGCTGCAAATTGGTGGTTCTGACCAGTGGGGTAACATCACTTCTGGTATCGACCTGACCCGTCGTCTGCATCAGAATCAGGTG  CGGGGGCTGGTAGCCCAGGTGACGGACGAGGAAGCGTTAGCAGAGCGACTGGCGCAAGGCCCGATCGCACTCGCGTGTGGCTTCGATCCTACCGCTGACAGCTTGCATTTGGGGCATCTTGTTCCATTGTTATGCCTGAAACGCTTCCAGCAGGCGGGCCACAAGCCGGTTGCGCTGGTAGGCGGCGCGACGGGTCTGATTGGCGACCCGAGCTTCAAAGCTGCCGAGCGTAAGCTGAACACCGAAGAAACTGTTCAGGAGTGGGTGGACAAAATCCGTAAGCAGGTTGCCCCGTTCCTCGATTTCGACTGTGGAGAAAACTCTGCTATCGCGGCCAATAATTATGACTGGTTCGGCAATATGAATGTGCTGACCTTCCTGCGCGATATTGGCAAACACTTCTCCGTTAACCAGATGATCAACAAAGAAGCGGTTAAGCAGCGTCTCAACCGTGAAGATCAGGGGATTTCGTTCACTGAGTTTTCCTACAACCTGCTGCAGGGTTATACGATGGCCTGTTGTAACAAACAGTACGGTGTGGTGCTGCAAATTGGTGGTTCTGACCAGTGGGGTAACATCACTTCTGGTATCGACCTGACCCGTCGTCTGCATCAGAATCAGGTG   SEQ IDNO.:31SEQ ID NO.: 31   pPR-EcRS-6(活性位点)合成酶多核苷酸pPR-EcRS-6 (active site) synthetase polynucleotide   CGGGGGCTGGTACCCCAAGTGACGGACGAGGAAGCGTTAGCAGAGCGACTGGCGCAAGGCCCGATCGCACTCACGTGTGGCTTCGATCCTACCGCTGACAGCTTGCATTTGGGGCATCTTGTTCCATTGTTATGCCTGAAACGCTTCCAGCAGGCGGGCCACAAGCCGGTTGCGCTGGTAGGCGGCGCGACGGGTCTGATTGGCGACCCGAGCTTCAAAGCTGCCGAGCGTAAGCTGAACACCGAAGAAACTGTTCAGGAGTGGGTGGACAAAATCCGTAAGCAGGTTGCCCCGTTCCTCGATTTCGACTGTGGAGAAAACTCTGCTATCGCGGCCAATAATTATGACTGGTTCGGCAATATGAATGTGCTGACCTTCCTGCGCGATATTGGCAAACACTTCTCCGTTAACCAGATGATCAACAAAGAAGCGGTTAAGCAGCGTCTCAACCGTGAAGATCAGGGGATTTCGTTCGCTGAGTTTTCCTACAACCTGCTGCAGGGTTATACGTTTGCCTGTATGAACAAACAGTACGGTGTGGTGCTGCAAATTGGTGGTTCTGACCAGTGGGGTAACATCACTTCTGGTATCGACCTGACCCGTCGTCTGCATCAGAATCAGGTG  CGGGGGCTGGTACCCCAAGTGACGGACGAGGAAGCGTTAGCAGAGCGACTGGCGCAAGGCCCGATCGCACTCACGTGTGGCTTCGATCCTACCGCTGACAGCTTGCATTTGGGGCATCTTGTTCCATTGTTATGCCTGAAACGCTTCCAGCAGGCGGGCCACAAGCCGGTTGCGCTGGTAGGCGGCGCGACGGGTCTGATTGGCGACCCGAGCTTCAAAGCTGCCGAGCGTAAGCTGAACACCGAAGAAACTGTTCAGGAGTGGGTGGACAAAATCCGTAAGCAGGTTGCCCCGTTCCTCGATTTCGACTGTGGAGAAAACTCTGCTATCGCGGCCAATAATTATGACTGGTTCGGCAATATGAATGTGCTGACCTTCCTGCGCGATATTGGCAAACACTTCTCCGTTAACCAGATGATCAACAAAGAAGCGGTTAAGCAGCGTCTCAACCGTGAAGATCAGGGGATTTCGTTCGCTGAGTTTTCCTACAACCTGCTGCAGGGTTATACGTTTGCCTGTATGAACAAACAGTACGGTGTGGTGCTGCAAATTGGTGGTTCTGACCAGTGGGGTAACATCACTTCTGGTATCGACCTGACCCGTCGTCTGCATCAGAATCAGGTG   SEQ IDNO.:32SEQ ID NO.: 32   pPR-EcRS-7(活性位点)合成酶多核苷酸pPR-EcRS-7 (active site) synthetase polynucleotide   GTGACGGACGAGGAAGCGTTAGCAGAGCGACTGGCGCAAGGCCCGATCGCACTCACGTGTGGCTTCGATCCTACCGCTGACAGCTTGCATTTGGGGCATCTTGTTCCATTGTTATGCCTGAAACGCTTCCAGCAGGCGGGCCACAAGCCGGTTGCGCTGGTAGGCGGCGCGACGGGTCTGATTGGCGACCCGAGCTTCAAAGCTGCCGAGCGTAAGCTGAACACCGAAGAAACTGTTCAGGAGTGGGTGGACAAAATCCGTAAGCAGGTTGCCCCGTTCCTCGATTTCGACTGTGGAGAAAACTCTGCTATCGCGGCCAATAATTATGACTGGTTCGGCAATATGAATGTGCTGACCTTCCTGCGCGATATTGGCAAACACTTCTCCGTTAACCAGATGATCAACAAAGAAGCGGTTAAGCAGCGTCTCAACCGTGAAGATCAGGGGATTTCGTTCACTGAGTTTTCCTACAATCTGCTGCAGGGTTATTCGGCTGCCTGTCTTAACAAACAGTACGGTGTGGTGCTGCAAATTGGTGGTTCTGACCAGTGGGGTAACATCACTTCTGGTATCGACCTGACCCGTCGTCTGCATCAGAATCAGGTG  GTGACGGACGAGGAAGCGTTAGCAGAGCGACTGGCGCAAGGCCCGATCGCACTCACGTGTGGCTTCGATCCTACCGCTGACAGCTTGCATTTGGGGCATCTTGTTCCATTGTTATGCCTGAAACGCTTCCAGCAGGCGGGCCACAAGCCGGTTGCGCTGGTAGGCGGCGCGACGGGTCTGATTGGCGACCCGAGCTTCAAAGCTGCCGAGCGTAAGCTGAACACCGAAGAAACTGTTCAGGAGTGGGTGGACAAAATCCGTAAGCAGGTTGCCCCGTTCCTCGATTTCGACTGTGGAGAAAACTCTGCTATCGCGGCCAATAATTATGACTGGTTCGGCAATATGAATGTGCTGACCTTCCTGCGCGATATTGGCAAACACTTCTCCGTTAACCAGATGATCAACAAAGAAGCGGTTAAGCAGCGTCTCAACCGTGAAGATCAGGGGATTTCGTTCACTGAGTTTTCCTACAATCTGCTGCAGGGTTATTCGGCTGCCTGTCTTAACAAACAGTACGGTGTGGTGCTGCAAATTGGTGGTTCTGACCAGTGGGGTAACATCACTTCTGGTATCGACCTGACCCGTCGTCTGCATCAGAATCAGGTG   SEQ IDNO.:33SEQ ID NO.: 33   pPR-EcRS-8(活性位点)合成酶多核苷酸pPR-EcRS-8 (active site) synthetase polynucleotide   CGGGGGCTGGTAGCCCAGGTGACGGACGAGGAAGCGTTAGCAGAGCGACTGGCGCAAGGCCCGATCGCACTCGTTTGTGGCTTCGATCCTACCGCTGACAGCTTGCATTTGGGGCATCTTGTTCCATTGTTATGCCTGAAACGCTTCCAGCAGGCGGGCCACAAGCCGGTTGCGCTGGTAGGCGGCGCGACGGGTCCGGGGGCTGGTAGCCCAGGTGACGGACGAGGAAGCGTTAGCAGAGCGACTGGCGCAAGGCCCGATCGCACTCGTTTGTGGCTTCGATCCTACCGCTGACAGCTTGCATTTGGGGCATCTTGTTCCATTGTTATGCCTGAAACGCTTCCAGCAGGCGGGCCACAAGCCGGTTGCGCTGGTAGGCGGCGCGACGGGTC

  TGATTGGCGACCCGAGCTTCAAAGCTGCCGAGCGTAAGCTGAACACCGAAGAAACTGTTCAGGAGTGGGTGGACAAAATCCGTAAGCAGGTTGCCCCGTTCCTCGATTTCGACTGTGGAGAAAACTCTGCTATCGCGGCCAATAATTATGACTGGTTCGGCAATATGAATGTGCTGACCTTCCTGCGCGATATTGGCAAACACTTCTCCGTTAACCAGATGATCAACAAAGAAGCGGTTAAGCAGCGTCTCAACCGTGAAGATCAGGGGATTTCGTTCACTGAGTTTTCCTACAACCTGCTGCAGGGTTATTCGATGGCCTGTACGAACAAACAGTACGGTGTGGTGCTGCAAATTGGTGGTTCTGACCAGTGGGGTAACATCACTTCTGGTATCGACCTGACCCGTCGTCTGCATCAGAATCAGGTG  TGATTGGCGACCCGAGCTTCAAAGCTGCCGAGCGTAAGCTGAACACCGAAGAAACTGTTCAGGAGTGGGTGGACAAAATCCGTAAGCAGGTTGCCCCGTTCCTCGATTTCGACTGTGGAGAAAACTCTGCTATCGCGGCCAATAATTATGACTGGTTCGGCAATATGAATGTGCTGACCTTCCTGCGCGATATTGGCAAACACTTCTCCGTTAACCAGATGATCAACAAAGAAGCGGTTAAGCAGCGTCTCAACCGTGAAGATCAGGGGATTTCGTTCACTGAGTTTTCCTACAACCTGCTGCAGGGTTATTCGATGGCCTGTACGAACAAACAGTACGGTGTGGTGCTGCAAATTGGTGGTTCTGACCAGTGGGGTAACATCACTTCTGGTATCGACCTGACCCGTCGTCTGCATCAGAATCAGGTG   SEQ IDNO.:34SEQ ID NO.: 34   pPR-EcRS-9(活性位点)合成酶多核苷酸pPR-EcRS-9 (active site) synthetase polynucleotide   CGGGGGCTGGTANCCCAAGTGACGGACGGGGAAGCGTTAGCAGAGCGACTGGCGCAAGGCCCGATCGCACTCAGTTGTGGCTTCGATCCTACCGCTGACAGCTTGCATTTGGGGCATCTTGTTCCATTGTTATGCCTGAAACGCTTCCAGCAGGCGGGCCACAAGCCGGTTGCGCTGGTAGGCGGCGCGACGGGTCTGATTGGCGACCCGAGCTTCAAAGCTGCCGAGCGTAAGCTGAACACCGAAGAAACTGTTCAGGAGTGGGTGGACAAAATCCGTAAGCAGGTTGCCCCGTTCCTCGATCTCGACTGTGGAGAAAACTCTGCTATCGCGGCCAATAATTATGACTGGTTCGGCAATATGAATGTGCTGACCTTCCTGCGCGATATTGGCAAACACTTCTCCGTTAACCAGATGATCAACAAAGAAGCGGTTAAGCAGCGTCTCAACCGTGAAGATCAGGGGATTTCGTTCACTGAGTTTTCCTACAACCTGCTGCAGGGTTATAGTTTTGCCTGTCTGAACAAACAGTACGGTGTGGTGCTGCAAATTGGTGGTTCTGACCAGTGGGGTAACATCACTTCTGGTATCGACCTGACCCGTCGTCTGCATCAGAATCAGGTG  CGGGGGCTGGTANCCCAAGTGACGGACGGGGAAGCGTTAGCAGAGCGACTGGCGCAAGGCCCGATCGCACTCAGTTGTGGCTTCGATCCTACCGCTGACAGCTTGCATTTGGGGCATCTTGTTCCATTGTTATGCCTGAAACGCTTCCAGCAGGCGGGCCACAAGCCGGTTGCGCTGGTAGGCGGCGCGACGGGTCTGATTGGCGACCCGAGCTTCAAAGCTGCCGAGCGTAAGCTGAACACCGAAGAAACTGTTCAGGAGTGGGTGGACAAAATCCGTAAGCAGGTTGCCCCGTTCCTCGATCTCGACTGTGGAGAAAACTCTGCTATCGCGGCCAATAATTATGACTGGTTCGGCAATATGAATGTGCTGACCTTCCTGCGCGATATTGGCAAACACTTCTCCGTTAACCAGATGATCAACAAAGAAGCGGTTAAGCAGCGTCTCAACCGTGAAGATCAGGGGATTTCGTTCACTGAGTTTTCCTACAACCTGCTGCAGGGTTATAGTTTTGCCTGTCTGAACAAACAGTACGGTGTGGTGCTGCAAATTGGTGGTTCTGACCAGTGGGGTAACATCACTTCTGGTATCGACCTGACCCGTCGTCTGCATCAGAATCAGGTG   SEQ IDNO.:35SEQ ID NO.: 35   pPR-EcRS-10(活性位点)合成酶多核苷酸pPR-EcRS-10 (active site) synthetase polynucleotide   CGGGGGCTGGTAGCCCAGGTGACGGACGAGGAAGCGTTAGCAGAGCGACTGGCGCAAGGCCCGATCGCACTCACGTGTGGCTTCGATCCTACCGCTGACAGCTTGCATTTGGGGCATCTTGTTCCATTGTTATGCCTGAAACGCTTCCAGCAGGCGGGCCACAAGCCGGTTGCGCTGGTAGGCGGCGCGACGGGTCTGATTGGCGACCCGAGCTTCAAAGCTGCCGAGCGTAAGCTGAACACCGAAGAAACTGTTCAGGAGTGGGTGGACAAAATCCGTAAGCAGGTTGCCCCGTTCCTCGATTTCGACTGTGGAGAAAACTCTGCTATCGCGGCCAATAATTATGACTGGTTCGGCAATATGAATGTGCTGACCTTCCTGCGCGATATTGGCAAACACTTCTCCGTTAACCAGATGATCAACAAAGAAGCGGTTAAGCAGCGTCTCAACCGTGAAGATCAGGGGATTTCGTTCACTGAGTTTTCCTACAACCTGCTGCAGGGTTATACGTTTGCCTGTACTAACAAACAGTACGGTGTGGTGCTGCAAATTGGTGGTTCTGACCAGTGGGGTAACATCACTTCTGGTATCGACCTGACCCGTCGTCTGCATCAGAATCAGGTG  CGGGGGCTGGTAGCCCAGGTGACGGACGAGGAAGCGTTAGCAGAGCGACTGGCGCAAGGCCCGATCGCACTCACGTGTGGCTTCGATCCTACCGCTGACAGCTTGCATTTGGGGCATCTTGTTCCATTGTTATGCCTGAAACGCTTCCAGCAGGCGGGCCACAAGCCGGTTGCGCTGGTAGGCGGCGCGACGGGTCTGATTGGCGACCCGAGCTTCAAAGCTGCCGAGCGTAAGCTGAACACCGAAGAAACTGTTCAGGAGTGGGTGGACAAAATCCGTAAGCAGGTTGCCCCGTTCCTCGATTTCGACTGTGGAGAAAACTCTGCTATCGCGGCCAATAATTATGACTGGTTCGGCAATATGAATGTGCTGACCTTCCTGCGCGATATTGGCAAACACTTCTCCGTTAACCAGATGATCAACAAAGAAGCGGTTAAGCAGCGTCTCAACCGTGAAGATCAGGGGATTTCGTTCACTGAGTTTTCCTACAACCTGCTGCAGGGTTATACGTTTGCCTGTACTAACAAACAGTACGGTGTGGTGCTGCAAATTGGTGGTTCTGACCAGTGGGGTAACATCACTTCTGGTATCGACCTGACCCGTCGTCTGCATCAGAATCAGGTG   SEQ IDNO.:36SEQ ID NO.: 36   对-碘代PheRS-1合成酶氨基酸(aa)p-IodoPheRS-1 synthase amino acid (aa)   MASSNLIKQLQERGLVAQVTDEEALAERLAQGPIALVCGFDPTADSLHLGHLVPLLCLKRFQQAGHKPVALVGGATGLIGDPSFKAAERLNTEETVQEWVDKIRKQVAPFLDFDCGENSAIAANNYDWFGNMNVLTFLRDIGKHFSVNQMINKEAVKQRLNREDQGISFTEFSYNLLQGYSYACLNKQYGVVLQIGGSDQWGNITSGIDLTRRLHQNQVFGLTVPLITKADGTKFGKTEGGAVWLDPKKTSPYKFYQFWINTADADVYRFLKFFTFMSIEEINALEEEDKNSGKAPRAQYVLAEQVTRLVHGEEGLQAAKRITECLFSGSLSALSEADFEQLAQDGVPMVEMEKGADLMQALVDSELQPSRGQARKTIASNAITINGEKQSDPEYFFKEEDRLFGRFTLLRRGKKNYCLICWK  MASSNLIKQLQERGLVAQVTDEEALAERLAQGPIALVCGFDPTADSLHLGHLVPLLCLKRFQQAGHKPVALVGGATGLIGDPSFKAAERLNTEETVQEWVDKIRKQVAPFLDFDCGENSAIAANNYDWFGNMNVLTFLRDIGKHFSVNQMINKEAVKQRLNREDQGISFTEFSYNLLQGYSYACLNKQYGVVLQIGGSDQWGNITSGIDLTRRLHQNQVFGLTVPLITKADGTKFGKTEGGAVWLDPKKTSPYKFYQFWINTADADVYRFLKFFTFMSIEEINALEEEDKNSGKAPRAQYVLAEQVTRLVHGEEGLQAAKRITECLFSGSLSALSEADFEQLAQDGVPMVEMEKGADLMQALVDSELQPSRGQARKTIASNAITINGEKQSDPEYFFKEEDRLFGRFTLLRRGKKNYCLICWK   SEQ IDNO.:37SEQ ID NO.: 37   对-碘代PheRS-2合成酶氨基酸(aa)p-IodoPheRS-2 synthase amino acid (aa)   MASSNLIKQLQERGLVAQVTDEEALAERLAQGPIALICGFDPTADSLHLGHLVPLLCLKRFQQAGHKPVALVGGATGLIGDPSFKAAERKLNTEETVQEWVDKIRKQVAPFLDFDCGENSAIAANNYDWFGNMNVLTFLRDIGKHFSVNQMINKEAVKQRLNREDQGISFTEFSYNLLQGYSMACLNKQYGVVLQIGGSDQWGNITSGIILTRRLHQNQVFGLTVPLITKADGTKFGKTEGGAVWLDPKKTSPYKFYQFWINTADADVYRFLKFFTFMSIEEINALEEEDKNSGKAPRAQYVLAEQVTRLVHGEEGLQAAKRITECLFSGSLSALSEADFEQLAQDGVPMVEMEKGADLMQALVDSELQPSRGQARKTIASNAITINGEKQSDPEYFFKEEDUFGRFTLLRRGKKNYCLICWK  MASSNLIKQLQERGLVAQVTDEEALAERLAQGPIALICGFDPTADSLHLGHLVPLLCLKRFQQAGHKPVALVGGATGLIGDPSFKAAERKLNTEETVQEWVDKIRKQVAPFLDFDCGENSAIAANNYDWFGNMNVLTFLRDIGKHFSVNQMINKEAVKQRLNREDQGISFTEFSYNLLQGYSMACLNKQYGVVLQIGGSDQWGNITSGIILTRRLHQNQVFGLTVPLITKADGTKFGKTEGGAVWLDPKKTSPYKFYQFWINTADADVYRFLKFFTFMSIEEINALEEEDKNSGKAPRAQYVLAEQVTRLVHGEEGLQAAKRITECLFSGSLSALSEADFEQLAQDGVPMVEMEKGADLMQALVDSELQPSRGQARKTIASNAITINGEKQSDPEYFFKEEDUFGRFTLLRRGKKNYCLICWK   SEQ IDNO.:38SEQ ID NO.: 38   对-碘代PheRS-3合成酶氨基酸(aa)p-IodoPheRS-3 synthase amino acid (aa)   MASSNLIKQLQERGLVAQVTDEEALAERLAQGPIALVCGFDPTADSLHLGHLVPLLCLKRFQQAGHKPVALVGGATGLIGDPSFKAAERKLNTEETVQEWVDKIRKQVAPFLDFDCGENSAIAANNYDWFGNMNVLTFLRDIGKHFSVNQMINKEAVKQRLNREDQGISFTEFSYNLLQGYSMACANKQYGVVLQIGGSDQWGNITSGIDLTRRLHQNQVFGLTVPLITKADGTKFGKTEGGAVWLDPKKTSPYKFYQFWINTADADVYRFLKFFTFMSIEETNALEEEDKNSGKAPRAQYVLAEQVTRLVHGEEGLQAAKRITECLFSGSLSALSEADFEQLAQDGVPMVEMEKGADLMQALVDSELQPSRGQARKTIASNAITTNGEKQSDPEYFFKEEDRLFGRFTLLRRGKKNYCLICWK  MASSNLIKQLQERGLVAQVTDEEALAERLAQGPIALVCGFDPTADSLHLGHLVPLLCLKRFQQAGHKPVALVGGATGLIGDPSFKAAERKLNTEETVQEWVDKIRKQVAPFLDFDCGENSAIAANNYDWFGNMNVLTFLRDIGKHFSVNQMINKEAVKQRLNREDQGISFTEFSYNLLQGYSMACANKQYGVVLQIGGSDQWGNITSGIDLTRRLHQNQVFGLTVPLITKADGTKFGKTEGGAVWLDPKKTSPYKFYQFWINTADADVYRFLKFFTFMSIEETNALEEEDKNSGKAPRAQYVLAEQVTRLVHGEEGLQAAKRITECLFSGSLSALSEADFEQLAQDGVPMVEMEKGADLMQALVDSELQPSRGQARKTIASNAITTNGEKQSDPEYFFKEEDRLFGRFTLLRRGKKNYCLICWK

  SEQ IDNO.:39SEQ ID NO.: 39   OMeTyrRS-1合成酶氨基酸(aa)OMeTyrRS-1 synthetase amino acid (aa)   MASSNLIKQLQERGLVAQVTDEEALAERLAQGPIALVCGFDPTADSLHLGHLVPLLCLKRFQQAGHKPVALVGGATGLIGDPSFKAAERKLNTEETVQEWVDKIRKQVAPFLDFDCGENSAIAANNYDWFGNMNVLTFLRDIGKHFSVNQMINKEAVKQRLNREDQGISFTEFSYNLLQGYSMACLNKQYGVVLQIGGSDQWGNITSGIDLTRRLHQNQVFGLTVPLITKADGTKFGKTEGGAVWLDPKKTSPYKFYQFWINTADADVYRFLKFFTFMSIEEINALEEEDKNSGKAPRAQYVLAEQVTRLVHGEEGLQAAKRITECLFSGSLSALSEADFEQLAQDGVPMVEMEKGADLMQALVDSELQPSRGQARKTIASNAITINGEKQSDPEYFFKEEDRLFGRFTLLRRGKKNYCLICWK  MASSNLIKQLQERGLVAQVTDEEALAERLAQGPIALVCGFDPTADSLHLGHLVPLLCLKRFQQAGHKPVALVGGATGLIGDPSFKAAERKLNTEETVQEWVDKIRKQVAPFLDFDCGENSAIAANNYDWFGNMNVLTFLRDIGKHFSVNQMINKEAVKQRLNREDQGISFTEFSYNLLQGYSMACLNKQYGVVLQIGGSDQWGNITSGIDLTRRLHQNQVFGLTVPLITKADGTKFGKTEGGAVWLDPKKTSPYKFYQFWINTADADVYRFLKFFTFMSIEEINALEEEDKNSGKAPRAQYVLAEQVTRLVHGEEGLQAAKRITECLFSGSLSALSEADFEQLAQDGVPMVEMEKGADLMQALVDSELQPSRGQARKTIASNAITINGEKQSDPEYFFKEEDRLFGRFTLLRRGKKNYCLICWK   SEQ IDNO.:40SEQ ID NO.: 40   OMeTyrRS-2合成酶氨基酸(aa)OMeTyrRS-2 synthase amino acid (aa)   MASSNLIKQLQERGLVAQVTDEEALAERLAQGPIALTCGFDPTADSLHLGHLVPLLCLKRFQQAGHKPVALVGGATGLIGDPSFKAAERKLNTEETVQEWVDKIRKQVAPFLDFDCGENSAIAANNYDWFGNMNVLTFLRDIGKHFSVNQMINKEAVKQRLNREDQGISFTEFSYNLLQGYTMACLNKQYGVVLQIGGSDQWGNITSGIDLTRRLHQNQVFGLTVPLITKADGTKFGKTEGGAVWLDPKKTSPYKFYQFWINTADADVYRFLKFFTFMSIEEINALEEEDKNSGKAPRAQYVLAEQVTRLVHGEEGLQAAKRITECLFSGSLSALSEADFEQLAQDGVPMVEMEKGADLMQALVDSELQPSRGQARKTIASNAITINGEKQSDPEYFFKEEDRLFGRFTLLRRGKKNYCLICWK  MASSNLIKQLQERGLVAQVTDEEALAERLAQGPIALTCGFDPTADSLHLGHLVPLLCLKRFQQAGHKPVALVGGATGLIGDPSFKAAERKLNTEETVQEWVDKIRKQVAPFLDFDCGENSAIAANNYDWFGNMNVLTFLRDIGKHFSVNQMINKEAVKQRLNREDQGISFTEFSYNLLQGYTMACLNKQYGVVLQIGGSDQWGNITSGIDLTRRLHQNQVFGLTVPLITKADGTKFGKTEGGAVWLDPKKTSPYKFYQFWINTADADVYRFLKFFTFMSIEEINALEEEDKNSGKAPRAQYVLAEQVTRLVHGEEGLQAAKRITECLFSGSLSALSEADFEQLAQDGVPMVEMEKGADLMQALVDSELQPSRGQARKTIASNAITINGEKQSDPEYFFKEEDRLFGRFTLLRRGKKNYCLICWK   SEQ IDNO.:41SEQ ID NO.: 41   OMeTyrRS-3合成酶氨基酸(aa)OMeTyrRS-3 synthetase amino acid (aa)   MASSNLIKQLQERGLVAQVTDEEALAERLAQGPIAITCGFDPTADSLHLGHLVPLLCLKRFQQAGHKPVALVGGATGLIGDPSFKAAERKLNTEETVQEWVDKIRKQVAPFLDFDCGENSAIAANNYDWFGNMNVLTFLRDIGKHFSVNQMINKEAVKQRLNREDQGISFTEFSYNLLQGYTYACLNKQYGVVLQIGGSDQWGNITSGIDLTRRLHQNQVFGLTVPLITKADGTKFGKTEGGAVWLDPKKTSPYKFYQFWINTADADVYRFLFFTFMSIEEINALEEEDKNSGKAPRAQYVLAEQVTRLVHGEEGLQAAKRITECLFSGSLSALSEADFEQLAQDGVPMVEMEKGADLMQALVDSELQPSRGQARKTIASNAITINGEKQSDPEYFFKEEDRLFGRFTLLRRGKKNYCLICWK  MASSNLIKQLQERGLVAQVTDEEALAERLAQGPIAITCGFDPTADSLHLGHLVPLLCLKRFQQAGHKPVALVGGATGLIGDPSFKAAERKLNTEETVQEWVDKIRKQVAPFLDFDCGENSAIAANNYDWFGNMNVLTFLRDIGKHFSVNQMINKEAVKQRLNREDQGISFTEFSYNLLQGYTYACLNKQYGVVLQIGGSDQWGNITSGIDLTRRLHQNQVFGLTVPLITKADGTKFGKTEGGAVWLDPKKTSPYKFYQFWINTADADVYRFLFFTFMSIEEINALEEEDKNSGKAPRAQYVLAEQVTRLVHGEEGLQAAKRITECLFSGSLSALSEADFEQLAQDGVPMVEMEKGADLMQALVDSELQPSRGQARKTIASNAITINGEKQSDPEYFFKEEDRLFGRFTLLRRGKKNYCLICWK   SEQ IDNO.:42SEQ ID NO.: 42   OMeTyrRS-4合成酶氨基酸(aa)OMeTyrRS-4 synthase amino acid (aa)   MASSNLIKQLQERGLVAQVTDEEALAERLAQGPIALLCGFDPTADSLHLGHLVPLLCLKRFQQAGHKPVALVGGATGLIGDPSFKAAERKLNTEETVQEWVDKIRKQVAPFLDFDCGENSAIAANNYDWFGNMNVLTFLRDIGKHFSVNQMINKEAVKQRLNREDQGISFTEFSYNLLQGYSMACSNKQYGVVLQIGGSDQWGNITSGIDLTRRLHQNQVFGLTVPLITKADGTKFGKTEGGAVWLDPKKTSPYKFYQFWINTADADVYRFLKFFTFMSIEEINALEEEDKNSGKAPRAQYVLAEQVTRLVHGEEGLQAAKRITECLFSGSLSALSEADFEQLAQDGVPMVEMEKGADLMQALVDSELQPSRGQARKTIASNAITINGEKQSDPEYFFKEEDRLFGRFTLLRRGKKNYCLICWK  MASSNLIKQLQERGLVAQVTDEEALAERLAQGPIALLCGFDPTADSLHLGHLVPLLCLKRFQQAGHKPVALVGGATGLIGDPSFKAAERKLNTEETVQEWVDKIRKQVAPFLDFDCGENSAIAANNYDWFGNMNVLTFLRDIGKHFSVNQMINKEAVKQRLNREDQGISFTEFSYNLLQGYSMACSNKQYGVVLQIGGSDQWGNITSGIDLTRRLHQNQVFGLTVPLITKADGTKFGKTEGGAVWLDPKKTSPYKFYQFWINTADADVYRFLKFFTFMSIEEINALEEEDKNSGKAPRAQYVLAEQVTRLVHGEEGLQAAKRITECLFSGSLSALSEADFEQLAQDGVPMVEMEKGADLMQALVDSELQPSRGQARKTIASNAITINGEKQSDPEYFFKEEDRLFGRFTLLRRGKKNYCLICWK   SEQ IDNO.:43SEQ ID NO.: 43   OMeTyrRS-5合成酶氨基酸(aa)OMeTyrRS-5 synthetase amino acid (aa)   MASSNLIKQLQERGLVAQVTDEEALAERLAQGPIALLCGFDPTADSLHLGHLVPLLCLKRFQQAGHKPVALVGGATGLIGDPSFKAAERKLNTEETVQEWVDKIRKQVAPFLDFDCGENSAIAANNYDWFGNMNVLTFLRDIGKHFSVNQMINKEAVKQRLNREDQGISFTEFSYNLLQGYSMACANKQYGVVLQIGGSDQWGNITSGIDLTRRLHQNQVFGLTVPLITKADGTKFGKTEGGAVWLDPKKTSPYKFYQFWINTADADVYRFLKFFTFMSIEEINALEEEDKNSGKAPRAQYVLAEQVTRLVHGEEGLQAAKRITECLFSGSLSALSEADFEQLAQDGVPMVEMEKGADLMQALVDSELQPSRGQARKTIASNAITINGEKQSDPEYFFKEEDRLFGRFTLLRRGKKNYCLICWK  MASSNLIKQLQERGLVAQVTDEEALAERLAQGPIALLCGFDPTADSLHLGHLVPLLCLKRFQQAGHKPVALVGGATGLIGDPSFKAAERKLNTEETVQEWVDKIRKQVAPFLDFDCGENSAIAANNYDWFGNMNVLTFLRDIGKHFSVNQMINKEAVKQRLNREDQGISFTEFSYNLLQGYSMACANKQYGVVLQIGGSDQWGNITSGIDLTRRLHQNQVFGLTVPLITKADGTKFGKTEGGAVWLDPKKTSPYKFYQFWINTADADVYRFLKFFTFMSIEEINALEEEDKNSGKAPRAQYVLAEQVTRLVHGEEGLQAAKRITECLFSGSLSALSEADFEQLAQDGVPMVEMEKGADLMQALVDSELQPSRGQARKTIASNAITINGEKQSDPEYFFKEEDRLFGRFTLLRRGKKNYCLICWK   SEQ IDNO.:44SEQ ID NO.: 44   OMeTyrRS-6合成酶氨基酸(aa)OMeTyrRS-6 synthetase amino acid (aa)   MASSNLIKQLQERGLVAQVTDEEALAERLAQGPIALTCGFDPTADSLHLGHLVPLLCLKRFQQAGHKPVALVGGATGLIGDPSFKAAERKLNTEETVQEWVDKIRKQVAPFLDFDCGENSAIAANNYDWFGNMNVLTFLRDIGKHFSVNQMINKEAVKQRLNREDQGISFTEFSYNLLQGYRMACLNKQYGVVLQIGGSDQWGNITSGIDLTRRLHQNQVFGLTVPLITKADGTKFGKTEGGAVWLDPKKTSPYKFYQFWINTADADVYRFLKFFTFMSIEEINALEEEDKNSGKAPRAQYVLAEQVTRLVHGEEGLQAAKRITECLFSGSLSALSEADFEQLAQDGVPMVEMEKGADLMQALVDSELQPSRGQARKTIASNAITINGEKQSDPEYFFKEEDRLFGRFTLLRRGKKNYCLICWK  MASSNLIKQLQERGLVAQVTDEEALAERLAQGPIALTCGFDPTADSLHLGHLVPLLCLKRFQQAGHKPVALVGGATGLIGDPSFKAAERKLNTEETVQEWVDKIRKQVAPFLDFDCGENSAIAANNYDWFGNMNVLTFLRDIGKHFSVNQMINKEAVKQRLNREDQGISFTEFSYNLLQGYRMACLNKQYGVVLQIGGSDQWGNITSGIDLTRRLHQNQVFGLTVPLITKADGTKFGKTEGGAVWLDPKKTSPYKFYQFWINTADADVYRFLKFFTFMSIEEINALEEEDKNSGKAPRAQYVLAEQVTRLVHGEEGLQAAKRITECLFSGSLSALSEADFEQLAQDGVPMVEMEKGADLMQALVDSELQPSRGQARKTIASNAITINGEKQSDPEYFFKEEDRLFGRFTLLRRGKKNYCLICWK   SEQ IDNO.:45SEQ ID NO.: 45   对-乙酰基PheRS-1合成酶氨基酸(aa)p-Acetyl PheRS-1 synthetase amino acid (aa)   MASSNLIKQLQERGLVAQVTDEEALAERLAQGPIALICGFDPTADSLHLGHLVPLLCLKRFQQAGHKKPVALVGGATGLIGDPSFKAAERKLNTEETVQEWVDKIRKQVAPFLDFDCGENSAIAANNYDWFGNMNVTFLRDIGKHFSVNQMINKEAVKQRLNREDQGISFTEFSYNLLQGYGMACANKQYGVVLQIGGSDQWGNITSGIDLTRRLHQNQVFGITVPLITKADGTKFGKTEGGAVWLDPKKTSPYKFYQFWINTADADVYRFLKFFTFMSIEEINALEEEDKNSGKAPRAQYVLAEQVTRLVHGEEGLQAAKRITECLFSGSLSALSEADFEQLAQDGVPMVEMEKGADLMQALVDSELQPSRGQARKTIASNAITINGEKQSDPEYFFKEEDRLF  MASSNLIKQLQERGLVAQVTDEEALAERLAQGPIALICGFDPTADSLHLGHLVPLLCLKRFQQAGHKKPVALVGGATGLIGDPSFKAAERKLNTEETVQEWVDKIRKQVAPFLDFDCGENSAIAANNYDWFGNMNVTFLRDIGKHFSVNQMINKEAVKQRLNREDQGISFTEFSYNLLQGYGMACANKQYGVVLQIGGSDQWGNITSGIDLTRRLHQNQVFGITVPLITKADGTKFGKTEGGAVWLDPKKTSPYKFYQFWINTADADVYRFLKFFTFMSIEEINALEEEDKNSGKAPRAQYVLAEQVTRLVHGEEGLQAAKRITECLFSGSLSALSEADFEQLAQDGVPMVEMEKGADLMQALVDSELQPSRGQARKTIASNAITINGEKQSDPEYFFKEEDRLF

  GRFTLLRRGKKNYCLICWKGRFTLLRRGKKNYCLICWK   SEQ IDNO.:46SEQ ID NO.: 46   对-苯甲酰PheRS-1合成酶氨基酸(aa)p-benzoyl PheRS-1 synthase amino acid (aa)   MASSNLIKQLQERGLVAQVTDEEALAERLAQGPIALGCGFDPTADSLHLGHLVPLLCLKRFQQAGHKPVALVGGATGLIGDPSFKAAERKLNTEETVQEWVDKIRKQVAPFLDFDCGENSAIAANNYDWFGNMNVLTFLRDIGKHFSVNQMINKEAVKQRLNREDQGISFTEFSYNLLQGYGFACANKQYGVVLQIGGSDQWGNITSGIDLTRRLHQNQVFGLTVPLITKADGTKFGKTEGGAVWLDPKKTSPYKFYQFWITADADVYRFLKFFTFMSIEEIALEEEDKNSGKAPRAQYVLAEQVTRLVHGEEGLQAAKRITECLFSGSLSALSEADFEQLAQDGVPMVEMEKGADLMQALVDSELQPSRGQARKTIASNAITINGEKQSDPEYFFKEEDRLFGRFTLLRRGKKNYCLICWK  MASSNLIKQLQERGLVAQVTDEEALAERLAQGPIALGCGFDPTADSLHLGHLVPLLCLKRFQQAGHKPVALVGGATGLIGDPSFKAAERKLNTEETVQEWVDKIRKQVAPFLDFDCGENSAIAANNYDWFGNMNVLTFLRDIGKHFSVNQMINKEAVKQRLNREDQGISFTEFSYNLLQGYGFACANKQYGVVLQIGGSDQWGNITSGIDLTRRLHQNQVFGLTVPLITKADGTKFGKTEGGAVWLDPKKTSPYKFYQFWITADADVYRFLKFFTFMSIEEIALEEEDKNSGKAPRAQYVLAEQVTRLVHGEEGLQAAKRITECLFSGSLSALSEADFEQLAQDGVPMVEMEKGADLMQALVDSELQPSRGQARKTIASNAITINGEKQSDPEYFFKEEDRLFGRFTLLRRGKKNYCLICWK   SEQ IDNO.:47SEQ ID NO.: 47   对-苯甲酰PheRS-2合成酶氨基酸(aa)p-benzoyl PheRS-2 synthase amino acid (aa)   MASSNLIKQLQERGLVAQVTDEEALAERLAQGPIALGCGFDPTADSLHLGHLVPLLCLKRFQQAGHKPVALVGGATGLIGDPSFKAAERKLNTEETVQEWVDKIRKQVAPFLDFDCGENSAIAANNYDWFGNMNVLTFLRDIGKHFSVNQMINKEAVKQRLNREDQGISFTEFSYNLLQGYGYACMNKQYGVVLQIGGSDQWGNITSGIDLTRRLHQNQVFGLTVPLITKADGTKFGKTEGGAVWLDPKKTSPYKFYQFWINTADADVYRFLKFFTFMSIEEINALEEEDKNSGKAPRAQYVLAEQVTRLVHGEEGLQAAKRITECLFSGSLSALSEADFEQLAQDGVPMVEMEKGADLMQALVDSELQPSRGQARKTIASNAITINGEKQSDPEYFFKEEDRLFGRFTLLRRGKKNYCLICWK  MASSNLIKQLQERGLVAQVTDEEALAERLAQGPIALGCGFDPTADSLHLGHLVPLLCLKRFQQAGHKPVALVGGATGLIGDPSFKAAERKLNTEETVQEWVDKIRKQVAPFLDFDCGENSAIAANNYDWFGNMNVLTFLRDIGKHFSVNQMINKEAVKQRLNREDQGISFTEFSYNLLQGYGYACMNKQYGVVLQIGGSDQWGNITSGIDLTRRLHQNQVFGLTVPLITKADGTKFGKTEGGAVWLDPKKTSPYKFYQFWINTADADVYRFLKFFTFMSIEEINALEEEDKNSGKAPRAQYVLAEQVTRLVHGEEGLQAAKRITECLFSGSLSALSEADFEQLAQDGVPMVEMEKGADLMQALVDSELQPSRGQARKTIASNAITINGEKQSDPEYFFKEEDRLFGRFTLLRRGKKNYCLICWK   SEQ IDNO.:48SEQ ID NO.: 48   对-叠氮基PheRS-1合成酶氨基酸(aa)p-azido PheRS-1 synthase amino acid (aa)   MASSNLIKQLQERGLVAQVTDEEALAERLAQGPIALLCGFDPTADSLHLGHLVPLLCLKRFQQAGHKPVALVGGATGLIGDPSFKAAERKNTEETVQEWVDKIRKQVAPFLDFDCGENSAIAANNYDWFGNMNVLTFLRDIGKHFSVNQMINKEAVKQRLNREDQGISFTEFSYNLLQGYSMACANKQYGVVLQIGGSDQWGNITSGIDTRRLHQNQVFGLTVPLITKADGTKFGKTEGGAVWLDPKKTSPYKFYQFWINTADADVYRFLKFFTFMSIEEINALEEEDKNSGKAPRAQYVLAEQVTRLVHGEEGLQAAKRITECLFSGSLSALSEADFEQLAQDGVPMVEMEKGADLMQALVDSELQPSRGQARKTIASNAITINGEKQSDPEYFFKEEDRLFGRFTLLRRGKKNYCLICWK  MASSNLIKQLQERGLVAQVTDEEALAERLAQGPIALLCGFDPTADSLHLGHLVPLLCLKRFQQAGHKPVALVGGATGLIGDPSFKAAERKNTEETVQEWVDKIRKQVAPFLDFDCGENSAIAANNYDWFGNMNVLTFLRDIGKHFSVNQMINKEAVKQRLNREDQGISFTEFSYNLLQGYSMACANKQYGVVLQIGGSDQWGNITSGIDTRRLHQNQVFGLTVPLITKADGTKFGKTEGGAVWLDPKKTSPYKFYQFWINTADADVYRFLKFFTFMSIEEINALEEEDKNSGKAPRAQYVLAEQVTRLVHGEEGLQAAKRITECLFSGSLSALSEADFEQLAQDGVPMVEMEKGADLMQALVDSELQPSRGQARKTIASNAITINGEKQSDPEYFFKEEDRLFGRFTLLRRGKKNYCLICWK   SEQ IDNO.:49SEQ ID NO.: 49   对-叠氮基PheRS-2合成酶氨基酸(aa)p-azido PheRS-2 synthase amino acid (aa)   MASSNLIKQLQERGLVAQVTDEEALAERLAQGPIALVCGFDPTADSLHLGHLVPLLCLKRFQQAGHKPVALVGGATGLIGDPSFKAAERKLNTEETVQEWVDKIRKQVAPFLDFDCGENSAIAANNYDWFGNMNVLTFLRDIGKHFSVNQMINKEAVKQRLNREDQGISFTEFSYNLLQGYSAACANKQYGVVLQIGGSDQWGNITSGIDLTRRLHQNQVFGLTVPLITKADGTKFGKTEGGAVWLDPKKTSPYKFYQFWINTADADVYRFLKFTFMSIEEINALEEEDKNSGKAPRAQYVLAEQVTRLVHGEEGLQAAKRITECLFSGSLSALSEADFEQLAQDGVPMVEMEKGADLMQALVDSELQPSRGQARKTIASNAITINGEKQSDPEYFFKEEDRLFGRFTLLRRGKKNYCLICWK  MASSNLIKQLQERGLVAQVTDEEALAERLAQGPIALVCGFDPTADSLHLGHLVPLLCLKRFQQAGHKPVALVGGATGLIGDPSFKAAERKLNTEETVQEWVDKIRKQVAPFLDFDCGENSAIAANNYDWFGNMNVLTFLRDIGKHFSVNQMINKEAVKQRLNREDQGISFTEFSYNLLQGYSAACANKQYGVVLQIGGSDQWGNITSGIDLTRRLHQNQVFGLTVPLITKADGTKFGKTEGGAVWLDPKKTSPYKFYQFWINTADADVYRFLKFTFMSIEEINALEEEDKNSGKAPRAQYVLAEQVTRLVHGEEGLQAAKRITECLFSGSLSALSEADFEQLAQDGVPMVEMEKGADLMQALVDSELQPSRGQARKTIASNAITINGEKQSDPEYFFKEEDRLFGRFTLLRRGKKNYCLICWK   SEQ IDNO.:50SEQ ID NO.: 50   对-叠氮基PheRS-3合成酶氨基酸(aa)p-azido PheRS-3 synthetase amino acid (aa)   MASSNLIKQLQERGLVAQVTDEEALAERLAQGPIALLCGFDPTADSLHLGHLVPLLCLKRFQQAGHKPVALVGGATGLIGDPSFKAAERKLNTEETVQEWVDKIRKQVAPFLDFDCGENSAIAANNYDWFGNMNVLTFLRDIGKHFSVNQMINKEAVKQRLNREDQGISFTEFSYNLLQGYSAACANKQYGVVLQIGGSDQWGNITSGIDLTRRLHQNQVFGLTVPLITKADGTKFGKTEGGAVWLDPKKTSPYKFYQFWINTADADVYRFLKFFTFMSIEEINALEEEDKNSGKAPRAQYVLAEQVTRLVHGEEGLQAAKRITECLFSGSLSALSEADFEQLAQDGVPMVEMEKGADLMQALVDSELQPSRGQARKTIASNAITINGEKQSDPEYFFKEEDRLFGRFTLLRRGKKNYCLICWK  MASSNLIKQLQERGLVAQVTDEEALAERLAQGPIALLCGFDPTADSLHLGHLVPLLCLKRFQQAGHKPVALVGGATGLIGDPSFKAAERKLNTEETVQEWVDKIRKQVAPFLDFDCGENSAIAANNYDWFGNMNVLTFLRDIGKHFSVNQMINKEAVKQRLNREDQGISFTEFSYNLLQGYSAACANKQYGVVLQIGGSDQWGNITSGIDLTRRLHQNQVFGLTVPLITKADGTKFGKTEGGAVWLDPKKTSPYKFYQFWINTADADVYRFLKFFTFMSIEEINALEEEDKNSGKAPRAQYVLAEQVTRLVHGEEGLQAAKRITECLFSGSLSALSEADFEQLAQDGVPMVEMEKGADLMQALVDSELQPSRGQARKTIASNAITINGEKQSDPEYFFKEEDRLFGRFTLLRRGKKNYCLICWK   SEQ IDNO.:51SEQ ID NO.: 51   对-叠氮基PheRS-4合成酶氨基酸(aa)p-azido PheRS-4 synthase amino acid (aa)   MASSNLIKQLQERGLVAQVTDEEALAERLAQGPIALVCGFDPTADSLHLGHLVPLLCLKRFQQAGHKPVALVGGATGLIGDPSFKAAERKLNTEETVQEWVDKIRKQVAPFLDFDCGENSAIAANNYDWFGNMNVLTFLRDIGKHFSVNQMINKEAVKQRLNREDQGISFTEFSYNLLQGYSAACVNKQYGVVLQIGGSDQWGNITSGIDLTRRLHQNQVFGLTVPLITKADGTKFGKTEGGAVWLDPKKTSPYKFYQFWINTADADVYRFLKFFTFMSIEEINALEEEDKNSGKAPRAQYVLAEQVTRLVHGEEGLQAAKRITECLFSGSLSALSEADFEQLAQDGVPMVEMEKGADLMQALVDSELQPSRGQARKTIASNAITINGEKQSDPEYFFKEEDRLFGRFTLLRRGKKNYCLICWK  MASSNLIKQLQERGLVAQVTDEEALAERLAQGPIALVCGFDPTADSLHLGHLVPLLCLKRFQQAGHKPVALVGGATGLIGDPSFKAAERKLNTEETVQEWVDKIRKQVAPFLDFDCGENSAIAANNYDWFGNMNVLTFLRDIGKHFSVNQMINKEAVKQRLNREDQGISFTEFSYNLLQGYSAACVNKQYGVVLQIGGSDQWGNITSGIDLTRRLHQNQVFGLTVPLITKADGTKFGKTEGGAVWLDPKKTSPYKFYQFWINTADADVYRFLKFFTFMSIEEINALEEEDKNSGKAPRAQYVLAEQVTRLVHGEEGLQAAKRITECLFSGSLSALSEADFEQLAQDGVPMVEMEKGADLMQALVDSELQPSRGQARKTIASNAITINGEKQSDPEYFFKEEDRLFGRFTLLRRGKKNYCLICWK   SEQ IDNO.:52SEQ ID NO.: 52   对-叠氮基PheRS-5合成酶氨基酸(aa)p-azido PheRS-5 synthase amino acid (aa)   MASSNLIKQLQERGLVAQVTDEEALAERLAQGPIALICGFDPTADSLHLGHLVPLLCLKRFQQAGHKPVALVGGATGLIGDPSFKAAERKLNTEETVQEWVDKIRKQVAPFLDFDCGENSAIAANDYDWFGNMNVLTFLRDIGKHFSVNQMINKEAVKQRLNREDQGISFTEFSYNLLQGYNFACVNKQYGVVLQIGGSDQWGNITSGIDLTRRLHQNQVFGLTVPLITKADGTKFGKTEGGAVWLDPKKTSPYKFYQFWINTADADVYRFLKFFTFMSIEEINALEEEDKNSGKAPRAQYVLAE  MASSNLIKQLQERGLVAQVTDEEALAERLAQGPIALICGFDPTADSLHLGHLVPLLCLKRFQQAGHKPVALVGGATGLIGDPSFKAAERKLNTEETVQEWVDKIRKQVAPFLDFDCGENSAIAANDYDWFGNMNVLTFLRDIGKHFSVNQMINKEAVKQRLNREDQGISFTEFSYNLLQGYNFACVNKQYGVVLQIGGSDQWGNITSGIDLTRRLHQNQVFGLTVPLITKADGTKFGKTEGGAVWLDPKKTSPYKFYQFWINTADADVYRFLKFFTFMSIEEINALEEEDKNSGKAPRAQYVLAE

  QVTRLVHGEEGLQAAKRITECLFSGSLSALSEADFEQLAQDGVPMVEMEKGADLMQALVDSELQPSRGQARKTIASNAITINGEKQSDPEYFFKEEDRLFGRFTLLRRGKKNYCLICWKQVTRLVHGEEGLQAAKRITECLFSGSLSALSEADFEQLAQDGVPMVEMEKGADLMQALVDSELQPSRGQARKTIASNAITINGEKQSDPEYFFKEEDRLFGRFTLLRRGKKNYCLICWK   SEQ IDNO.:53SEQ ID NO.: 53   对-叠氮基PheRS-6合成酶氨基酸(aa)p-azido PheRS-6 synthase amino acid (aa)   MASSNLIKQLQERGLVAQVTDEEALAERLAQGPIALTCGFDPTADSLHLGHLVPLLCLKRFQQAGHKPVALVGGATGLIGDPSFKAAERKLNTEETVQEWVDKIRKQVAPFLDFDCGENSALAANNYDWFGNMNVLTFLRDIGKHFSVNQMINKEAVKQRLNREDQGISFTEFSYNLLQGYSAACLNKQYGVVLQIGGSDQWGNITSGIDLTRRLHQNQVFGITVPLITKADGTKFGKTEGGAVWLDPKKTSPYKFYQFWINTADADVYRFLKFFTFMSIEEINALEEEDKNSGKAPRAQYVLAEQVTRLVHGEEGLQAAKRITECLFSGSLSALSEADFEQLAQDGVPMVEMEKGADLMQALVDSELQPSRGQARKTIASNAITINGEKQSDPEYFFKEEDRLFGRFTLLRRGKKNYCLICWK  MASSNLIKQLQERGLVAQVTDEEALAERLAQGPIALTCGFDPTADSLHLGHLVPLLCLKRFQQAGHKPVALVGGATGLIGDPSFKAAERKLNTEETVQEWVDKIRKQVAPFLDFDCGENSALAANNYDWFGNMNVLTFLRDIGKHFSVNQMINKEAVKQRLNREDQGISFTEFSYNLLQGYSAACLNKQYGVVLQIGGSDQWGNITSGIDLTRRLHQNQVFGITVPLITKADGTKFGKTEGGAVWLDPKKTSPYKFYQFWINTADADVYRFLKFFTFMSIEEINALEEEDKNSGKAPRAQYVLAEQVTRLVHGEEGLQAAKRITECLFSGSLSALSEADFEQLAQDGVPMVEMEKGADLMQALVDSELQPSRGQARKTIASNAITINGEKQSDPEYFFKEEDRLFGRFTLLRRGKKNYCLICWK   SEQ IDNO.:54SEQ ID NO.: 54   pPR-EcRS-1合成酶氨基酸(aa)对-炔丙基氧基苯丙氨酸合成酶pPR-EcRS-1 synthase amino acid (aa) p-propargyloxyphenylalanine synthase   MASSNLIKQLQERGLVAQVTDEEALAERLAQGPIALGCGFDPTADSLHLGHLVPLLCLKRFQQAGHKPVALVGGATGLIGDPSFKAAERKLNTEETVQEWVDKIRKQVAPFLDFDCGENSAIAANNYDWFGNMNVTFLRDIGKHFSVNQMINKEAVKQRLNREDQGISFTEFSYNLLQGYSMACLNKQYGVVLQIGGSDQWGNITSGIDLTRLHQNQVFGLTVPLITKADGTKFGKTEGGAVWLDPKKTSPYKFYQFWINTADADVYRFLKFTFMSIEEINALEEEDKNSGKAPRAQYVLAEQVTRLVHGEEGLQAAKRITECLFSGSLSALSEADFEQLAQDGVPMVEMEKGADLMQALVDSELQPSRGQARKTIASNAITINGEKQSDPEYFFKEEDRLFGRFTLLRRGKKNYCLICWK  MASSNLIKQLQERGLVAQVTDEEALAERLAQGPIALGCGFDPTADSLHLGHLVPLLCLKRFQQAGHKPVALVGGATGLIGDPSFKAAERKLNTEETVQEWVDKIRKQVAPFLDFDCGENSAIAANNYDWFGNMNVTFLRDIGKHFSVNQMINKEAVKQRLNREDQGISFTEFSYNLLQGYSMACLNKQYGVVLQIGGSDQWGNITSGIDLTRLHQNQVFGLTVPLITKADGTKFGKTEGGAVWLDPKKTSPYKFYQFWINTADADVYRFLKFTFMSIEEINALEEEDKNSGKAPRAQYVLAEQVTRLVHGEEGLQAAKRITECLFSGSLSALSEADFEQLAQDGVPMVEMEKGADLMQALVDSELQPSRGQARKTIASNAITINGEKQSDPEYFFKEEDRLFGRFTLLRRGKKNYCLICWK   SEQ IDNO.55SEQ ID NO.55   pPR-Ec RS-2合成酶氨基酸(aa)pPR-Ec RS-2 synthetase amino acid (aa)   MASSNLIKQLQERGLVAQVTDEEALAERLAQGPIALTCGFDPTADSLHLGHLVPLLCLKRFQQAGHKPVALVGGATGLIGDPSFKAAERKLNTEETVQEWVDKIRKQVAPFLDFDCGENSAIAANNYDWFGNMNVLTFLRDIGKHFSVNQMINKEAVKQRLNREDQGISFTEFSYNLLQGYSAACLNKQYGVVLQIGGSDQWGNITSGIDLTRRLHQNQVFGLTVPLITKADGTKFGKTEGGAVWLDPKKTSPYKFYQFWINTADADVYRFLKFFTFMSIEEINALEEEDKNSGKAPRAQYVLAEQVTRLVHGEEGLQAAKRITECLFSGSLSALSEADFEQLAQDGVPMVEMEKGADLMQALVDSELQPSRGQARKTIASNAITINGEKQSDPEYFFKEEDRLFGRFTLLRRGKKNYCLICWK  MASSNLIKQLQERGLVAQVTDEEALAERLAQGPIALTCGFDPTADSLHLGHLVPLLCLKRFQQAGHKPVALVGGATGLIGDPSFKAAERKLNTEETVQEWVDKIRKQVAPFLDFDCGENSAIAANNYDWFGNMNVLTFLRDIGKHFSVNQMINKEAVKQRLNREDQGISFTEFSYNLLQGYSAACLNKQYGVVLQIGGSDQWGNITSGIDLTRRLHQNQVFGLTVPLITKADGTKFGKTEGGAVWLDPKKTSPYKFYQFWINTADADVYRFLKFFTFMSIEEINALEEEDKNSGKAPRAQYVLAEQVTRLVHGEEGLQAAKRITECLFSGSLSALSEADFEQLAQDGVPMVEMEKGADLMQALVDSELQPSRGQARKTIASNAITINGEKQSDPEYFFKEEDRLFGRFTLLRRGKKNYCLICWK   SEQ IDNO.:56SEQ ID NO.: 56   pPR-EcRS-3合成酶氨基酸(aa)pPR-EcRS-3 synthetase amino acid (aa)   MASSNLIKQLQERGLVAQVTDEEALAERLAQGPIALSCGFDPTADSLHLGHLVPLLCLKRFQQAGHKPVALVGGATGLIGDPSFKAAERKLNTEETVQEWVDKIRKQVAPFLDFDCGENSAIAANNYDWFGNMNVLTFLRDIGKHFSVNQMINKEAVKQRLNREDQGISFTEFSYNLLQGYTMACVNKQYGVVLQIGGSDQWGNITSGIDLTRRLHQNQVFGLTVPLITKADGTKFGKTEGGAVWLDPKKTSPYKFYQFWINTADADVYRFLKFFTFMSIEEINALEEEDKNSGKAPRAQYVLAEQVTRLVHGEEGLQAAKRITECLFSGSLSALSEADFEQLAQDGVPMVEMEKGADLMQALVDSELQPSRGQARKTIASNAITINGEKQSDPEYFFKEEDRLFGRFTLLRRGKKNYCLICWK  MASSNLIKQLQERGLVAQVTDEEALAERLAQGPIALSCGFDPTADSLHLGHLVPLLCLKRFQQAGHKPVALVGGATGLIGDPSFKAAERKLNTEETVQEWVDKIRKQVAPFLDFDCGENSAIAANNYDWFGNMNVLTFLRDIGKHFSVNQMINKEAVKQRLNREDQGISFTEFSYNLLQGYTMACVNKQYGVVLQIGGSDQWGNITSGIDLTRRLHQNQVFGLTVPLITKADGTKFGKTEGGAVWLDPKKTSPYKFYQFWINTADADVYRFLKFFTFMSIEEINALEEEDKNSGKAPRAQYVLAEQVTRLVHGEEGLQAAKRITECLFSGSLSALSEADFEQLAQDGVPMVEMEKGADLMQALVDSELQPSRGQARKTIASNAITINGEKQSDPEYFFKEEDRLFGRFTLLRRGKKNYCLICWK   SEQ IDNO.:57SEQ ID NO.: 57   pPR-EcRS-4合成酶氨基酸(aa)pPR-EcRS-4 synthetase amino acid (aa)   MASSNLIKQLQERGLVAQVTDEEALAERLAQGPIALACGFDPTADSLHLGHLVPLLCLKRFQQAGHKPVALVGGATGLIGDPSFKAAERKLNTEETVQEWVDKIRKQVAPFLDFDCGENSAIAANNYDWFGNMNVLTFLRDIGKHFSVNQMINKEAVKQRLNREDQGISFTEFSYNLLQGYSYACLNKQYGVVLQIGGSDQWGNITSGIDLTRRLHQNQVFGLTVPLITKADGTKFGKTEGGAVWLDPKKTSPYKFYQFWINTADADVYRFLKFFTFMSIEEINALEEEDKNSGKAPRAQYVLAEQVTRLVHGEEGLQAAKRITECLFSGSLSALSEADFEQLAQDGVPMVEMEKGADLMQALVDSELQPSRGQARKTIASNAITINGEKQSDPEYFFKEEDRLFGRFTLLRRGKKNYCLICWK  MASSNLIKQLQERGLVAQVTDEEALAERLAQGPIALACGFDPTADSLHLGHLVPLLCLKRFQQAGHKPVALVGGATGLIGDPSFKAAERKLNTEETVQEWVDKIRKQVAPFLDFDCGENSAIAANNYDWFGNMNVLTFLRDIGKHFSVNQMINKEAVKQRLNREDQGISFTEFSYNLLQGYSYACLNKQYGVVLQIGGSDQWGNITSGIDLTRRLHQNQVFGLTVPLITKADGTKFGKTEGGAVWLDPKKTSPYKFYQFWINTADADVYRFLKFFTFMSIEEINALEEEDKNSGKAPRAQYVLAEQVTRLVHGEEGLQAAKRITECLFSGSLSALSEADFEQLAQDGVPMVEMEKGADLMQALVDSELQPSRGQARKTIASNAITINGEKQSDPEYFFKEEDRLFGRFTLLRRGKKNYCLICWK   SEQ IDNO.:58SEQ ID NO.: 58   pPR-EcRS-5合成酶氨基酸(aa)pPR-EcRS-5 synthetase amino acid (aa)   MASSNLIKQLQERGLVAQVTDEEALAERLAQGPIALACGFDPTADSLHLGHLVPLLCLKRFQQAGHKPVALVGGATGLIGDPSFKAAERKLNTEETVQEWVDKIRKQVAPFLDFDCGENSAIAANNYDWFGNMNVLTFLRDIGKHFSVNQMINKEAVKQRLNREDQGISFTEFSYNLLQGYTMACCNKQYGVVLQIGGSDQWGNITSGIDLTRRLHQNQVFGLTVPLITKADGTKFGKTEGGAVWLDPKKTSPYKFYQFWINTADADVYRFLKFFTFMSIEEINALEEEDKNSGKAPRAQYVLAEQVTRLVHGEEGLQAAKRITECLFSG SLSALSEADFEQLAQDGVPMVEMEKGADLMQALVDSELQPSRGQARKTIASNAITINGEKQSDPEYFFKEEDRLFGRFTLLRRGKKNYCLICWK  MASSNLIKQLQERGLVAQVTDEEALAERLAQGPIALACGFDPTADSLHLGHLVPLLCLKRFQQAGHKPVALVGGATGLIGDPSFKAAERKLNTEETVQEWVDKIRKQVAPFLDFDCGENSAIAANNYDWFGNMNVLTFLRDIGKHFSVNQMINKEAVKQRLNREDQGISFTEFSYNLLQGYTMACCNKQYGVVLQIGGSDQWGNITSGIDLTRRLHQNQVFGLTVPLITKADGTKFGKTEGGAVWLDPKKTSPYKFYQFWINTADADVYRFLKFFTFMSIEEINALEEEDKNSGKAPRAQYVLAEQVTRLVHGEEGLQAAKRITECLFSG SLSALSEADFEQLAQDGVPMVEMEKGADLMQALVDSELQPSRGQARKTIASNAITINGEKQSDPEYFFKEEDRLFGRFTLLRRGKKNYCLICWK   SEQ IDNO.:59SEQ ID NO.: 59   pPR-EcRS-6合成酶氨基酸(aa)pPR-EcRS-6 synthetase amino acid (aa)   MASSNLIKQLQERGLVAQVTDEEALAERLAQGPIALTCGFDPTADSLHLGHLVPLLCLKRFQQAGHKPVALVGGATGLIGDPSFKAAERKLNTEETVQEWVDKIRKQVAPFLDFDCGENSAIAANNYDWFGNMNVLTFLRDIGKHFSVNQMINKEAVKQRLNREDQGISFTEFSYNLLQGYTFACMNKQYGVVLQIGGSDQWGMASSNLIKQLQERGLVAQVTDEEALAERLAQGPIALTCGFDPTADSLHLGHLVPLLCLKRFQQAGHKPVALVGGATGLIGDPSFKAAERKLNTEETVQEWVDKIRKQVAPFLDFDCGENSAIAANNYDWFGNMNVLTFLRDIGKHFSVNQMINKEAVKQRLGSDQVGISFTEFYACNLLWGTF

  NITSGIDLTRRLHQNQVFGLTVPLITKADGTKFGKTEGGAVWLDPKKTSPYKFYQFWINTADADVYRFLKFFTFMSIEEINALEEEDKNSGKAPRAQYVLAEQVTRLVHGEEGLQAAKRITECLFSGSLSALSEADFEQLAQDGVPMVEMEKGADLMQALVDSELQPSRGQARKTIASNAITINGEKQSDPEYFFKEEDRLFGRFTLLRRGKKNYCLICWK  NITSGIDLTRRLHQNQVFGLTVPLITKADGTKFGKTEGGAVWLDPKKTSPYKFYQFWINTADADVYRFLKFFTFMSIEEINALEEEDKNSGKAPRAQYVLAEQVTRLVHGEEGLQAAKRITECLFSGSLSALSEADFEQLAQDGVPMVEMEKGADLMQALVDSELQPSRGQARKTIASNAITINGEKQSDPEYFFKEEDRLFGRFTLLRRGKKNYCLICWK   SEQ IDNO.:60SEQ ID NO.: 60   pPR-EcRS-7合成酶氨基酸(aa)pPR-EcRS-7 synthase amino acid (aa)   MASSNLIKQLQERGLVAQVTDEEALAERLAQGPIALTCGFDPTADSLHLGHLVPLLCLKRFQQAGHKPVALVGGATGLIGDPSFKAAERKLNTEETVQEWVDKIRKQVAPFLDFDCGENSAIAANNYDWFGNMNVLTFLRDIGKHFSVNQMINKEAVKQRLNREDQGISFTEFSYNLLQGYSVACLNKQYGVVLQIGGSDQWGNITSGIDLTRRLHQNQVFGLTVPLITKADGTKFGKTEGGAVWLDPKTSPYKFYQFWINTADADVYRFLKFFTFMSIEEINALEEEDKNSGKAPRAQYVLAEQVTRLVHGEEGLQAAKRITECLFSGSLSALSEADFEQLAQDGVPMVEMEKGADLMQALVDSELQPSRGQARKTIASNAITINGEKQSDPEYFFKEEDRLFGRFTLLRRGKKNYCLICWK  MASSNLIKQLQERGLVAQVTDEEALAERLAQGPIALTCGFDPTADSLHLGHLVPLLCLKRFQQAGHKPVALVGGATGLIGDPSFKAAERKLNTEETVQEWVDKIRKQVAPFLDFDCGENSAIAANNYDWFGNMNVLTFLRDIGKHFSVNQMINKEAVKQRLNREDQGISFTEFSYNLLQGYSVACLNKQYGVVLQIGGSDQWGNITSGIDLTRRLHQNQVFGLTVPLITKADGTKFGKTEGGAVWLDPKTSPYKFYQFWINTADADVYRFLKFFTFMSIEEINALEEEDKNSGKAPRAQYVLAEQVTRLVHGEEGLQAAKRITECLFSGSLSALSEADFEQLAQDGVPMVEMEKGADLMQALVDSELQPSRGQARKTIASNAITINGEKQSDPEYFFKEEDRLFGRFTLLRRGKKNYCLICWK   SEQ IDNO.:61SEQ ID NO.: 61   pPR-EcRS-8合成酶氨基酸(aa)pPR-EcRS-8 synthetase amino acid (aa)   MASSNLIKQLQERGLVAQVTDEEALAERLAQGPIALVCGFDPTADSLHLGHLVPLLCLKRFQQAGHKPVALVGGATGLIGDPSFKAAERKLNTEETVQEWVDKIRKQVAPFLDFDCGENSAIAANNYDWFGNMNVLTFLRDIGKHFSVNQMINKEAVKQRLNREDQGISFTEFSYNLLQGYSMACTNKQYGVVLQIGGSDQWGNITSGIDLTRRLHQNQVFGLTVPLITKADGTKFGKTEGGAVWLDPKKTSPYKFYQFWINTADADVYRFLKFFTFMSIEEINALEEEDKNSGKAPRAQYVLAEQVTRLVHGEEGLQAAKRITECLFSGSLSALSEADFEQLAQDGVPMVEMEKGADLMQALVDSELQPSRGQARKTIASNAITINGEKQSDPEYFFKEEDRLFGRFTLLRRGKKNYCLICWK  MASSNLIKQLQERGLVAQVTDEEALAERLAQGPIALVCGFDPTADSLHLGHLVPLLCLKRFQQAGHKPVALVGGATGLIGDPSFKAAERKLNTEETVQEWVDKIRKQVAPFLDFDCGENSAIAANNYDWFGNMNVLTFLRDIGKHFSVNQMINKEAVKQRLNREDQGISFTEFSYNLLQGYSMACTNKQYGVVLQIGGSDQWGNITSGIDLTRRLHQNQVFGLTVPLITKADGTKFGKTEGGAVWLDPKKTSPYKFYQFWINTADADVYRFLKFFTFMSIEEINALEEEDKNSGKAPRAQYVLAEQVTRLVHGEEGLQAAKRITECLFSGSLSALSEADFEQLAQDGVPMVEMEKGADLMQALVDSELQPSRGQARKTIASNAITINGEKQSDPEYFFKEEDRLFGRFTLLRRGKKNYCLICWK   SEQ IDNO.:62SEQ ID NO.: 62   pPR-EcRS-9合成酶氨基酸(aa)pPR-EcRS-9 synthetase amino acid (aa)   MASSNLIKQLQERGLVAQVTDEEALAERLAQGPIALSCGFDPTADSLHLGHLVPLLCLKRFQQAGHKPVA LVGGATGLIGDPSFKAAERKLNTEETVQEWVDKIRKQVAPFLDFDCGENSAIAANNYDWFGNMNVLTFLRDIGKHFSVNQMINKEAVKQRLNREDQGISFTEFSYNLLQGYSFACLNKQYGVVLQIGGSDQWGNITSGIDITRRLHQNQVFGLTVPLITKADGTKFGKTEGGAVWLDPKKTSPYKFYQFWINTADADVYRFLKFFTFMSIEEINALEEEDKNSGKAPRAQYVLAEQVTRLVHGEEGLQAAKRITECLFSGSLSALSEADFEQLAQDGVPMVEMEKGADLMQALVDSELQPSRGQARKTIASNAITINGEKQSDPEYFFKEEDRLFGRFTLLRRGKKNYCLICWK  MASSNLIKQLQERGLVAQVTDEEALAERLAQGPIALSCGFDPTADSLHLGHLVPLLCLKRFQQAGHKPVA LVGGATGLIGDPSFKAAERKLNTEETVQEWVDKIRKQVAPFLDFDCGENSAIAANNYDWFGNMNVLTFLRDIGKHFSVNQMINKEAVKQRLNREDQGISFTEFSYNLLQGYSFACLNKQYGVVLQIGGSDQWGNITSGIDITRRLHQNQVFGLTVPLITKADGTKFGKTEGGAVWLDPKKTSPYKFYQFWINTADADVYRFLKFFTFMSIEEINALEEEDKNSGKAPRAQYVLAEQVTRLVHGEEGLQAAKRITECLFSGSLSALSEADFEQLAQDGVPMVEMEKGADLMQALVDSELQPSRGQARKTIASNAITINGEKQSDPEYFFKEEDRLFGRFTLLRRGKKNYCLICWK   SEQ IDNO.:63SEQ ID NO.: 63   pPR-EcRS-10合成酶氨基酸(aa)pPR-EcRS-10 synthase amino acid (aa)   MASSNLIKQLQERGLVAQVTDEEALAERLAQGPIALTCGFDPTADSLHLGHLVPLLCLKRFQQAGHKPVALVGGATGLIGDPSFKAAERKLNTEETVQEWVDKIRKQVAPFLDFDCGENSAIAANNYDWFGNMNVLTFLRDIGKHFSVNQMINKEAVKQRLNREDQGISFTEFSYNLLQGYTFACTNKQYGVVLQIGGSDQWGNITSGIDLTRRLHQNQVFGLTVPLITKADGTKFGKTEGGAVWLDPKKTSPYKFYQFWINTADADVYRFLKFFTFMSIEEINALEEEDKNSGKAPRAQYVLAEQVTRLVHGEEGLQAAKRITECLFSGSLSALSEADFEQLAQDGVPMVEMEKGADLMQALVDSELQPSRGQARKTIASNAITINGEKQSDPEYFFKEEDRLFGRFTLLRRGKKNYCLICWK  MASSNLIKQLQERGLVAQVTDEEALAERLAQGPIALTCGFDPTADSLHLGHLVPLLCLKRFQQAGHKPVALVGGATGLIGDPSFKAAERKLNTEETVQEWVDKIRKQVAPFLDFDCGENSAIAANNYDWFGNMNVLTFLRDIGKHFSVNQMINKEAVKQRLNREDQGISFTEFSYNLLQGYTFACTNKQYGVVLQIGGSDQWGNITSGIDLTRRLHQNQVFGLTVPLITKADGTKFGKTEGGAVWLDPKKTSPYKFYQFWINTADADVYRFLKFFTFMSIEEINALEEEDKNSGKAPRAQYVLAEQVTRLVHGEEGLQAAKRITECLFSGSLSALSEADFEQLAQDGVPMVEMEKGADLMQALVDSELQPSRGQARKTIASNAITINGEKQSDPEYFFKEEDRLFGRFTLLRRGKKNYCLICWK   SEQ IDNO.:64SEQ ID NO.: 64   tRNA/Tyr  多核苷酸tRNA/Tyr polynucleotide   AGCTTCCCGATAAGGGAGCAGGCCAGTAAAAAGCATTACCCCGTGGTGGGGTTCCCGAGCGGCCAAAGGGAGCAGACTCTAAATCTGCCGTCATCGACCTCGAAGGTTCGAATCCTTCCCCCACCACCA                AGCTTCCCGATAAGGGAGCAGGCCAGTAAAAAGCATTACCCCGTGGTGGGGTTCCCGAGCGGCCAAAGGGAGCAGACTCTAAATCTGCCGTCATCGACCTCGAAGGTTCGAATCCTTCCCCCACCACCA   SEQ IDNO.:65SEQ ID NO.: 65   tRNA/TyrtRNA/Tyr   AGCUUCCCGAUAAGGGAGCAGGCCAGUAAAAAGCAUUACCCCGUGGUGGGGUUCCCGAGCGGCCAAAGGGAGCAGACUCUAAAUCUGCCGUCAUCGACCUCGAAGGUUCGAAUCCUUCCCCCACCACCA               AGCUUCCCGAUAAGGGAGCAGGCCAGUAAAAAGCAUUACCCCGUGGUGGGGUUCCCGAGCGGCCAAAGGGAGCAGACUCUAAAUCUGCCGUCAU CGACCUCGAAGGUUCGAAUCCUUCCCCCCACCACCA   SEQ IDNO.:66SEQ ID NO.: 66   琥珀突变体L3TAGAmber mutant L3TAG   5’-ATGAAGTAGCTGTCTTCTATCGAACAAGCATGCG-3’5'-ATGAAGTAGCTGTCTTTCTATCGAACAAGCATGCG-3'   SEQ IDNO.:67SEQ ID NO.: 67   琥珀突变体I13TAGAmber mutant I13TAG   5’-CGAACAAGCATGCGATTAGTGCCGACTTAAAAAG-3’5'-CGAACAAGCATGCGATTAGTGCCGACTTAAAAAG-3'   SEQ IDNO.:68SEQ ID NO.: 68   琥珀突变体T44TAGAmber mutant T44TAG   5’-CGCTACTCTCCCAAATAGAAAAGGTCTCCGCTG-3’5'-CGCTACTCTCCCAAATAGAAAAGGTCTCCGCTG-3'   SEQ IDNO.:69SEQ ID NO.: 69   琥珀突变体F68TAGAmber mutant F68TAG   5’-CTGGAACAGCTATAGCTACTGATTTTTCCTCG-3’5'-CTGGAACAGCTATAGCTACTGATTTTTCCTCG-3'   SEQ IDNO.:70SEQ ID NO.: 70   琥珀突变体R110TAGAmber mutant R110TAG   5’-GCCGTCACAGATTAGTTGGCTTCAGTGGAGACTG-3’5'-GCCGTCACACAGATTAGTTGGCTTCAGTGGAGACTG-3'   SEQ IDNO.:71SEQ ID NO.: 71   琥珀突变体V114TAGAmber mutant V114TAG   5’-GATTGGCTTCATAGGAGACTGATATGCTCTAAC-3’5'-GATTGGCTTCATAGGAGACTGATATGCTCTAAC-3'

  SEQ IDNO.:72SEQ ID NO.: 72   琥珀突变体T121TAGAmber mutant T121TAG   5’-GCCTCTATAGTTGAGACAGCATAGAATAATGCG-3’5'-GCCTCTATAGTTGAGACAGCATAGAATAATGCG-3'   SEQ IDNO.:73SEQ ID NO.: 73   琥珀突变体I127TAGAmber mutant I127TAG   5’-GAGACAGCATAGATAGAGTGCGACATCATCATCGG-3’5'-GAGACAGCATAGATAGAGTGCGACATCATCATCGG-3'   SEQ IDNO.:74SEQ ID NO.: 74   琥珀突变体S131TAGAmber mutant S131TAG   5’-GAATAAGTGCGACATAGTCATCGGAAGAGAGTAGTAG-3’5'-GAATAAGTGCGACATAGTCATCGGAAGAGAGTAGTAG-3'   SEQ IDNO.:75SEQ ID NO.: 75   琥珀突变体T145TAGAmber mutant T145TAG   5’-GGTCAAAGACAGTTGTAGGTATCGATTGACTCGGC-3’5'-GGTCAAAGACAGTTGTAGGTATCGATTGACTCGGC-3'   SEQ IDNO.:76SEQ ID NO.: 76   允许位点突变体T44FPermissive site mutant T44F   5’-CGCTACTCTCCCCAAATTTAAAAGGTCTCCGCTG-3’5'-CGCTACTCTCCCCAAATTTAAAAGGTCTCCGCTG-3'   SEQ IDNO.:77SEQ ID NO.: 77   允许位点突变体T44YPermissive site mutant T44Y   5’-CGCTACTCTCCCCAAATATAAAAGGTCTCCGCTG-3’5'-CGCTACTTCCCCAAATATAAAAGGTCTCCGCTG-3'   SEQ IDNO.:78SEQ ID NO.: 78   允许位点突变体T44WPermissive site mutant T44W   5’-CGCTACTCTCCCCAAATGGAAAAGGTCTCCGCTG-3’5'-CGCTACTCTCCCCAAATGGAAAAGGTCTCCGCTG-3'   SEQ IDNO.:79SEQ ID NO.: 79   允许位点突变体T44DPermissive site mutant T44D   5’-CGCTACTCTCCCCAAAGATAAAAGGTCTCCGCTG-3’5'-CGCTACTTCTCCCCAAAGATAAAAGGTCTCCGCTG-3'   SEQ IDNO.:80SEQ ID NO.: 80   允许位点突变体T44KPermissive site mutant T44K   5’-CGCTACTCTCCCCAAAAAAAAAAGGTCTCCGCTG-3’5'-CGCTACTTCTCCCCAAAAAAAAAAGGTCTCCGCTG-3'   SEQ IDNO.:81SEQ ID NO.: 81   允许位点突变体R110FPermissive site mutant R110F   5’-GCCGTCACAGATTTTTTGGCTTCAGTGGAGACTG-3’5'-GCCGTCACCAGATTTTTTGGCTTCAGTGGAGACTG-3'   SEQ IDNO.:82SEQ ID NO.: 82   允许位点突变体R110YPermissive site mutant R110Y   5’-GCCGTCACAGATTATTTGGCTTCAGTGGAGACTG-3’5'-GCCGTCACCAGATTATTTGGCTTCAGTGGAGACTG-3'   SEQ IDNO.:83SEQ ID NO.: 83   允许位点突变体R110WPermissive site mutant R110W   5’-GCCGTCACAGATTGGTTGGCTTCAGTGGAGACTG-3’5'-GCCGTCACAGATTGGTTGGCTTCAGTGGAGACTG-3'   SEQ IDNO.:84SEQ ID NO.: 84   允许位点突变体R110DPermissive site mutant R110D   5’-GCCGTCACAGATGATTTGGCTTCAGTGGAGACTG-3’5'-GCCGTCACAGATGATTTGGCTTCAGTGGAGACTG-3'   SEQ IDNO.:85SEQ ID NO.: 85   允许位点突变体R110KPermissive site mutant R110K   5’-GCCGTCACAGATAAATTGGCTTCAGTGGAGACTG-3’5'-GCCGTCACAGATAAATTGGCTTCAGTGGAGACTG-3'   SEQ IDNO.:86SEQ ID NO.: 86   对-乙酰基PheRS-1合成酶氨基酸(aa)a p-Acetyl PheRS-1 synthase amino acid (aa) a   MASSNLIKQLQERGLVAQVTDEEALAERLAQGPIALICGFDPTADSLHLGHLVPLLCLKRFQQAGHKPVALVGGATGLIGDPSFKAAERKLNTEETVQEWVDKIRKQVAPFLDFDCGENSAIAANNYDWFGNMNVLTFLRDIGKHFSVNQMINKEAVKQRLNREGQGISFTEFSYNLLQGYGMACANKQYGVVLQIGGSDQWGNITSGIDLTRRLHQNQVFGLTVPLITKADGTKFGKTEGGAVWLDPKKTSPYKFYQFWINTADADVYRFLKFFTFMSIEEINALEEEDKNSGKAPRAQYVLAEQVTRLVHGEEGLQAAKRITECLFSGSLSALSEADFEQLAQDGVPMVEMEKGADLMQALVDSELQPSRGQARKTIASNAITINGEKQSDPEYFFKEEDRLFGRFTLLRRGKKNYCLICWK  MASSNLIKQLQERGLVAQVTDEEALAERLAQGPIALICGFDPTADSLHLGHLVPLLCLKRFQQAGHKPVALVGGATGLIGDPSFKAAERKLNTEETVQEWVDKIRKQVAPFLDFDCGENSAIAANNYDWFGNMNVLTFLRDIGKHFSVNQMINKEAVKQRLNREGQGISFTEFSYNLLQGYGMACANKQYGVVLQIGGSDQWGNITSGIDLTRRLHQNQVFGLTVPLITKADGTKFGKTEGGAVWLDPKKTSPYKFYQFWINTADADVYRFLKFFTFMSIEEINALEEEDKNSGKAPRAQYVLAEQVTRLVHGEEGLQAAKRITECLFSGSLSALSEADFEQLAQDGVPMVEMEKGADLMQALVDSELQPSRGQARKTIASNAITINGEKQSDPEYFFKEEDRLFGRFTLLRRGKKNYCLICWK

a这些克隆也含有Asp165Gly突变 aThese clones also contained the Asp165Gly mutation

序列表sequence listing

<110>斯克利普斯研究院(The Scipps Research Institute)<110>The Scipps Research Institute

A.戴特斯(Deiters,Alexander)A. Deiters (Deiters, Alexander)

A.T.克罗普(Cropp,T Ashton)A.T. Kropp (Cropp, T Ashton)

J.W.钦(Chin,Jason W)J.W. Chin (Jason W)

C.J.安德森(Anderson,J Chri stopher)C.J. Anderson (Anderson, J Christopher)

P.G.舒尔茨(Schultz,Peter G)P.G. Schultz (Peter G)

<120>非天然活性氨基酸遗传密码增加<120> Unnatural active amino acid genetic code increase

<130>54-000250US/PC<130>54-000250US/PC

<160>104<160>104

<170>PatentIn version 3.3<170>PatentIn version 3.3

<210>1<210>1

<211>1275<211>1275

<212>DNA<212>DNA

<213>大肠杆菌(Escherichia ccli)<213> Escherichia coli (Escherichia ccli)

<400>1<400>1

atggcaagca gtaacttgat taaacaattg caagagcggg ggctggtagc ccaggtgacg   60atggcaagca gtaacttgat taaacaattg caagagcggg ggctggtagc ccaggtgacg 60

gacgaggaag cgttagcaga gcgactggcg caaggcccga tcgcgctcta ttgcggcttc  120gacgaggaag cgttagcaga gcgactggcg caaggcccga tcgcgctcta ttgcggcttc 120

gatcctaccg ctgacagctt gcatttgggg catcttgttc cattgttatg cctgaaacgc  180gatcctaccg ctgacagctt gcatttgggg catcttgttc cattgttatg cctgaaacgc 180

ttccagcagg cgggccacaa gccggttgcg ctggtaggcg gcgcgacggg tctgattggc  240ttccagcagg cgggccacaa gccggttgcg ctggtaggcg gcgcgacggg tctgattggc 240

gacccgagct tcaaagctgc cgagcgtaag ctgaacaccg aagaaactgt tcaggagtgg  300gacccgagct tcaaagctgc cgagcgtaag ctgaacaccg aagaaactgt tcaggagtgg 300

gtggacaaaa tccgtaagca ggttgccccg ttcctcgatt tcgactgtgg agaaaactct  360gtggacaaaa tccgtaagca ggttgccccg ttcctcgatt tcgactgtgg agaaaactct 360

gctatcgcgg cgaacaacta tgactggttc ggcaatatga atgtgctgac cttcctgcgc  420gctatcgcgg cgaacaacta tgactggttc ggcaatatga atgtgctgac cttcctgcgc 420

gatattggca aacacttctc cgttaaccag atgatcaaca aagaagcggt taagcagcgt  480gatattggca aacacttctc cgttaaccag atgatcaaca aagaagcggt taagcagcgt 480

ctcaaccgtg aagatcaggg gatttcgttc actgagtttt cctacaacct gttgcagggt  540ctcaaccgtg aagatcaggg gatttcgttc actgagtttt cctacaacct gttgcagggt 540

tatgacttcg cctgtctgaa caaacagtac ggtgtggtgc tgcaaattgg tggttctgac  600tatgacttcg cctgtctgaa caaacagtac ggtgtggtgc tgcaaattgg tggttctgac 600

cagtggggta acatcacttc tggtatcgac ctgacccgtc gtctgcatca gaatcaggtg  660cagtggggta acatcacttc tggtatcgac ctgacccgtc gtctgcatca gaatcaggtg 660

tttggcctga ccgttccgct gatcactaaa gcagatggca ccaaatttgg taaaactgaa     720tttggcctga ccgttccgct gatcactaaa gcagatggca ccaaatttgg taaaactgaa 720

ggcggcgcag tctggttgga tccgaagaaa accagcccgt acaaattcta ccagttctgg     780ggcggcgcag tctggttgga tccgaagaaa accagcccgt acaaattcta ccagttctgg 780

atcaacactg cggatgccga cgtttaccgc ttcctgaagt tcttcacctt tatgagcatt     840atcaacactg cggatgccga cgtttaccgc ttcctgaagt tcttcacctt tatgagcatt 840

gaagagatca acgccctgga agaagaagat aaaaacagcg gtaaagcacc gcgcgcccag     900gaagagatca acgccctgga agaagaagat aaaaacagcg gtaaagcacc gcgcgcccag 900

tatgtactgg cggagcaggt gactcgtctg gttcacggtg aagaaggttt acaggcggca     960tatgtactgg cggagcaggt gactcgtctg gttcacggtg aagaaggttt acaggcggca 960

aaacgtatta ccgaatgcct gttcagcggt tctttgagtg cgctgagtga agcggacttc    1020aaacgtatta ccgaatgcct gttcagcggt tctttgagtg cgctgagtga agcggacttc 1020

gaacagctgg cgcaggacgg cgtaccgatg gttgagatgg aaaagggcgc agacctgatg    1080gaacagctgg cgcaggacgg cgtaccgatg gttgagatgg aaaagggcgc agacctgatg 1080

caggcactgg tcgattctga actgcaacct tcccgtggtc aggcacgtaa aactatcgcc    1140caggcactgg tcgattctga actgcaacct tcccgtggtc aggcacgtaa aactatcgcc 1140

tccaatgcca tcaccattaa cggtgaaaaa cagtccgatc ctgaatactt ctttaaagaa    1200tccaatgcca tcaccattaa cggtgaaaaa cagtccgatc ctgaatactt ctttaaagaa 1200

gaagatcgtc tgtttggtcg ttttacctta ctgcgtcgcg gtaaaaagaa ttactgtctg    1260gaagatcgtc tgtttggtcg ttttacctta ctgcgtcgcg gtaaaaagaa ttactgtctg 1260

atttgctgga aataa                                                     1275atttgctgga aataa 1275

<210>2<210>2

<211>424<211>424

<212>PRT<212>PRT

<213>大肠杆菌(Escherichia coli)<213>Escherichia coli

<400>2<400>2

Met Ala Ser Ser Asn Leu Ile Lys Gln Leu Gln Glu Arg Gly Leu ValMet Ala Ser Ser Asn Leu Ile Lys Gln Leu Gln Glu Arg Gly Leu Val

1               5                   10                  151 5 10 15

Ala Gln Val Thr Asp Glu Glu Ala Leu Ala Glu Arg Leu Ala Gln GlyAla Gln Val Thr Asp Glu Glu Ala Leu Ala Glu Arg Leu Ala Gln Gly

            20                  25                  3020 25 30

Pro Ile Ala Leu Tyr Cys Gly Phe Asp Pro Thr Ala Asp Ser Leu HisPro Ile Ala Leu Tyr Cys Gly Phe Asp Pro Thr Ala Asp Ser Leu His

        35                  40                  4535 40 45

Leu Gly His Leu Val Pro Leu Leu Cys Leu Lys Arg Phe Gln Gln AlaLeu Gly His Leu Val Pro Leu Leu Cys Leu Lys Arg Phe Gln Gln Ala

    50                  55                  6050 55 60

Gly His Lys Pro Val Ala Leu Val Gly Gly Ala Thr Gly Leu Ile GlyGly His Lys Pro Val Ala Leu Val Gly Gly Ala Thr Gly Leu Ile Gly

65                  70                  75                  8065 70 75 80

Asp Pro Ser Phe Lys Ala Ala Glu Arg Lys Leu Asn Thr Glu Glu ThrAsp Pro Ser Phe Lys Ala Ala Glu Arg Lys Leu Asn Thr Glu Glu Thr

                85                  90                  9585 90 95

Val Gln Glu Trp Val Asp Lys Ile Arg Lys Gln Val Ala Pro Phe LeuVal Gln Glu Trp Val Asp Lys Ile Arg Lys Gln Val Ala Pro Phe Leu

            100                 105                 110100 105 110

Asp Phe Asp Cys Gly Glu Asn Ser Ala Ile Ala Ala Asn Asn Tyr AspAsp Phe Asp Cys Gly Glu Asn Ser Ala Ile Ala Ala Asn Asn Tyr Asp

        115                 120                 125115 120 125

Trp Phe Gly Asn Met Asn Val Leu Thr Phe Leu Arg Asp Ile Gly LysTrp Phe Gly Asn Met Asn Val Leu Thr Phe Leu Arg Asp Ile Gly Lys

    130                 135                 140130 135 140

His Phe Ser Val Asn Gln Met Ile Asn Lys Glu Ala Val Lys Gln ArgHis Phe Ser Val Asn Gln Met Ile Asn Lys Glu Ala Val Lys Gln Arg

145                 150                 155                 160145 150 155 160

Leu Asn Arg Glu Asp Gln Gly Ile Ser Phe Thr Glu Phe Ser Tyr AsnLeu Asn Arg Glu Asp Gln Gly Ile Ser Phe Thr Glu Phe Ser Tyr Asn

                165                 170                 175165 170 175

Leu Leu Gln Gly Tyr Asp Phe Ala Cys Leu Asn Lys Gln Tyr Gly ValLeu Leu Gln Gly Tyr Asp Phe Ala Cys Leu Asn Lys Gln Tyr Gly Val

            180                 185                 190180 185 190

Val Leu Gln Ile Gly Gly Ser Asp Gln Trp Gly Asn Ile Thr Ser GlyVal Leu Gln Ile Gly Gly Ser Asp Gln Trp Gly Asn Ile Thr Ser Gly

        195                 200                 205195 200 205

Ile Asp Leu Thr Arg Arg Leu Hi s Gln Asn Gln Val Phe Gly Leu ThrIle Asp Leu Thr Arg Arg Leu Hi s Gln Asn Gln Val Phe Gly Leu Thr

    210                 215                 220210 215 220

Val Pro Leu Ile Thr Lys Ala Asp Gly Thr Lys Phe Gly Lys Thr GluVal Pro Leu Ile Thr Lys Ala Asp Gly Thr Lys Phe Gly Lys Thr Glu

225                 230                 235                 240225 230 235 240

Gly Gly Ala Val Trp Leu Asp Pro Lys Lys Thr Ser Pro Tyr Lys PheGly Gly Ala Val Trp Leu Asp Pro Lys Lys Thr Ser Pro Tyr Lys Phe

                245                 250                 255245 250 255

Tyr Gln Phe Trp Ile Asn Thr Ala Asp Ala Asp Val Tyr Arg Phe LeuTyr Gln Phe Trp Ile Asn Thr Ala Asp Ala Asp Val Tyr Arg Phe Leu

            260                 265                 270260 265 270

Lys Phe Phe Thr Phe Met Ser Ile Glu Glu Ile Asn Ala Leu Glu GluLys Phe Phe Thr Phe Met Ser Ile Glu Glu Ile Asn Ala Leu Glu Glu

        275                 280                 285275 280 285

Glu Asp Lys Asn Ser Gly Lys Ala Pro Arg Ala Gln Tyr Val Leu AlaGlu Asp Lys Asn Ser Gly Lys Ala Pro Arg Ala Gln Tyr Val Leu Ala

    290                 295                 300290 295 300

Glu Gln Val Thr Arg Leu Val His Gly Glu Glu Gly Leu Gln Ala AlaGlu Gln Val Thr Arg Leu Val His Gly Glu Glu Gly Leu Gln Ala Ala

305                 310                 315                 320305 310 315 320

Lys Arg Ile Thr Glu Cys Leu Phe Ser Gly Ser Leu Ser Ala Leu SerLys Arg Ile Thr Glu Cys Leu Phe Ser Gly Ser Leu Ser Ala Leu Ser

                325                 330                 335325 330 335

Glu Ala Asp Phe Glu Gln Leu Ala Gln Asp Gly Val Pro Met Val GluGlu Ala Asp Phe Glu Gln Leu Ala Gln Asp Gly Val Pro Met Val Glu

            340                 345                 350340 345 350

Met Glu Lys Gly Ala Asp Leu Met Gln Ala Leu Val Asp Ser Glu LeuMet Glu Lys Gly Ala Asp Leu Met Gln Ala Leu Val Asp Ser Glu Leu

        355                 360                 365355 360 365

Gln Pro Ser Arg Gly Gln Ala Arg Lys Thr Ile Ala Ser Asn Ala IleGln Pro Ser Arg Gly Gln Ala Arg Lys Thr Ile Ala Ser Asn Ala Ile

    370                 375                 380370 375 380

Thr Ile Asn Gly Glu Lys Gln Ser Asp Pro Glu Tyr Phe Phe Lys GluThr Ile Asn Gly Glu Lys Gln Ser Asp Pro Glu Tyr Phe Phe Lys Glu

385                 390                 395                 400385 390 395 400

Glu Asp Arg Leu Phe Gly Arg Phe Thr Leu Leu Arg Arg Gly Lys LysGlu Asp Arg Leu Phe Gly Arg Phe Thr Leu Leu Arg Arg Gly Lys Lys

                405                 410                 415405 410 415

Asn Tyr Cys Leu Ile Cys Trp LysAsn Tyr Cys Leu Ile Cys Trp Lys

            420420

<210>3<210>3

<211>1275<211>1275

<212>DNA<212>DNA

<213>人工<213> Artificial

<220><220>

<223>人工合成酶<223> Synthetic enzymes

<400>3<400>3

atggcaagca gtaacttgat taaacaattg caagagcggg ggctggtagc ccaggtgacg     60atggcaagca gtaacttgat taaacaattg caagagcggg ggctggtagc ccaggtgacg 60

gacgaggaag cgttagcaga gcgactggcg caaggcccga tcgcactcgt gtgtggcttc    120gacgaggaag cgttagcaga gcgactggcg caaggcccga tcgcactcgt gtgtggcttc 120

gatcctaccg ctgacagctt gcatttgggg catcttgttc cattgttatg cctgaaacgc    180gatcctaccg ctgacagctt gcatttgggg catcttgttc cattgttatg cctgaaacgc 180

ttccagcagg cgggccacaa gccggttgcg ctggtaggcg gcgcgacggg tctgattggc    240ttccagcagg cgggccacaa gccggttgcg ctggtaggcg gcgcgacggg tctgattggc 240

gacccgagct tcaaagctgc cgagcgtaag ctgaacaccg aagaaactgt tcaggagtgg    300gacccgagct tcaaagctgc cgagcgtaag ctgaacaccg aagaaactgt tcaggagtgg 300

gtggacaaaa tccgtaagca ggttgccccg ttcctcgatt tcgactgtgg agaaaactct    360gtggacaaaa tccgtaagca ggttgccccg ttcctcgatt tcgactgtgg agaaaactct 360

gctatcgcgg ccaataatta tgactggttc ggcaatatga atgtgctgac cttcctgcgc    420gctatcgcgg ccaataatta tgactggttc ggcaatatga atgtgctgac cttcctgcgc 420

gatattggca aacacttctc cgttaaccag atgatcaaca aagaagcggt taagcagcgt    480gatattggca aacacttctc cgttaaccag atgatcaaca aagaagcggt taagcagcgt 480

ctcaaccgtg aagatcaggg gatttcgttc actgagtttt cctacaacct gctgcagggt    540ctcaaccgtg aagatcaggg gatttcgttc actgagtttt cctacaacct gctgcagggt 540

tatagtatgg cctgtttgaa caaacagtac ggtgtggtgc tgcaaattgg tggttctgac    600tatagtatgg cctgtttgaa caaacagtac ggtgtggtgc tgcaaattgg tggttctgac 600

cagtggggta acatcacttc tggtatcgac ctgacccgtc gtctgcatca gaatcaggtg    660cagtggggta acatcacttc tggtatcgac ctgacccgtc gtctgcatca gaatcaggtg 660

tttggcctga ccgttccgct gatcactaaa gcagatggca ccaaatttgg taaaactgaa    720tttggcctga ccgttccgct gatcactaaa gcagatggca ccaaatttgg taaaactgaa 720

ggcggcgcag tctggttgga tccgaagaaa accagcccgt acaaattcta ccagttctgg    780ggcggcgcag tctggttgga tccgaagaaa accagcccgt acaaattcta ccagttctgg 780

atcaacactg cggatgccga cgtttaccgc ttcctgaagt tcttcacctt tatgagcatt    840atcaacactg cggatgccga cgtttaccgc ttcctgaagt tcttcacctt tatgagcatt 840

gaagagatca acgccctgga agaagaagat aaaaacagcg gtaaagcacc gcgcgcccag   900gaagagatca acgccctgga agaagaagat aaaaacagcg gtaaagcacc gcgcgcccag 900

tatgtactgg cggagcaggt gactcgtctg gttcacggtg aagaaggttt acaggcggca   960tatgtactgg cggagcaggt gactcgtctg gttcacggtg aagaaggttt acaggcggca 960

aaacgtatta ccgaatgcct gttcagcggt tctttgagtg cgctgagtga agcggactt   1020aaacgtatta ccgaatgcct gttcagcggt tctttgagtg cgctgagtga agcggactt 1020

gaacagctgg cgcaggacgg cgtaccgatg gttgagatgg aaaagggcgc agacctgatg  1080gaacagctgg cgcaggacgg cgtaccgatg gttgagatgg aaaagggcgc agacctgatg 1080

caggcactgg tcgattctga actgcaacct tcccgtggtc aggcacgtaa aactatcgcc  1140caggcactgg tcgattctga actgcaacct tcccgtggtc aggcacgtaa aactatcgcc 1140

tccaatgcca tcaccattaa cggtgaaaaa cagtccgatc ctgaatactt ctttaaagaa  1200tccaatgcca tcaccattaa cggtgaaaaa cagtccgatc ctgaatactt ctttaaagaa 1200

gaagatcgtc tgtttggtcg ttttacctta ctgcgtcgcg gtaaaaagaa ttactgtctg  1260gaagatcgtc tgtttggtcg ttttacctta ctgcgtcgcg gtaaaaagaa ttactgtctg 1260

atttgctgga aataa                                                   1275atttgctgga aataa 1275

<210>4<210>4

<211>1275<211>1275

<212>DNA<212>DNA

<213>人工<213> Artificial

<220><220>

<223>人工合成酶<223> Synthetic enzymes

<400>4<400>4

atggcaagca gtaacttgat taaacaattg caagagcggg ggctggtagc ccaggtgacg   60atggcaagca gtaacttgat taaacaattg caagagcggg ggctggtagc ccaggtgacg 60

gacgaggaag cgttagcaga gcgactggcg caaggcccga tcgcactcac ttgtggcttc  120gacgaggaag cgttagcaga gcgactggcg caaggcccga tcgcactcac ttgtggcttc 120

gatcctaccg ctgacagctt gcatttgggg catcttgttc cattgttatg cctgaaacgc  180gatcctaccg ctgacagctt gcatttgggg catcttgttc cattgttatg cctgaaacgc 180

ttccagcagg cgggccacaa gccggttgcg ctggtaggcg gcgcgacggg tctgattggc  240ttccagcagg cgggccacaa gccggttgcg ctggtaggcg gcgcgacggg tctgattggc 240

gacccgagct tcaaagctgc cgagcgtaag ctgaacaccg aagaaactgt tcaggagtgg  300gacccgagct tcaaagctgc cgagcgtaag ctgaacaccg aagaaactgt tcaggagtgg 300

gtggacaaaa tccgtaagca ggttgccccg ttcctcgatt tcgactgtgg agaaaactct  360gtggacaaaa tccgtaagca ggttgccccg ttcctcgatt tcgactgtgg agaaaactct 360

gctatcgcgg ccaataatta tgactggttc agcaatatga atgtgctgac cttcctgcgc  420gctatcgcgg ccaataatta tgactggttc agcaatatga atgtgctgac cttcctgcgc 420

gatattggca aacacttctc cgttaaccag atgatcaaca aagaagcggt taagcagcgt  480gatattggca aacacttctc cgttaaccag atgatcaaca aagaagcggt taagcagcgt 480

ctcaaccgtg aagatcaggg gatttcgttc actgagtttt cctacaacct gctgcagggt  540ctcaaccgtg aagatcaggg gatttcgttc actgagtttt cctacaacct gctgcagggt 540

tatacgtatg cctgtctgaa caaacagtac ggtgtggtgc tgcaaattgg tggttctgac  600tatacgtatg cctgtctgaa caaacagtac ggtgtggtgc tgcaaattgg tggttctgac 600

cagtggggta acatcacttc tggtatcgac ctgacccgtc gtctgcatca gaatcaggtg  660cagtggggta acatcacttc tggtatcgac ctgacccgtc gtctgcatca gaatcaggtg 660

tttggcctga ccgttccgct gatcactaaa gcagatggca ccaaatttgg taaaactgaa  720tttggcctga ccgttccgct gatcactaaa gcagatggca ccaaatttgg taaaactgaa 720

ggcggcgcag tctggttgga tccgaagaaa accagcccgt acaaattcta ccagttctgg  780ggcggcgcag tctggttgga tccgaagaaa accagcccgt acaaattcta ccagttctgg 780

atcaacactg cggatgccga cgtttaccgc ttcctgaagt tcttcacctt tatgagcatt  840atcaacactg cggatgccga cgtttaccgc ttcctgaagt tcttcacctt tatgagcatt 840

gaagagatca acgccctgga agaagaagat aaaaacagcg gtaaagcacc gcgcgcccag  900gaagagatca acgccctgga agaagaagat aaaaacagcg gtaaagcacc gcgcgcccag 900

tatgtactgg cggagcaggt gactcgtctg gttcacggtg aagaaggttt acaggcggca  960tatgtactgg cggagcaggt gactcgtctg gttcacggtg aagaaggttt acaggcggca 960

aaacgtatta ccgaatgcct gttcagcggt tctttgagtg cgctgagtga agcggacttc 1020aaacgtatta ccgaatgcct gttcagcggt tctttgagtg cgctgagtga agcggacttc 1020

gaacagctgg cgcaggacgg cgtaccgatg gttgagatgg aaaagggcgc agacctgatg 1080gaacagctgg cgcaggacgg cgtaccgatg gttgagatgg aaaagggcgc agacctgatg 1080

caggcactgg tcgattctga actgcaacct tcccgtggtc aggcacgtaa aactatcgcc 1140caggcactgg tcgattctga actgcaacct tcccgtggtc aggcacgtaa aactatcgcc 1140

tccaatgcca tcaccattaa cggtgaaaaa cagtccgatc ctgaatactt ctttaaagaa 1200tccaatgcca tcaccattaa cggtgaaaaa cagtccgatc ctgaatactt ctttaaagaa 1200

gaagatcgtc tgtttggtcg ttttacctta ctgcgtcgcg gtaaaaagaa ttactgtctg 1260gaagatcgtc tgtttggtcg ttttacctta ctgcgtcgcg gtaaaaagaa ttactgtctg 1260

atttgctgga aataa                                                  1275atttgctgga aataa 1275

<210>5<210>5

<211>1275<211>1275

<212>DNA<212>DNA

<213>人工<213> Artificial

<220><220>

<223>人工合成酶<223> Synthetic enzymes

<400>5<400>5

atggcaagca gtaacttgat taaacaattg caagagcggg ggctggtagc ccaggtgacg     60atggcaagca gtaacttgat taaacaattg caagagcggg ggctggtagc ccaggtgacg 60

gacgaggaag cgttagcaga gcgactggcg caaggcccga tcgcactcgt gtgtggcttc    120gacgaggaag cgttagcaga gcgactggcg caaggcccga tcgcactcgt gtgtggcttc 120

gatcctaccg ctgacagctt gcatttgggg catcttgttc cattgttatg cctgaaacgc    180gatcctaccg ctgacagctt gcatttgggg catcttgttc cattgttatg cctgaaacgc 180

ttccagcagg cgggccacaa gccggttgcg ctggtaggcg gcgcgacggg tctgattggc    240ttccagcagg cgggccacaa gccggttgcg ctggtaggcg gcgcgacggg tctgattggc 240

gacccgagct tcaaagctgc cgagcgtaag ctgaacaccg aagaaactgt tcaggagtgg  300gacccgagct tcaaagctgc cgagcgtaag ctgaacaccg aagaaactgt tcaggagtgg 300

gtggacaaaa tccgtaagca ggttgccccg ttcctcgatt tcgactgtgg agaaaactct  360gtggacaaaa tccgtaagca ggttgccccg ttcctcgatt tcgactgtgg agaaaactct 360

gctatcgcgg ccaataatta tgactggttc ggcaatatga atgtgctgac cttcctgcgc  420gctatcgcgg ccaataatta tgactggttc ggcaatatga atgtgctgac cttcctgcgc 420

gatattggca aacacttctc cgttaaccag atgatcaaca aagaagcggt taagcagcgt  480gatattggca aacacttctc cgttaaccag atgatcaaca aagaagcggt taagcagcgt 480

ctcaaccgtg aagatcaggg gatttcgttc actgagtttt cctacaacct gctgcagggt  540ctcaaccgtg aagatcaggg gatttcgttc actgagtttt cctacaacct gctgcagggt 540

tatagtatgg cctgtttgaa caaacagtac ggtgtggtgc tgcaaattgg tggttctgac  600tatagtatgg cctgtttgaa caaacagtac ggtgtggtgc tgcaaattgg tggttctgac 600

cagtggggta acatcacttc tggtatcgac ctgacccgtc gtctgcatca gaatcaggtg  660cagtggggta acatcacttc tggtatcgac ctgacccgtc gtctgcatca gaatcaggtg 660

tttggcctga ccgttccgct gatcactaaa gcagatggca ccaaatttgg taaaactgaa  720tttggcctga ccgttccgct gatcactaaa gcagatggca ccaaatttgg taaaactgaa 720

ggcggcgcag tctggttgga tccgaagaaa accagcccgt acaaattcta ccagttctgg  780ggcggcgcag tctggttgga tccgaagaaa accagcccgt acaaattcta ccagttctgg 780

atcaacactg cggatgccga cgtttaccgc ttcctgaagt tcttcacctt tatgagcatt  840atcaacactg cggatgccga cgtttaccgc ttcctgaagt tcttcacctt tatgagcatt 840

gaagagatca acgccctgga agaagaagat aaaaacagcg gtaaagcacc gcgcgcccag  900gaagagatca acgccctgga agaagaagat aaaaacagcg gtaaagcacc gcgcgcccag 900

tatgtactgg cggagcaggt gactcgtctg gttcacggtg aagaaggttt acaggcggca  960tatgtactgg cggagcaggt gactcgtctg gttcacggtg aagaaggttt acaggcggca 960

aaacgtatta ccgaatgcct gttcagcggt tctttgagtg cgctgagtga agcggacttc 1020aaacgtatta ccgaatgcct gttcagcggt tctttgagtg cgctgagtga agcggacttc 1020

gaacagctgg cgcaggacgg cgtaccgatg gttgagatgg aaaagggcgc agacctgatg 1080gaacagctgg cgcaggacgg cgtaccgatg gttgagatgg aaaagggcgc agacctgatg 1080

caggcactgg tcgattctga actgcaacct tcccgtggtc aggcacgtaa aactatcgcc 1140caggcactgg tcgattctga actgcaacct tcccgtggtc aggcacgtaa aactatcgcc 1140

tccaatgcca tcaccattaa cggtgaaaaa cagtccgatc ctgaatactt ctttaaagaa 1200tccaatgcca tcaccattaa cggtgaaaaa cagtccgatc ctgaatactt ctttaaagaa 1200

gaagatcgtc tgtttggtcg ttttacctta ctgcgtcgcg gtaaaaagaa ttactgtctg 1260gaagatcgtc tgtttggtcg ttttacctta ctgcgtcgcg gtaaaaagaa ttactgtctg 1260

atttgctgga aataa                                                  1275atttgctgga aataa 1275

<210>6<210>6

<211>1275<211>1275

<212>DNA<212>DNA

<213>人工<213> Artificial

<220><220>

<223>人工合成酶<223> Synthetic enzymes

<400>6<400>6

atggcaagca gtaacttgat taaacaattg caagagcggg ggctggtagc ccaggtgacg     60atggcaagca gtaacttgat taaacaattg caagagcggg ggctggtagc ccaggtgacg 60

gacgaggaag cgttagcaga gcgactggcg caaggcccga tcgcactcgt gtgtggcttc    120gacgaggaag cgttagcaga gcgactggcg caaggcccga tcgcactcgt gtgtggcttc 120

gatcctaccg ctgacagctt gcatttgggg catcttgttc cattgttatg cctgaaacgc    180gatcctaccg ctgacagctt gcatttgggg catcttgttc cattgttatg cctgaaacgc 180

ttccagcagg cgggccacaa gccggttgcg ctggtaggcg gcgcgacggg tctgattggc    240ttccagcagg cgggccacaa gccggttgcg ctggtaggcg gcgcgacggg tctgattggc 240

gacccgagct tcaaagctgc cgagcgtaag ctgaacaccg aagaaactgt tcaggagtgg    300gacccgagct tcaaagctgc cgagcgtaag ctgaacaccg aagaaactgt tcaggagtgg 300

gtggacaaaa tccgtaagca ggttgccccg ttcctcgatt tcgactgtgg agaaaactct    360gtggacaaaa tccgtaagca ggttgccccg ttcctcgatt tcgactgtgg agaaaactct 360

gctatcgcgg ccaataatta tgactggttc ggcaatatga atgtgctgac cttcctgcgc    420gctatcgcgg ccaataatta tgactggttc ggcaatatga atgtgctgac cttcctgcgc 420

gatattggca aacacttctc cgttaaccag atgatcaaca aagaagcggt taagcagcgt    480gatattggca aacacttctc cgttaaccag atgatcaaca aagaagcggt taagcagcgt 480

ctcaaccgtg aagatcaggg gatttcgttc actgagtttt cctacaacct gctgcagggt    540ctcaaccgtg aagatcaggg gatttcgttc actgagtttt cctacaacct gctgcagggt 540

tatagtatgg cctgtttgaa caaacagtac ggtgtggtgc tgcaaattgg tggttctgac    600tatagtatgg cctgtttgaa caaacagtac ggtgtggtgc tgcaaattgg tggttctgac 600

cagtggggta acatcacttc tggtatcgac ctgacccgtc gtctgcatca gaatcaggtg    660cagtggggta acatcacttc tggtatcgac ctgacccgtc gtctgcatca gaatcaggtg 660

tttggcctga ccgttccgct gatcactaaa gcagatggca ccaaatttgg taaaactgaa    720tttggcctga ccgttccgct gatcactaaa gcagatggca ccaaatttgg taaaactgaa 720

ggcggcgcag tctggttgga tccgaagaaa accagcccgt acaaattcta ccagttctgg    780ggcggcgcag tctggttgga tccgaagaaa accagcccgt acaaattcta ccagttctgg 780

atcaacactg cggatgccga cgtttaccgc ttcctgaagt tcttcacctt tatgagcatt    840atcaacactg cggatgccga cgtttaccgc ttcctgaagt tcttcacctt tatgagcatt 840

gaagagatca acgccctgga agaagaagat aaaaacagcg gtaaagcacc gcgcgcccag    900gaagagatca acgccctgga agaagaagat aaaaacagcg gtaaagcacc gcgcgcccag 900

tatgtactgg cggagcaggt gactcgtctg gttcacggtg aagaaggttt acaggcggca    960tatgtactgg cggagcaggt gactcgtctg gttcacggtg aagaaggttt acaggcggca 960

aaacgtatta ccgaatgcct gttcagcggt tctttgagtg cgctgagtga agcggacttc   1020aaacgtatta ccgaatgcct gttcagcggt tctttgagtg cgctgagtga agcggacttc 1020

gaacagctgg cgcaggacgg cgtaccgatg gttgagatgg aaaagggcgc agacctgatg   1080gaacagctgg cgcaggacgg cgtaccgatg gttgagatgg aaaagggcgc agacctgatg 1080

caggcactgg tcgattctga actgcaacct tcccgtggtc aggcacgtaa aactatcgcc   1140caggcactgg tcgattctga actgcaacct tcccgtggtc aggcacgtaa aactatcgcc 1140

tccaatgcca tcaccattaa cggtgaaaaa cagtccgatc ctgaatactt ctttaaagaa   1200tccaatgcca tcaccattaa cggtgaaaaa cagtccgatc ctgaatactt ctttaaagaa 1200

gaagatcgtc tgtttggtcg ttttacctta ctgcgtcgcg gtaaaaagaa ttactgtctg   1260gaagatcgtc tgtttggtcg ttttacctta ctgcgtcgcg gtaaaaagaa ttactgtctg 1260

atttgctgga aataa                                                   1275atttgctgga aataa 1275

<210>7<210>7

<211>1275<211>1275

<212>DNA<212>DNA

<213>人工<213> Artificial

<220><220>

<223>人工合成酶<223> Synthetic enzymes

<400>7<400>7

atggcaagca gtaacttgat taaacaattg caagagcggg ggctggtagc ccaggtgacg   60atggcaagca gtaacttgat taaacaattg caagagcggg ggctggtagc ccaggtgacg 60

gacgaggaag cgttagcaga gcgactggcg caaggcccga tcgcactcac gtgtggcttc  120gacgaggaag cgttagcaga gcgactggcg caaggcccga tcgcactcac gtgtggcttc 120

gatcctaccg ctgacagctt gcatttgggg catcttgttc cattgttatg cctgaaacgc  180gatcctaccg ctgacagctt gcatttgggg catcttgttc cattgttatg cctgaaacgc 180

ttccagcagg cgggccacaa gccggttgcg ctggtaggcg gcgcgacggg tctgattggc  240ttccagcagg cgggccacaa gccggttgcg ctggtaggcg gcgcgacggg tctgattggc 240

gacccgagct tcaaagctgc cgagcgtaag ctgaacaccg aagaaactgt tcaggagtgg  300gacccgagct tcaaagctgc cgagcgtaag ctgaacaccg aagaaactgt tcaggagtgg 300

gtggacaaaa tccgtaagca ggttgccccg ttcctcgatt tcgactgtgg agaaaactct  360gtggacaaaa tccgtaagca ggttgccccg ttcctcgatt tcgactgtgg agaaaactct 360

gctatcgcgg ccaataatta tgactggttc ggcaatatga atgtgctgac cttcctgcgc  420gctatcgcgg ccaataatta tgactggttc ggcaatatga atgtgctgac cttcctgcgc 420

gatattggca aacacttctc cgttaaccag atgatcaaca aagaagcggt taagcagcgt  480gatattggca aacacttctc cgttaaccag atgatcaaca aagaagcggt taagcagcgt 480

ctcaaccgtg aagatcaggg gatttcgttc actgagtttt cctacagcct gctgcagggt  540ctcaaccgtg aagatcaggg gatttcgttc actgagtttt cctacagcct gctgcagggt 540

tatacgatgg cctgtctgaa caaacagtac ggtgtggtgc tgcaaattgg tggttctgac  600tatacgatgg cctgtctgaa caaacagtac ggtgtggtgc tgcaaattgg tggttctgac 600

cagtggggta acatcacttc tggtatcgac ctgacccgtc gtctgcatca gaatcaggtg  660cagtggggta acatcacttc tggtatcgac ctgacccgtc gtctgcatca gaatcaggtg 660

tttggcctga ccgttccgct gatcactaaa gcagatggca ccaaatttgg taaaactgaa  720tttggcctga ccgttccgct gatcactaaa gcagatggca ccaaatttgg taaaactgaa 720

ggcggcgcag tctggttgga tccgaagaaa accagcccgt acaaattcta ccagttctgg  780ggcggcgcag tctggttgga tccgaagaaa accagcccgt acaaattcta ccagttctgg 780

atcaacactg cggatgccga cgtttaccgc ttcctgaagt tcttcacctt tatgagcatt  840atcaacactg cggatgccga cgtttaccgc ttcctgaagt tcttcacctt tatgagcatt 840

gaagagatca acgccctgga agaagaagat aaaaacagcg gtaaagcacc gcgcgcccag  900gaagagatca acgccctgga agaagaagat aaaaacagcg gtaaagcacc gcgcgcccag 900

tatgtactgg cggagcaggt gactcgtctg gttcacggtg aagaaggttt acaggcggca  960tatgtactgg cggagcaggt gactcgtctg gttcacggtg aagaaggttt acaggcggca 960

aaacgtatta ccgaatgcct gttcagcggt tctttgagtg cgctgagtga agcggacttc   1020aaacgtatta ccgaatgcct gttcagcggt tctttgagtg cgctgagtga agcggacttc 1020

gaacagctgg cgcaggacgg cgtaccgatg gttgagatgg aaaagggcgc agacctgatg   1080gaacagctgg cgcaggacgg cgtaccgatg gttgagatgg aaaagggcgc agacctgatg 1080

caggcactgg tcgattctga actgcaacct tcccgtggtc aggcacgtaa aactatcgcc   1140caggcactgg tcgattctga actgcaacct tcccgtggtc aggcacgtaa aactatcgcc 1140

tccaatgcca tcaccattaa cggtgaaaaa cagtccgatc ctgaatactt ctttaaagaa   1200tccaatgcca tcaccattaa cggtgaaaaa cagtccgatc ctgaatactt ctttaaagaa 1200

gaagatcgtc tgtttggtcg ttttacctta ctgcgtcgcg gtaaaaagaa ttactgtctg   1260gaagatcgtc tgtttggtcg ttttacctta ctgcgtcgcg gtaaaaagaa ttactgtctg 1260

atttgctgga aataa                                                    1275atttgctgga aataa 1275

<210>8<210>8

<211>540<211>540

<212>DNA<212>DNA

<213>人工<213> Artificial

<220><220>

<223>人工合成酶<223> Synthetic enzymes

<400>8<400>8

cgggggctgg tagcccaggt gacggacgag gaagcgttag cagagcgact ggcgcaaggc     60cgggggctgg tagcccaggt gacggacgag gaagcgttag cagagcgact ggcgcaaggc 60

ccgatcgcac tcacttgtgg cttcgatcct accgctgaca gcttgcattt ggggcatctt    120ccgatcgcac tcacttgtgg cttcgatcct accgctgaca gcttgcattt ggggcatctt 120

gttccattgt tatgcctgaa acgcttccag caggcgggcc acaagccggt tgcgctggta    180gttccattgt tatgcctgaa acgcttccag caggcgggcc acaagccggt tgcgctggta 180

ggcggcgcga cgggtctgat tggcgacccg agcttcaaag ctgccgagcg taagctgaac    240ggcggcgcga cgggtctgat tggcgacccg agcttcaaag ctgccgagcg taagctgaac 240

accgaagaaa ctgttcagga gtgggtggac aaaatccgta agcaggttgc cccgttcctc    300accgaagaaa ctgttcagga gtgggtggac aaaatccgta agcaggttgc cccgttcctc 300

gatttcgact gtggagaaaa ctctgctatc gcggccaata attatgactg gttcagcaat    360gatttcgact gtggagaaaa ctctgctatc gcggccaata attatgactg gttcagcaat 360

atgaatgtgc tgaccttcct gcgcgatatt ggcaaacact tctccgttaa ccagatgatc    420atgaatgtgc tgaccttcct gcgcgatatt ggcaaacact tctccgttaa ccagatgatc 420

aacaaagaag cggttaagca gcgtctcaac cgtgaagatc aggggatttc gttcactgag    480aacaaagaag cggttaagca gcgtctcaac cgtgaagatc aggggatttc gttcactgag 480

ttttcctaca acctgctgca gggttatacg tatgcctgtc tgaacaaaca gtacggtgtg    540ttttcctaca acctgctgca gggttatacg tatgcctgtc tgaacaaaca gtacggtgtg 540

<210>9<210>9

<211>540<211>540

<212>DNA<212>DNA

<213>人工<213> Artificial

<220><220>

<223>人工合成酶<223> Synthetic enzymes

<400>9<400>9

cgggggctgg taccccaggt gacggacgag gaagcgttag cagagcgact ggcgcaaggc   60cgggggctgg taccccaggt gacggacgag gaagcgttag cagagcgact ggcgcaaggc 60

ccgatcgcac tcacttgtgg cttcgatcct accgctgaca gcttgcattt ggggcatctt  120ccgatcgcac tcacttgtgg cttcgatcct accgctgaca gcttgcattt ggggcatctt 120

gttccattgt tatgcctgaa acgcttccag caggcgggcc acaagccggt tgcgctggta  180gttccattgt tatgcctgaa acgcttccag caggcgggcc acaagccggt tgcgctggta 180

ggcggcgcga cgggtctgat tggcgacccg agcttcaaag ctgccgagcg taagctgaac  240ggcggcgcga cgggtctgat tggcgacccg agcttcaaag ctgccgagcg taagctgaac 240

accgaagaaa ctgttcagga gtgggtggac aaaatccgta agcaggttgc cccgttcctc  300accgaagaaa ctgttcagga gtgggtggac aaaatccgta agcaggttgc cccgttcctc 300

gatttcgact gtggagaaaa ctctgctatc gcggccaata attatgactg gttcagcaat  360gatttcgact gtggagaaaa ctctgctatc gcggccaata attatgactg gttcagcaat 360

atgaatgtgc tgaccttcct gcgcgatatt ggcaaacact tctccgttaa ccagatgatc  420atgaatgtgc tgaccttcct gcgcgatatt ggcaaacact tctccgttaa ccagatgatc 420

aacaaagaag cggttaagca gcgtctcaac cgtgaagatc aggggatttc gttcactgag  480aacaaagaag cggttaagca gcgtctcaac cgtgaagatc aggggatttc gttcactgag 480

ttttcctaca acctgctgca gggttatacg tatgcctgtc tgaacaaaca gtacggtgtg  540ttttcctaca acctgctgca gggttatacg tatgcctgtc tgaacaaaca gtacggtgtg 540

<210>10<210>10

<211>540<211>540

<212>DNA<212>DNA

<213>人工<213> Artificial

<220><220>

<223>人工合成酶<223> Synthetic enzymes

<400>10<400>10

cgggggctgg tagcccaggt gacggacgag gaagcgttag cagagcgact ggcgcaaggc   60cgggggctgg tagcccaggt gacggacgag gaagcgttag cagagcgact ggcgcaaggc 60

ccgatcgcac tcacttgtgg cttcgatcct accgctgaca gcttgcattt ggggcatctt  120ccgatcgcac tcacttgtgg cttcgatcct accgctgaca gcttgcattt ggggcatctt 120

gttccattgt tatgcctgaa acgcttccag caggcgggcc acaagccggt tgcgctggta  180gttccattgt tatgcctgaa acgcttccag caggcgggcc acaagccggt tgcgctggta 180

ggcggcgcga cgggtctgat tggcgacccg agcttcaaag ctgccgagcg taagctgaac  240ggcggcgcga cgggtctgat tggcgacccg agcttcaaag ctgccgagcg taagctgaac 240

accgaagaaa ctgttcagga gtgggtggac aaaatccgta agcaggttgc cccgttcctc  300accgaagaaa ctgttcagga gtgggtggac aaaatccgta agcaggttgc cccgttcctc 300

gatttcgact gtggagaaaa ctctgctatc gcggccaata attatgactg gttcagcaat  360gatttcgact gtggagaaaa ctctgctatc gcggccaata attatgactg gttcagcaat 360

atgaatgtgc tgaccttcct gcgcgatatt ggcaaacact tctccgttaa ccagatgatc  420atgaatgtgc tgaccttcct gcgcgatatt ggcaaacact tctccgttaa ccagatgatc 420

aacaaagaag cggttaagca gcgtctcaac cgtgaagatc aggggatttc gttcactgag  480aacaaagaag cggttaagca gcgtctcaac cgtgaagatc aggggatttc gttcactgag 480

ttttcctaca acctgctgca gggttatacg tatgcctgtc tgaacaaaca gtacggtgtg  540ttttcctaca acctgctgca gggttatacg tatgcctgtc tgaacaaaca gtacggtgtg 540

<210>11<210>11

<211>540<211>540

<212>DNA<212>DNA

<213>人工<213> Artificial

<220><220>

<223>人工合成酶<223> Synthetic enzymes

<400>11<400>11

cgggggctgg tagcccaggt gacggacgag gaagcgttag cagagcgact ggcgcaaggc   60cgggggctgg tagcccaggt gacggacgag gaagcgttag cagagcgact ggcgcaaggc 60

ccgatcgcac tcacttgtgg cttcgatcct accgctgaca gcttgcattt ggggcatctt  120ccgatcgcac tcacttgtgg cttcgatcct accgctgaca gcttgcattt ggggcatctt 120

gttccattgt tatgcctgaa acgcttccag caggcgggcc acaagccggt tgcgctggta  180gttccattgt tatgcctgaa acgcttccag caggcgggcc acaagccggt tgcgctggta 180

ggcggcgcga cgggtctgat tggcgacccg agcttcaaag ctgccgagcg taagctgaac  240ggcggcgcga cgggtctgat tggcgacccg agcttcaaag ctgccgagcg taagctgaac 240

accgaagaaa ctgttcagga gtgggtggac aaaatccgta agcaggttgc cccgttcctc  300accgaagaaa ctgttcagga gtgggtggac aaaatccgta agcaggttgc cccgttcctc 300

gatttcgact gtggagaaaa ctctgctatc gcggccaata attatgactg gttcggcaat  360gatttcgact gtggagaaaa ctctgctatc gcggccaata attatgactg gttcggcaat 360

atgaatgtgc tgaccttcct gcgcgatatt ggcaaacact tctccgttaa ccagatgatc  420atgaatgtgc tgaccttcct gcgcgatatt ggcaaacact tctccgttaa ccagatgatc 420

aacaaagaag cggttaagca gcgtctcaac cgtgaagatc aggggatttc gttcactgag  480aacaaagaag cggttaagca gcgtctcaac cgtgaagatc aggggatttc gttcactgag 480

ttttcctaca acctgctgca gggttattcg tatgcctgtg cgaacaaaca gtacggtgtg  540ttttcctaca acctgctgca gggttattcg tatgcctgtg cgaacaaaca gtacggtgtg 540

<210>12<210>12

<211>540<211>540

<212>DNA<212>DNA

<213>人工<213> Artificial

<220><220>

<223>人工合成酶<223> Synthetic enzymes

<400>12<400>12

cgggggctgg tagcccaggt gacggacgag gaagcgttag cagagcgact ggcgcaaggc     60cgggggctgg tagcccaggt gacggacgag gaagcgttag cagagcgact ggcgcaaggc 60

ccgatcgcac tcacttgtgg cttcgatcct accgctgaca gcttgcattt ggggcatctt    120ccgatcgcac tcacttgtgg cttcgatcct accgctgaca gcttgcattt ggggcatctt 120

gttccattgt tatgcctgaa acgcttccag caggcgggcc acaagccggt tgcgctggta    180gttccattgt tatgcctgaa acgcttccag caggcgggcc acaagccggt tgcgctggta 180

ggcggcgcga cgggtctgat tggcgacccg agcttcaaag ctgccgagcg taagctgaac    240ggcggcgcga cgggtctgat tggcgacccg agcttcaaag ctgccgagcg taagctgaac 240

accgaagaaa ctgttcagga gtgggtggac aaaatccgta agcaggttgc cccgttcctc    300accgaagaaa ctgttcagga gtgggtggac aaaatccgta agcaggttgc cccgttcctc 300

gatttcgact gtggagaaaa ctctgctatc gcggccaata attatgactg gttcagcaat    360gatttcgact gtggagaaaa ctctgctatc gcggccaata attatgactg gttcagcaat 360

atgaatgtgc tgaccttcct gcgcgatatt ggcaaacact tctccgttaa ccagatgatc    420atgaatgtgc tgaccttcct gcgcgatatt ggcaaacact tctccgttaa ccagatgatc 420

aacaaagaag cggttaagca gcgtctcaac cgtgaagatc aggggatttc gttcactgag    480aacaaagaag cggttaagca gcgtctcaac cgtgaagatc aggggatttc gttcactgag 480

ttttcctaca acctgctgca gggttatacg tatgcctgtc tgaacaaaca gtacggtgtg    540ttttcctaca acctgctgca gggttatacg tatgcctgtc tgaacaaaca gtacggtgtg 540

<210>13<210>13

<211>540<211>540

<212>DNA<212>DNA

<213>人工<213> Artificial

<220><220>

<223>人工合成酶<223> Synthetic enzymes

<400>13<400>13

cgggggctgg taccccaggt gacggacgag gaagcgttag cagagcgact ggcgcaaggc     60cgggggctgg taccccaggt gacggacgag gaagcgttag cagagcgact ggcgcaaggc 60

ccgatcgcac tcctttgtgg cttcgatcct accgctgaca gcttgcattt ggggcatctt    120ccgatcgcac tcctttgtgg cttcgatcct accgctgaca gcttgcattt ggggcatctt 120

gttccattgt tatgcctgaa acgcttccag caggcgggcc acaagccggt tgcgctggta    180gttccattgt tatgcctgaa acgcttccag caggcgggcc acaagccggt tgcgctggta 180

ggcggcgcga cgggtctgat tggcgacccg agcttcaaag ctgccgagcg taagctgaac    240ggcggcgcga cgggtctgat tggcgacccg agcttcaaag ctgccgagcg taagctgaac 240

accgaagaaa ctgttcagga gtgggtggac aaaatccgta agcaggttgc cccgttcctc    300accgaagaaa ctgttcagga gtgggtggac aaaatccgta agcaggttgc cccgttcctc 300

gatttcgact gtggagaaaa ctctgctatc gcggccaata attatgactg gttcggcaat    360gatttcgact gtggagaaaa ctctgctatc gcggccaata attatgactg gttcggcaat 360

atgaatgtgc tgaccttcct gcgcgatatt ggcaaacact tctccgttaa ccagatgatc    420atgaatgtgc tgaccttcct gcgcgatatt ggcaaacact tctccgttaa ccagatgatc 420

aacaaagaag cggttaagca gcgtctcaac cgtgaagatc aggggatttc gttcactgag    480aacaaagaag cggttaagca gcgtctcaac cgtgaagatc aggggatttc gttcactgag 480

ttttcctaca acctgctgca gggttattct attgcctgtt cgaacaaaca gtacggtgtg    540ttttcctaca acctgctgca gggttattct attgcctgtt cgaacaaaca gtacggtgtg 540

<210>14<210>14

<211>540<211>540

<212>DNA<212>DNA

<213>人工<213> Artificial

<220><220>

<223>人工合成酶<223> Synthetic enzymes

<400>14<400>14

cgggggctgg tagcccaggt gacggacgag gaagcgttag cagagcgact ggcgcaaggc    60cgggggctgg tagcccaggt gacggacgag gaagcgttag cagagcgact ggcgcaaggc 60

ccgatcgcac tcgtgtgtgg cttcgatcct accgctgaca gcttgcattt ggggcatctt    120ccgatcgcac tcgtgtgtgg cttcgatcct accgctgaca gcttgcattt ggggcatctt 120

gttccattgt tatgcctgaa acgcttccag caggcgggcc acaagccggt tgcgctggta    180gttccattgt tatgcctgaa acgcttccag caggcgggcc acaagccggt tgcgctggta 180

ggcggcgcga cgggtctgat tggcgacccg agcttcaaag ctgccgagcg taagctgaac    240ggcggcgcga cgggtctgat tggcgacccg agcttcaaag ctgccgagcg taagctgaac 240

accgaagaaa ctgttcagga gtgggtggac aaaatccgta agcaggttgc cccgttcctc    300accgaagaaa ctgttcagga gtgggtggac aaaatccgta agcaggttgc cccgttcctc 300

gatttcgact gtggagaaaa ctctgctatc gcggccaata attatgactg gttcggcaat    360gatttcgact gtggagaaaa ctctgctatc gcggccaata attatgactg gttcggcaat 360

atgaatgtgc tgaccttcct gcgcgatatt ggcaaacact tctccgttaa ccagatgatc    420atgaatgtgc tgaccttcct gcgcgatatt ggcaaacact tctccgttaa ccagatgatc 420

aacaaagaag cggttaagca gcgtctcaac cgtgaagatc aggggatttc gttcactgag    480aacaaagaag cggttaagca gcgtctcaac cgtgaagatc aggggatttc gttcactgag 480

ttttcctaca acctgctgca gggttatagt attgcctgtt tgaacaaaca gtacggtgtg    540ttttcctaca acctgctgca gggttatagt attgcctgtt tgaacaaaca gtacggtgtg 540

<210>15<210>15

<211>540<211>540

<212>DNA<212>DNA

<213>人工<213> Artificial

<220><220>

<223>人工合成酶<223> Synthetic enzymes

<400>15<400>15

cgggggctgg taccccaggt gacggacgag gaagcgttag cagagcgact ggcgcaaggc    60cgggggctgg taccccaggt gacggacgag gaagcgttag cagagcgact ggcgcaaggc 60

ccgatcgcac tcgtgtgtgg cttcgatcct accgctgaca gcttgcattt ggggcatctt    120ccgatcgcac tcgtgtgtgg cttcgatcct accgctgaca gcttgcattt ggggcatctt 120

gttccattgt tatgcctgaa acgcttccag caggcgggcc acaagccggt tgcgctggta    180gttccattgt tatgcctgaa acgcttccag caggcgggcc acaagccggt tgcgctggta 180

ggcggcgcga cgggtctgat tggcgacccg agcttcaaag ctgccgagcg taagctgaac    240ggcggcgcga cgggtctgat tggcgacccg agcttcaaag ctgccgagcg taagctgaac 240

accgaagaaa ctgttcagga gtgggtggac aaaatccgta agcaggttgc cccgttcctc    300accgaagaaa ctgttcagga gtgggtggac aaaatccgta agcaggttgc cccgttcctc 300

gatttcgact gtggagaaaa ctctgctatc gcggccaata attatgactg gttcggcaat    360gatttcgact gtggagaaaa ctctgctatc gcggccaata attatgactg gttcggcaat 360

atgaatgtgc tgaccttcct gcgcgatatt ggcaaacact tctccgttaa ccagatgatc    420atgaatgtgc tgaccttcct gcgcgatatt ggcaaacact tctccgttaa ccagatgatc 420

aacaaagaag cggttaagca gcgtctcaac cgtgaagatc aggggatttc gttcactgag    480aacaaagaag cggttaagca gcgtctcaac cgtgaagatc aggggatttc gttcactgag 480

ttttcctaca acctgctgca gggttatagt attgcctgtt tgaacaaaca gtacggtgtg    540ttttcctaca acctgctgca gggttatagt attgcctgtt tgaacaaaca gtacggtgtg 540

<210>16<210>16

<211>540<211>540

<212>DNA<212>DNA

<213>人工<213> Artificial

<220><220>

<223>人工合成酶<223> Synthetic enzymes

<400>16<400>16

cgggggctgg tagcccaggt gacggacgag gaagcgttag cagagcgact ggcgcaaggc     60cgggggctgg tagcccaggt gacggacgag gaagcgttag cagagcgact ggcgcaaggc 60

ccgatcgcac tctggtgtgg cttcgatcct accgctgaca gcttgcattt ggggcatctt    120ccgatcgcac tctggtgtgg cttcgatcct accgctgaca gcttgcattt ggggcatctt 120

gttccattgt tatgcctgaa acgcttccag caggcgggcc acaagccggt tgcgctggta    180gttccattgt tatgcctgaa acgcttccag caggcgggcc acaagccggt tgcgctggta 180

ggcggcgcga cgggtctgat tggcgacccg agcttcaagg ctgccgagcg taagctgaac    240ggcggcgcga cgggtctgat tggcgacccg agcttcaagg ctgccgagcg taagctgaac 240

accgaagaaa ctgttcagga gtgggtggac aaaatccgta agcaggttgc cccgttcctc    300accgaagaaa ctgttcagga gtgggtggac aaaatccgta agcaggttgc cccgttcctc 300

gatttcgact gtggagaaaa ctctgctatc gcggccaatt gttatgactg gttcggcaat    360gatttcgact gtggagaaaa ctctgctatc gcggccaatt gttatgactg gttcggcaat 360

atgaatgtgc tgaccttcct gcgcgatatt ggcaaacact tctccgttaa ccagatgatc    420atgaatgtgc tgaccttcct gcgcgatatt ggcaaacact tctccgttaa ccagatgatc 420

aacaaagaag cggttaagca gcgtctcaac cgtgaagatc aggggatttc gttcactgag    480aacaaagaag cggttaagca gcgtctcaac cgtgaagatc aggggatttc gttcactgag 480

ttttcctaca acctgctgca gggttatatg cgtgcctgtg agaacaaaca gtacggtgtg    540ttttcctaca acctgctgca gggttatatg cgtgcctgtg agaacaaaca gtacggtgtg 540

<210>17<210>17

<211>624<211>624

<212>DNA<212>DNA

<213>人工<213> Artificial

<220><220>

<223>人工合成酶<223> Synthetic enzymes

<400>17<400>17

cgggggctgg tagcccaggt gacggacgag gaagcgttag cagagcgact ggcgcaaggc     60cgggggctgg tagcccaggt gacggacgag gaagcgttag cagagcgact ggcgcaaggc 60

ccgatcgcac tcatttgtgg cttcgatcct accgctgaca gcttgcattt ggggcatctt    120ccgatcgcac tcatttgtgg cttcgatcct accgctgaca gcttgcattt ggggcatctt 120

gttccattgt tatgcctgaa acgcttccag caggcgggcc acaagccggt tgcgctggta    180gttccattgt tatgcctgaa acgcttccag caggcgggcc acaagccggt tgcgctggta 180

ggcggcgcga cgggtctgat tggcgacccg agcttcaaag ctgccgagcg taagctgaac    240ggcggcgcga cgggtctgat tggcgacccg agcttcaaag ctgccgagcg taagctgaac 240

accgaagaaa ctgttcagga gtgggtggac aaaatccgta agcaggttgc cccgttcctc    300accgaagaaa ctgttcagga gtgggtggac aaaatccgta agcaggttgc cccgttcctc 300

gatttcgact gtggagaaaa ctctgctatc gcggccaata attatgactg gttcggcaat    360gatttcgact gtggagaaaa ctctgctatc gcggccaata attatgactg gttcggcaat 360

atgaatgtgc tgaccttcct gcgcgatatt ggcaaacact tctccgttaa ccagatgatc    420atgaatgtgc tgaccttcct gcgcgatatt ggcaaacact tctccgttaa ccagatgatc 420

aacaaagaag cggttaagca gcgtctcaac cgtgaaggtc aggggatttc gttcactgag    480aacaaagaag cggttaagca gcgtctcaac cgtgaaggtc aggggatttc gttcactgag 480

ttttcctaca acctgctgca gggttatggt atggcctgtg ctaacaaaca gtacggtgtg    540ttttcctaca acctgctgca gggttatggt atggcctgtg ctaacaaaca gtacggtgtg 540

gtgctgcaaa ttggtggttc tgaccaatgg ggtaacatca cttctggtat cgacctgacc    600gtgctgcaaa ttggtggttc tgaccaatgg ggtaacatca cttctggtat cgacctgacc 600

cgtcgtctgc atcagaatca ggtg                                           624cgtcgtctgc atcagaatca ggtg 624

<210>18<210>18

<211>609<211>609

<212>DNA<212>DNA

<213>人工<213> Artificial

<220><220>

<223>人工合成酶<223> Synthetic enzymes

<400>18<400>18

caggtgacgg acgaggaagc gttagcagag cgactggcgc aaggcccgat cgcactcggt     60caggtgacgg acgaggaagc gttagcagag cgactggcgc aaggcccgat cgcactcggt 60

tgtggcttcg atcctaccgc tgacagcttg catttggggc atcttgttcc attgttatgc  120tgtggcttcg atcctaccgc tgacagcttg catttggggc atcttgttcc attgttatgc 120

ctggaacgct tccagcaggc gggccacaag ccggttgcgc tggtaggcgg cgcgacgggt  180ctggaacgct tccagcaggc gggccacaag ccggttgcgc tggtaggcgg cgcgacgggt 180

ctgattggcg acccgagctt caaagctgcc gagcgtaagc tgaacaccga agaaactgtt  240ctgattggcg acccgagctt caaagctgcc gagcgtaagc tgaacaccga agaaactgtt 240

caggagtggg tggacaaaat ccgtaagcag gttgccccgt tcctcgattt cgactgtgga  300caggagtggg tggacaaaat ccgtaagcag gttgccccgt tcctcgattt cgactgtgga 300

gaaaactctg ctatcgcggc caataattat gactggttcg gcaatatgaa tgtgctgacc  360gaaaactctg ctatcgcggc caataattat gactggttcg gcaatatgaa tgtgctgacc 360

ttcctgcgcg atattggcaa acacttctcc gttaaccaga tgatcaacaa agaagcggtt  420ttcctgcgcg atattggcaa acacttctcc gttaaccaga tgatcaacaa agaagcggtt 420

aagcagcgtc tcaaccgtga agatcagggg atttcgttca ctgagttttc ctacaacctg  480aagcagcgtc tcaaccgtga agatcagggg atttcgttca ctgagttttc ctacaacctg 480

ctgcagggtt atggttttgc ctgtttgaac aaacagtacg gtgtggtgct gcaaattggt  540ctgcagggtt atggttttgc ctgtttgaac aaacagtacg gtgtggtgct gcaaattggt 540

ggttctgacc agtggggtaa catcacttct ggtatcgacc tgacccgtcg tctgcatcag  600ggttctgacc agtggggtaa catcacttct ggtatcgacc tgacccgtcg tctgcatcag 600

aatcaggtg                                                          609aatcaggtg 609

<210>19<210>19

<211>591<211>591

<212>DNA<212>DNA

<213>人工<213> Artificial

<220><220>

<223>人工合成酶<223> Synthetic enzymes

<400>19<400>19

gcgttagcag agcgactggc gcaaggcccg atcgcactcg ggtgtggctt cgatcctacc   60gcgttagcag agcgactggc gcaaggcccg atcgcactcg ggtgtggctt cgatcctacc 60

gctgacagct tgcatttggg gcatcttgtt ccattgttat gcctgaaacg cttccagcag  120gctgacagct tgcatttggg gcatcttgtt ccattgttat gcctgaaacg cttccagcag 120

gcgggccaca agccggttgc gctggtaggc ggcgcgacgg gtctgattgg cgacccgagc  180gcgggccaca agccggttgc gctggtaggc ggcgcgacgg gtctgattgg cgacccgagc 180

ttcaaagctg ccgagcgtaa gctgaacacc gaagaaactg ttcaggagtg ggtggacaaa  240ttcaaagctg ccgagcgtaa gctgaacacc gaagaaactg ttcaggagtg ggtggacaaa 240

atccgtaagc aggttgcccc gttcctcgat ttcgactgtg gagaaaactc tgctatcgcg  300atccgtaagc aggttgcccc gttcctcgat ttcgactgtg gagaaaactc tgctatcgcg 300

gccaataatt atgactggtt cggcaatatg aatgtgctga ccttcctgcg cgatattggc  360gccaataatt atgactggtt cggcaatatg aatgtgctga ccttcctgcg cgatattggc 360

aaacacttct ccgttaacca gatgatcaac aaagaagcgg ttaagcagcg tctcaaccgt  420aaacacttct ccgttaacca gatgatcaac aaagaagcgg ttaagcagcg tctcaaccgt 420

gaagatcagg ggatttcgtt cactgagttt tcctacaacc tgctgcaggg ttatggttat   480gaagatcagg ggatttcgtt cactgagttt tcctacaacc tgctgcaggg ttatggttat 480

gcctgtatga acaaacagta cggtgtggtg ctgcaaattg gtggttctga ccagtggggt   540gcctgtatga acaaacagta cggtgtggtg ctgcaaattg gtggttctga ccagtggggt 540

aacatcactt ctggtatcga cctgacccgt cgtctgcatc agaatcaggt g            591aacatcactt ctggtatcga cctgacccgt cgtctgcatc agaatcaggt g 591

<210>20<210>20

<211>621<211>621

<212>DNA<212>DNA

<213>人工<213> Artificial

<220><220>

<223>人工合成酶<223> Synthetic enzymes

<220><220>

<221>misc_feature<221>misc_feature

<222>(26)..(26)<222>(26)..(26)

<223>n是a、c、g或t<223> n is a, c, g or t

<220><220>

<221>misc_feature<221>misc_feature

<222>(612)..(612)<222>(612)..(612)

<223>n是a、c、g或t<223> n is a, c, g or t

<220><220>

<221>misc_feature<221>misc_feature

<222>(618)..(618)<222>(618)..(618)

<223>n是a、c、g或t<223> n is a, c, g or t

<400>20<400>20

gggctggtag cccaggtgac ggacgnagaa gcgttagcag agcgactggc gcaaggcccg   60gggctggtag cccaggtgac ggacgnagaa gcgttagcag agcgactggc gcaaggcccg 60

atcgcactcc tttgtggctt cgatcctacc gctgacagct tgcatttggg gcatcttgtt  120atcgcactcc tttgtggctt cgatcctacc gctgacagct tgcatttggg gcatcttgtt 120

ccattgttat gcctgaaacg cttccagcag gcgggccaca agccggttgc gctggtaggc  180ccattgttat gcctgaaacg cttccagcag gcgggccaca agccggttgc gctggtaggc 180

ggcgcgacgg gtctgattgg cgacccgagc ttcaaagctg ccgagcgtaa gctgaacacc  240ggcgcgacgg gtctgattgg cgacccgagc ttcaaagctg ccgagcgtaa gctgaacacc 240

gaagaaactg ttcaggagtg ggtggacaaa atccgtaagc aggttgcccc gttcctcgat  300gaagaaactg ttcaggagtg ggtggacaa atccgtaagc aggttgcccc gttcctcgat 300

ttcgactgtg gagaaaactc tgctatcgcg gccaataatt atgactggtt cggcaatatg  360ttcgactgtg gagaaaactc tgctatcgcg gccaataatt atgactggtt cggcaatatg 360

aatgtgctga ccttcctgcg cgatattggc aaacacttct ccgttaacca gatgatcaac    420aatgtgctga ccttcctgcg cgatattggc aaacacttct ccgttaacca gatgatcaac 420

aaagaagcgg ttaagcagcg tctcaaccgt gaagatcagg ggatttcgtt cactgagttt    480aaagaagcgg ttaagcagcg tctcaaccgt gaagatcagg ggatttcgtt cactgagttt 480

tcctacaacc tgctgcaggg ttattctatg gcctgtgcga acaaacagta cggtgtggtg    540tcctacaacc tgctgcaggg ttatctatg gcctgtgcga acaaacagta cggtgtggtg 540

ctgcaaattg gtggttctga ccagtggggt aacatcactt ctggtatcga cctgacccgt    600ctgcaaattg gtggttctga ccagtggggt aacatcactt ctggtatcga cctgacccgt 600

cgtctgcatc anaatcangt g                                              621cgtctgcatc anaatcangt g 621

<210>21<210>21

<211>588<211>588

<212>DNA<212>DNA

<213>人工<213> Artificial

<220><220>

<223>人工合成酶<223> Synthetic enzymes

<400>21<400>21

ttagcagagc gactggcgca aggcccgatc gcactcgttt gtggcttcga tcctaccgct     60ttagcagagc gactggcgca aggcccgatc gcactcgttt gtggcttcga tcctaccgct 60

gacagcttgc atttggggca tcttgttcca ttgttatgcc tgaaacgctt ccagcaggcg    120gacagcttgc atttggggca tcttgttcca ttgttatgcc tgaaacgctt ccagcaggcg 120

ggccacaagc cggttgcgct ggtaggcggc gcgacgggtc tgattggcga cccgagcttc    180ggccacaagc cggttgcgct ggtaggcggc gcgacgggtc tgattggcga cccgagcttc 180

aaagctgccg agcgtaagct gaacaccgaa gaaactgttc aggagtgggt ggacaaaatc    240aaagctgccg agcgtaagct gaacaccgaa gaaactgttc aggagtgggt ggacaaaatc 240

cgtaagcagg ttgccccgtt cctcgatttc gactgtggag aaaactctgc tatcgcggcc    300cgtaagcagg ttgccccgtt cctcgatttc gactgtggag aaaactctgc tatcgcggcc 300

aataattatg actggttcgg caatatgaat gtgctgacct tcctgcgcga tattggcaaa    360aataattatg actggttcgg caatatgaat gtgctgacct tcctgcgcga tattggcaaa 360

cacttctccg ttaaccagat gatcaacaaa gaagcggtta agcagcgtct caaccgtgaa    420cacttctccg ttaaccagat gatcaacaaa gaagcggtta agcagcgtct caaccgtgaa 420

gatcagggga tttcgttcac tgagttttcc tacaacctgc tgcagggtta ttctgcggcc    480gatcaggggga tttcgttcac tgagttttcc tacaacctgc tgcagggtta ttctgcggcc 480

tgtgcgaaca aacagtacgg tgtggtgctg caaattggtg gttctgacca gtggggtaac    540tgtgcgaaca aacagtacgg tgtggtgctg caaattggtg gttctgacca gtggggtaac 540

atcacttctg gtatcgacct gacccgtcgt ctgcatcaga atcaggtg                 588atcacttctg gtatcgacct gacccgtcgt ctgcatcaga atcaggtg 588

<210>22<210>22

<211>600<211>600

<212>DNA<212>DNA

<213>人工<213> Artificial

<220><220>

<223>人工合成酶<223> Synthetic enzymes

<220><220>

<221>misc_feature<221>misc_feature

<222>(403)..(403)<222>(403)..(403)

<223>n是a、c、g或t<223> n is a, c, g or t

<220><220>

<221>misc_feature<221>misc_feature

<222>(513)..(513)<222>(513)..(513)

<223>n是a、c、g或t<223> n is a, c, g or t

<220><220>

<221>misc_feature<221>misc_feature

<222>(515)..(515)<222>(515)..(515)

<223>n是a、c、g或t<223> n is a, c, g or t

<220><220>

<221>misc_feature<221>misc_feature

<222>(518)..(518)<222>(518)..(518)

<223>n是a、c、g或t<223> n is a, c, g or t

<220><220>

<221>misc_feature<221>misc_feature

<222>(531)..(531)<222>(531)..(531)

<223>n是a、c、g或t<223> n is a, c, g or t

<400>22<400>22

gacgaggaag cgttagcaga gcgactggcg caaggcccga tcgcactcct gtgtggcttc     60gacgaggaag cgttagcaga gcgactggcg caaggcccga tcgcactcct gtgtggcttc 60

gatcctaccg ctgacagctt gcatttgggg catcttgttc cattgttatg cctgaaacgc    120gatcctaccg ctgacagctt gcatttgggg catcttgttc cattgttatg cctgaaacgc 120

ttccagcagg cgggccacaa gccggttgcg ctggtaggcg gcgcgacggg tctgattggc    180ttccagcagg cgggccacaa gccggttgcg ctggtaggcg gcgcgacggg tctgattggc 180

gacccgagct tcaaagctgc cgagcgtaag ctgaacaccg aagaaactgt tcaggagtgg    240gacccgagct tcaaagctgc cgagcgtaag ctgaacaccg aagaaactgt tcaggagtgg 240

gtggacaaaa tccgtaagca ggttgccccg ttcctcgatt tcgactgtgg agaaaactct    300gtggacaaaa tccgtaagca ggttgccccg ttcctcgatt tcgactgtgg agaaaactct 300

gctatcgcgg ccaataatta tgactggttc ggcaatatga atgtgctgac cttcctgcgc    360gctatcgcgg ccaataatta tgactggttc ggcaatatga atgtgctgac cttcctgcgc 360

gatattggca aacacttctc cgttaaccag atgatcaaca aanaagcggt taagcagcgt    420gatattggca aacacttctc cgttaaccag atgatcaaca aanaagcggt taagcagcgt 420

ctcaaccgtg aagatcaggg gatttcgttc actgagtttt cctacaacct gctgcagggt    480ctcaaccgtg aagatcaggg gatttcgttc actgagtttt cctacaacct gctgcagggt 480

tattcggctg cctgtgcgaa caaacagtac ggngnggngc tgcaaattgg nggttctgac    540tattcggctg cctgtgcgaa caaacagtac ggngnggngc tgcaaattgg nggttctgac 540

caggggggta acatcacttc tggtatcgac ctgacccgtc gtctgcatca aaatcaggtg    600caggggggta acatcacttc tggtatcgac ctgacccgtc gtctgcatca aaatcaggtg 600

<210>23<210>23

<211>591<211>591

<212>DNA<212>DNA

<213>人工<213> Artificial

<220><220>

<223>人工合成酶<223> Synthetic enzymes

<220><220>

<221>misc_feature<221>misc_feature

<222>(588)..(588)<222>(588)..(588)

<223>n是a、c、g或t<223> n is a, c, g or t

<400>23<400>23

gcgttagcag agcgactggc gcaaggcccg atcgcactcg tttgtggctt cgatcctacc     60gcgttagcag agcgactggc gcaaggcccg atcgcactcg tttgtggctt cgatcctacc 60

gctgacagct tgcatttggg gcatcttgtt ccattgttgt gcctgaaacg cttccagcag    120gctgacagct tgcatttggg gcatcttgtt ccattgttgt gcctgaaacg cttccagcag 120

gcgggccaca agccggttgc gctggtaggc ggcgcgacgg gtctgattgg cgacccgagc    180gcgggccaca agccggttgc gctggtaggc ggcgcgacgg gtctgattgg cgacccgagc 180

ttcaaagctg ccgagcgtaa gctgaacacc gaagaaactg ttcaggagtg ggtggacaaa    240ttcaaagctg ccgagcgtaa gctgaacacc gaagaaactg ttcaggagtg ggtggacaaa 240

atccgtaagc aggttgcccc gttcctcgat ttcgactgtg gagaaaactc tgctatcgcg    300atccgtaagc aggttgcccc gttcctcgat ttcgactgtg gagaaaactc tgctatcgcg 300

gccaataatt atgactggtt cggcaatatg aatgtgctga ccttcctgcg cgatattggc    360gccaataatt atgactggtt cggcaatatg aatgtgctga ccttcctgcg cgatattggc 360

aaacacttct ccgttaacca gatgatcaac aaagaagcgg ttaagcagcg tctcaaccgt    420aaacacttct ccgttaacca gatgatcaac aaagaagcgg ttaagcagcg tctcaaccgt 420

gaagatcagg ggatttcgtt cactgagttt tcctacaacc tgctgcaggg ttatagtgcg    480gaagatcagg ggatttcgtt cactgagttt tcctacaacc tgctgcaggg ttatagtgcg 480

gcctgtgtta acaaacagta cggtgtggtg ctgcaaattg gtggttctga ccagtggggt    540gcctgtgtta acaaacagta cggtgtggtg ctgcaaattg gtggttctga ccagtggggt 540

aacatcactt ctggtatcga cctgacccgt cgtctgcatc agaatcangt g             591aacatcactt ctggtatcga cctgacccgt cgtctgcatc agaatcangt g 591

<210>24<210>24

<211>600<211>600

<212>DNA<212>DNA

<213>人工<213> Artificial

<220><220>

<223>人工合成酶<223> Synthetic enzymes

<400>24<400>24

gacgaggaag cgttagcaga gcgactggcg caaggcccga tcgcactcat ttgtggcttc     60gacgaggaag cgttagcaga gcgactggcg caaggcccga tcgcactcat ttgtggcttc 60

gatcctaccg ctgacagctt gcatttgggg catcttgttc cattgttatg cctgaaacgc    120gatcctaccg ctgacagctt gcatttgggg catcttgttc cattgttatg cctgaaacgc 120

ttccagcagg cgggccacaa gccggttgcg ctggtaggcg gcgcgacggg tctgattggc    180ttccagcagg cgggccacaa gccggttgcg ctggtaggcg gcgcgacggg tctgattggc 180

gacccgagct tcaaagctgc cgagcgtaag ctgaacaccg aagaaactgt tcaggagtgg    240gacccgagct tcaaagctgc cgagcgtaag ctgaacaccg aagaaactgt tcaggagtgg 240

gtggacaaaa tccgtaagca ggttgccccg ttcctcgatt tcgactgtgg agaaaactct    300gtggacaaaa tccgtaagca ggttgccccg ttcctcgatt tcgactgtgg agaaaactct 300

gctatcgcgg ccaatgatta tgactggttc ggcaatatga atgtgctgac cttcctgcgc    360gctatcgcgg ccaatgatta tgactggttc ggcaatatga atgtgctgac cttcctgcgc 360

gatattggca aacacttctc cgttaaccag atgatcaaca aagaagcggt taagcagcgt    420gatattggca aacacttctc cgttaaccag atgatcaaca aagaagcggt taagcagcgt 420

ctcaaccgtg aagatcaggg gatttcgttc actgagtttt cctacaacct gctgcagggt    480ctcaaccgtg aagatcaggg gatttcgttc actgagtttt cctacaacct gctgcagggt 480

tataattttg cctgtgtgaa caaacagtac ggtgtggtgc tgcaaattgg tggttctgac    540tataattttg cctgtgtgaa caaacagtac ggtgtggtgc tgcaaattgg tggttctgac 540

cagtggggta acatcacttc tggtatcgac ctgacccgtc gtctgcatca gaatcaggtg    600cagtggggta acatcacttc tggtatcgac ctgacccgtc gtctgcatca gaatcaggtg 600

<210>25<210>25

<211>579<211>579

<212>DNA<212>DNA

<213>人工<213> Artificial

<220><220>

<223>人工合成酶<223> Synthetic enzymes

<400>25<400>25

cgactggcgc aaggcccgat cgcactcacg tgtggcttcg atcctaccgc tgacagcttg     60cgactggcgc aaggcccgat cgcactcacg tgtggcttcg atcctaccgc tgacagcttg 60

catttggggc atcttgttcc attgttatgc ctgaaacgct tccagcaggc gggccacaag    120catttggggc atcttgttcc attgttatgc ctgaaacgct tccagcaggc gggccacaag 120

ccggttgcgc tggtaggcgg cgcgacgggt ctgattggcg acccgagctt caaagctgcc    180ccggttgcgc tggtaggcgg cgcgacgggt ctgattggcg acccgagctt caaagctgcc 180

gagcgtaagc tgaacaccga agaaactgtt caggagtggg tggacaaaat ccgtaagcag    240gagcgtaagc tgaacaccga agaaactgtt caggagtggg tggacaaaat ccgtaagcag 240

gttgccccgt tcctcgattt cgactgtgga gaaaactctg ctatcgcggc caataattat    300gttgccccgt tcctcgattt cgactgtgga gaaaactctg ctatcgcggc caataattat 300

gactggttcg gcaatatgaa tgtgctgacc ttcctgcgcg atattggcaa acacttctcc    360gactggttcg gcaatatgaa tgtgctgacc ttcctgcgcg atattggcaa acacttctcc 360

gttaaccaga tgatcaacaa agaagcggtt aagcagcgtc tcaaccgtga agatcagggg    420gttaaccaga tgatcaacaa agaagcggtt aagcagcgtc tcaaccgtga agatcagggg 420

atttcgttca ctgagttttc ctacaatctg ctgcagggtt attcggctgc ctgtcttaac    480atttcgttca ctgagttttc ctacaatctg ctgcagggtt attcggctgc ctgtcttaac 480

aaacagtacg gtgtggtgct gcaaattggt ggttctgacc agtggggtaa catcacttct    540aaacagtacg gtgtggtgct gcaaattggt ggttctgacc agtggggtaa catcacttct 540

ggtatcgacc tgacccgtcg tctgcatcag aatcaggtg                           579ggtatcgacc tgacccgtcg tctgcatcag aatcaggtg 579

<210>26<210>26

<211>624<211>624

<212>DNA<212>DNA

<213>人工<213> Artificial

<220><220>

<223>人工合成酶<223> Synthetic enzymes

<220><220>

<221>misc_feature<221>misc_feature

<222>(13)..(13)<222>(13)..(13)

<223>n是a、c、g或t<223> n is a, c, g or t

<220><220>

<221>misc_feature<221>misc_feature

<222>(599)..(599)<222>(599)..(599)

<223>n是a、c、g或t<223> n is a, c, g or t

<400>26<400>26

cgggggctgg tancccaggt gacggacgag gaagcgttag cagagcgact ggcgcaaggc   60cgggggctgg tancccaggt gacggacgag gaagcgttag cagagcgact ggcgcaaggc 60

ccgatcgcac tcgggtgtgg cttcgatcct accgctgaca gcttgcattt ggggcatctt  120ccgatcgcac tcgggtgtgg cttcgatcct accgctgaca gcttgcattt ggggcatctt 120

gttccattgt tatgcctgaa acgcttccag caggcgggcc acaagccggt tgcgctggta  180gttccattgt tatgcctgaa acgcttccag caggcgggcc acaagccggt tgcgctggta 180

ggcggcgcga cgggtctgat tggcgacccg agcttcaaag ctgccgagcg taagctgaac  240ggcggcgcga cgggtctgat tggcgacccg agcttcaaag ctgccgagcg taagctgaac 240

accgaagaaa ctgttcagga gtgggtggac aaaatccgta agcaggttgc cccgttcctc    300accgaagaaa ctgttcagga gtgggtggac aaaatccgta agcaggttgc cccgttcctc 300

gatttcgact gtggagaaaa ctctgctatc gcggccaata attatgactg gttcggcaat    360gatttcgact gtggagaaaa ctctgctatc gcggccaata attatgactg gttcggcaat 360

atgaatgtgc tgaccttcct gcgcgatatt ggcaaacact tctccgttaa ccagatgatc    420atgaatgtgc tgaccttcct gcgcgatatt ggcaaacact tctccgttaa ccagatgatc 420

aacaaagaag cggttaagca gcgtctcaac cgtgaagatc aggggatttc gttcactgag    480aacaaagaag cggttaagca gcgtctcaac cgtgaagatc aggggatttc gttcactgag 480

ttttcctaca acctgctgca gggttattct atggcctgtt tgaacaaaca gtacggtgtg    540ttttcctaca acctgctgca gggttattct atggcctgtt tgaacaaaca gtacggtgtg 540

gtgctgcaaa ttggtggttc tgaccagtgg ggtaacatca cttctggtat cgacctganc    600gtgctgcaaa ttggtggttc tgaccagtgg ggtaacatca cttctggtat cgacctganc 600

cgtcgtctgc atcagaatca ggtg                                           624cgtcgtctgc atcagaatca ggtg 624

<210>27<210>27

<211>625<211>625

<212>DNA<212>DNA

<213>人工<213> Artificial

<220><220>

<223>人工合成酶<223> Synthetic enzymes

<220><220>

<221>misc_feature<221>misc_feature

<222>(600)..(600)<222>(600)..(600)

<223>n是a、c、g或t<223> n is a, c, g or t

<400>27<400>27

cgggggctgg tagcccaggt gacggacgag gaagcgttag cagagcgact ggcgcaaggc   60cgggggctgg tagcccaggt gacggacgag gaagcgttag cagagcgact ggcgcaaggc 60

ccgatcgcac tcacgtgtgg cttcgatcct accgctgaca gcttgcattt ggggcatctt  120ccgatcgcac tcacgtgtgg cttcgatcct accgctgaca gcttgcattt ggggcatctt 120

gttccattgt tatgcctgaa acgcttccag caggcgggcc acaagccggt tgcgctggta  180gttccattgt tatgcctgaa acgcttccag caggcgggcc acaagccggt tgcgctggta 180

ggcggcgcga cgggtctgat tggcgacccg agcttcaaag ctgccgagcg taagctgaac  240ggcggcgcga cgggtctgat tggcgacccg agcttcaaag ctgccgagcg taagctgaac 240

accgaagaaa ctgttcagga gtgggtggac aaaatccgta agcaggttgc cccgttcctc  300accgaagaaa ctgttcagga gtgggtggac aaaatccgta agcaggttgc cccgttcctc 300

gatttcgact gtggagaaaa ctctgctatc gcggccaata attatgactg gttcggcaat  360gatttcgact gtggagaaaa ctctgctatc gcggccaata attatgactg gttcggcaat 360

atgaatgtgc tgaccttcct gcgcgatatt ggcaaacact tctccgttaa ccagatgatc  420atgaatgtgc tgaccttcct gcgcgatatt ggcaaacact tctccgttaa ccagatgatc 420

aacaaagaag cggttaagca gcgtctcaac cgtgaagatc aggggatttc gttcactgag    480aacaaagaag cggttaagca gcgtctcaac cgtgaagatc aggggatttc gttcactgag 480

ttttcctaca atctgctgca gggttattcg gctgcctgtc ttaacaaaca gtacggtgtg    540ttttcctaca atctgctgca gggttattcg gctgcctgtc ttaacaaaca gtacggtgtg 540

gtgctgcaaa ttggtggttc tgaccagtgg ggtaacatca cttctggtat cgaacctgan    600gtgctgcaaa ttggtggttc tgaccagtgg ggtaacatca cttctggtat cgaacctgan 600

ccgtcgtctg catcaaaatc aagtg                                          625ccgtcgtctg catcaaaatc aagtg 625

<210>28<210>28

<211>624<211>624

<212>DNA<212>DNA

<213>人工<213> Artificial

<220><220>

<223>人工合成酶<223> Synthetic enzymes

<400>28<400>28

cgggggctgg taccccaagt gacggacgag gaaacgttag cagagcgact ggcgcaaggc     60cgggggctgg taccccaagt gacggacgag gaaacgttag cagagcgact ggcgcaaggc 60

ccgatcgcac tctcttgtgg cttcgatcct accgctgaca gcttgcattt ggggcatctt    120ccgatcgcac tctcttgtgg cttcgatcct accgctgaca gcttgcattt ggggcatctt 120

gttccattgt tatgcctgaa acgcttccag caggcaggcc acaagccggt tgcgctggta    180gttccattgt tatgcctgaa acgcttccag caggcaggcc acaagccggt tgcgctggta 180

ggcggcgcga cgggtctgat tggcgacccg agcttcaaag ctgccgagcg taagctgaac    240ggcggcgcga cgggtctgat tggcgacccg agcttcaaag ctgccgagcg taagctgaac 240

accgaagaaa ctgttcagga gtgggtggac aaaatccgta agcaggttgc cccgttcctc    300accgaagaaa ctgttcagga gtgggtggac aaaatccgta agcaggttgc cccgttcctc 300

gatttcgact gtggagaaaa ctctgctatc gcggccaata attatgactg gttcggcaat    360gatttcgact gtggagaaaa ctctgctatc gcggccaata attatgactg gttcggcaat 360

atgaatgtgc tgaccttcct gcgcgatatt ggcaaacact tctccgttaa ccagatgatc    420atgaatgtgc tgaccttcct gcgcgatatt ggcaaacact tctccgttaa ccagatgatc 420

aacaaagaag cggttaagca gcgtctcaac cgtgaagatc aggggatttc gttcactgag    480aacaaagaag cggttaagca gcgtctcaac cgtgaagatc aggggatttc gttcactgag 480

ttttcctaca acctgctgca gggttatacg atggcctgtg tgaacaaaca gtacggtgtg    540ttttcctaca acctgctgca gggttatacg atggcctgtg tgaacaaaca gtacggtgtg 540

gtgctgcaaa ttggtggttc tgaccagtgg ggtaacatca cttctggtat cgacctgacc    600gtgctgcaaa ttggtggttc tgaccagtgg ggtaacatca cttctggtat cgacctgacc 600

cgtcgtctgc atcagaatca ggtg                                           624cgtcgtctgc atcagaatca ggtg 624

<210>29<210>29

<211>624<211>624

<212>DNA<212>DNA

<213>人工<213> Artificial

<220><220>

<223>人工合成酶<223> Synthetic enzymes

<400>29<400>29

cgggggctgg tagcccaggt gacggacgag gaagcgttag cagagcgact ggcgcaaggc   60cgggggctgg tagcccaggt gacggacgag gaagcgttag cagagcgact ggcgcaaggc 60

ccgatcgcac tcgcgtgcgg cttcgatcct accgctgaca gcttgcattt ggggcatctt  120ccgatcgcac tcgcgtgcgg cttcgatcct accgctgaca gcttgcattt ggggcatctt 120

gttccattgt tatgcctgaa acgcttccag caggcgggcc acaagccggt tgcgctggta  180gttccattgt tatgcctgaa acgcttccag caggcgggcc acaagccggt tgcgctggta 180

ggcggcgcga cgggtctgat tggcgacccg agcttcaagg ctgccgagcg taagctgaac  240ggcggcgcga cgggtctgat tggcgacccg agcttcaagg ctgccgagcg taagctgaac 240

accgaagaaa ctgttcagga gtgggtggac aaaatccgta agcaggttgc cccgttcctc  300accgaagaaa ctgttcagga gtgggtggac aaaatccgta agcaggttgc cccgttcctc 300

gatttcgact gtggagaaaa ctctgctatc gcggccaata attatgactg gttcggcaat  360gatttcgact gtggagaaaa ctctgctatc gcggccaata attatgactg gttcggcaat 360

atgaatgtgc tgaccttcct gcgcgatatt ggcaaacact tctccgttaa ccagatgatc  420atgaatgtgc tgaccttcct gcgcgatatt ggcaaacact tctccgttaa ccagatgatc 420

aacaaagaag cggttaagca gcgtctcaac cgtgaagatc aggggatttc gttcactgag  480aacaaagaag cggttaagca gcgtctcaac cgtgaagatc aggggatttc gttcactgag 480

ttttcctaca acctgctgca gggttattct tatgcctgtc ttaacaaaca gtacggtgtg  540ttttcctaca acctgctgca gggttattct tatgcctgtc ttaacaaaca gtacggtgtg 540

gtgctgcaaa ttggtggttc tgaccagtgg ggtaacatca cttctggtat cgacctgacc  600gtgctgcaaa ttggtggttc tgaccagtgg ggtaacatca cttctggtat cgacctgacc 600

cgtcgtctgc atcagaatca ggtg                                         624cgtcgtctgc atcagaatca ggtg 624

<210>30<210>30

<211>624<211>624

<212>DNA<212>DNA

<213>人工<213> Artificial

<220><220>

<223>人工合成酶<223> Synthetic enzymes

<400>30<400>30

cgggggctgg tagcccaggt gacggacgag gaagcgttag cagagcgact ggcgcaaggc   60cgggggctgg tagcccaggt gacggacgag gaagcgttag cagagcgact ggcgcaaggc 60

ccgatcgcac tcgcgtgtgg cttcgatcct accgctgaca gcttgcattt ggggcatctt  120ccgatcgcac tcgcgtgtgg cttcgatcct accgctgaca gcttgcattt ggggcatctt 120

gttccattgt tatgcctgaa acgcttccag caggcgggcc acaagccggt tgcgctggta  180gttccattgt tatgcctgaa acgcttccag caggcgggcc acaagccggt tgcgctggta 180

ggcggcgcga cgggtctgat tggcgacccg agcttcaaag ctgccgagcg taagctgaac    240ggcggcgcga cgggtctgat tggcgacccg agcttcaaag ctgccgagcg taagctgaac 240

accgaagaaa ctgttcagga gtgggtggac aaaatccgta agcaggttgc cccgttcctc    300accgaagaaa ctgttcagga gtgggtggac aaaatccgta agcaggttgc cccgttcctc 300

gatttcgact gtggagaaaa ctctgctatc gcggccaata attatgactg gttcggcaat    360gatttcgact gtggagaaaa ctctgctatc gcggccaata attatgactg gttcggcaat 360

atgaatgtgc tgaccttcct gcgcgatatt ggcaaacact tctccgttaa ccagatgatc    420atgaatgtgc tgaccttcct gcgcgatatt ggcaaacact tctccgttaa ccagatgatc 420

aacaaagaag cggttaagca gcgtctcaac cgtgaagatc aggggatttc gttcactgag    480aacaaagaag cggttaagca gcgtctcaac cgtgaagatc aggggatttc gttcactgag 480

ttttcctaca acctgctgca gggttatacg atggcctgtt gtaacaaaca gtacggtgtg    540ttttcctaca acctgctgca gggttatacg atggcctgtt gtaacaaaca gtacggtgtg 540

gtgctgcaaa ttggtggttc tgaccagtgg ggtaacatca cttctggtat cgacctgacc    600gtgctgcaaa ttggtggttc tgaccagtgg ggtaacatca cttctggtat cgacctgacc 600

cgtcgtctgc atcagaatca ggtg                                           624cgtcgtctgc atcagaatca ggtg 624

<210>31<210>31

<211>624<211>624

<212>DNA<212>DNA

<213>人工<213> Artificial

<220><220>

<223>人工合成酶<223> Synthetic enzymes

<400>31<400>31

cgggggctgg taccccaagt gacggacgag gaagcgttag cagagcgact ggcgcaaggc     60cgggggctgg taccccaagt gacggacgag gaagcgttag cagagcgact ggcgcaaggc 60

ccgatcgcac tcacgtgtgg cttcgatcct accgctgaca gcttgcattt ggggcatctt    120ccgatcgcac tcacgtgtgg cttcgatcct accgctgaca gcttgcattt ggggcatctt 120

gttccattgt tatgcctgaa acgcttccag caggcgggcc acaagccggt tgcgctggta    180gttccattgt tatgcctgaa acgcttccag caggcgggcc acaagccggt tgcgctggta 180

ggcggcgcga cgggtctgat tggcgacccg agcttcaaag ctgccgagcg taagctgaac    240ggcggcgcga cgggtctgat tggcgacccg agcttcaaag ctgccgagcg taagctgaac 240

accgaagaaa ctgttcagga gtgggtggac aaaatccgta agcaggttgc cccgttcctc    300accgaagaaa ctgttcagga gtgggtggac aaaatccgta agcaggttgc cccgttcctc 300

gatttcgact gtggagaaaa ctctgctatc gcggccaata attatgactg gttcggcaat    360gatttcgact gtggagaaaa ctctgctatc gcggccaata attatgactg gttcggcaat 360

atgaatgtgc tgaccttcct gcgcgatatt ggcaaacact tctccgttaa ccagatgatc    420atgaatgtgc tgaccttcct gcgcgatatt ggcaaacact tctccgttaa ccagatgatc 420

aacaaagaag cggttaagca gcgtctcaac cgtgaagatc aggggatttc gttcgctgag    480aacaaagaag cggttaagca gcgtctcaac cgtgaagatc aggggatttc gttcgctgag 480

ttttcctaca acctgctgca gggttatacg tttgcctgta tgaacaaaca gtacggtgtg    540ttttcctaca acctgctgca gggttatacg tttgcctgta tgaacaaaca gtacggtgtg 540

gtgctgcaaa ttggtggttc tgaccagtgg ggtaacatca cttctggtat cgacctgacc  600gtgctgcaaa ttggtggttc tgaccagtgg ggtaacatca cttctggtat cgacctgacc 600

cgtcgtctgc atcagaatca ggtg                                         624cgtcgtctgc atcagaatca ggtg 624

<210>32<210>32

<211>606<211>606

<212>DNA<212>DNA

<213>人工<213> Artificial

<220><220>

<223>人工合成酶<223> Synthetic enzymes

<400>32<400>32

gtgacggacg aggaagcgtt agcagagcga ctggcgcaag gcccgatcgc actcacgtgt   60gtgacggacg aggaagcgtt agcagagcga ctggcgcaag gcccgatcgc actcacgtgt 60

ggcttcgatc ctaccgctga cagcttgcat ttggggcatc ttgttccatt gttatgcctg  120ggcttcgatc ctaccgctga cagcttgcat ttggggcatc ttgttccatt gttatgcctg 120

aaacgcttcc agcaggcggg ccacaagccg gttgcgctgg taggcggcgc gacgggtctg  180aaacgcttcc agcaggcggg ccacaagccg gttgcgctgg taggcggcgc gacgggtctg 180

attggcgacc cgagcttcaa agctgccgag cgtaagctga acaccgaaga aactgttcag  240attggcgacc cgagcttcaa agctgccgag cgtaagctga acaccgaaga aactgttcag 240

gagtgggtgg acaaaatccg taagcaggtt gccccgttcc tcgatttcga ctgtggagaa  300gagtgggtgg acaaaatccg taagcaggtt gccccgttcc tcgatttcga ctgtggagaa 300

aactctgcta tcgcggccaa taattatgac tggttcggca atatgaatgt gctgaccttc  360aactctgcta tcgcggccaa taattatgac tggttcggca atatgaatgt gctgaccttc 360

ctgcgcgata ttggcaaaca cttctccgtt aaccagatga tcaacaaaga agcggttaag  420ctgcgcgata ttggcaaaca cttctccgtt aaccagatga tcaacaaaga agcggttaag 420

cagcgtctca accgtgaaga tcaggggatt tcgttcactg agttttccta caatctgctg  480cagcgtctca accgtgaaga tcaggggatt tcgttcactg agttttccta caatctgctg 480

cagggttatt cggctgcctg tcttaacaaa cagtacggtg tggtgctgca aattggtggt  540cagggttatt cggctgcctg tcttaacaaa cagtacggtg tggtgctgca aattggtggt 540

tctgaccagt ggggtaacat cacttctggt atcgacctga cccgtcgtct gcatcagaat  600tctgaccagt ggggtaacat cacttctggt atcgacctga cccgtcgtct gcatcagaat 600

caggtg                                                             606caggtg 606

<210>33<210>33

<211>624<211>624

<212>DNA<212>DNA

<213>人工<213> Artificial

<220><220>

<223>人工合成酶<223> Synthetic enzymes

<400>33<400>33

cgggggctgg tagcccaggt gacggacgag gaagcgttag cagagcgact ggcgcaaggc     60cgggggctgg tagcccaggt gacggacgag gaagcgttag cagagcgact ggcgcaaggc 60

ccgatcgcac tcgtttgtgg cttcgatcct accgctgaca gcttgcattt ggggcatctt    120ccgatcgcac tcgtttgtgg cttcgatcct accgctgaca gcttgcattt ggggcatctt 120

gttccattgt tatgcctgaa acgcttccag caggcgggcc acaagccggt tgcgctggta    180gttccattgt tatgcctgaa acgcttccag caggcgggcc acaagccggt tgcgctggta 180

ggcggcgcga cgggtctgat tggcgacccg agcttcaaag ctgccgagcg taagctgaac    240ggcggcgcga cgggtctgat tggcgacccg agcttcaaag ctgccgagcg taagctgaac 240

accgaagaaa ctgttcagga gtgggtggac aaaatccgta agcaggttgc cccgttcctc    300accgaagaaa ctgttcagga gtgggtggac aaaatccgta agcaggttgc cccgttcctc 300

gatttcgact gtggagaaaa ctctgctatc gcggccaata attatgactg gttcggcaat    360gatttcgact gtggagaaaa ctctgctatc gcggccaata attatgactg gttcggcaat 360

atgaatgtgc tgaccttcct gcgcgatatt ggcaaacact tctccgttaa ccagatgatc    420atgaatgtgc tgaccttcct gcgcgatatt ggcaaacact tctccgttaa ccagatgatc 420

aacaaagaag cggttaagca gcgtctcaac cgtgaagatc aggggatttc gttcactgag    480aacaaagaag cggttaagca gcgtctcaac cgtgaagatc aggggatttc gttcactgag 480

ttttcctaca acctgctgca gggttattcg atggcctgta cgaacaaaca gtacggtgtg    540ttttcctaca acctgctgca gggttattcg atggcctgta cgaacaaaca gtacggtgtg 540

gtgctgcaaa ttggtggttc tgaccagtgg ggtaacatca cttctggtat cgacctgacc    600gtgctgcaaa ttggtggttc tgaccagtgg ggtaacatca cttctggtat cgacctgacc 600

cgtcgtctgc atcagaatca ggtg                                           624cgtcgtctgc atcagaatca ggtg 624

<210>34<210>34

<211>624<211>624

<212>DNA<212>DNA

<213>人工<213> Artificial

<220><220>

<223>人工合成酶<223> Synthetic enzymes

<220><220>

<221>misc_feature<221>misc_feature

<222>(13)..(13)<222>(13)..(13)

<223>n是a、c、g或t<223> n is a, c, g or t

<400>34<400>34

cgggggctgg tancccaagt gacggacggg gaagcgttag cagagcgact ggcgcaaggc     60cgggggctgg tancccaagt gacggacggg gaagcgttag cagagcgact ggcgcaaggc 60

ccgatcgcac tcagttgtgg cttcgatcct accgctgaca gcttgcattt ggggcatctt    120ccgatcgcac tcagttgtgg cttcgatcct accgctgaca gcttgcattt ggggcatctt 120

gttccattgt tatgcctgaa acgcttccag caggcgggcc acaagccggt tgcgctggta  180gttccattgt tatgcctgaa acgcttccag caggcgggcc acaagccggt tgcgctggta 180

ggcggcgcga cgggtctgat tggcgacccg agcttcaaag ctgccgagcg taagctgaac  240ggcggcgcga cgggtctgat tggcgacccg agcttcaaag ctgccgagcg taagctgaac 240

accgaagaaa ctgttcagga gtgggtggac aaaatccgta agcaggttgc cccgttcctc  300accgaagaaa ctgttcagga gtgggtggac aaaatccgta agcaggttgc cccgttcctc 300

gatctcgact gtggagaaaa ctctgctatc gcggccaata attatgactg gttcggcaat  360gatctcgact gtggagaaaa ctctgctatc gcggccaata attatgactg gttcggcaat 360

atgaatgtgc tgaccttcct gcgcgatatt ggcaaacact tctccgttaa ccagatgatc  420atgaatgtgc tgaccttcct gcgcgatatt ggcaaacact tctccgttaa ccagatgatc 420

aacaaagaag cggttaagca gcgtctcaac cgtgaagatc aggggatttc gttcactgag  480aacaaagaag cggttaagca gcgtctcaac cgtgaagatc aggggatttc gttcactgag 480

ttttcctaca acctgctgca gggttatagt tttgcctgtc tgaacaaaca gtacggtgtg  540ttttcctaca acctgctgca gggttatagt tttgcctgtc tgaacaaaca gtacggtgtg 540

gtgctgcaaa ttggtggttc tgaccagtgg ggtaacatca cttctggtat cgacctgacc  600gtgctgcaaa ttggtggttc tgaccagtgg ggtaacatca cttctggtat cgacctgacc 600

cgtcgtctgc atcagaatca ggtg                                         624cgtcgtctgc atcagaatca ggtg 624

<210>35<210>35

<211>624<211>624

<212>DNA<212>DNA

<213>人工<213> Artificial

<220><220>

<223>人工合成酶<223> Synthetic enzymes

<400>35<400>35

cgggggctgg tagcccaggt gacggacgag gaagcgttag cagagcgact ggcgcaaggc   60cgggggctgg tagcccaggt gacggacgag gaagcgttag cagagcgact ggcgcaaggc 60

ccgatcgcac tcacgtgtgg cttcgactct accgctgaca gcttgcattt ggggcatctt  120ccgatcgcac tcacgtgtgg cttcgactct accgctgaca gcttgcattt ggggcatctt 120

gttccattgt tatgcctgaa acgcttccag caggcgggcc acaagccggt tgcgctggta  180gttccattgt tatgcctgaa acgcttccag caggcgggcc acaagccggt tgcgctggta 180

ggcggcgcga cgggtctgat tggcgacccg agcttcaaag ctgccgagcg taagctgaac  240ggcggcgcga cgggtctgat tggcgacccg agcttcaaag ctgccgagcg taagctgaac 240

accgaagaaa ctgttcagga gtgggtggac aaaatccgta agcaggttgc cccgttcctc  300accgaagaaa ctgttcagga gtgggtggac aaaatccgta agcaggttgc cccgttcctc 300

gatttcgact gtggagaaaa ctctgctatc gcggccaata attatgactg gttcggcaat  360gatttcgact gtggagaaaa ctctgctatc gcggccaata attatgactg gttcggcaat 360

atgaatgtgc tgaccttcct gcgcgatatt ggcaaacact tctccgttaa ccagatgatc  420atgaatgtgc tgaccttcct gcgcgatatt ggcaaacact tctccgttaa ccagatgatc 420

aacaaagaag cggttaagca gcgtctcaac cgtgaagatc aggggatttc gttcactgag  480aacaaagaag cggttaagca gcgtctcaac cgtgaagatc aggggatttc gttcactgag 480

ttttcctaca acctgctgca gggttatacg tttgcctgta ctaacaaaca gtacggtgtg  540ttttcctaca acctgctgca gggttatacg tttgcctgta ctaacaaaca gtacggtgtg 540

gtgctgcaaa ttggtggttc tgaccagtgg ggtaacatca cttctggtat cgacctgacc  600gtgctgcaaa ttggtggttc tgaccagtgg ggtaacatca cttctggtat cgacctgacc 600

cgtcgtctgc atcagaatca ggtg                                         624cgtcgtctgc atcagaatca ggtg 624

<210>36<210>36

<211>424<211>424

<212>PRT<212>PRT

<213>人工<213> Artificial

<220><220>

<223>人工合成酶<223> Synthetic enzymes

<400>36<400>36

Met Ala Ser Ser Asn Leu Ile Lys Gln Leu Gln Glu Arg Gly Leu ValMet Ala Ser Ser Asn Leu Ile Lys Gln Leu Gln Glu Arg Gly Leu Val

1               5                   10                  151 5 10 15

Ala Gln Val Thr Asp Glu Glu Ala Leu Ala Glu Arg Leu Ala Gln GlyAla Gln Val Thr Asp Glu Glu Ala Leu Ala Glu Arg Leu Ala Gln Gly

            20                  25                  3020 25 30

Pro Ile Ala Leu Val Cys Gly Phe Asp Pro Thr Ala Asp Ser Leu HisPro Ile Ala Leu Val Cys Gly Phe Asp Pro Thr Ala Asp Ser Leu His

        35                  40                  4535 40 45

Leu Gly His Leu Val Pro Leu Leu Cys Leu Lys Arg Phe Gln Gln AlaLeu Gly His Leu Val Pro Leu Leu Cys Leu Lys Arg Phe Gln Gln Ala

    50                  55                  6050 55 60

Gly His Lys Pro Val Ala Leu Val Gly Gly Ala Thr Gly Leu Ile GlyGly His Lys Pro Val Ala Leu Val Gly Gly Ala Thr Gly Leu Ile Gly

65                  70                  75                  8065 70 75 80

Asp Pro Ser Phe Lys Ala Ala Glu Arg Lys Leu Asn Thr Glu Glu ThrAsp Pro Ser Phe Lys Ala Ala Glu Arg Lys Leu Asn Thr Glu Glu Thr

                85                  90                  9585 90 95

Val Gln Glu Trp Val Asp Lys Ile Arg Lys Gln Val Ala Pro Phe LeuVal Gln Glu Trp Val Asp Lys Ile Arg Lys Gln Val Ala Pro Phe Leu

            100                 105              110100 105 110

Asp Phe Asp Cys Gly Glu Asn Ser Ala Ile Ala Ala Asn Asn Tyr AspAsp Phe Asp Cys Gly Glu Asn Ser Ala Ile Ala Ala Asn Asn Tyr Asp

        115                  120                  125115 120 125

Trp Phe Gly Asn Met Asn Val Leu Thr Phe Leu Arg Asp Ile Gly LysTrp Phe Gly Asn Met Asn Val Leu Thr Phe Leu Arg Asp Ile Gly Lys

     130                  135                  140130 135 140

His Phe Ser Val Asn Gln Met Ile Asn Lys Glu Ala Val Lys Gln ArgHis Phe Ser Val Asn Gln Met Ile Asn Lys Glu Ala Val Lys Gln Arg

145                 150                 155                 160145 150 155 160

Leu Asn Arg Glu Asp Gln Gly Ile Ser Phe Thr Glu Phe Ser Tyr AsnLeu Asn Arg Glu Asp Gln Gly Ile Ser Phe Thr Glu Phe Ser Tyr Asn

                165                 170                 175165 170 175

Leu Leu Gln Gly Tyr Ser Tyr Ala Cys Leu Asn Lys Gln Tyr Gly ValLeu Leu Gln Gly Tyr Ser Tyr Ala Cys Leu Asn Lys Gln Tyr Gly Val

            180                 185                 190180 185 190

Val Leu Gln Ile Gly Gly Ser Asp Gln Trp Gly Asn Ile Thr Ser GlyVal Leu Gln Ile Gly Gly Ser Asp Gln Trp Gly Asn Ile Thr Ser Gly

        195                 200                 205195 200 205

Ile Asp Leu Thr Arg Arg Leu His Gln Asn Gln Val Phe Gly Leu ThrIle Asp Leu Thr Arg Arg Leu His Gln Asn Gln Val Phe Gly Leu Thr

    210                 215                 220210 215 220

Val Pro Leu Ile Thr Lys Ala Asp Gly Thr Lys Phe Gly Lys Thr GluVal Pro Leu Ile Thr Lys Ala Asp Gly Thr Lys Phe Gly Lys Thr Glu

225                 230                 235                 240225 230 235 240

Gly Gly Ala Val Trp Leu Asp Pro Lys Lys Thr Ser Pro Tyr Lys PheGly Gly Ala Val Trp Leu Asp Pro Lys Lys Thr Ser Pro Tyr Lys Phe

                245                 250                 255245 250 255

Tyr Gln Phe Trp Ile Asn Thr Ala Asp Ala Asp Val Tyr Arg Phe LeuTyr Gln Phe Trp Ile Asn Thr Ala Asp Ala Asp Val Tyr Arg Phe Leu

            260                 265                 270260 265 270

Lys Phe Phe Thr Phe Met Ser Ile Glu Glu Ile Asn Ala Leu Glu GluLys Phe Phe Thr Phe Met Ser Ile Glu Glu Ile Asn Ala Leu Glu Glu

        275                 280                 285275 280 285

Glu Asp Lys Asn Ser Gly Lys Ala Pro Arg Ala Gln Tyr Val Leu AlaGlu Asp Lys Asn Ser Gly Lys Ala Pro Arg Ala Gln Tyr Val Leu Ala

    290                 295                 300290 295 300

Glu Gln Val Thr Arg Leu Val His Gly Glu Glu Gly Leu Gln Ala AlaGlu Gln Val Thr Arg Leu Val His Gly Glu Glu Gly Leu Gln Ala Ala

305                 310                 315                 320305 310 315 320

Lys Arg Ile Thr Glu Cys Leu Phe Ser Gly Ser Leu Ser Ala Leu SerLys Arg Ile Thr Glu Cys Leu Phe Ser Gly Ser Leu Ser Ala Leu Ser

                325                 330                 335325 330 335

Glu Ala Asp Phe Glu Gln Leu Ala Gln Asp Gly Val Pro Met Val GluGlu Ala Asp Phe Glu Gln Leu Ala Gln Asp Gly Val Pro Met Val Glu

            340                 345                 350340 345 350

Met Glu Lys Gly Ala Asp Leu Met Gln Ala Leu Val Asp Ser Glu LeuMet Glu Lys Gly Ala Asp Leu Met Gln Ala Leu Val Asp Ser Glu Leu

        355                 360                 365355 360 365

Gln Pro Ser Arg Gly Gln Ala Arg Lys Thr Ile Ala Ser Asn Ala IleGln Pro Ser Arg Gly Gln Ala Arg Lys Thr Ile Ala Ser Asn Ala Ile

    370                 375                 380370 375 380

Thr Ile Asn Gly Glu Lys Gln Ser Asp Pro Glu Tyr Phe Phe Lys GluThr Ile Asn Gly Glu Lys Gln Ser Asp Pro Glu Tyr Phe Phe Lys Glu

385                 390                 395                 400385 390 395 400

Glu Asp Arg Leu Phe Gly Arg Phe Thr Leu Leu Arg Arg Gly Lys LysGlu Asp Arg Leu Phe Gly Arg Phe Thr Leu Leu Arg Arg Gly Lys Lys

                405                 410                 415405 410 415

Asn Tyr Cys Leu Ile Cys Trp LysAsn Tyr Cys Leu Ile Cys Trp Lys

            420420

<210>37<210>37

<211>424<211>424

<212>PRT<212>PRT

<213>人工<213> Artificial

<220><220>

<223>人工合成酶<223> Synthetic enzymes

<400>37<400>37

Met Ala Ser Ser Asn Leu Ile Lys Gln Leu Gln Glu Arg Gly Leu ValMet Ala Ser Ser Asn Leu Ile Lys Gln Leu Gln Glu Arg Gly Leu Val

1               5                   10                  151 5 10 15

Ala Gln Val Thr Asp Glu Glu Ala Leu Ala Glu Arg Leu Ala Gln GlyAla Gln Val Thr Asp Glu Glu Ala Leu Ala Glu Arg Leu Ala Gln Gly

            20                  25                  3020 25 30

Pro Ile Ala Leu Ile Cys Gly Phe Asp Pro Thr Ala Asp Ser Leu HisPro Ile Ala Leu Ile Cys Gly Phe Asp Pro Thr Ala Asp Ser Leu His

        35                  40                  4535 40 45

Leu Gly His Leu Val Pro Leu Leu Cys Leu Lys Arg Phe Gln Gln AlaLeu Gly His Leu Val Pro Leu Leu Cys Leu Lys Arg Phe Gln Gln Ala

    50                  55                  6050 55 60

Gly His Lys Pro Val Ala Leu Val Gly Gly Ala Thr Gly Leu Ile GlyGly His Lys Pro Val Ala Leu Val Gly Gly Ala Thr Gly Leu Ile Gly

65                  70                  75                  8065 70 75 80

Asp Pro Ser Phe Lys Ala Ala Glu Arg Lys Leu Asn Thr Glu Glu ThrAsp Pro Ser Phe Lys Ala Ala Glu Arg Lys Leu Asn Thr Glu Glu Thr

                85                 90                 9585 90 95

Val Gln Glu Trp Val Asp Lys Ile Arg Lys Gln Val Ala Pro Phe LeuVal Gln Glu Trp Val Asp Lys Ile Arg Lys Gln Val Ala Pro Phe Leu

            100                 105                 110100 105 110

Asp Phe Asp Cys Gly Glu Asn Ser Ala Ile Ala Ala Asn Asn Tyr AspAsp Phe Asp Cys Gly Glu Asn Ser Ala Ile Ala Ala Asn Asn Tyr Asp

        115                 120                 125115 120 125

Trp Phe Gly Asn Met Asn Val Leu Thr Phe Leu Arg Asp Ile Gly LysTrp Phe Gly Asn Met Asn Val Leu Thr Phe Leu Arg Asp Ile Gly Lys

    130                 135                 140130 135 140

His Phe Ser Val Asn Gln Met Ile Asn Lys Glu Ala Val Lys Gln ArgHis Phe Ser Val Asn Gln Met Ile Asn Lys Glu Ala Val Lys Gln Arg

145                 150                 155                 160145 150 155 160

Leu Asn Arg Glu Asp Gln Gly Ile Ser Phe Thr Glu Phe Ser Tyr AsnLeu Asn Arg Glu Asp Gln Gly Ile Ser Phe Thr Glu Phe Ser Tyr Asn

                165                 170                 175165 170 175

Leu Leu Gln Gly Tyr Ser Met Ala Cys Leu Asn Lys Gln Tyr Gly ValLeu Leu Gln Gly Tyr Ser Met Ala Cys Leu Asn Lys Gln Tyr Gly Val

            180                 185                 190180 185 190

Val Leu Gln Ile Gly Gly Ser Asp Gln Trp Gly Asn Ile Thr Ser GlyVal Leu Gln Ile Gly Gly Ser Asp Gln Trp Gly Asn Ile Thr Ser Gly

        195                 200                 205195 200 205

Ile Asp Leu Thr Arg Arg Leu His Gln Asn Gln Val Phe Gly Leu ThrIle Asp Leu Thr Arg Arg Leu His Gln Asn Gln Val Phe Gly Leu Thr

    210                 215                 220210 215 220

Val Pro LeuIle Thr Lys Ala Asp Gly Thr Lys Phe Gly Lys Thr GluVal Pro LeuIle Thr Lys Ala Asp Gly Thr Lys Phe Gly Lys Thr Glu

225                 230                 235                240225 230 235 240

Gly Gly Ala Val Trp Leu Asp Pro Lys Lys Thr Ser Pro Tyr Lys PheGly Gly Ala Val Trp Leu Asp Pro Lys Lys Thr Ser Pro Tyr Lys Phe

                245                 250                 255245 250 255

Tyr Gln Phe Trp Ile Asn Thr Ala Asp Ala Asp Val Tyr Arg Phe LeuTyr Gln Phe Trp Ile Asn Thr Ala Asp Ala Asp Val Tyr Arg Phe Leu

            260                 265                 270260 265 270

Lys Phe Phe Thr Phe Met Ser Ile Glu Glu Ile Asn Ala Leu Glu GluLys Phe Phe Thr Phe Met Ser Ile Glu Glu Ile Asn Ala Leu Glu Glu

        275                 280                 285275 280 285

Glu Asp Lys Asn Ser Gly Lys Ala Pro Arg Ala Gln Tyr Val Leu AlaGlu Asp Lys Asn Ser Gly Lys Ala Pro Arg Ala Gln Tyr Val Leu Ala

    290                 295                 300290 295 300

Glu Gln Val Thr Arg Leu Val His Gly Glu Glu Gly Leu Gln Ala AlaGlu Gln Val Thr Arg Leu Val His Gly Glu Glu Gly Leu Gln Ala Ala

305                 310                 315                 320305 310 315 320

Lys Arg Ile Thr Glu Cys Leu Phe Ser Gly Ser Leu Ser Ala Leu SerLys Arg Ile Thr Glu Cys Leu Phe Ser Gly Ser Leu Ser Ala Leu Ser

                325                 330                 335325 330 335

Glu Ala Asp Phe Glu Gln Leu Ala Gln Asp Gly Val Pro Met Val GluGlu Ala Asp Phe Glu Gln Leu Ala Gln Asp Gly Val Pro Met Val Glu

            340                 345                 350340 345 350

Met Glu Lys Gly Ala Asp Leu Met Gln Ala Leu Val Asp Ser Glu LeuMet Glu Lys Gly Ala Asp Leu Met Gln Ala Leu Val Asp Ser Glu Leu

        355                 360                 365355 360 365

Gln Pro Ser Arg Gly Gln Ala Arg Lys Thr Ile Ala Ser Asn Ala IleGln Pro Ser Arg Gly Gln Ala Arg Lys Thr Ile Ala Ser Asn Ala Ile

    370                 375                 380370 375 380

Thr Ile Asn 6ly Glu Lys Gln Ser Asp Pro Glu Tyr Phe Phe Lys GluThr Ile Asn 6ly Glu Lys Gln Ser Asp Pro Glu Tyr Phe Phe Lys Glu

385                 390                 395                 400385 390 395 400

Glu Asp Arg Leu Phe Gly Arg Phe Thr Leu Leu Arg Arg Gly Lys LysGlu Asp Arg Leu Phe Gly Arg Phe Thr Leu Leu Arg Arg Gly Lys Lys

                405                 410                 415405 410 415

Asn Tyr Cys Leu Ile Cys Trp LysAsn Tyr Cys Leu Ile Cys Trp Lys

            420420

<210>38<210>38

<211>424<211>424

<212>PRT<212>PRT

<213>人工<213> Artificial

<220><220>

<223>人工合成酶<223> Synthetic enzymes

<400>38<400>38

Met Ala Ser Ser Asn Leu Ile Lys Gln Leu Gln Glu Arg Gly Leu ValMet Ala Ser Ser Asn Leu Ile Lys Gln Leu Gln Glu Arg Gly Leu Val

1               5                   10                  151 5 10 15

Ala Gln Val Thr Asp Glu Glu Ala Leu Ala Glu Arg Leu Ala Gln GlyAla Gln Val Thr Asp Glu Glu Ala Leu Ala Glu Arg Leu Ala Gln Gly

            20                  25                  3020 25 30

Pro Ile Ala Leu Val Cys Gly Phe Asp Pro Thr Ala Asp Ser Leu HisPro Ile Ala Leu Val Cys Gly Phe Asp Pro Thr Ala Asp Ser Leu His

        35                  40                  4535 40 45

Leu Gly His Leu Val Pro Leu Leu Cys Leu Lys Arg Phe Gln Gln AlaLeu Gly His Leu Val Pro Leu Leu Cys Leu Lys Arg Phe Gln Gln Ala

    50                  55                  6050 55 60

Gly His Lys Pro Val Ala Leu Val Gly Gly Ala Thr Gly Leu Ile GlyGly His Lys Pro Val Ala Leu Val Gly Gly Ala Thr Gly Leu Ile Gly

65                  70                  75                  8065 70 75 80

Asp Pro Ser Phe Lys Ala Ala Glu Arg Lys Leu Asn Thr Glu Glu ThrAsp Pro Ser Phe Lys Ala Ala Glu Arg Lys Leu Asn Thr Glu Glu Thr

                85                  90                  9585 90 95

Val Gln Glu Trp Val Asp Lys Ile Arg Lys Gln Val Ala Pro Phe LeuVal Gln Glu Trp Val Asp Lys Ile Arg Lys Gln Val Ala Pro Phe Leu

            100                 105                 110100 105 110

Asp Phe Asp Cys Gly Glu Asn Ser Ala Ile Ala Ala Asn Asn Tyr AspAsp Phe Asp Cys Gly Glu Asn Ser Ala Ile Ala Ala Asn Asn Tyr Asp

        115                 120                  125115 120 125

Trp Phe Gly Asn Met Asn Val Leu Thr Phe Leu Arg Asp Ile Gly LysTrp Phe Gly Asn Met Asn Val Leu Thr Phe Leu Arg Asp Ile Gly Lys

    130                  135                  140130 135 140

His Phe Ser Val Asn Gln Met Ile Asn Lys Glu Ala Val Lys Gln ArgHis Phe Ser Val Asn Gln Met Ile Asn Lys Glu Ala Val Lys Gln Arg

145                 150                 155                 160145 150 155 160

Leu Asn Arg Glu Asp Gln Gly Ile Ser Phe Thr Glu Phe Ser Tyr AsnLeu Asn Arg Glu Asp Gln Gly Ile Ser Phe Thr Glu Phe Ser Tyr Asn

                165                 170                 175165 170 175

Leu Leu Gln Gly Tyr Ser Met Ala Cys Ala Asn Lys Gln Tyr Gly ValLeu Leu Gln Gly Tyr Ser Met Ala Cys Ala Asn Lys Gln Tyr Gly Val

            180                 185                 190180 185 190

Val Leu Gln Ile Gly Gly Ser Asp Gln Trp Gly Asn Ile Thr Ser GlyVal Leu Gln Ile Gly Gly Ser Asp Gln Trp Gly Asn Ile Thr Ser Gly

        195                  200                  205195 200 205

Ile Asp Leu Thr Arg Arg Leu His Gln Asn Gln Val Phe Gly Leu ThrIle Asp Leu Thr Arg Arg Leu His Gln Asn Gln Val Phe Gly Leu Thr

    210                  215                  220210 215 220

Val Pro Leu Ile Thr Lys Ala Asp Gly Thr Lys Phe Gly Lys Thr GluVal Pro Leu Ile Thr Lys Ala Asp Gly Thr Lys Phe Gly Lys Thr Glu

225                 230                 235                 240225 230 235 240

Gly Gly Ala Val Trp Leu Asp Pro Lys Lys Thr Ser Pro Tyr Lys PheGly Gly Ala Val Trp Leu Asp Pro Lys Lys Thr Ser Pro Tyr Lys Phe

                245                 250                 255245 250 255

Tyr Gln Phe Trp Ile Asn Thr Ala Asp Ala Asp Val Tyr Arg Phe LeuTyr Gln Phe Trp Ile Asn Thr Ala Asp Ala Asp Val Tyr Arg Phe Leu

            260                 265                 270260 265 270

Lys Phe Phe Thr Phe Met Ser Ile Glu Glu Ile Asn Ala Leu Glu GluLys Phe Phe Thr Phe Met Ser Ile Glu Glu Ile Asn Ala Leu Glu Glu

        275                  280                  285275 280 285

Glu Asp Lys Asn Ser Gly Lys Ala Pro Arg Ala Gln Tyr Val Leu AlaGlu Asp Lys Asn Ser Gly Lys Ala Pro Arg Ala Gln Tyr Val Leu Ala

    290                  295                  300290 295 300

Glu Gln Val Thr Arg Leu Val His Gly Glu Glu Gly Leu Gln Ala AlaGlu Gln Val Thr Arg Leu Val His Gly Glu Glu Gly Leu Gln Ala Ala

305                 310                 315                 320305 310 315 320

Lys ArgIle Thr Glu Cys Leu Phe Ser Gly Ser Leu Ser Ala Leu SerLys ArgIle Thr Glu Cys Leu Phe Ser Gly Ser Leu Ser Ala Leu Ser

               325                  330                  335325 330 335

Glu Ala Asp Phe Glu Gln Leu Ala Gln Asp Gly Val Pro Met Val GluGlu Ala Asp Phe Glu Gln Leu Ala Gln Asp Gly Val Pro Met Val Glu

            340                 345                 350340 345 350

Met Glu Lys Gly Ala Asp Leu Met Gln Ala Leu Val Asp Ser Glu LeuMet Glu Lys Gly Ala Asp Leu Met Gln Ala Leu Val Asp Ser Glu Leu

        355                 360                 365355 360 365

Gln Pro Ser Arg Gly Gln Ala Arg Lys Thr Ile Ala Ser Asn Ala IleGln Pro Ser Arg Gly Gln Ala Arg Lys Thr Ile Ala Ser Asn Ala Ile

    370                  375                  380370 375 380

Thr Ile Asn Gly Glu Lys Gln Ser Asp Pro Glu Tyr Phe Phe Lys GluThr Ile Asn Gly Glu Lys Gln Ser Asp Pro Glu Tyr Phe Phe Lys Glu

385                 390                 395                 400385 390 395 400

Glu Asp Arg Leu Phe Gly Arg Phe Thr Leu Leu Arg Arg Gly Lys LysGlu Asp Arg Leu Phe Gly Arg Phe Thr Leu Leu Arg Arg Gly Lys Lys

                405                 410                 415405 410 415

Asn Tyr Cys Leu Ile Cys Trp LysAsn Tyr Cys Leu Ile Cys Trp Lys

            420420

<210>39<210>39

<211>424<211>424

<212>PRT<212>PRT

<213>人工<213> Artificial

<220><220>

<223>人工合成酶<223> Synthetic enzymes

<400>39<400>39

Met Ala Ser Ser Asn Leu Ile Lys Gln Leu Gln GluArg Gly Leu ValMet Ala Ser Ser Asn Leu Ile Lys Gln Leu Gln GluArg Gly Leu Val

1               5                   10                 151 5 10 15

Ala Gln Val Thr Asp Glu Glu Ala Leu Ala Glu Arg Leu Ala Gln GlyAla Gln Val Thr Asp Glu Glu Ala Leu Ala Glu Arg Leu Ala Gln Gly

            20                  25                  3020 25 30

Pro Ile Ala Leu Val Cys Gly Phe Asp Pro Thr Ala Asp Ser Leu HisPro Ile Ala Leu Val Cys Gly Phe Asp Pro Thr Ala Asp Ser Leu His

        35                  40                  4535 40 45

Leu Gly His Leu Val Pro Leu Leu Cys Leu Lys Arg Phe Gln Gln AlaLeu Gly His Leu Val Pro Leu Leu Cys Leu Lys Arg Phe Gln Gln Ala

    50                  55                  6050 55 60

Gly His Lys Pro Val Ala Leu Val Gly Gly Ala Thr Gly Leu Ile GlyGly His Lys Pro Val Ala Leu Val Gly Gly Ala Thr Gly Leu Ile Gly

65                  70                  75                  8065 70 75 80

Asp Pro Ser Phe Lys Ala Ala Glu Arg Lys Leu Asn Thr Glu Glu ThrAsp Pro Ser Phe Lys Ala Ala Glu Arg Lys Leu Asn Thr Glu Glu Thr

                85                  90                  9585 90 95

Val Gln Glu Trp ValAsp Lys Ile Arg Lys Gln Val Ala Pro Phe LeuVal Gln Glu Trp ValAsp Lys Ile Arg Lys Gln Val Ala Pro Phe Leu

            100                  105                  110100 105 110

Asp Phe Asp Cys Gly Glu Asn Ser Ala Ile Ala Ala Asn Asn Tyr AspAsp Phe Asp Cys Gly Glu Asn Ser Ala Ile Ala Ala Asn Asn Tyr Asp

        115                 120                 125115 120 125

Trp Phe Gly Asn MetAsn Val Leu Thr Phe Leu Arg Asp Ile Gly LysTrp Phe Gly Asn MetAsn Val Leu Thr Phe Leu Arg Asp Ile Gly Lys

    130                135                 140130 135 140

His Phe Ser Val Asn Gln Met Ile Asn Lys Glu Ala Val Lys Gln ArgHis Phe Ser Val Asn Gln Met Ile Asn Lys Glu Ala Val Lys Gln Arg

145                 150                 155                 160145 150 155 160

Leu Asn Arg Glu Asp Gln Gly Ile Ser Phe Thr Glu Phe Ser Tyr AsnLeu Asn Arg Glu Asp Gln Gly Ile Ser Phe Thr Glu Phe Ser Tyr Asn

                165                 170                 175165 170 175

Leu Leu Gln Gly Tyr Ser Met Ala Cys Leu Asn Lys Gln Tyr Gly ValLeu Leu Gln Gly Tyr Ser Met Ala Cys Leu Asn Lys Gln Tyr Gly Val

            180                 185                 190180 185 190

Val Leu Gln Ile Gly Gly Ser Asp Gln Trp Gly Asn Ile Thr Ser GlyVal Leu Gln Ile Gly Gly Ser Asp Gln Trp Gly Asn Ile Thr Ser Gly

        195                 200                 205195 200 205

Ile Asp Leu Thr Arg Arg Leu His Gln Asn Gln Val Phe Gly Leu ThrIle Asp Leu Thr Arg Arg Leu His Gln Asn Gln Val Phe Gly Leu Thr

    210                 215                 220210 215 220

Val Pro Leu Ile Thr Lys Ala Asp Gly Thr Lys Phe Gly Lys Thr GluVal Pro Leu Ile Thr Lys Ala Asp Gly Thr Lys Phe Gly Lys Thr Glu

225                 230                 235                 240225 230 235 240

Gly Gly Ala Val Trp Leu Asp Pro Lys Lys Thr Ser Pro Tyr Lys PheGly Gly Ala Val Trp Leu Asp Pro Lys Lys Thr Ser Pro Tyr Lys Phe

                245                 250                 255245 250 255

Tyr Gln Phe Trp Ile Asn Thr Ala Asp Ala Asp Val Tyr Arg Phe LeuTyr Gln Phe Trp Ile Asn Thr Ala Asp Ala Asp Val Tyr Arg Phe Leu

            260                 265                 270260 265 270

Lys Phe Phe Thr Phe Met Ser Ile Glu Glu Ile Asn Ala Leu Glu GluLys Phe Phe Thr Phe Met Ser Ile Glu Glu Ile Asn Ala Leu Glu Glu

        275                 280                 285275 280 285

Glu Asp Lys Asn Ser Gly Lys Ala Pro Arg Ala Gln Tyr Val Leu AlaGlu Asp Lys Asn Ser Gly Lys Ala Pro Arg Ala Gln Tyr Val Leu Ala

    290                 295                 300290 295 300

Glu Gln Val Thr Arg Leu Val His Gly Glu Glu Gly Leu Gln Ala AlaGlu Gln Val Thr Arg Leu Val His Gly Glu Glu Gly Leu Gln Ala Ala

305                 310                 315                 320305 310 315 320

Lys Arg Ile Thr Glu Cys Leu Phe Ser Gly Ser Leu Ser Ala Leu SerLys Arg Ile Thr Glu Cys Leu Phe Ser Gly Ser Leu Ser Ala Leu Ser

                325                 330                 335325 330 335

Glu Ala Asp Phe Glu Gln Leu Ala Gln Asp Gly Val Pro Met Val GluGlu Ala Asp Phe Glu Gln Leu Ala Gln Asp Gly Val Pro Met Val Glu

            340                 345                 350340 345 350

Met Glu Lys Gly Ala Asp Leu Met Gln Ala Leu Val Asp Ser Glu LeuMet Glu Lys Gly Ala Asp Leu Met Gln Ala Leu Val Asp Ser Glu Leu

        355                 360                 365355 360 365

Gln Pro Ser Arg Gly Gln Ala Arg Lys Thr Ile Ala Ser Asn Ala IleGln Pro Ser Arg Gly Gln Ala Arg Lys Thr Ile Ala Ser Asn Ala Ile

    370                 375                 380370 375 380

Thr Ile Asn Gly Glu Lys Gln Ser Asp Pro Glu Tyr Phe Phe Lys GluThr Ile Asn Gly Glu Lys Gln Ser Asp Pro Glu Tyr Phe Phe Lys Glu

385                 390                 395                 400385 390 395 400

Glu Asp Arg Leu Phe Gly Arg Phe Thr Leu Leu Arg Arg Gly Lys LysGlu Asp Arg Leu Phe Gly Arg Phe Thr Leu Leu Arg Arg Gly Lys Lys

                405                 410                 415405 410 415

Asn Tyr Cys Leu Ile Cys Trp LysAsn Tyr Cys Leu Ile Cys Trp Lys

            420420

<210>40<210>40

<211>424<211>424

<212>PRT<212>PRT

<213>人工<213> Artificial

<220><220>

<223>人工合成酶<223> Synthetic enzymes

<400>40<400>40

Met Ala Ser Ser Asn Leu Ile Lys Gln Leu Gln Glu Arg Gly Leu ValMet Ala Ser Ser Asn Leu Ile Lys Gln Leu Gln Glu Arg Gly Leu Val

1               5                   10                  151 5 10 15

Ala Gln Val Thr Asp Glu Glu Ala Leu Ala Glu Arg Leu Ala Gln GlyAla Gln Val Thr Asp Glu Glu Ala Leu Ala Glu Arg Leu Ala Gln Gly

            20                  25                  3020 25 30

Pro Ile Ala Leu Thr Cys Gly Phe Asp Pro Thr Ala Asp Ser Leu HisPro Ile Ala Leu Thr Cys Gly Phe Asp Pro Thr Ala Asp Ser Leu His

        35                  40                  4535 40 45

Leu Gly His Leu Val Pro Leu Leu Cys Leu Lys Arg Phe Gln Gln AlaLeu Gly His Leu Val Pro Leu Leu Cys Leu Lys Arg Phe Gln Gln Ala

    50                  55                  6050 55 60

Gly His Lys Pro Val Ala Leu Val Gly Gly Ala Thr Gly Leu Ile GlyGly His Lys Pro Val Ala Leu Val Gly Gly Ala Thr Gly Leu Ile Gly

65                  70                  75                  8065 70 75 80

Asp Pro Ser Phe Lys Ala Ala Glu Arg Lys Leu Asn Thr Glu Glu ThrAsp Pro Ser Phe Lys Ala Ala Glu Arg Lys Leu Asn Thr Glu Glu Thr

                85                  90                  9585 90 95

Val Gln Glu Trp Val Asp Lys Ile Arg Lys Gln Val Ala Pro Phe LeuVal Gln Glu Trp Val Asp Lys Ile Arg Lys Gln Val Ala Pro Phe Leu

            100                 105                 110100 105 110

Asp Phe Asp Cys Gly Glu Asn Ser Ala Ile Ala Ala Asn Asn Tyr AspAsp Phe Asp Cys Gly Glu Asn Ser Ala Ile Ala Ala Asn Asn Tyr Asp

        115                 120                 125115 120 125

Trp Phe Gly Asn Met Asn Val Leu Thr Phe Leu Arg Asp Ile Gly LysTrp Phe Gly Asn Met Asn Val Leu Thr Phe Leu Arg Asp Ile Gly Lys

    130                 135                 140130 135 140

His Phe Ser Val Asn Gln Met Ile Asn Lys Glu Ala Val Lys Gln ArgHis Phe Ser Val Asn Gln Met Ile Asn Lys Glu Ala Val Lys Gln Arg

145                 150                 155                 160145 150 155 160

Leu Asn Arg Glu Asp Gln Gly Ile Ser Phe Thr Glu Phe Ser Tyr AsnLeu Asn Arg Glu Asp Gln Gly Ile Ser Phe Thr Glu Phe Ser Tyr Asn

                165                 170                 175165 170 175

Leu Leu Gln Gly Tyr Thr Met Ala Cys Leu Asn Lys Gln Tyr Gly ValLeu Leu Gln Gly Tyr Thr Met Ala Cys Leu Asn Lys Gln Tyr Gly Val

            180                 185                 190180 185 190

Val Leu Gln Ile Gly Gly Ser Asp Gln Trp Gly Asn Ile Thr Ser GlyVal Leu Gln Ile Gly Gly Ser Asp Gln Trp Gly Asn Ile Thr Ser Gly

        195                 200                 205195 200 205

Ile Asp Leu Thr Arg Arg Leu Hi s Gln Asn Gln Val Phe Gly Leu ThrIle Asp Leu Thr Arg Arg Leu Hi s Gln Asn Gln Val Phe Gly Leu Thr

    210                 215                 220210 215 220

Val Pro Leu Ile Thr Lys Ala Asp Gly Thr Lys Phe Gly Lys Thr GluVal Pro Leu Ile Thr Lys Ala Asp Gly Thr Lys Phe Gly Lys Thr Glu

225                 230                 235                 240225 230 235 240

Gly Gly Ala Val Trp Leu Asp Pro Lys Lys Thr Ser Pro Tyr Lys PheGly Gly Ala Val Trp Leu Asp Pro Lys Lys Thr Ser Pro Tyr Lys Phe

                245                 250                 255245 250 255

Tyr Gln Phe Trp Ile Asn Thr Ala Asp Ala Asp Val Tyr Arg Phe LeuTyr Gln Phe Trp Ile Asn Thr Ala Asp Ala Asp Val Tyr Arg Phe Leu

            260                 265                 270260 265 270

Lys Phe Phe Thr Phe Met Ser Ile Glu Glu Ile Asn Ala Leu Glu GluLys Phe Phe Thr Phe Met Ser Ile Glu Glu Ile Asn Ala Leu Glu Glu

        275                 280                 285275 280 285

Glu Asp Lys Asn Ser Gly Lys Ala Pro Arg Ala Gln Tyr Val Leu AlaGlu Asp Lys Asn Ser Gly Lys Ala Pro Arg Ala Gln Tyr Val Leu Ala

    290                 295                 300290 295 300

Glu Gln Val Thr Arg Leu Val His Gly Glu Glu Gly Leu Gln Ala AlaGlu Gln Val Thr Arg Leu Val His Gly Glu Glu Gly Leu Gln Ala Ala

305                 310                 315                 320305 310 315 320

Lys Arg Ile Thr Glu Cys Leu Phe Ser Gly Ser Leu Ser Ala Leu SerLys Arg Ile Thr Glu Cys Leu Phe Ser Gly Ser Leu Ser Ala Leu Ser

                325                 330                 335325 330 335

Glu Ala Asp Phe Glu Gln Leu Ala Gln Asp Gly Val Pro Met Val GluGlu Ala Asp Phe Glu Gln Leu Ala Gln Asp Gly Val Pro Met Val Glu

            340                 345                 350340 345 350

Met Glu Lys Gly Ala Asp Leu Met Gln Ala Leu Val Asp Ser Glu LeuMet Glu Lys Gly Ala Asp Leu Met Gln Ala Leu Val Asp Ser Glu Leu

        355                 360                 365355 360 365

Gln Pro Ser Arg Gly Gln Ala Arg Lys Thr Ile Ala Ser Asn Ala IleGln Pro Ser Arg Gly Gln Ala Arg Lys Thr Ile Ala Ser Asn Ala Ile

    370                 375                 380370 375 380

Thr Ile Asn Gly Glu Lys Gln Ser Asp Pro Glu Tyr Phe Phe Lys GluThr Ile Asn Gly Glu Lys Gln Ser Asp Pro Glu Tyr Phe Phe Lys Glu

385                 390                 395                 400385 390 395 400

Glu Asp Arg Leu Phe Gly Arg Phe Thr Leu Leu Arg Arg Gly Lys LysGlu Asp Arg Leu Phe Gly Arg Phe Thr Leu Leu Arg Arg Gly Lys Lys

                405                 410                 415405 410 415

Asn Tyr Cys Leu Ile Cys Trp LysAsn Tyr Cys Leu Ile Cys Trp Lys

            420420

<210>41<210>41

<21l>424<21l>424

<212>PRT<212>PRT

<213>人工<213> Artificial

<220><220>

<223>人工合成酶<223> Synthetic enzymes

<400>41<400>41

Met Ala Ser Ser Asn Leu Ile Lys Gln Leu Gln Glu Arg Gly Leu ValMet Ala Ser Ser Asn Leu Ile Lys Gln Leu Gln Glu Arg Gly Leu Val

1               5                   10                  151 5 10 15

Ala Gln Val Thr Asp Glu Glu Ala Leu Ala Glu Arg Leu Ala Gln GlyAla Gln Val Thr Asp Glu Glu Ala Leu Ala Glu Arg Leu Ala Gln Gly

            20                  25                  3020 25 30

ProIle Ala Leu Thr Cys Gly Phe Asp Pro Thr Ala Asp Ser Leu HisProIle Ala Leu Thr Cys Gly Phe Asp Pro Thr Ala Asp Ser Leu His

       35                  40                  4535 40 45

Leu Gly His Leu Val Pro Leu Leu Cys Leu Lys Arg Phe Gln Gln AlaLeu Gly His Leu Val Pro Leu Leu Cys Leu Lys Arg Phe Gln Gln Ala

    50                  55                  6050 55 60

Gly His Lys Pro Val Ala Leu Val Gly Gly Ala Thr Gly Leu Ile GlyGly His Lys Pro Val Ala Leu Val Gly Gly Ala Thr Gly Leu Ile Gly

65                  70                  75                  8065 70 75 80

Asp Pro Ser Phe Lys Ala Ala Glu Arg Lys Leu Asn Thr Glu Glu ThrAsp Pro Ser Phe Lys Ala Ala Glu Arg Lys Leu Asn Thr Glu Glu Thr

                85                  90                  9585 90 95

Val Gln Glu Trp Val Asp Lys Ile Arg Lys Gln Val Ala Pro Phe LeuVal Gln Glu Trp Val Asp Lys Ile Arg Lys Gln Val Ala Pro Phe Leu

            100                 105                 110100 105 110

Asp Phe Asp Cys Gly Glu Asn Ser Ala Ile Ala Ala Asn Asn Tyr AspAsp Phe Asp Cys Gly Glu Asn Ser Ala Ile Ala Ala Asn Asn Tyr Asp

        115                 120                 125115 120 125

Trp Phe Gly Asn Met Asn Val Leu Thr Phe Leu Arg Asp Ile Gly LysTrp Phe Gly Asn Met Asn Val Leu Thr Phe Leu Arg Asp Ile Gly Lys

    130                 135                 140130 135 140

His Phe Ser Val Asn Gln Met Ile Asn Lys Glu Ala Val Lys Gln ArgHis Phe Ser Val Asn Gln Met Ile Asn Lys Glu Ala Val Lys Gln Arg

145                 150                 155                 160145 150 155 160

Leu Asn Arg Glu Asp Gln Gly Ile Ser Phe Thr Glu Phe Ser Tyr AsnLeu Asn Arg Glu Asp Gln Gly Ile Ser Phe Thr Glu Phe Ser Tyr Asn

                165                 170                 175165 170 175

Leu Leu Gln Gly Tyr Thr Tyr Ala Cys Leu Asn Lys Gln Tyr Gly ValLeu Leu Gln Gly Tyr Thr Tyr Ala Cys Leu Asn Lys Gln Tyr Gly Val

            180                 185                 190180 185 190

Val Leu Gln Ile Gly Gly Ser Asp Gln Trp Gly Asn Ile Thr Ser GlyVal Leu Gln Ile Gly Gly Ser Asp Gln Trp Gly Asn Ile Thr Ser Gly

        195                 200                 205195 200 205

Ile Asp Leu Thr Arg Arg Leu His Gln Asn Gln Val Phe Gly Leu ThrIle Asp Leu Thr Arg Arg Leu His Gln Asn Gln Val Phe Gly Leu Thr

    210                 215                 220210 215 220

Val Pro Leu Ile Thr Lys Ala Asp Gly Thr Lys Phe Gly Lys Thr GluVal Pro Leu Ile Thr Lys Ala Asp Gly Thr Lys Phe Gly Lys Thr Glu

225                 230                 235                 240225 230 235 240

Gly Gly Ala Val Trp Leu Asp Pro Lys Lys Thr Ser Pro Tyr Lys PheGly Gly Ala Val Trp Leu Asp Pro Lys Lys Thr Ser Pro Tyr Lys Phe

                245                 250                 255245 250 255

Tyr Gln Phe Trp Ile Asn Thr Ala Asp Ala Asp Val Tyr Arg Phe LeuTyr Gln Phe Trp Ile Asn Thr Ala Asp Ala Asp Val Tyr Arg Phe Leu

            260                 265                 270260 265 270

Lys Phe Phe Thr Phe Met Ser Ile Glu Glu Ile Asn Ala Leu Glu GluLys Phe Phe Thr Phe Met Ser Ile Glu Glu Ile Asn Ala Leu Glu Glu

        275                 280                 285275 280 285

Glu Asp Lys Asn Ser Gly Lys Ala Pro Arg Ala Gln Tyr Val Leu AlaGlu Asp Lys Asn Ser Gly Lys Ala Pro Arg Ala Gln Tyr Val Leu Ala

    290                 295                 300290 295 300

Glu Gln Val Thr Arg Leu Val His Gly Glu Glu Gly Leu Gln Ala AlaGlu Gln Val Thr Arg Leu Val His Gly Glu Glu Gly Leu Gln Ala Ala

305                 310                 315                 320305 310 315 320

Lys Arg Ile Thr Glu Cys Leu Phe Ser Gly Ser Leu Ser Ala Leu SerLys Arg Ile Thr Glu Cys Leu Phe Ser Gly Ser Leu Ser Ala Leu Ser

                325                 330                 335325 330 335

Glu Ala Asp Phe Glu Gln Leu Ala Gln Asp Gly Val Pro Met Val GluGlu Ala Asp Phe Glu Gln Leu Ala Gln Asp Gly Val Pro Met Val Glu

            340                 345                 350340 345 350

Met Glu Lys Gly Ala Asp Leu Met Gln Ala Leu Val Asp Ser Glu LeuMet Glu Lys Gly Ala Asp Leu Met Gln Ala Leu Val Asp Ser Glu Leu

        355                 360                 365355 360 365

Gln Pro Ser Arg Gly Gln Ala Arg Lys Thr Ile Ala Ser Asn Ala IleGln Pro Ser Arg Gly Gln Ala Arg Lys Thr Ile Ala Ser Asn Ala Ile

    370                 375                 380370 375 380

Thr Ile Asn Gly Glu Lys Gln Ser Asp Pro Glu Tyr Phe Phe Lys GluThr Ile Asn Gly Glu Lys Gln Ser Asp Pro Glu Tyr Phe Phe Lys Glu

385                 390                 395                 400385 390 395 400

Glu Asp Arg Leu Phe Gly Arg Phe Thr Leu Leu Arg Arg Gly Lys LysGlu Asp Arg Leu Phe Gly Arg Phe Thr Leu Leu Arg Arg Gly Lys Lys

                405                 410                 415405 410 415

Asn Tyr Cys Leu Ile Cys Trp LysAsn Tyr Cys Leu Ile Cys Trp Lys

            420420

<210>42<210>42

<211>424<211>424

<212>PRT<212>PRT

<213>人工<213> Artificial

<220><220>

<223>人工合成酶<223> Synthetic enzymes

<400>42<400>42

Met Ala Ser Ser Asn Leu Ile Lys Gln Leu Gln Glu Arg Gly Leu ValMet Ala Ser Ser Asn Leu Ile Lys Gln Leu Gln Glu Arg Gly Leu Val

1               5               10                      151 5 10 15

Ala Gln Val Thr Asp Glu Glu Ala Leu Ala Glu Arg Leu Ala Gln GlyAla Gln Val Thr Asp Glu Glu Ala Leu Ala Glu Arg Leu Ala Gln Gly

            20                  25                  3020 25 30

Pro Ile Ala Leu Leu Cys Gly Phe Asp Pro Thr Ala Asp Ser Leu HisPro Ile Ala Leu Leu Cys Gly Phe Asp Pro Thr Ala Asp Ser Leu His

        35                  40                  4535 40 45

Leu Gly His Leu Val Pro Leu Leu Cys Leu Lys Arg Phe Gln Gln AlaLeu Gly His Leu Val Pro Leu Leu Cys Leu Lys Arg Phe Gln Gln Ala

    50                 55                 6050 55 60

Gly His Lys Pro Val Ala Leu Val Gly Gly Ala Thr Gly Leu Ile GlyGly His Lys Pro Val Ala Leu Val Gly Gly Ala Thr Gly Leu Ile Gly

65                  70                  75                  8065 70 75 80

Asp Pro Ser Phe Lys Ala Ala Glu Arg Lys Leu Asn Thr Glu Glu ThrAsp Pro Ser Phe Lys Ala Ala Glu Arg Lys Leu Asn Thr Glu Glu Thr

                85                  90                  9585 90 95

Val Gln Glu Trp Val Asp Lys Ile Arg Lys Gln Val Ala Pro Phe LeuVal Gln Glu Trp Val Asp Lys Ile Arg Lys Gln Val Ala Pro Phe Leu

            100                 105                 110100 105 110

Asp Phe Asp Cys Gly Glu Asn Ser Ala Ile Ala Ala Asn Asn Tyr AspAsp Phe Asp Cys Gly Glu Asn Ser Ala Ile Ala Ala Asn Asn Tyr Asp

        115                 120                 125115 120 125

Trp Phe Gly Asn Met Asn Val Leu Thr Phe Leu Arg Asp Ile Gly LysTrp Phe Gly Asn Met Asn Val Leu Thr Phe Leu Arg Asp Ile Gly Lys

    130                 135                 140130 135 140

His Phe Ser Val Asn Gln Met Ile Asn Lys Glu Ala Val Lys Gln ArgHis Phe Ser Val Asn Gln Met Ile Asn Lys Glu Ala Val Lys Gln Arg

145                 150                 155                 160145 150 155 160

Leu Asn Arg Glu Asp Gln Gly Ile Ser Phe Thr Glu Phe Ser Tyr AsnLeu Asn Arg Glu Asp Gln Gly Ile Ser Phe Thr Glu Phe Ser Tyr Asn

                165                 170                 175165 170 175

Leu Leu Gln Gly Tyr Ser Met Ala Cys Ser Asn Lys Gln Tyr Gly ValLeu Leu Gln Gly Tyr Ser Met Ala Cys Ser Asn Lys Gln Tyr Gly Val

            180                 185                 190180 185 190

Val Leu Gln Ile Gly Gly Ser Asp Gln Trp Gly Asn Ile Thr Ser GlyVal Leu Gln Ile Gly Gly Ser Asp Gln Trp Gly Asn Ile Thr Ser Gly

        195                 200                 205195 200 205

Ile Asp Leu Thr Arg Arg Leu His Gln Asn Gln Val Phe Gly Leu ThrIle Asp Leu Thr Arg Arg Leu His Gln Asn Gln Val Phe Gly Leu Thr

    210                 215                 220210 215 220

Val Pro Leu Ile Thr Lys Ala Asp Gly Thr Lys Phe Gly Lys Thr GluVal Pro Leu Ile Thr Lys Ala Asp Gly Thr Lys Phe Gly Lys Thr Glu

225                 230                 235                 240225 230 235 240

Gly Gly Ala Val Trp Leu Asp Pro Lys Lys Thr Ser Pro Tyr Lys PheGly Gly Ala Val Trp Leu Asp Pro Lys Lys Thr Ser Pro Tyr Lys Phe

                245                 250                 255245 250 255

Tyr Gln Phe Trp Ile Asn Thr Ala Asp Ala Asp Val Tyr Arg Phe LeuTyr Gln Phe Trp Ile Asn Thr Ala Asp Ala Asp Val Tyr Arg Phe Leu

            260                 265                 270260 265 270

Lys Phe Phe Thr Phe Met Ser Ile Glu Glu Ile Asn Ala Leu Glu GluLys Phe Phe Thr Phe Met Ser Ile Glu Glu Ile Asn Ala Leu Glu Glu

        275                 280                 285275 280 285

Glu Asp Lys Asn Ser Gly Lys Ala Pro Arg Ala Gln Tyr Val Leu AlaGlu Asp Lys Asn Ser Gly Lys Ala Pro Arg Ala Gln Tyr Val Leu Ala

    290                 295                 300290 295 300

Glu Gln Val Thr Arg Leu Val His Gly Glu Glu Gly Leu Gln Ala AlaGlu Gln Val Thr Arg Leu Val His Gly Glu Glu Gly Leu Gln Ala Ala

305                 310                 315                 320305 310 315 320

Lys Arg Ile Thr Glu Cys Leu Phe Ser Gly Ser Leu Ser Ala Leu SerLys Arg Ile Thr Glu Cys Leu Phe Ser Gly Ser Leu Ser Ala Leu Ser

                325                 330                 335325 330 335

Glu Ala Asp Phe Glu Gln Leu Ala Gln Asp Gly Val Pro Met Val GluGlu Ala Asp Phe Glu Gln Leu Ala Gln Asp Gly Val Pro Met Val Glu

            340                 345                 350340 345 350

Met Glu Lys Gly Ala Asp Leu Met Gln Ala Leu Val Asp Ser Glu LeuMet Glu Lys Gly Ala Asp Leu Met Gln Ala Leu Val Asp Ser Glu Leu

        355                 360                 365355 360 365

Gln Pro Ser Arg Gly Gln Ala Arg Lys Thr Ile Ala Ser Asn Ala IleGln Pro Ser Arg Gly Gln Ala Arg Lys Thr Ile Ala Ser Asn Ala Ile

    370                 375                 380370 375 380

Thr Ile Asn Gly Glu Lys Gln Ser Asp Pro Glu Tyr Phe Phe Lys GluThr Ile Asn Gly Glu Lys Gln Ser Asp Pro Glu Tyr Phe Phe Lys Glu

385                 390                 395                 400385 390 395 400

Glu Asp Arg Leu Phe Gly Arg Phe Thr Leu Leu Arg Arg Gly Lys LysGlu Asp Arg Leu Phe Gly Arg Phe Thr Leu Leu Arg Arg Gly Lys Lys

                405                 410                 415405 410 415

Asn Tyr Cys Leu Ile Cys Trp LysAsn Tyr Cys Leu Ile Cys Trp Lys

            420420

<210>43<210>43

<211>424<211>424

<212>PRT<212>PRT

<213>人工<213> Artificial

<220><220>

<223>人工合成酶<223> Synthetic enzymes

<400>43<400>43

Met Ala Ser Ser Asn Leu Ile Lys Gln Leu Gln Glu Arg Gly Leu ValMet Ala Ser Ser Asn Leu Ile Lys Gln Leu Gln Glu Arg Gly Leu Val

1               5                   10                  151 5 10 15

Ala Gln Val Thr Asp Glu Glu Ala Leu Ala Glu Arg Leu Ala Gln GlyAla Gln Val Thr Asp Glu Glu Ala Leu Ala Glu Arg Leu Ala Gln Gly

            20                  25                  3020 25 30

Pro Ile Ala Leu Leu Cys Gly Phe Asp Pro Thr Ala Asp Ser Leu HisPro Ile Ala Leu Leu Cys Gly Phe Asp Pro Thr Ala Asp Ser Leu His

        35                  40                  4535 40 45

Leu Gly His Leu Val Pro Leu Leu Cys Leu Lys Arg Phe Gln Gln AlaLeu Gly His Leu Val Pro Leu Leu Cys Leu Lys Arg Phe Gln Gln Ala

    50                  55                  6050 55 60

Gly His Lys Pro Val Ala Leu Val Gly Gly Ala Thr Gly Leu Ile GlyGly His Lys Pro Val Ala Leu Val Gly Gly Ala Thr Gly Leu Ile Gly

65                  70                  75                  8065 70 75 80

Asp Pro Ser Phe Lys Ala Ala Glu Arg Lys Leu Asn Thr Glu Glu ThrAsp Pro Ser Phe Lys Ala Ala Glu Arg Lys Leu Asn Thr Glu Glu Thr

                85                  90                  9585 90 95

Val Gln Glu Trp Val Asp Lys Ile Arg Lys Gln Val Ala Pro Phe LeuVal Gln Glu Trp Val Asp Lys Ile Arg Lys Gln Val Ala Pro Phe Leu

            100                 105                 110100 105 110

Asp Phe Asp Cys Gly Glu Asn Ser Ala Ile Ala Ala Asn Asn Tyr AspAsp Phe Asp Cys Gly Glu Asn Ser Ala Ile Ala Ala Asn Asn Tyr Asp

        115                 120                 125115 120 125

Trp Phe Gly Asn Met Asn Val Leu Thr Phe Leu Arg Asp Ile Gly LysTrp Phe Gly Asn Met Asn Val Leu Thr Phe Leu Arg Asp Ile Gly Lys

    130                 135                 140130 135 140

His Phe Ser Val Asn Gln Met Ile Asn Lys Glu Ala Val Lys Gln ArgHis Phe Ser Val Asn Gln Met Ile Asn Lys Glu Ala Val Lys Gln Arg

145                 150                 155                 160145 150 155 160

Leu Asn Arg Glu Asp Gln Gly Ile Ser Phe Thr Glu Phe Ser Tyr AsnLeu Asn Arg Glu Asp Gln Gly Ile Ser Phe Thr Glu Phe Ser Tyr Asn

                165                 170                 175165 170 175

Leu Leu Gln Gly Tyr Ser Met Ala Cys Ala Asn Lys Gln Tyr Gly ValLeu Leu Gln Gly Tyr Ser Met Ala Cys Ala Asn Lys Gln Tyr Gly Val

            180                 185                 190180 185 190

Val Leu Gln Ile Gly Gly Ser Asp Gln Trp Gly Asn Ile Thr Ser GlyVal Leu Gln Ile Gly Gly Ser Asp Gln Trp Gly Asn Ile Thr Ser Gly

        195                 200                 205195 200 205

Ile Asp Leu Thr Arg Arg Leu His Gln Asn Gln Val Phe Gly Leu ThrIle Asp Leu Thr Arg Arg Leu His Gln Asn Gln Val Phe Gly Leu Thr

    2l0                 215                 2202l0 215 220

Val Pro Leu Ile Thr Lys Ala Asp Gly Thr Lys Phe Gly Lys Thr GluVal Pro Leu Ile Thr Lys Ala Asp Gly Thr Lys Phe Gly Lys Thr Glu

225                 230                 235                 240225 230 235 240

Gly Gly Ala Val Trp Leu Asp Pro Lys Lys Thr Ser Pro Tyr Lys PheGly Gly Ala Val Trp Leu Asp Pro Lys Lys Thr Ser Pro Tyr Lys Phe

                245                 250                 255245 250 255

Tyr Gln Phe Trp Ile Asn Thr Ala Asp Ala Asp Val Tyr Arg Phe LeuTyr Gln Phe Trp Ile Asn Thr Ala Asp Ala Asp Val Tyr Arg Phe Leu

            260                 265                 270260 265 270

Lys Phe Phe Thr Phe Met Ser Ile Glu Glu Ile Asn Ala Leu Glu GluLys Phe Phe Thr Phe Met Ser Ile Glu Glu Ile Asn Ala Leu Glu Glu

        275                 280                 285275 280 285

Glu Asp Lys Asn Ser Gly Lys Ala Pro Arg Ala Gln Tyr Val Leu AlaGlu Asp Lys Asn Ser Gly Lys Ala Pro Arg Ala Gln Tyr Val Leu Ala

    290                 295                 300290 295 300

Glu Gln Val Thr Arg Leu Val His Gly Glu Glu Gly Leu Gln Ala AlaGlu Gln Val Thr Arg Leu Val His Gly Glu Glu Gly Leu Gln Ala Ala

305                 310                 315                 320305 310 315 320

Lys Arg Ile Thr Glu Cys Leu Phe Ser Gly Ser Leu Ser Ala Leu SerLys Arg Ile Thr Glu Cys Leu Phe Ser Gly Ser Leu Ser Ala Leu Ser

                325                 330                 335325 330 335

Glu Ala Asp Phe Glu Gln Leu Ala Gln Asp Gly Val Pro Met Val GluGlu Ala Asp Phe Glu Gln Leu Ala Gln Asp Gly Val Pro Met Val Glu

            340                 345                 350340 345 350

Met Glu Lys Gly Ala Asp Leu Met Gln Ala Leu Val Asp Ser Glu LeuMet Glu Lys Gly Ala Asp Leu Met Gln Ala Leu Val Asp Ser Glu Leu

        355                 360                 365355 360 365

Gln Pro Ser Arg Gly Gln Ala Arg Lys Thr Ile Ala Ser Asn Ala IleGln Pro Ser Arg Gly Gln Ala Arg Lys Thr Ile Ala Ser Asn Ala Ile

    370                 375                 380370 375 380

Thr Ile Asn Gly Glu Lys Gln Ser Asp Pro Glu Tyr Phe Phe Lys GluThr Ile Asn Gly Glu Lys Gln Ser Asp Pro Glu Tyr Phe Phe Lys Glu

385                 390                 395                 400385 390 395 400

Glu Asp Arg Leu Phe Gly Arg Phe Thr Leu Leu Arg Arg Gly Lys LysGlu Asp Arg Leu Phe Gly Arg Phe Thr Leu Leu Arg Arg Gly Lys Lys

                405                 410                 415405 410 415

Asn Tyr Cys Leu Ile Cys Trp LysAsn Tyr Cys Leu Ile Cys Trp Lys

            420420

<210>44<210>44

<211>424<211>424

<212>PRT<212>PRT

<213>人工<213> Artificial

<220><220>

<223>人工合成酶<223> Synthetic enzymes

<400>44<400>44

Met Ala Ser Ser Asn Leu Ile Lys Gln Leu Gln Glu Arg Gly Leu ValMet Ala Ser Ser Asn Leu Ile Lys Gln Leu Gln Glu Arg Gly Leu Val

1               5                   10                  151 5 10 15

Ala Gln Val Thr Asp Glu Glu Ala Leu Ala Glu Arg Leu Ala Gln GlyAla Gln Val Thr Asp Glu Glu Ala Leu Ala Glu Arg Leu Ala Gln Gly

            20                  25                  3020 25 30

Pro Ile Ala Leu Thr Cys Gly Phe Asp Pro Thr Ala Asp Ser Leu HisPro Ile Ala Leu Thr Cys Gly Phe Asp Pro Thr Ala Asp Ser Leu His

        35                  40                  4535 40 45

Leu Gly His Leu Val Pro Leu Leu Cys Leu Lys Arg Phe Gln Gln AlaLeu Gly His Leu Val Pro Leu Leu Cys Leu Lys Arg Phe Gln Gln Ala

    50                  55                  6050 55 60

Gly His Lys Pro Val Ala Leu Val Gly Gly Ala Thr Gly Leu Ile GlyGly His Lys Pro Val Ala Leu Val Gly Gly Ala Thr Gly Leu Ile Gly

65                  70                  75                  8065 70 75 80

Asp Pro Ser Phe Lys Ala Ala Glu Arg Lys Leu Asn Thr Glu Glu ThrAsp Pro Ser Phe Lys Ala Ala Glu Arg Lys Leu Asn Thr Glu Glu Thr

                85                  90                  9585 90 95

Val Gln Glu Trp Val Asp Lys Ile Arg Lys Gln Val Ala Pro Phe LeuVal Gln Glu Trp Val Asp Lys Ile Arg Lys Gln Val Ala Pro Phe Leu

            100                 105                 110100 105 110

Asp Phe Asp Cys Gly Glu Asn Ser Ala Ile Ala Ala Asn Asn Tyr AspAsp Phe Asp Cys Gly Glu Asn Ser Ala Ile Ala Ala Asn Asn Tyr Asp

        115                 120                 125115 120 125

Trp Phe Gly Asn Met Asn Val Leu Thr Phe Leu Arg Asp Ile Gly LysTrp Phe Gly Asn Met Asn Val Leu Thr Phe Leu Arg Asp Ile Gly Lys

    130                 135                 140130 135 140

His Phe Ser Val Asn Gln Met Ile Asn Lys Glu Ala Val Lys Gln ArgHis Phe Ser Val Asn Gln Met Ile Asn Lys Glu Ala Val Lys Gln Arg

145                 150                 155                 160145 150 155 160

Leu Asn Arg Glu Asp Gln Gly Ile Ser Phe Thr Glu Phe Ser Tyr AsnLeu Asn Arg Glu Asp Gln Gly Ile Ser Phe Thr Glu Phe Ser Tyr Asn

                165                 170                 175165 170 175

Leu Leu Gln Gly Tyr Arg Met Ala Cys Leu Asn Lys Gln Tyr Gly ValLeu Leu Gln Gly Tyr Arg Met Ala Cys Leu Asn Lys Gln Tyr Gly Val

            180                 185                 190180 185 190

Val Leu Gln Ile Gly Gly Ser Asp Gln Trp Gly Asn Ile Thr Ser GlyVal Leu Gln Ile Gly Gly Ser Asp Gln Trp Gly Asn Ile Thr Ser Gly

        195                 200                 205195 200 205

Ile Asp Leu Thr Arg Arg Leu His Gln Asn Gln Val Phe Gly Leu ThrIle Asp Leu Thr Arg Arg Leu His Gln Asn Gln Val Phe Gly Leu Thr

    210                 215                 220210 215 220

Val Pro Leu Ile Thr Lys Ala Asp Gly Thr Lys Phe Gly Lys Thr GluVal Pro Leu Ile Thr Lys Ala Asp Gly Thr Lys Phe Gly Lys Thr Glu

225                 230                 235                 240225 230 235 240

Gly Gly Ala Val Trp Leu Asp Pro Lys Lys Thr Ser Pro Tyr Lys PheGly Gly Ala Val Trp Leu Asp Pro Lys Lys Thr Ser Pro Tyr Lys Phe

                245                 250                 255245 250 255

Tyr Gln Phe Trp Ile Asn Thr Ala Asp Ala Asp Val Tyr Arg Phe LeuTyr Gln Phe Trp Ile Asn Thr Ala Asp Ala Asp Val Tyr Arg Phe Leu

            260                 265                 270260 265 270

Lys Phe Phe Thr Phe Met Ser Ile Glu Glu Ile Asn Ala Leu Glu GluLys Phe Phe Thr Phe Met Ser Ile Glu Glu Ile Asn Ala Leu Glu Glu

        275                 280                 285275 280 285

Glu Asp Lys Asn Ser Gly Lys Ala Pro Arg Ala Gln Tyr Val Leu AlaGlu Asp Lys Asn Ser Gly Lys Ala Pro Arg Ala Gln Tyr Val Leu Ala

    290                 295                 300290 295 300

Glu Gln Val Thr Arg Leu Val His Gly Glu Glu Gly Leu Gln Ala AlaGlu Gln Val Thr Arg Leu Val His Gly Glu Glu Gly Leu Gln Ala Ala

305                 310                 315                 320305 310 315 320

Lys Arg Ile Thr Glu Cys Leu Phe Ser Gly Ser Leu Ser Ala Leu SerLys Arg Ile Thr Glu Cys Leu Phe Ser Gly Ser Leu Ser Ala Leu Ser

                325                 330                 335325 330 335

Glu Ala Asp Phe Glu Gln Leu Ala Gln Asp Gly Val Pro Met Val GluGlu Ala Asp Phe Glu Gln Leu Ala Gln Asp Gly Val Pro Met Val Glu

            340                 345                 350340 345 350

Met Glu Lys Gly Ala Asp Leu Met Gln Ala Leu Val Asp Ser Glu LeuMet Glu Lys Gly Ala Asp Leu Met Gln Ala Leu Val Asp Ser Glu Leu

        355                 360                 365355 360 365

Gln Pro Ser Arg Gly Gln Ala Arg Lys Thr Ile Ala Ser Asn Ala IleGln Pro Ser Arg Gly Gln Ala Arg Lys Thr Ile Ala Ser Asn Ala Ile

    370                 375                 380370 375 380

Thr Ile Asn Gly Glu Lys Gln Ser Asp Pro Glu Tyr Phe Phe Lys GluThr Ile Asn Gly Glu Lys Gln Ser Asp Pro Glu Tyr Phe Phe Lys Glu

385                 390                 395                 400385 390 395 400

Glu Asp Arg Leu Phe Gly Arg Phe Thr Leu Leu Arg Arg Gly Lys LysGlu Asp Arg Leu Phe Gly Arg Phe Thr Leu Leu Arg Arg Gly Lys Lys

                405                 410                 415405 410 415

Asn Tyr Cys Leu Ile Cys Trp LysAsn Tyr Cys Leu Ile Cys Trp Lys

            420420

<210>45<210>45

<211>424<211>424

<212>PRT<212>PRT

<213>人工<213> Artificial

<220><220>

<223>人工合成酶<223> Synthetic enzymes

<400>45<400>45

Met Ala Ser Ser Asn Leu Ile Lys Gln Leu Gln Glu Arg Gly Leu ValMet Ala Ser Ser Asn Leu Ile Lys Gln Leu Gln Glu Arg Gly Leu Val

1               5                   10                  151 5 10 15

Ala Gln Val Thr Asp Glu Glu Ala Leu Ala Glu Arg Leu Ala Gln GlyAla Gln Val Thr Asp Glu Glu Ala Leu Ala Glu Arg Leu Ala Gln Gly

            20                  25                  3020 25 30

Pro Ile Ala Leu Ile Cys Gly Phe Asp Pro Thr Ala Asp Ser Leu HisPro Ile Ala Leu Ile Cys Gly Phe Asp Pro Thr Ala Asp Ser Leu His

        35                  40                  4535 40 45

Leu Gly His Leu Val Pro Leu Leu Cys Leu Lys Arg Phe Gln Gln AlaLeu Gly His Leu Val Pro Leu Leu Cys Leu Lys Arg Phe Gln Gln Ala

    50                  55                  6050 55 60

Gly His Lys Pro Val Ala Leu Val Gly Gly Ala Thr Gly Leu Ile GlyGly His Lys Pro Val Ala Leu Val Gly Gly Ala Thr Gly Leu Ile Gly

65                  70                  75                  8065 70 75 80

Asp Pro Ser Phe Lys Ala Ala Glu Arg Lys Leu Asn Thr Glu Glu ThrAsp Pro Ser Phe Lys Ala Ala Glu Arg Lys Leu Asn Thr Glu Glu Thr

                85                  90                  9585 90 95

Val Gln Glu Trp Val Asp Lys Ile Arg Lys Gln Val Ala Pro Phe LeuVal Gln Glu Trp Val Asp Lys Ile Arg Lys Gln Val Ala Pro Phe Leu

            100                 105                 110100 105 110

Asp Phe Asp Cys Gly Glu Asn Ser Ala Ile Ala Ala Asn Asn Tyr AspAsp Phe Asp Cys Gly Glu Asn Ser Ala Ile Ala Ala Asn Asn Tyr Asp

        115                 120                 125115 120 125

Trp Phe Gly Asn Met Asn Val Leu Thr Phe Leu Arg Asp Ile Gly LysTrp Phe Gly Asn Met Asn Val Leu Thr Phe Leu Arg Asp Ile Gly Lys

    130                  135                  140130 135 140

His Phe Ser Val Asn Gln Met Ile Asn Lys Glu Ala Val Lys Gln ArgHis Phe Ser Val Asn Gln Met Ile Asn Lys Glu Ala Val Lys Gln Arg

145                 150                 155                 160145 150 155 160

Leu Asn Arg Glu Asp Gln Gly Ile Ser Phe Thr Glu Phe Ser Tyr AsnLeu Asn Arg Glu Asp Gln Gly Ile Ser Phe Thr Glu Phe Ser Tyr Asn

                165                 170                 175165 170 175

Leu Leu Gln Gly Tyr Gly Met Ala Cys Ala Asn Lys Gln Tyr Gly ValLeu Leu Gln Gly Tyr Gly Met Ala Cys Ala Asn Lys Gln Tyr Gly Val

            180                 185                 190180 185 190

Val Leu Gln Ile Gly Gly Ser Asp Gln Trp Gly Asn Ile Thr Ser GlyVal Leu Gln Ile Gly Gly Ser Asp Gln Trp Gly Asn Ile Thr Ser Gly

        195                 200                 205195 200 205

Ile Asp Leu Thr Arg Arg Leu His Gln Asn Gln Val Phe Gly Leu ThrIle Asp Leu Thr Arg Arg Leu His Gln Asn Gln Val Phe Gly Leu Thr

    210                 215                 220210 215 220

Val Pro Leu Ile Thr Lys Ala Asp Gly Thr Lys Phe Gly Lys Thr GluVal Pro Leu Ile Thr Lys Ala Asp Gly Thr Lys Phe Gly Lys Thr Glu

225                 230                 235                 240225 230 235 240

Gly Gly Ala Val Trp Leu Asp Pro Lys Lys Thr Ser Pro Tyr Lys PheGly Gly Ala Val Trp Leu Asp Pro Lys Lys Thr Ser Pro Tyr Lys Phe

                245                 250                 255245 250 255

Tyr Gln Phe Trp Ile Asn Thr Ala Asp Ala Asp Val Tyr Arg Phe LeuTyr Gln Phe Trp Ile Asn Thr Ala Asp Ala Asp Val Tyr Arg Phe Leu

            260                 265                 270260 265 270

Lys Phe Phe Thr Phe Met Ser Ile Glu Glu Ile Asn Ala Leu Glu GluLys Phe Phe Thr Phe Met Ser Ile Glu Glu Ile Asn Ala Leu Glu Glu

        275                 280                 285275 280 285

Glu Asp Lys Asn Ser Gly Lys Ala Pro Arg Ala Gln Tyr Val  Leu AlaGlu Asp Lys Asn Ser Gly Lys Ala Pro Arg Ala Gln Tyr Val Leu Ala

    290                 295                 300290 295 300

Glu Gln Val Thr Arg Leu Val Hi s Gly Glu Glu Gly Leu Gln Ala AlaGlu Gln Val Thr Arg Leu Val Hi s Gly Glu Glu Gly Leu Gln Ala Ala

305                 310                  315                 320305 310 315 320

Lys Arg Ile Thr Glu Cys Leu Phe Ser Gly Ser Leu Ser Ala Leu SerLys Arg Ile Thr Glu Cys Leu Phe Ser Gly Ser Leu Ser Ala Leu Ser

                325                 330                 335325 330 335

Glu Ala Asp Phe Glu Gln Leu Ala Gln Asp Gly Val Pro Met Val GluGlu Ala Asp Phe Glu Gln Leu Ala Gln Asp Gly Val Pro Met Val Glu

            340                 345                 350340 345 350

Met Glu Lys Gly Ala Asp Leu Met Gln Ala Leu Val Asp Ser Glu LeuMet Glu Lys Gly Ala Asp Leu Met Gln Ala Leu Val Asp Ser Glu Leu

        355                 360                 365355 360 365

Gln Pro Ser Arg Gly Gln Ala Arg Lys Thr Ile Ala Ser Asn Ala IleGln Pro Ser Arg Gly Gln Ala Arg Lys Thr Ile Ala Ser Asn Ala Ile

    370                 375                 380370 375 380

Thr Ile Asn Gly Glu Lys Gln Ser Asp Pro Glu Tyr Phe Phe Lys GluThr Ile Asn Gly Glu Lys Gln Ser Asp Pro Glu Tyr Phe Phe Lys Glu

385                 390                 395                 400385 390 395 400

Glu Asp Arg Leu Phe Gly Arg Phe Thr Leu Leu Arg Arg Gly Lys LysGlu Asp Arg Leu Phe Gly Arg Phe Thr Leu Leu Arg Arg Gly Lys Lys

                405                 410                 415405 410 415

Asn Tyr Cys Leu Ile Cys Trp LysAsn Tyr Cys Leu Ile Cys Trp Lys

            420420

<210>46<210>46

<211>424<211>424

<212>PRT<212>PRT

<213>人工<213> Artificial

<220><220>

<223>人工合成酶<223> Synthetic enzymes

<400>46<400>46

Met Ala Ser Ser Asn Leu Ile Lys Gln Leu Gln Glu Arg Gly Leu ValMet Ala Ser Ser Asn Leu Ile Lys Gln Leu Gln Glu Arg Gly Leu Val

1               5                   10                  151 5 10 15

Ala Gln Val Thr Asp Glu Glu Ala Leu Ala Glu Arg Leu Ala Gln GlyAla Gln Val Thr Asp Glu Glu Ala Leu Ala Glu Arg Leu Ala Gln Gly

            20                  25                  3020 25 30

Pro Ile Ala Leu Gly Cys Gly Phe Asp Pro Thr Ala Asp Ser Leu HisPro Ile Ala Leu Gly Cys Gly Phe Asp Pro Thr Ala Asp Ser Leu His

        35                  40                  4535 40 45

Leu Gly His Leu Val Pro Leu Leu Cys Leu Lys Arg Phe Gln Gln AlaLeu Gly His Leu Val Pro Leu Leu Cys Leu Lys Arg Phe Gln Gln Ala

    50                 55                 6050 55 60

Gly His Lys Pro Val Ala Leu Val Gly Gly Ala Thr Gly Leu Ile GlyGly His Lys Pro Val Ala Leu Val Gly Gly Ala Thr Gly Leu Ile Gly

65                  70                  75                  8065 70 75 80

Asp Pro Ser Phe Lys Ala Ala Glu Arg Lys Leu Asn Thr Glu Glu ThrAsp Pro Ser Phe Lys Ala Ala Glu Arg Lys Leu Asn Thr Glu Glu Thr

                85                  90                  9585 90 95

Val Gln Glu Trp Val Asp Lys Ile Arg Lys Gln Val Ala Pro Phe LeuVal Gln Glu Trp Val Asp Lys Ile Arg Lys Gln Val Ala Pro Phe Leu

            100                 105                 110100 105 110

Asp Phe Asp Cys Gly Glu Asn Ser Ala Ile Ala Ala Asn Asn Tyr AspAsp Phe Asp Cys Gly Glu Asn Ser Ala Ile Ala Ala Asn Asn Tyr Asp

        115                 120                 125115 120 125

Trp Phe Gly Asn Met Asn Val Leu Thr Phe Leu Arg Asp Ile Gly LysTrp Phe Gly Asn Met Asn Val Leu Thr Phe Leu Arg Asp Ile Gly Lys

    130                 135                 140130 135 140

His Phe Ser Val Asn Gln Met Ile Asn Lys Glu Ala Val Lys Gln ArgHis Phe Ser Val Asn Gln Met Ile Asn Lys Glu Ala Val Lys Gln Arg

145                 150                 155                 160145 150 155 160

Leu Asn Arg Glu Asp Gln Gly Ile Ser Phe Thr Glu Phe Ser Tyr AsnLeu Asn Arg Glu Asp Gln Gly Ile Ser Phe Thr Glu Phe Ser Tyr Asn

                165                 170                 175165 170 175

Leu Leu Gln Gly Tyr Gly Phe Ala Cys Ala Asn Lys Gln Tyr Gly ValLeu Leu Gln Gly Tyr Gly Phe Ala Cys Ala Asn Lys Gln Tyr Gly Val

            180                 185                 190180 185 190

Val Leu Gln Ile Gly Gly Ser Asp Gln Trp Gly Asn Ile Thr Ser GlyVal Leu Gln Ile Gly Gly Ser Asp Gln Trp Gly Asn Ile Thr Ser Gly

        195                 200                 205195 200 205

Ile Asp Leu Thr Arg Arg Leu His Gln Asn Gln Val Phe Gly Leu ThrIle Asp Leu Thr Arg Arg Leu His Gln Asn Gln Val Phe Gly Leu Thr

    210                 215                 220210 215 220

Val Pro Leu Ile Thr Lys Ala Asp Gly Thr Lys Phe Gly Lys Thr GluVal Pro Leu Ile Thr Lys Ala Asp Gly Thr Lys Phe Gly Lys Thr Glu

225                 230                 235                 240225 230 235 240

Gly Gly Ala Val Trp Leu Asp Pro Lys Lys Thr Ser Pro Tyr Lys PheGly Gly Ala Val Trp Leu Asp Pro Lys Lys Thr Ser Pro Tyr Lys Phe

                245                 250                 255245 250 255

Tyr Gln Phe Trp Ile Asn Thr Ala Asp Ala Asp Val Tyr Arg Phe LeuTyr Gln Phe Trp Ile Asn Thr Ala Asp Ala Asp Val Tyr Arg Phe Leu

            260                 265                 270260 265 270

Lys Phe Phe Thr Phe Met Ser Ile Glu Glu Ile Asn Ala Leu Glu GluLys Phe Phe Thr Phe Met Ser Ile Glu Glu Ile Asn Ala Leu Glu Glu

        275                 280                 285275 280 285

Glu Asp Lys Asn Ser Gly Lys Ala Pro Arg Ala Gln Tyr Val Leu AlaGlu Asp Lys Asn Ser Gly Lys Ala Pro Arg Ala Gln Tyr Val Leu Ala

    290                 295                 300290 295 300

Glu Gln Val Thr Arg Leu Val His Gly Glu Glu Gly Leu Gln Ala AlaGlu Gln Val Thr Arg Leu Val His Gly Glu Glu Gly Leu Gln Ala Ala

305                 310                 315                 320305 310 315 320

Lys Arg Ile Thr Glu CysLeu Phe Ser Gly Ser Leu Ser Ala Leu SerLys Arg Ile Thr Glu CysLeu Phe Ser Gly Ser Leu Ser Ala Leu Ser

                325                330                 335325 330 335

Glu Ala Asp Phe Glu Gln Leu Ala Gln Asp Gly Val Pro Met Val GluGlu Ala Asp Phe Glu Gln Leu Ala Gln Asp Gly Val Pro Met Val Glu

            340                 345                 350340 345 350

Met Glu Lys Gly Ala Asp Leu Met Gln Ala Leu Val Asp Ser Glu LeuMet Glu Lys Gly Ala Asp Leu Met Gln Ala Leu Val Asp Ser Glu Leu

        355                 360                 365355 360 365

Gln Pro Ser Arg Gly Gln Ala Arg Lys Thr Ile Ala Ser Asn Ala IleGln Pro Ser Arg Gly Gln Ala Arg Lys Thr Ile Ala Ser Asn Ala Ile

    370                 375                 380370 375 380

Thr Ile Asn Gly Glu Lys Gln Ser Asp Pro Glu Tyr Phe Phe Lys GluThr Ile Asn Gly Glu Lys Gln Ser Asp Pro Glu Tyr Phe Phe Lys Glu

385                 390                 395                 400385 390 395 400

Glu Asp Arg Leu Phe Gly Arg Phe Thr Leu Leu Arg Arg Gly Lys LysGlu Asp Arg Leu Phe Gly Arg Phe Thr Leu Leu Arg Arg Gly Lys Lys

                405                 410                 415405 410 415

Asn Tyr Cys Leu Ile Cys Trp LysAsn Tyr Cys Leu Ile Cys Trp Lys

            420420

<210>47<210>47

<211>424<211>424

<212>PRT<212>PRT

<213>人工<213> Artificial

<220><220>

<223>人工合成酶<223> Synthetic enzymes

<400>47<400>47

Met Ala Ser Ser Asn Leu Ile Lys Gln Leu Gln Glu Arg Gly Leu ValMet Ala Ser Ser Asn Leu Ile Lys Gln Leu Gln Glu Arg Gly Leu Val

1               5                   10                  151 5 10 15

Ala Gln Val Thr Asp Glu Glu Ala Leu Ala Glu Arg Leu Ala Gln GlyAla Gln Val Thr Asp Glu Glu Ala Leu Ala Glu Arg Leu Ala Gln Gly

            20                 25                 3020 25 30

Pro Ile Ala Leu Gly Cys Gly Phe Asp Pro Thr Ala Asp Ser Leu HisPro Ile Ala Leu Gly Cys Gly Phe Asp Pro Thr Ala Asp Ser Leu His

        35                 40                 4535 40 45

Leu Gly His Leu Val Pro Leu Leu Cys Leu Lys Arg Phe Gln Gln AlaLeu Gly His Leu Val Pro Leu Leu Cys Leu Lys Arg Phe Gln Gln Ala

    50                  55                  6050 55 60

Gly His Lys Pro Val Ala Leu Val Gly Gly Ala Thr Gly Leu Ile GlyGly His Lys Pro Val Ala Leu Val Gly Gly Ala Thr Gly Leu Ile Gly

65                  70                  75                  8065 70 75 80

Asp Pro Ser Phe Lys Ala Ala Glu Arg Lys Leu Asn Thr Glu Glu ThrAsp Pro Ser Phe Lys Ala Ala Glu Arg Lys Leu Asn Thr Glu Glu Thr

                85                  90                  9585 90 95

Val Gln Glu Trp Val Asp Lys Ile Arg Lys Gln Val Ala Pro Phe LeuVal Gln Glu Trp Val Asp Lys Ile Arg Lys Gln Val Ala Pro Phe Leu

            100                 105                 110100 105 110

Asp Phe Asp Cys Gly Glu Asn Ser Ala Ile Ala Ala Asn Asn Tyr AspAsp Phe Asp Cys Gly Glu Asn Ser Ala Ile Ala Ala Asn Asn Tyr Asp

        115                 120                 125115 120 125

Trp Phe Gly Asn Met Asn Val Leu Thr Phe Leu Arg Asp Ile Gly LysTrp Phe Gly Asn Met Asn Val Leu Thr Phe Leu Arg Asp Ile Gly Lys

    130                 135                 140130 135 140

His Phe Ser Val Asn Gln Met Ile Asn Lys Glu Ala Val Lys Gln ArgHis Phe Ser Val Asn Gln Met Ile Asn Lys Glu Ala Val Lys Gln Arg

145                 150                 155                 160145 150 155 160

Leu Asn Arg Glu Asp Gln Gly Ile Ser Phe Thr Glu Phe Ser Tyr AsnLeu Asn Arg Glu Asp Gln Gly Ile Ser Phe Thr Glu Phe Ser Tyr Asn

                165                 170                 175165 170 175

Leu Leu Gln Gly Tyr Gly Tyr Ala Cys Met Asn Lys Gln Tyr Gly ValLeu Leu Gln Gly Tyr Gly Tyr Ala Cys Met Asn Lys Gln Tyr Gly Val

            180                 185                 190180 185 190

Val Leu Gln Ile Gly Gly Ser Asp Gln Trp Gly Asn Ile Thr Ser GlyVal Leu Gln Ile Gly Gly Ser Asp Gln Trp Gly Asn Ile Thr Ser Gly

        195                 200                 205195 200 205

Ile Asp Leu Thr Arg Arg Leu His Gln Asn Gln Val Phe Gly Leu ThrIle Asp Leu Thr Arg Arg Leu His Gln Asn Gln Val Phe Gly Leu Thr

    210                 215                 220210 215 220

Val Pro Leu Ile Thr Lys Ala Asp Gly Thr Lys Phe Gly Lys Thr GluVal Pro Leu Ile Thr Lys Ala Asp Gly Thr Lys Phe Gly Lys Thr Glu

225                 230                 235                 240225 230 235 240

Gly Gly Ala Val Trp Leu Asp Pro Lys Lys Thr Ser Pro Tyr Lys PheGly Gly Ala Val Trp Leu Asp Pro Lys Lys Thr Ser Pro Tyr Lys Phe

                245                 250                 255245 250 255

Tyr Gln Phe Trp Ile Asn Thr Ala Asp Ala Asp Val Tyr Arg Phe LeuTyr Gln Phe Trp Ile Asn Thr Ala Asp Ala Asp Val Tyr Arg Phe Leu

            260                 265                 270260 265 270

Lys Phe Phe Thr Phe Met Ser Ile Glu Glu Ile Asn Ala Leu Glu GluLys Phe Phe Thr Phe Met Ser Ile Glu Glu Ile Asn Ala Leu Glu Glu

        275                 280                 285275 280 285

Glu Asp Lys Asn Ser Gly Lys Ala Pro Arg Ala Gln Tyr Val Leu AlaGlu Asp Lys Asn Ser Gly Lys Ala Pro Arg Ala Gln Tyr Val Leu Ala

    290                 295                 300290 295 300

Glu Gln Val Thr Arg Leu Val Hi s Gly Glu Glu Gly Leu Gln Ala AlaGlu Gln Val Thr Arg Leu Val Hi s Gly Glu Glu Gly Leu Gln Ala Ala

305                 310                 315                 320305 310 315 320

Lys Arg Ile Thr Glu Cys Leu Phe Ser Gly Ser Leu Ser Ala Leu SerLys Arg Ile Thr Glu Cys Leu Phe Ser Gly Ser Leu Ser Ala Leu Ser

                325                 330                 335325 330 335

Glu Ala Asp Phe Glu Gln Leu Ala Gln Asp Gly Val Pro Met Val GluGlu Ala Asp Phe Glu Gln Leu Ala Gln Asp Gly Val Pro Met Val Glu

            340                 345                 350340 345 350

Met Glu Lys Gly Ala Asp Leu Met Gln Ala Leu Val Asp Ser Glu LeuMet Glu Lys Gly Ala Asp Leu Met Gln Ala Leu Val Asp Ser Glu Leu

        355                 360                 365355 360 365

Gln Pro Ser Arg Gly Gln Ala Arg Lys Thr Ile Ala Ser Asn Ala I1eGln Pro Ser Arg Gly Gln Ala Arg Lys Thr Ile Ala Ser Asn Ala I1e

    370                 375                 380370 375 380

Thr Ile Asn Cly Glu Lys Gln Ser Asp Pro Glu Tyr Phe Phe Lys GluThr Ile Asn Cly Glu Lys Gln Ser Asp Pro Glu Tyr Phe Phe Lys Glu

385                 390                 395                 400385 390 395 400

Glu Asp Arg Leu Phe Gly Arg Phe Thr Leu Leu Arg Arg Gly Lys LysGlu Asp Arg Leu Phe Gly Arg Phe Thr Leu Leu Arg Arg Gly Lys Lys

                405                 410                 415405 410 415

Asn Tyr Cys Leu Ile Cys Trp LysAsn Tyr Cys Leu Ile Cys Trp Lys

            420420

<210>48<210>48

<211>424<211>424

<212>PRT<212>PRT

<213>人工<213> Artificial

<220><220>

<223>人工合成酶<223> Synthetic enzymes

<400>48<400>48

Met Ala Ser Ser Asn Leu Ile Lys Gln Leu Gln Glu Arg Gly Leu ValMet Ala Ser Ser Asn Leu Ile Lys Gln Leu Gln Glu Arg Gly Leu Val

1               5                   10                  151 5 10 15

Ala Gln Val Thr Asp Glu Glu Ala Leu Ala Glu Arg Leu Ala Gln GlyAla Gln Val Thr Asp Glu Glu Ala Leu Ala Glu Arg Leu Ala Gln Gly

            20                  25                  3020 25 30

Pro Ile Ala Leu Leu Cys Gly Phe Asp Pro Thr Ala Asp Ser Leu HisPro Ile Ala Leu Leu Cys Gly Phe Asp Pro Thr Ala Asp Ser Leu His

        35                  40                  4535 40 45

Leu Gly His Leu Val Pro Leu Leu Cys Leu Lys Arg Phe Gln Gln AlaLeu Gly His Leu Val Pro Leu Leu Cys Leu Lys Arg Phe Gln Gln Ala

    50                  55                  6050 55 60

Gly His Lys Pro Val Ala Leu Val Gly Gly Ala Thr Gly Leu Ile GlyGly His Lys Pro Val Ala Leu Val Gly Gly Ala Thr Gly Leu Ile Gly

65                  70                  75                  8065 70 75 80

Asp Pro Ser Phe Lys Ala Ala Glu Arg Lys Leu Asn Thr Glu Glu ThrAsp Pro Ser Phe Lys Ala Ala Glu Arg Lys Leu Asn Thr Glu Glu Thr

                85                  90                  9585 90 95

Val Gln Glu Trp Val Asp Lys Ile Arg Lys Gln Val Ala Pro Phe LeuVal Gln Glu Trp Val Asp Lys Ile Arg Lys Gln Val Ala Pro Phe Leu

            100                 105                 110100 105 110

Asp Phe Asp Cys Gly Glu Asn Ser Ala Ile Ala Ala Asn Asn Tyr AspAsp Phe Asp Cys Gly Glu Asn Ser Ala Ile Ala Ala Asn Asn Tyr Asp

        115                 120                 125115 120 125

Trp Phe Gly Asn Met Asn Val Leu Thr Phe Leu Arg Asp Ile Gly LysTrp Phe Gly Asn Met Asn Val Leu Thr Phe Leu Arg Asp Ile Gly Lys

    130                 135                 140130 135 140

His Phe Ser Val Asn Gln Met Ile Asn Lys Glu Ala Val Lys Gln ArgHis Phe Ser Val Asn Gln Met Ile Asn Lys Glu Ala Val Lys Gln Arg

145                 150                 155                 160145 150 155 160

Leu Asn Arg Glu Asp Gln Gly Ile Ser Phe Thr Glu Phe Ser Tyr AsnLeu Asn Arg Glu Asp Gln Gly Ile Ser Phe Thr Glu Phe Ser Tyr Asn

                165                 170                 175165 170 175

Leu Leu Gln Gly Tyr Ser Met Ala Cys Ala Asn Lys Gln Tyr Gly ValLeu Leu Gln Gly Tyr Ser Met Ala Cys Ala Asn Lys Gln Tyr Gly Val

            180                 185                 190180 185 190

Val Leu Gln Ile Gly Gly Ser Asp Gln Trp Gly Asn Ile Thr Ser GlyVal Leu Gln Ile Gly Gly Ser Asp Gln Trp Gly Asn Ile Thr Ser Gly

        195                 200                 205195 200 205

Ile Asp Leu Thr Arg Arg Leu His Gln Asn Gln Val Phe Gly Leu ThrIle Asp Leu Thr Arg Arg Leu His Gln Asn Gln Val Phe Gly Leu Thr

    210                 215                 220210 215 220

Val Pro Leu Ile Thr Lys Ala Asp Gly Thr Lys Phe Gly Lys Thr GluVal Pro Leu Ile Thr Lys Ala Asp Gly Thr Lys Phe Gly Lys Thr Glu

225                 230                 235                 240225 230 235 240

Gly Gly Ala Val Trp Leu Asp Pro Lys Lys Thr Ser Pro Tyr Lys PheGly Gly Ala Val Trp Leu Asp Pro Lys Lys Thr Ser Pro Tyr Lys Phe

                245                 250                 255245 250 255

Tyr Gln Phe Trp Ile Asn Thr Ala Asp Ala Asp Val Tyr Arg Phe LeuTyr Gln Phe Trp Ile Asn Thr Ala Asp Ala Asp Val Tyr Arg Phe Leu

            260                 265                 270260 265 270

Lys Phe Phe Thr Phe Met Ser Ile Glu Glu Ile Asn Ala Leu Glu GluLys Phe Phe Thr Phe Met Ser Ile Glu Glu Ile Asn Ala Leu Glu Glu

        275                 280                 285275 280 285

Glu Asp Lys Asn Ser Gly Lys Ala Pro Arg Ala Gln Tyr Val Leu AlaGlu Asp Lys Asn Ser Gly Lys Ala Pro Arg Ala Gln Tyr Val Leu Ala

    290                 295                 300290 295 300

Glu Gln Val Thr Arg Leu Val His Gly Glu Glu Gly Leu Gln Ala AlaGlu Gln Val Thr Arg Leu Val His Gly Glu Glu Gly Leu Gln Ala Ala

305                 310                 315                 320305 310 315 320

Lys Arg Ile Thr Glu Cys Leu Phe Ser Gly Ser Leu Ser Ala Leu SerLys Arg Ile Thr Glu Cys Leu Phe Ser Gly Ser Leu Ser Ala Leu Ser

                325                 330                 335325 330 335

Glu Ala Asp Phe Glu Gln Leu Ala Gln Asp Gly Val Pro Met Val GluGlu Ala Asp Phe Glu Gln Leu Ala Gln Asp Gly Val Pro Met Val Glu

            340                 345                 350340 345 350

Met Glu Lys Gly Ala Asp Leu Met Gln Ala Leu Val Asp Ser Glu LeuMet Glu Lys Gly Ala Asp Leu Met Gln Ala Leu Val Asp Ser Glu Leu

        355                 360                 365355 360 365

Gln Pro Ser Arg Gly Gln Ala Arg Lys Thr Ile Ala Ser Asn Ala IleGln Pro Ser Arg Gly Gln Ala Arg Lys Thr Ile Ala Ser Asn Ala Ile

    370                 375                 380370 375 380

Thr Ile Asn Gly Glu Lys Gln Ser Asp Pro Glu Tyr Phe Phe Lys GluThr Ile Asn Gly Glu Lys Gln Ser Asp Pro Glu Tyr Phe Phe Lys Glu

385                 390                 395                 400385 390 395 400

Glu Asp Arg Leu Phe Gly Arg Phe Thr Leu Leu Arg Arg Gly Lys LysGlu Asp Arg Leu Phe Gly Arg Phe Thr Leu Leu Arg Arg Gly Lys Lys

                405                 410                 415405 410 415

Asn Tyr Cys Leu Ile Cys Trp LysAsn Tyr Cys Leu Ile Cys Trp Lys

            420420

<210>49<210>49

<211>424<211>424

<212>PRT<212>PRT

<213>人工<213> Artificial

<220><220>

<223>人工合成酶<223> Synthetic enzymes

<400>49<400>49

Met Ala Ser Ser Asn Leu Ile Lys Gln Leu Gln Glu Arg Gly Leu ValMet Ala Ser Ser Asn Leu Ile Lys Gln Leu Gln Glu Arg Gly Leu Val

1               5                   10                  151 5 10 15

Ala Gln Val Thr Asp Glu Glu Ala Leu Ala Glu Arg Leu Ala Gln GlyAla Gln Val Thr Asp Glu Glu Ala Leu Ala Glu Arg Leu Ala Gln Gly

            20                  25                  3020 25 30

Pro Ile Ala Leu Val Cys Gly Phe Asp Pro Thr Ala Asp Ser Leu HisPro Ile Ala Leu Val Cys Gly Phe Asp Pro Thr Ala Asp Ser Leu His

        35                  40                  4535 40 45

Leu Gly His Leu Val Pro Leu Leu Cys Leu Lys Arg Phe Gln Gln AlaLeu Gly His Leu Val Pro Leu Leu Cys Leu Lys Arg Phe Gln Gln Ala

    50                  55                  6050 55 60

Gly His Lys Pro Val Ala Leu Val Gly Gly Ala Thr Gly Leu Ile GlyGly His Lys Pro Val Ala Leu Val Gly Gly Ala Thr Gly Leu Ile Gly

65                  70                  75                  8065 70 75 80

Asp Pro Ser Phe Lys Ala Ala Glu Arg Lys Leu Asn Thr Glu Glu ThrAsp Pro Ser Phe Lys Ala Ala Glu Arg Lys Leu Asn Thr Glu Glu Thr

                85                  90                  9585 90 95

Val Gln Glu Trp Val Asp Lys Ile Arg Lys Gln Val Ala Pro Phe LeuVal Gln Glu Trp Val Asp Lys Ile Arg Lys Gln Val Ala Pro Phe Leu

            100                 105                 110100 105 110

Asp Phe Asp Cys Gly Glu Asn Ser Ala Ile Ala Ala Asn Asn Tyr AspAsp Phe Asp Cys Gly Glu Asn Ser Ala Ile Ala Ala Asn Asn Tyr Asp

        115                 120                 125115 120 125

Trp Phe Gly Asn Met Asn Val Leu Thr Phe Leu Arg Asp Ile Gly LysTrp Phe Gly Asn Met Asn Val Leu Thr Phe Leu Arg Asp Ile Gly Lys

    130                 135                 140130 135 140

His Phe Ser Val Asn Gln Met Ile Asn Lys Glu Ala Val Lys Gln ArgHis Phe Ser Val Asn Gln Met Ile Asn Lys Glu Ala Val Lys Gln Arg

145                 150                 155                 160145 150 155 160

Leu Asn Arg Glu Asp Gln Gly Ile Ser Phe Thr Glu Phe Ser Tyr AsnLeu Asn Arg Glu Asp Gln Gly Ile Ser Phe Thr Glu Phe Ser Tyr Asn

                165                 170                 175165 170 175

Leu Leu Gln Gly Tyr Ser Ala Ala Cys Ala Asn Lys Gln Tyr Gly ValLeu Leu Gln Gly Tyr Ser Ala Ala Cys Ala Asn Lys Gln Tyr Gly Val

            180                 185                 190180 185 190

Val Leu Gln Ile Gly Gly Ser Asp Gln Trp Gly Asn Ile Thr Ser GlyVal Leu Gln Ile Gly Gly Ser Asp Gln Trp Gly Asn Ile Thr Ser Gly

        195                 200                 205195 200 205

Ile Asp Leu Thr Arg Arg Leu His Gln Asn Gln Val Phe Gly Leu ThrIle Asp Leu Thr Arg Arg Leu His Gln Asn Gln Val Phe Gly Leu Thr

    210                 215                 220210 215 220

Val Pro Leu Ile Thr Lys Ala Asp Gly Thr Lys Phe Gly Lys Thr GluVal Pro Leu Ile Thr Lys Ala Asp Gly Thr Lys Phe Gly Lys Thr Glu

225                 230                 235                 240225 230 235 240

Gly Gly Ala Val Trp Leu Asp Pro Lys Lys Thr Ser Pro Tyr Lys PheGly Gly Ala Val Trp Leu Asp Pro Lys Lys Thr Ser Pro Tyr Lys Phe

                245                 250                 255245 250 255

Tyr Gln Phe Trp Ile Asn Thr Ala Asp Ala Asp Val Tyr Arg Phe LeuTyr Gln Phe Trp Ile Asn Thr Ala Asp Ala Asp Val Tyr Arg Phe Leu

            260                 265                 270260 265 270

Lys Phe Phe Thr Phe Met Ser Ile Glu Glu Ile Asn Ala Leu Glu GluLys Phe Phe Thr Phe Met Ser Ile Glu Glu Ile Asn Ala Leu Glu Glu

        275                 280                 285275 280 285

Glu Asp Lys Asn Ser Gly Lys Ala Pro Arg Ala Gln Tyr Val Leu AlaGlu Asp Lys Asn Ser Gly Lys Ala Pro Arg Ala Gln Tyr Val Leu Ala

    290                 295                 300290 295 300

Glu Gln Val Thr Arg Leu Val His Gly Glu Glu Gly Leu Gln Ala AlaGlu Gln Val Thr Arg Leu Val His Gly Glu Glu Gly Leu Gln Ala Ala

305                 310                 315                 320305 310 315 320

Lys Arg Ile Thr Glu Cys Leu Phe Ser Gly Ser Leu Ser Ala Leu SerLys Arg Ile Thr Glu Cys Leu Phe Ser Gly Ser Leu Ser Ala Leu Ser

                325                 330                 335325 330 335

Glu Ala Asp Phe Glu Gln Leu Ala Gln Asp Gly Val Pro Met Val GluGlu Ala Asp Phe Glu Gln Leu Ala Gln Asp Gly Val Pro Met Val Glu

            340                 345                 350340 345 350

Met Glu Lys Gly Ala Asp Leu Met Gln Ala Leu Val Asp Ser Glu LeuMet Glu Lys Gly Ala Asp Leu Met Gln Ala Leu Val Asp Ser Glu Leu

        355                 360                 365355 360 365

Gln Pro Ser Arg Gly Gln Ala Arg Lys Thr Ile Ala Ser Asn Ala IleGln Pro Ser Arg Gly Gln Ala Arg Lys Thr Ile Ala Ser Asn Ala Ile

    370                 375                 380370 375 380

Thr Ile Asn Gly Glu Lys Gln Ser Asp Pro Glu Tyr Phe Phe Lys GluThr Ile Asn Gly Glu Lys Gln Ser Asp Pro Glu Tyr Phe Phe Lys Glu

385                 390                 395                 400385 390 395 400

Glu Asp Arg Leu Phe Gly Arg Phe Thr Leu Leu Arg Arg Gly Lys LysGlu Asp Arg Leu Phe Gly Arg Phe Thr Leu Leu Arg Arg Gly Lys Lys

                405                 410                 415405 410 415

Asn Tyr Cys Leu Ile Cys Trp LysAsn Tyr Cys Leu Ile Cys Trp Lys

            420420

<210>50<210>50

<211>424<211>424

<212>PRT<212>PRT

<213>人工<213> Artificial

<220><220>

<223>人工合成酶<223> Synthetic enzymes

<400>50<400>50

Met Ala Ser Ser Asn Leu Ile Lys Gln Leu Gln Glu Arg Gly Leu ValMet Ala Ser Ser Asn Leu Ile Lys Gln Leu Gln Glu Arg Gly Leu Val

1               5               10                  151 5 10 15

Ala Gln Val Thr Asp Glu Glu Ala Leu Ala Glu Arg Leu Ala Gln GlyAla Gln Val Thr Asp Glu Glu Ala Leu Ala Glu Arg Leu Ala Gln Gly

            20                  25                  3020 25 30

Pro Ile Ala Leu Leu Cys Gly Phe Asp Pro Thr Ala Asp Ser Leu HisPro Ile Ala Leu Leu Cys Gly Phe Asp Pro Thr Ala Asp Ser Leu His

        35                  40                  4535 40 45

Leu Gly His Leu Val Pro Leu Leu Cys Leu Lys Arg Phe Gln Gln AlaLeu Gly His Leu Val Pro Leu Leu Cys Leu Lys Arg Phe Gln Gln Ala

    50                  55                  6050 55 60

Gly His Lys Pro Val Ala Leu Val Gly Gly Ala Thr Gly Leu Ile GlyGly His Lys Pro Val Ala Leu Val Gly Gly Ala Thr Gly Leu Ile Gly

65                  70                  75                  8065 70 75 80

Asp Pro Ser Phe Lys Ala Ala Glu Arg Lys Leu Asn Thr Glu Glu ThrAsp Pro Ser Phe Lys Ala Ala Glu Arg Lys Leu Asn Thr Glu Glu Thr

                85                  90                  9585 90 95

Val Gln Glu Trp Val Asp Lys Ile Arg Lys Gln Val Ala Pro Phe LeuVal Gln Glu Trp Val Asp Lys Ile Arg Lys Gln Val Ala Pro Phe Leu

            100                 105                 110100 105 110

Asp Phe Asp Cys Gly Glu Asn Ser Ala Ile Ala Ala Asn Asn Tyr AspAsp Phe Asp Cys Gly Glu Asn Ser Ala Ile Ala Ala Asn Asn Tyr Asp

        115                 120                 125115 120 125

Trp Phe Gly Asn Met Asn Val Leu Thr Phe Leu Arg Asp Ile Gly LysTrp Phe Gly Asn Met Asn Val Leu Thr Phe Leu Arg Asp Ile Gly Lys

    130                 135                 140130 135 140

His Phe Ser Val Asn Gln Met Ile Asn Lys Glu Ala Val Lys Gln ArgHis Phe Ser Val Asn Gln Met Ile Asn Lys Glu Ala Val Lys Gln Arg

145                 150                 155                 160145 150 155 160

Leu Asn Arg Glu Asp Gln Gly Ile Ser Phe Thr Glu Phe Ser Tyr AsnLeu Asn Arg Glu Asp Gln Gly Ile Ser Phe Thr Glu Phe Ser Tyr Asn

                165                 170                 175165 170 175

Leu Leu Gln Gly Tyr Ser Ala Ala Cys Ala Asn Lys Gln Tyr Gly ValLeu Leu Gln Gly Tyr Ser Ala Ala Cys Ala Asn Lys Gln Tyr Gly Val

            180                 185                 190180 185 190

Val Leu Gln Ile Gly Gly Ser Asp Gln Trp Gly Asn Ile Thr Ser GlyVal Leu Gln Ile Gly Gly Ser Asp Gln Trp Gly Asn Ile Thr Ser Gly

        195                 200                 205195 200 205

Ile Asp Leu Thr Arg Arg Leu His Gln Asn Gln Val Phe Gly Leu ThrIle Asp Leu Thr Arg Arg Leu His Gln Asn Gln Val Phe Gly Leu Thr

    210                 215                 220210 215 220

Val Pro Leu Ile Thr Lys Ala Asp Gly Thr Lys Phe Gly Lys Thr GluVal Pro Leu Ile Thr Lys Ala Asp Gly Thr Lys Phe Gly Lys Thr Glu

225                 230                 235                 240225 230 235 240

Gly Gly Ala Val Trp Leu Asp Pro Lys Lys Thr Ser Pro Tyr Lys PheGly Gly Ala Val Trp Leu Asp Pro Lys Lys Thr Ser Pro Tyr Lys Phe

                245                 250                 255245 250 255

Tyr Gln Phe Trp Ile Asn Thr Ala Asp Ala Asp Val Tyr Arg Phe LeuTyr Gln Phe Trp Ile Asn Thr Ala Asp Ala Asp Val Tyr Arg Phe Leu

            260                 265                 270260 265 270

Lys Phe Phe Thr Phe Met Ser Ile Glu Glu Ile Asn Ala Leu Glu GluLys Phe Phe Thr Phe Met Ser Ile Glu Glu Ile Asn Ala Leu Glu Glu

        275                 280                 285275 280 285

Glu Asp Lys Asn Ser Gly Lys Ala Pro Arg Ala Gln Tyr Val Leu AlaGlu Asp Lys Asn Ser Gly Lys Ala Pro Arg Ala Gln Tyr Val Leu Ala

    290                 295                 300290 295 300

Glu Gln Val Thr Arg Leu Val His Gly Glu Glu Gly Leu Gln Ala AlaGlu Gln Val Thr Arg Leu Val His Gly Glu Glu Gly Leu Gln Ala Ala

305                 310                 315                 320305 310 315 320

Lys Arg Ile Thr Glu Cys Leu Phe Ser Gly Ser Leu Ser Ala Leu SerLys Arg Ile Thr Glu Cys Leu Phe Ser Gly Ser Leu Ser Ala Leu Ser

                325                 330                 335325 330 335

Glu Ala Asp Phe Glu Gln Leu Ala Gln Asp Gly Val Pro Met Val GluGlu Ala Asp Phe Glu Gln Leu Ala Gln Asp Gly Val Pro Met Val Glu

            340                 345                 350340 345 350

Met Glu Lys Gly Ala Asp Leu Met Gln Ala Leu Val Asp Ser Glu LeuMet Glu Lys Gly Ala Asp Leu Met Gln Ala Leu Val Asp Ser Glu Leu

        355                 360                 365355 360 365

Gln Pro Ser Arg Gly Gln Ala Arg Lys Thr Ile Ala Ser Asn Ala IleGln Pro Ser Arg Gly Gln Ala Arg Lys Thr Ile Ala Ser Asn Ala Ile

    370                 375                 380370 375 380

Thr Ile Asn Gly Glu Lys Gln Ser Asp Pro Glu Tyr Phe Phe Lys GluThr Ile Asn Gly Glu Lys Gln Ser Asp Pro Glu Tyr Phe Phe Lys Glu

385                 390                 395                 400385 390 395 400

Glu Asp Arg Leu Phe Gly Arg Phe Thr Leu Leu Arg Arg Gly Lys LysGlu Asp Arg Leu Phe Gly Arg Phe Thr Leu Leu Arg Arg Gly Lys Lys

                405                 410                 415405 410 415

Asn Tyr Cys Leu Ile Cys Trp LysAsn Tyr Cys Leu Ile Cys Trp Lys

            420420

<210>51<210>51

<211>424<211>424

<212>PRT<212>PRT

<213>人工<213> Artificial

<220><220>

<223>人工合成酶<223> Synthetic enzymes

<400>51<400>51

Met Ala Ser Ser Asn Leu Ile Lys Gln Leu Gln Glu Arg Gly Leu ValMet Ala Ser Ser Asn Leu Ile Lys Gln Leu Gln Glu Arg Gly Leu Val

1               5                   10                  151 5 10 15

Ala Gln Val Thr Asp Glu Glu Ala Leu Ala Glu Arg Leu Ala Gln GlyAla Gln Val Thr Asp Glu Glu Ala Leu Ala Glu Arg Leu Ala Gln Gly

            20                  25                  3020 25 30

Pro Ile Ala Leu Val Cys Gly Phe Asp Pro Thr Ala Asp Ser Leu HisPro Ile Ala Leu Val Cys Gly Phe Asp Pro Thr Ala Asp Ser Leu His

        35                  40                  4535 40 45

Leu Gly His Leu Val Pro Leu Leu Cys Leu Lys Arg Phe Gln Gln AlaLeu Gly His Leu Val Pro Leu Leu Cys Leu Lys Arg Phe Gln Gln Ala

    50                  55                  6050 55 60

Gly His Lys Pro Val Ala Leu Val Gly Gly Ala Thr Gly Leu Ile GlyGly His Lys Pro Val Ala Leu Val Gly Gly Ala Thr Gly Leu Ile Gly

65                  70                  75                  8065 70 75 80

Asp Pro Ser Phe Lys Ala Ala Glu Arg Lys Leu Asn Thr Glu Glu ThrAsp Pro Ser Phe Lys Ala Ala Glu Arg Lys Leu Asn Thr Glu Glu Thr

                85                  90                  9585 90 95

Val Gln Glu Trp Val Asp Lys Ile Arg Lys Gln Val Ala Pro Phe LeuVal Gln Glu Trp Val Asp Lys Ile Arg Lys Gln Val Ala Pro Phe Leu

            100                 105                 110100 105 110

Asp Phe Asp Cys Gly Glu Asn Ser Ala Ile Ala Ala Asn Asn Tyr AspAsp Phe Asp Cys Gly Glu Asn Ser Ala Ile Ala Ala Asn Asn Tyr Asp

        115                 120                 125115 120 125

Trp Phe Gly Asn Met Asn Val Leu Thr Phe Leu Arg Asp Ile Gly LysTrp Phe Gly Asn Met Asn Val Leu Thr Phe Leu Arg Asp Ile Gly Lys

    130                 135                 140130 135 140

His Phe Ser Val Asn Gln Met Ile Asn Lys Glu Ala Val Lys Gln ArgHis Phe Ser Val Asn Gln Met Ile Asn Lys Glu Ala Val Lys Gln Arg

145                 150                 155                 160145 150 155 160

Leu Asn Arg Glu Asp Gln Gly Ile Ser Phe Thr Glu Phe Ser Tyr AsnLeu Asn Arg Glu Asp Gln Gly Ile Ser Phe Thr Glu Phe Ser Tyr Asn

                165                 170                 175165 170 175

Leu Leu Gln Gly Tyr Ser Ala Ala Cys Val Asn Lys Gln Tyr Gly ValLeu Leu Gln Gly Tyr Ser Ala Ala Cys Val Asn Lys Gln Tyr Gly Val

            180                 185                 190180 185 190

Val Leu Gln Ile Gly Gly Ser Asp Gln Trp Gly Asn Ile Thr Ser GlyVal Leu Gln Ile Gly Gly Ser Asp Gln Trp Gly Asn Ile Thr Ser Gly

        195                 200                 205195 200 205

Ile Asp Leu Thr Arg Arg Leu His Gln Asn Gln Val Phe Gly Leu ThrIle Asp Leu Thr Arg Arg Leu His Gln Asn Gln Val Phe Gly Leu Thr

    210                 215                 220210 215 220

Val Pro Leu Ile Thr Lys Ala Asp Gly Thr Lys Phe Gly Lys Thr GluVal Pro Leu Ile Thr Lys Ala Asp Gly Thr Lys Phe Gly Lys Thr Glu

225                 230                 235                 240225 230 235 240

Gly Gly Ala Val Trp Leu Asp Pro Lys Lys Thr Ser Pro Tyr Lys PheGly Gly Ala Val Trp Leu Asp Pro Lys Lys Thr Ser Pro Tyr Lys Phe

                245                 250                 255245 250 255

Tyr Gln Phe Trp Ile Asn Thr Ala Asp Ala Asp Val Tyr Arg Phe LeuTyr Gln Phe Trp Ile Asn Thr Ala Asp Ala Asp Val Tyr Arg Phe Leu

            260                 265                 270260 265 270

Lys Phe Phe Thr Phe Met Ser Ile Glu Glu Ile Asn Ala Leu Glu GluLys Phe Phe Thr Phe Met Ser Ile Glu Glu Ile Asn Ala Leu Glu Glu

        275                 280                 285275 280 285

Glu Asp Lys Asn Ser Gly Lys Ala Pro Arg Ala Gln Tyr Val Leu AlaGlu Asp Lys Asn Ser Gly Lys Ala Pro Arg Ala Gln Tyr Val Leu Ala

    290                 295                 300290 295 300

Glu Gln Val Thr Arg Leu Val His Gly Glu Glu Gly Leu Gln Ala AlaGlu Gln Val Thr Arg Leu Val His Gly Glu Glu Gly Leu Gln Ala Ala

305                 310                 315                 320305 310 315 320

Lys Arg Ile Thr Glu Cys Leu Phe Ser Gly Ser Leu Ser Ala Leu SerLys Arg Ile Thr Glu Cys Leu Phe Ser Gly Ser Leu Ser Ala Leu Ser

                325                 330                 335325 330 335

Glu Ala Asp Phe Glu Gln Leu Ala Gln Asp Gly Val Pro Met Val GluGlu Ala Asp Phe Glu Gln Leu Ala Gln Asp Gly Val Pro Met Val Glu

            340                 345                 350340 345 350

Met Glu Lys Gly Ala Asp Leu Met Gln Ala Leu Val Asp Ser Glu LeuMet Glu Lys Gly Ala Asp Leu Met Gln Ala Leu Val Asp Ser Glu Leu

        355                 360                 365355 360 365

Gln Pro Ser Arg Gly Gln Ala Arg Lys Thr Ile Ala Ser Asn Ala IleGln Pro Ser Arg Gly Gln Ala Arg Lys Thr Ile Ala Ser Asn Ala Ile

    370                 375                 380370 375 380

Thr Ile Asn Gly Glu Lys Gln Ser Asp Pro Glu Tyr Phe Phe Lys GluThr Ile Asn Gly Glu Lys Gln Ser Asp Pro Glu Tyr Phe Phe Lys Glu

385                 390                 395                 400385 390 395 400

Glu Asp Arg Leu Phe Gly Arg Phe Thr Leu Leu Arg Arg Gly Lys LysGlu Asp Arg Leu Phe Gly Arg Phe Thr Leu Leu Arg Arg Gly Lys Lys

                405                 410                 415405 410 415

Asn Tyr Cys Leu Ile Cys Trp LysAsn Tyr Cys Leu Ile Cys Trp Lys

            420420

<210>52<210>52

<211>424<211>424

<212>PRT<212>PRT

<213>人工<213> Artificial

<220><220>

<223>人工合成酶<223> Synthetic enzymes

<400>52<400>52

Met Ala Ser Ser Asn Leu Ile Lys Gln Leu Gln Glu Arg Gly Leu ValMet Ala Ser Ser Asn Leu Ile Lys Gln Leu Gln Glu Arg Gly Leu Val

1               5                   10                  151 5 10 15

Ala Gln Val Thr Asp Glu Glu Ala Leu Ala Glu Arg Leu Ala Gln GlyAla Gln Val Thr Asp Glu Glu Ala Leu Ala Glu Arg Leu Ala Gln Gly

            20                  25                  3020 25 30

Pro Ile Ala Leu Ile Cys Gly Phe Asp Pro Thr Ala Asp Ser Leu HisPro Ile Ala Leu Ile Cys Gly Phe Asp Pro Thr Ala Asp Ser Leu His

        35                  40                  4535 40 45

Leu Gly His Leu Val Pro Leu Leu Cys Leu Lys Arg Phe Gln Gln AlaLeu Gly His Leu Val Pro Leu Leu Cys Leu Lys Arg Phe Gln Gln Ala

    50                  55                  6050 55 60

Gly His Lys Pro Val Ala Leu Val Gly Gly Ala Thr Gly Leu Ile GlyGly His Lys Pro Val Ala Leu Val Gly Gly Ala Thr Gly Leu Ile Gly

65                  70                  75                  8065 70 75 80

Asp Pro Ser Phe Lys Ala Ala Glu Arg Lys Leu Asn Thr Glu Glu ThrAsp Pro Ser Phe Lys Ala Ala Glu Arg Lys Leu Asn Thr Glu Glu Thr

                85                  90                  9585 90 95

Val Gln Glu Trp Val Asp Lys Ile Arg Lys Gln Val Ala Pro Phe LeuVal Gln Glu Trp Val Asp Lys Ile Arg Lys Gln Val Ala Pro Phe Leu

            100                 105                 110100 105 110

Asp Phe Asp Cys Gly Glu Asn Ser Ala Ile Ala Ala Asn Asp Tyr AspAsp Phe Asp Cys Gly Glu Asn Ser Ala Ile Ala Ala Asn Asp Tyr Asp

        115                 120                 125115 120 125

Trp Phe Gly Asn Met Asn Val Leu Thr Phe Leu Arg Asp Ile Gly LysTrp Phe Gly Asn Met Asn Val Leu Thr Phe Leu Arg Asp Ile Gly Lys

    130                 135                 140130 135 140

His Phe Ser Val Asn Gln Met Ile Asn Lys Glu Ala Val Lys Gln ArgHis Phe Ser Val Asn Gln Met Ile Asn Lys Glu Ala Val Lys Gln Arg

145                 150                 155                 160145 150 155 160

Leu Asn Arg Glu Asp Gln Gly Ile Ser Phe Thr Glu Phe Ser Tyr AsnLeu Asn Arg Glu Asp Gln Gly Ile Ser Phe Thr Glu Phe Ser Tyr Asn

                165                 170                 175165 170 175

Leu Leu Gln Gly Tyr Asn Phe Ala Cys Val Asn Lys Gln Tyr Gly ValLeu Leu Gln Gly Tyr Asn Phe Ala Cys Val Asn Lys Gln Tyr Gly Val

            180                 185                 190180 185 190

Val Leu Gln Ile Gly Gly Ser Asp Gln Trp Gly Asn Ile Thr Ser GlyVal Leu Gln Ile Gly Gly Ser Asp Gln Trp Gly Asn Ile Thr Ser Gly

        195                 200                 205195 200 205

Ile Asp Leu Thr Arg Arg Leu His Gln Asn Gln Val Phe Gly Leu ThrIle Asp Leu Thr Arg Arg Leu His Gln Asn Gln Val Phe Gly Leu Thr

    210                 215                 220210 215 220

Val Pro Leu Ile Thr Lys Ala Asp Gly Thr Lys Phe Gly Lys Thr GluVal Pro Leu Ile Thr Lys Ala Asp Gly Thr Lys Phe Gly Lys Thr Glu

225                 230                 235                 240225 230 235 240

Gly Gly Ala Val Trp Leu Asp Pro Lys Lys Thr Ser Pro Tyr Lys PheGly Gly Ala Val Trp Leu Asp Pro Lys Lys Thr Ser Pro Tyr Lys Phe

                245                 250                 255245 250 255

Tyr Gln Phe Trp Ile Asn Thr Ala Asp Ala Asp Val Tyr Arg Phe LeuTyr Gln Phe Trp Ile Asn Thr Ala Asp Ala Asp Val Tyr Arg Phe Leu

            260                 265                 270260 265 270

Lys Phe Phe Thr Phe Met Ser Ile Glu Glu Ile Asn Ala Leu Glu GluLys Phe Phe Thr Phe Met Ser Ile Glu Glu Ile Asn Ala Leu Glu Glu

        275                 280                 285275 280 285

Glu Asp Lys Asn Ser Gly Lys Ala Pro Arg Ala Gln Tyr Val Leu AlaGlu Asp Lys Asn Ser Gly Lys Ala Pro Arg Ala Gln Tyr Val Leu Ala

    290                 295                 300290 295 300

Glu Gln Val Thr Arg Leu Val His Gly Glu Glu Gly Leu Gln Ala AlaGlu Gln Val Thr Arg Leu Val His Gly Glu Glu Gly Leu Gln Ala Ala

305                 310                 315                 320305 310 315 320

Lys Arg Ile Thr Glu Cys Leu Phe Ser Gly Ser Leu Ser Ala Leu SerLys Arg Ile Thr Glu Cys Leu Phe Ser Gly Ser Leu Ser Ala Leu Ser

                325                 330                 335325 330 335

Glu Ala Asp Phe Glu Gln Leu Ala Gln Asp Gly Val Pro Met Val GluGlu Ala Asp Phe Glu Gln Leu Ala Gln Asp Gly Val Pro Met Val Glu

            340                 345                 350340 345 350

Met Glu Lys Gly Ala Asp Leu Met Gln Ala Leu Val Asp Ser Glu LeuMet Glu Lys Gly Ala Asp Leu Met Gln Ala Leu Val Asp Ser Glu Leu

        355                 360                 365355 360 365

Gln Pro Ser Arg Gly Gln Ala Arg Lys Thr Ile Ala Ser Asn Ala IleGln Pro Ser Arg Gly Gln Ala Arg Lys Thr Ile Ala Ser Asn Ala Ile

    370                 375                 380370 375 380

Thr Ile Asn Gly Glu Lys Gln Ser Asp Pro Glu Tyr Phe Phe Lys GluThr Ile Asn Gly Glu Lys Gln Ser Asp Pro Glu Tyr Phe Phe Lys Glu

385                 390                 395                 400385 390 395 400

Glu Asp Arg Leu Phe Gly Arg Phe Thr Leu Leu Arg Arg Gly Lys LysGlu Asp Arg Leu Phe Gly Arg Phe Thr Leu Leu Arg Arg Gly Lys Lys

                405                 410                 415405 410 415

Asn Tyr Cys Leu Ile Cys Trp LysAsn Tyr Cys Leu Ile Cys Trp Lys

            420420

<210>53<210>53

<211>424<211>424

<212>PRT<212>PRT

<213>人工<213> Artificial

<220><220>

<223>人工合成酶<223> Synthetic enzymes

<400>53<400>53

Met Ala Ser Ser Asn Leu Ile Lys Gln Leu Gln Glu Arg Gly Leu ValMet Ala Ser Ser Asn Leu Ile Lys Gln Leu Gln Glu Arg Gly Leu Val

1               5                   10                  151 5 10 15

Ala Gln Val Thr Asp Glu Glu Ala Leu Ala Glu Arg Leu Ala Gln GlyAla Gln Val Thr Asp Glu Glu Ala Leu Ala Glu Arg Leu Ala Gln Gly

            20                  25                  3020 25 30

Pro Ile Ala Leu Thr Cys Gly Phe Asp Pro Thr Ala Asp Ser Leu HisPro Ile Ala Leu Thr Cys Gly Phe Asp Pro Thr Ala Asp Ser Leu His

        35                 40                 4535 40 45

Leu Gly His Leu Val Pro Leu Leu Cys Leu Lys Arg Phe Gln Gln AlaLeu Gly His Leu Val Pro Leu Leu Cys Leu Lys Arg Phe Gln Gln Ala

    50                  55                  6050 55 60

Gly His Lys Pro Val Ala Leu Val Gly Gly Ala Thr Gly Leu Ile GlyGly His Lys Pro Val Ala Leu Val Gly Gly Ala Thr Gly Leu Ile Gly

65                  70                  75                  8065 70 75 80

Asp Pro Ser Phe Lys Ala Ala Glu Arg Lys Leu Asn Thr Glu Glu ThrAsp Pro Ser Phe Lys Ala Ala Glu Arg Lys Leu Asn Thr Glu Glu Thr

                85                  90                  9585 90 95

Val Gln Glu Trp Val Asp Lys Ile Arg Lys Gln Val Ala Pro Phe LeuVal Gln Glu Trp Val Asp Lys Ile Arg Lys Gln Val Ala Pro Phe Leu

            100                 105                 110100 105 110

Asp Phe Asp Cys Gly Glu Asn Ser Ala Ile Ala Ala Asn Asn Tyr AspAsp Phe Asp Cys Gly Glu Asn Ser Ala Ile Ala Ala Asn Asn Tyr Asp

        115                 120                 125115 120 125

Trp Phe Gly Asn Met Asn Val Leu Thr Phe Leu Arg Asp Ile Gly LysTrp Phe Gly Asn Met Asn Val Leu Thr Phe Leu Arg Asp Ile Gly Lys

130                     135                 140130 135 140

His Phe Ser Val Asn Gln Met Ile Asn Lys Glu Ala Val Lys Gln ArgHis Phe Ser Val Asn Gln Met Ile Asn Lys Glu Ala Val Lys Gln Arg

145                 150                 155                 160145 150 155 160

Leu Asn Arg Glu Asp Gln Gly Ile Ser Phe Thr Glu Phe Ser Tyr AsnLeu Asn Arg Glu Asp Gln Gly Ile Ser Phe Thr Glu Phe Ser Tyr Asn

                165                 170                 175165 170 175

Leu Leu Gln Gly Tyr Ser Ala Ala Cys Leu Asn Lys Gln Tyr Gly ValLeu Leu Gln Gly Tyr Ser Ala Ala Cys Leu Asn Lys Gln Tyr Gly Val

            180                 185                 190180 185 190

Val Leu Gln Ile Gly Gly Ser Asp Gln Trp Gly Asn Ile Thr Ser GlyVal Leu Gln Ile Gly Gly Ser Asp Gln Trp Gly Asn Ile Thr Ser Gly

        195                 200                 205195 200 205

Ile Asp Leu Thr Arg Arg Leu His Gln Asn Gln Val Phe Gly Leu ThrIle Asp Leu Thr Arg Arg Leu His Gln Asn Gln Val Phe Gly Leu Thr

    210                 215                 220210 215 220

Val Pro Leu Ile Thr Lys Ala Asp Gly Thr Lys Phe Gly Lys Thr GluVal Pro Leu Ile Thr Lys Ala Asp Gly Thr Lys Phe Gly Lys Thr Glu

225                 230                 235                 240225 230 235 240

Gly Gly Ala Val Trp Leu Asp Pro Lys Lys Thr Ser Pro Tyr Lys PheGly Gly Ala Val Trp Leu Asp Pro Lys Lys Thr Ser Pro Tyr Lys Phe

                245                 250                 255245 250 255

Tyr Gln Phe Trp Ile Asn Thr Ala Asp Ala Asp Val Tyr Arg Phe LeuTyr Gln Phe Trp Ile Asn Thr Ala Asp Ala Asp Val Tyr Arg Phe Leu

            260                 265                 270260 265 270

Lys Phe Phe Thr Phe Met Ser Ile Glu Glu Ile Asn Ala Leu Glu GluLys Phe Phe Thr Phe Met Ser Ile Glu Glu Ile Asn Ala Leu Glu Glu

        275                 280                 285275 280 285

Glu Asp Lys Asn Ser Gly Lys Ala Pro Arg Ala Gln Tyr Val Leu AlaGlu Asp Lys Asn Ser Gly Lys Ala Pro Arg Ala Gln Tyr Val Leu Ala

    290                 295                 300290 295 300

Glu Gln Val Thr Arg Leu Val His Gly Glu Glu Gly Leu Gln Ala AlaGlu Gln Val Thr Arg Leu Val His Gly Glu Glu Gly Leu Gln Ala Ala

305                 310                 315                 320305 310 315 320

Lys Arg Ile Thr Glu Cys Leu Phe Ser Gly Ser Leu Ser Ala Leu SerLys Arg Ile Thr Glu Cys Leu Phe Ser Gly Ser Leu Ser Ala Leu Ser

                325                 330                 335325 330 335

Glu Ala Asp Phe Glu Gln Leu Ala Gln Asp Gly Val Pro Met Val GluGlu Ala Asp Phe Glu Gln Leu Ala Gln Asp Gly Val Pro Met Val Glu

            340                 345                 350340 345 350

Met Glu Lys Gly Ala Asp Leu Met Gln Ala Leu Val Asp Ser Glu LeuMet Glu Lys Gly Ala Asp Leu Met Gln Ala Leu Val Asp Ser Glu Leu

        355                 360                 365355 360 365

Gln Pro Ser Arg Gly Gln Ala Arg Lys Thr Ile Ala Ser Asn Ala IleGln Pro Ser Arg Gly Gln Ala Arg Lys Thr Ile Ala Ser Asn Ala Ile

    370                 375                 380370 375 380

Thr Ile Asn Gly Glu Lys Gln Ser Asp Pro Glu Tyr Phe Phe Lys GluThr Ile Asn Gly Glu Lys Gln Ser Asp Pro Glu Tyr Phe Phe Lys Glu

385                 390                 395                 400385 390 395 400

Glu Asp Arg Leu Phe Gly Arg Phe Thr Leu Leu Arg Arg Gly Lys LysGlu Asp Arg Leu Phe Gly Arg Phe Thr Leu Leu Arg Arg Gly Lys Lys

                405                 410                 415405 410 415

Asn Tyr Cys Leu Ile Cys Trp LysAsn Tyr Cys Leu Ile Cys Trp Lys

            420420

<210>54<210>54

<211>424<211>424

<212>PRT<212>PRT

<213>人工<213> Artificial

<220><220>

<223>人工合成酶<223> Synthetic enzymes

<400>54<400>54

Met Ala Ser Ser Asn Leu Ile Lys Gln Leu Gln Glu Arg Gly Leu ValMet Ala Ser Ser Asn Leu Ile Lys Gln Leu Gln Glu Arg Gly Leu Val

1               5                   10                  151 5 10 15

Ala Gln Val Thr Asp Glu Glu Ala Leu Ala Glu Arg Leu Ala Gln GlyAla Gln Val Thr Asp Glu Glu Ala Leu Ala Glu Arg Leu Ala Gln Gly

            20                  25                  3020 25 30

Pro Ile Ala Leu Gly Cys Gly Phe Asp Pro Thr Ala Asp Ser Leu HisPro Ile Ala Leu Gly Cys Gly Phe Asp Pro Thr Ala Asp Ser Leu His

        35                  40                  4535 40 45

Leu Gly His Leu Val Pro Leu Leu Cys Leu Lys Arg Phe Gln Gln AlaLeu Gly His Leu Val Pro Leu Leu Cys Leu Lys Arg Phe Gln Gln Ala

    50                  55                  6050 55 60

Gly His Lys Pro Val Ala Leu Val Gly Gly Ala Thr Gly Leu Ile GlyGly His Lys Pro Val Ala Leu Val Gly Gly Ala Thr Gly Leu Ile Gly

65                  70                  75                  8065 70 75 80

Asp Pro Ser Phe Lys Ala Ala Glu Arg Lys Leu Asn Thr Glu Glu ThrAsp Pro Ser Phe Lys Ala Ala Glu Arg Lys Leu Asn Thr Glu Glu Thr

                85                  90                  9585 90 95

Val Gln Glu Trp Val Asp Lys Ile Arg Lys Gln Val Ala Pro Phe LeuVal Gln Glu Trp Val Asp Lys Ile Arg Lys Gln Val Ala Pro Phe Leu

            100                 105                 110100 105 110

Asp Phe Asp Cys Gly Glu Asn Ser Ala Ile Ala Ala Asn Asn Tyr AspAsp Phe Asp Cys Gly Glu Asn Ser Ala Ile Ala Ala Asn Asn Tyr Asp

        115                 120                 125115 120 125

Trp Phe Gly Asn Met Asn Val Leu Thr Phe Leu Arg Asp Ile Gly LysTrp Phe Gly Asn Met Asn Val Leu Thr Phe Leu Arg Asp Ile Gly Lys

    130                 135                 140130 135 140

His Phe Ser Val Asn Gln Met Ile Asn Lys Glu Ala Val Lys Gln ArgHis Phe Ser Val Asn Gln Met Ile Asn Lys Glu Ala Val Lys Gln Arg

145                 150                 155                 160145 150 155 160

Leu Asn Arg Glu Asp Gln Gly Ile Ser Phe Thr Glu Phe Ser Tyr AsnLeu Asn Arg Glu Asp Gln Gly Ile Ser Phe Thr Glu Phe Ser Tyr Asn

                165                 170                 175165 170 175

Leu Leu Gln Gly Tyr Ser Met Ala Cys Leu Asn Lys Gln Tyr Gly ValLeu Leu Gln Gly Tyr Ser Met Ala Cys Leu Asn Lys Gln Tyr Gly Val

            180                 185                 190180 185 190

Val Leu Gln Ile Gly Gly Ser Asp Gln Trp Gly Asn Ile Thr Ser GlyVal Leu Gln Ile Gly Gly Ser Asp Gln Trp Gly Asn Ile Thr Ser Gly

195                 200                 205195 200 205

Ile Asp Leu Thr Arg Arg Leu His Gln Asn Gln Val Phe Gly Leu ThrIle Asp Leu Thr Arg Arg Leu His Gln Asn Gln Val Phe Gly Leu Thr

    210                 215                 220210 215 220

Val Pro Leu Ile Thr Lys Ala Asp Gly Thr Lys Phe Gly Lys Thr GluVal Pro Leu Ile Thr Lys Ala Asp Gly Thr Lys Phe Gly Lys Thr Glu

225                 230                 235                 240225 230 235 240

Gly Gly Ala Val Trp Leu Asp Pro Lys Lys Thr Ser Pro Tyr Lys PheGly Gly Ala Val Trp Leu Asp Pro Lys Lys Thr Ser Pro Tyr Lys Phe

245                 250                 255245 250 255

Tyr Gln Phe Trp Ile Asn Thr Ala Asp Ala Asp Val Tyr Arg Phe LeuTyr Gln Phe Trp Ile Asn Thr Ala Asp Ala Asp Val Tyr Arg Phe Leu

            260                 265                 270260 265 270

Lys Phe Phe Thr Phe Met Ser Ile Glu Glu Ile Asn Ala Leu Glu GluLys Phe Phe Thr Phe Met Ser Ile Glu Glu Ile Asn Ala Leu Glu Glu

        275                 280                 285275 280 285

Glu Asp Lys Asn Ser Gly Lys Ala Pro Arg Ala Gln Tyr Val Leu AlaGlu Asp Lys Asn Ser Gly Lys Ala Pro Arg Ala Gln Tyr Val Leu Ala

    290                 295                 300290 295 300

Glu Gln Val Thr Arg Leu Val Hi s Gly Glu Glu Gly Leu Gln Ala AlaGlu Gln Val Thr Arg Leu Val Hi s Gly Glu Glu Gly Leu Gln Ala Ala

305                 310                  315                 320305 310 315 320

Lys Arg Ile Thr Glu Cys Leu Phe Ser Gly Ser Leu Ser Ala Leu SerLys Arg Ile Thr Glu Cys Leu Phe Ser Gly Ser Leu Ser Ala Leu Ser

                325                 330                 335325 330 335

Glu Ala Asp Phe Glu Gln Leu Ala Gln Asp Gly Val Pro Met Val GluGlu Ala Asp Phe Glu Gln Leu Ala Gln Asp Gly Val Pro Met Val Glu

            340                 345                 350340 345 350

Met Glu Lys Gly Ala Asp Leu Met Gln Ala Leu Val Asp Ser Glu LeuMet Glu Lys Gly Ala Asp Leu Met Gln Ala Leu Val Asp Ser Glu Leu

        355                 360                 365355 360 365

Gln Pro Ser Arg Gly Gln Ala Arg Lys Thr Ile Ala Ser Asn Ala IleGln Pro Ser Arg Gly Gln Ala Arg Lys Thr Ile Ala Ser Asn Ala Ile

    370                 375                 380370 375 380

Thr Ile Asn Gly Glu Lys Gln Ser Asp Pro Glu Tyr Phe Phe Lys GluThr Ile Asn Gly Glu Lys Gln Ser Asp Pro Glu Tyr Phe Phe Lys Glu

385                 390                 395                 400385 390 395 400

Glu Asp Arg Leu Phe Gly Arg Phe Thr Leu Leu Arg Arg Gly Lys LysGlu Asp Arg Leu Phe Gly Arg Phe Thr Leu Leu Arg Arg Gly Lys Lys

                405                 410                 415405 410 415

Asn Tyr Cys Leu Ile Cys Trp LysAsn Tyr Cys Leu Ile Cys Trp Lys

            420420

<210>55<210>55

<211>424<211>424

<212>PRT<212>PRT

<213>人工<213> Artificial

<220><220>

<223>人工合成酶<223> Synthetic enzymes

<400>55<400>55

Met Ala Ser Ser Asn Leu Ile Lys Gln Leu Gln Glu Arg Gly Leu ValMet Ala Ser Ser Asn Leu Ile Lys Gln Leu Gln Glu Arg Gly Leu Val

1               5                   10                  151 5 10 15

Ala Gln Val Thr Asp Glu Glu Ala Leu Ala Glu Arg Leu Ala Gln GlyAla Gln Val Thr Asp Glu Glu Ala Leu Ala Glu Arg Leu Ala Gln Gly

            20                  25                  3020 25 30

Pro Ile Ala Leu Thr Cys Gly Phe Asp Pro Thr Ala Asp Ser Leu HisPro Ile Ala Leu Thr Cys Gly Phe Asp Pro Thr Ala Asp Ser Leu His

        35                  40                  4535 40 45

Leu Gly His Leu Val Pro Leu Leu Cys Leu Lys Arg Phe Gln Gln AlaLeu Gly His Leu Val Pro Leu Leu Cys Leu Lys Arg Phe Gln Gln Ala

    50                  55                  6050 55 60

Gly His Lys Pro Val Ala Leu V8l Gly Gly Ala Thr Gly Leu Ile GlyGly His Lys Pro Val Ala Leu V8l Gly Gly Ala Thr Gly Leu Ile Gly

65                  70                  75                  8065 70 75 80

Asp Pro Ser Phe Lys Ala Ala Glu Arg Lys Leu Asn Thr Glu Glu ThrAsp Pro Ser Phe Lys Ala Ala Glu Arg Lys Leu Asn Thr Glu Glu Thr

                85                  90                  9585 90 95

Val Gln Glu Trp Val Asp Lys Ile Arg Lys Gln Val Ala Pro Phe LeuVal Gln Glu Trp Val Asp Lys Ile Arg Lys Gln Val Ala Pro Phe Leu

            100                 105                 110100 105 110

Asp Phe Asp Cys Gly Glu Asn Ser Ala Ile Ala Ala Asn Asn Tyr AspAsp Phe Asp Cys Gly Glu Asn Ser Ala Ile Ala Ala Asn Asn Tyr Asp

        115                 120                 125115 120 125

Trp Phe Gly Asn Met Asn Val Leu Thr Phe Leu Arg Asp Ile Gly LysTrp Phe Gly Asn Met Asn Val Leu Thr Phe Leu Arg Asp Ile Gly Lys

    130                 135                 140130 135 140

His Phe Ser Val Asn Gln Met Ile Asn Lys Glu Ala Val Lys Gln ArgHis Phe Ser Val Asn Gln Met Ile Asn Lys Glu Ala Val Lys Gln Arg

145                 150                 155                 160145 150 155 160

Leu Asn Arg Glu Asp Gln Gly Ile Ser Phe Thr Glu Phe Ser Tyr AsnLeu Asn Arg Glu Asp Gln Gly Ile Ser Phe Thr Glu Phe Ser Tyr Asn

                165                 170                 175165 170 175

Leu Leu Gln Gly Tyr Ser Ala Ala Cys Leu Asn Lys Gln Tyr Gly ValLeu Leu Gln Gly Tyr Ser Ala Ala Cys Leu Asn Lys Gln Tyr Gly Val

            180                 185                 190180 185 190

Val Leu Gln Ile Gly Gly Ser Asp Gln Trp Gly Asn Ile Thr Ser GlyVal Leu Gln Ile Gly Gly Ser Asp Gln Trp Gly Asn Ile Thr Ser Gly

        195                 200                 205195 200 205

Ile Asp Leu Thr Arg Arg Leu His Gln Asn Gln Val Phe Gly Leu ThrIle Asp Leu Thr Arg Arg Leu His Gln Asn Gln Val Phe Gly Leu Thr

    210                 215                 220210 215 220

Val Pro Leu Ile Thr Lys Ala Asp Gly Thr Lys Phe Gly Lys Thr GluVal Pro Leu Ile Thr Lys Ala Asp Gly Thr Lys Phe Gly Lys Thr Glu

225                 230                 235                 240225 230 235 240

Gly Gly Ala Val Trp Leu Asp Pro Lys Lys Thr Ser Pro Tyr Lys PheGly Gly Ala Val Trp Leu Asp Pro Lys Lys Thr Ser Pro Tyr Lys Phe

                245                 250                 255245 250 255

Tyr Gln Phe Trp Ile Asn Thr Ala Asp Ala Asp Val Tyr Arg Phe LeuTyr Gln Phe Trp Ile Asn Thr Ala Asp Ala Asp Val Tyr Arg Phe Leu

            260                 265                 270260 265 270

Lys Phe Phe Thr Phe Met Ser Ile Glu Glu Ile Asn Ala Leu Glu GluLys Phe Phe Thr Phe Met Ser Ile Glu Glu Ile Asn Ala Leu Glu Glu

        275                 280                 285275 280 285

Glu Asp Lys Asn Ser Gly Lys Ala Pro Arg Ala Gln Tyr Val Leu AlaGlu Asp Lys Asn Ser Gly Lys Ala Pro Arg Ala Gln Tyr Val Leu Ala

    290                 295                 300290 295 300

Glu Gln Val Thr Arg Leu Val His Gly Glu Glu Gly Leu Gln Ala AlaGlu Gln Val Thr Arg Leu Val His Gly Glu Glu Gly Leu Gln Ala Ala

305                 310                 315                 320305 310 315 320

Lys Arg Ile Thr Glu Cys Leu Phe Ser Gly Ser Leu Ser Ala Leu SerLys Arg Ile Thr Glu Cys Leu Phe Ser Gly Ser Leu Ser Ala Leu Ser

                325                 330                 335325 330 335

Glu Ala Asp Phe Glu Gln Leu Ala Gln Asp Gly Val Pro Met Val GluGlu Ala Asp Phe Glu Gln Leu Ala Gln Asp Gly Val Pro Met Val Glu

            340                 345                 350340 345 350

Met Glu Lys Gly Ala Asp Leu Met Gln Ala Leu Val Asp Ser Glu LeuMet Glu Lys Gly Ala Asp Leu Met Gln Ala Leu Val Asp Ser Glu Leu

        355                 360                 365355 360 365

Gln Pro Ser Arg Gly Gln Ala Arg Lys Thr Ile Ala Ser Asn Ala IleGln Pro Ser Arg Gly Gln Ala Arg Lys Thr Ile Ala Ser Asn Ala Ile

    370                 375                 380370 375 380

Thr Ile Asn Gly Glu Lys Gln Ser Asp Pro Glu Tyr Phe Phe Lys GluThr Ile Asn Gly Glu Lys Gln Ser Asp Pro Glu Tyr Phe Phe Lys Glu

385                 390                 395                 400385 390 395 400

Glu Asp Arg Leu Phe Gly Arg Phe Thr Leu Leu Arg Arg Gly Lys LysGlu Asp Arg Leu Phe Gly Arg Phe Thr Leu Leu Arg Arg Gly Lys Lys

                405                 410                 415405 410 415

Asn Tyr Cys Leu Ile Cys Trp LysAsn Tyr Cys Leu Ile Cys Trp Lys

            420420

<210>56<210>56

<211>424<211>424

<212>PRT<212>PRT

<213>人工<213> Artificial

<220><220>

<223>人工合成酶<223> Synthetic enzymes

<400>56<400>56

Met Ala Ser Ser Asn Leu Ile Lys Gln Leu Gln Glu Arg Gly Leu ValMet Ala Ser Ser Asn Leu Ile Lys Gln Leu Gln Glu Arg Gly Leu Val

1               5                   10                  151 5 10 15

Ala Gln Val Thr Asp Glu Glu Ala Leu Ala Glu Arg Leu Ala Gln GlyAla Gln Val Thr Asp Glu Glu Ala Leu Ala Glu Arg Leu Ala Gln Gly

            20                  25                  3020 25 30

Pro Ile Ala Leu Ser Cys Gly Phe Asp Pro Thr Ala Asp Ser Leu HisPro Ile Ala Leu Ser Cys Gly Phe Asp Pro Thr Ala Asp Ser Leu His

        35                  40                  4535 40 45

Leu Gly His Leu Val Pro Leu Leu Cys Leu Lys Arg Phe Gln Gln AlaLeu Gly His Leu Val Pro Leu Leu Cys Leu Lys Arg Phe Gln Gln Ala

    50                  55                  6050 55 60

Gly His Lys Pro Val Ala Leu Val Gly Gly Ala Thr Gly Leu Ile GlyGly His Lys Pro Val Ala Leu Val Gly Gly Ala Thr Gly Leu Ile Gly

65                  70                  75                  8065 70 75 80

Asp Pro Ser Phe Lys Ala Ala Glu Arg Lys Leu Asn Thr Glu Glu ThrAsp Pro Ser Phe Lys Ala Ala Glu Arg Lys Leu Asn Thr Glu Glu Thr

                85                  90                  9585 90 95

Val Gln Glu Trp Val Asp Lys Ile Arg Lys Gln Val Ala Pro Phe LeuVal Gln Glu Trp Val Asp Lys Ile Arg Lys Gln Val Ala Pro Phe Leu

            100                  105                  110100 105 110

Asp Phe Asp Cys Gly Glu Asn Ser AlaIle Ala Ala Asn Asn Tyr AspAsp Phe Asp Cys Gly Glu Asn Ser AlaIle Ala Ala Asn Asn Tyr Asp

        115                 120                125115 120 125

Trp Phe Gly Asn Met Asn Val Leu Thr Phe Leu Arg Asp Ile Gly LysTrp Phe Gly Asn Met Asn Val Leu Thr Phe Leu Arg Asp Ile Gly Lys

    130                 135                 140130 135 140

His Phe Ser Val Asn Gln Met Ile Asn Lys Glu Ala Val Lys Gln ArgHis Phe Ser Val Asn Gln Met Ile Asn Lys Glu Ala Val Lys Gln Arg

145                 150                 155                 160145 150 155 160

Leu Asn Arg Glu Asp Gln Gly Ile Ser Phe Thr Glu Phe Ser Tyr AsnLeu Asn Arg Glu Asp Gln Gly Ile Ser Phe Thr Glu Phe Ser Tyr Asn

                165                 170                 175165 170 175

Leu Leu Gln Gly Tyr Thr Met Ala Cys Val Asn Lys Gln Tyr Gly ValLeu Leu Gln Gly Tyr Thr Met Ala Cys Val Asn Lys Gln Tyr Gly Val

            180                 185                 190180 185 190

Val Leu Gln Ile Gly Gly Ser Asp Gln Trp Gly Asn Ile Thr Ser GlyVal Leu Gln Ile Gly Gly Ser Asp Gln Trp Gly Asn Ile Thr Ser Gly

        195                 200                 205195 200 205

Ile Asp Leu Thr Arg Arg Leu His Gln Asn Gln Val Phe Gly Leu ThrIle Asp Leu Thr Arg Arg Leu His Gln Asn Gln Val Phe Gly Leu Thr

    210                 215                 220210 215 220

Val Pro Leu Ile Thr Lys Ala Asp Gly Thr Lys Phe Gly Lys Thr GluVal Pro Leu Ile Thr Lys Ala Asp Gly Thr Lys Phe Gly Lys Thr Glu

225                 230                 235                 240225 230 235 240

Gly Gly Ala Val Trp Leu Asp Pro Lys Lys Thr Ser Pro Tyr Lys PheGly Gly Ala Val Trp Leu Asp Pro Lys Lys Thr Ser Pro Tyr Lys Phe

                245                 250                 255245 250 255

Tyr Gln Phe Trp Ile Asn Thr Ala Asp Ala Asp Val Tyr Arg Phe LeuTyr Gln Phe Trp Ile Asn Thr Ala Asp Ala Asp Val Tyr Arg Phe Leu

            260                 265                 270260 265 270

Lys Phe Phe Thr Phe Met Ser Ile Glu Glu Ile Asn Ala Leu Glu GluLys Phe Phe Thr Phe Met Ser Ile Glu Glu Ile Asn Ala Leu Glu Glu

        275                 280                 285275 280 285

Glu Asp Lys Asn Ser Gly Lys Ala Pro Arg Ala Gln Tyr Val Leu AlaGlu Asp Lys Asn Ser Gly Lys Ala Pro Arg Ala Gln Tyr Val Leu Ala

    290                 295                 300290 295 300

Glu Gln Val Thr Arg Leu Val Hi s Gly Glu Glu Gly Leu Gln Ala AlaGlu Gln Val Thr Arg Leu Val Hi s Gly Glu Glu Gly Leu Gln Ala Ala

305                 310                  315                 320305 310 315 320

Lys Arg Ile Thr Glu Cys Leu Phe Ser Gly Ser Leu Ser Ala Leu SerLys Arg Ile Thr Glu Cys Leu Phe Ser Gly Ser Leu Ser Ala Leu Ser

                325                 330                 335325 330 335

Glu Ala Asp Phe Glu Gln Leu Ala Gln Asp Gly Val Pro Met Val GluGlu Ala Asp Phe Glu Gln Leu Ala Gln Asp Gly Val Pro Met Val Glu

            340                 345                 350340 345 350

Met Glu Lys Gly Ala Asp Leu Met Gln Ala Leu Val Asp Ser Glu LeuMet Glu Lys Gly Ala Asp Leu Met Gln Ala Leu Val Asp Ser Glu Leu

        355                 360                 365355 360 365

Gln Pro Ser Arg Gly Gln Ala Arg Lys Thr Ile Ala Ser Asn Ala IleGln Pro Ser Arg Gly Gln Ala Arg Lys Thr Ile Ala Ser Asn Ala Ile

    370                 375                 380370 375 380

Thr Ile Asn Gly Glu Lys Gln Ser Asp Pro Glu Tyr Phe Phe Lys GluThr Ile Asn Gly Glu Lys Gln Ser Asp Pro Glu Tyr Phe Phe Lys Glu

385                 390                 395                 400385 390 395 400

Glu Asp Arg Leu Phe Gly Arg Phe Thr Leu Leu Arg Arg Gly Lys LysGlu Asp Arg Leu Phe Gly Arg Phe Thr Leu Leu Arg Arg Gly Lys Lys

                405                 410                 415405 410 415

Asn Tyr Cys Leu Ile Cys Trp LysAsn Tyr Cys Leu Ile Cys Trp Lys

            420420

<210>57<210>57

<211>424<211>424

<212>PRT<212>PRT

<213>人工<213> Artificial

<220><220>

<223>人工合成酶<223> Synthetic enzymes

<400>57<400>57

Met Ala Ser Ser Asn Leu Ile Lys Gln Leu Gln Glu Arg Gly Leu ValMet Ala Ser Ser Asn Leu Ile Lys Gln Leu Gln Glu Arg Gly Leu Val

1               5                   10                  151 5 10 15

Ala Gln Val Thr Asp Glu Glu Ala Leu Ala Glu Arg Leu Ala Gln GlyAla Gln Val Thr Asp Glu Glu Ala Leu Ala Glu Arg Leu Ala Gln Gly

            20                  25                  3020 25 30

Pro Ile Ala Leu Ala Cys Gly Phe Asp Pro Thr Ala Asp Ser Leu HisPro Ile Ala Leu Ala Cys Gly Phe Asp Pro Thr Ala Asp Ser Leu His

        35                  40                  4535 40 45

Leu Gly His Leu Val Pro Leu Leu Cys Leu Lys Arg Phe Gln Gln AlaLeu Gly His Leu Val Pro Leu Leu Cys Leu Lys Arg Phe Gln Gln Ala

    50                  55                  6050 55 60

Gly His Lys Pro Val Ala Leu Val Gly Gly Ala Thr Gly Leu Ile GlyGly His Lys Pro Val Ala Leu Val Gly Gly Ala Thr Gly Leu Ile Gly

65                  70                  75                  8065 70 75 80

Asp Pro Ser Phe Lys Ala Ala Glu Arg Lys Leu Asn Thr Glu Glu ThrAsp Pro Ser Phe Lys Ala Ala Glu Arg Lys Leu Asn Thr Glu Glu Thr

                85                  90                  9585 90 95

Val Gln Glu Trp Val Asp Lys Ile Arg Lys Gln Val Ala Pro Phe LeuVal Gln Glu Trp Val Asp Lys Ile Arg Lys Gln Val Ala Pro Phe Leu

            100                 105                 110100 105 110

Asp Phe Asp Cys Gly Glu Asn Ser Ala Ile Ala Ala Asn Asn Tyr AspAsp Phe Asp Cys Gly Glu Asn Ser Ala Ile Ala Ala Asn Asn Tyr Asp

        115                 120                 125115 120 125

Trp Phe Gly Asn Met Asn Val Leu Thr Phe Leu Arg Asp Ile Gly LysTrp Phe Gly Asn Met Asn Val Leu Thr Phe Leu Arg Asp Ile Gly Lys

    130                 135                 140130 135 140

His Phe Ser Val Asn Gln Met Ile Asn Lys Glu Ala Val Lys Gln ArgHis Phe Ser Val Asn Gln Met Ile Asn Lys Glu Ala Val Lys Gln Arg

145                 150                 155                 160145 150 155 160

Leu Asn Arg Glu Asp Gln Gly Ile Ser Phe Thr Glu Phe Ser Tyr AsnLeu Asn Arg Glu Asp Gln Gly Ile Ser Phe Thr Glu Phe Ser Tyr Asn

                165                 170                 175165 170 175

Leu Leu Gln Gly Tyr Ser Tyr Ala Cys Leu Asn Lys Gln Tyr Gly ValLeu Leu Gln Gly Tyr Ser Tyr Ala Cys Leu Asn Lys Gln Tyr Gly Val

            180                 185                 190180 185 190

Val Leu Gln Ile Gly Gly Ser Asp Gln Trp Gly Asn Ile Thr Ser GlyVal Leu Gln Ile Gly Gly Ser Asp Gln Trp Gly Asn Ile Thr Ser Gly

        195                 200                 205195 200 205

Ile Asp Leu Thr Arg Arg Leu Hi s Gln Asn Gln Val Phe Gly Leu ThrIle Asp Leu Thr Arg Arg Leu Hi s Gln Asn Gln Val Phe Gly Leu Thr

    210                 215                 220210 215 220

Val Pro Leu Ile Thr Lys Ala Asp Gly Thr Lys Phe Gly Lys Thr GluVal Pro Leu Ile Thr Lys Ala Asp Gly Thr Lys Phe Gly Lys Thr Glu

225                 230                 235                 240225 230 235 240

Gly Gly Ala Val Trp Leu Asp Pro Lys Lys Thr Ser Pro Tyr Lys PheGly Gly Ala Val Trp Leu Asp Pro Lys Lys Thr Ser Pro Tyr Lys Phe

                245                 250                 255245 250 255

Tyr Gln Phe Trp Ile Asn Thr Ala Asp Ala Asp Val Tyr Arg Phe LeuTyr Gln Phe Trp Ile Asn Thr Ala Asp Ala Asp Val Tyr Arg Phe Leu

            260                 265                 270260 265 270

Lys Phe Phe Thr Phe Met Ser Ile Glu Glu Ile Asn Ala Leu Glu GluLys Phe Phe Thr Phe Met Ser Ile Glu Glu Ile Asn Ala Leu Glu Glu

        275                 280                 285275 280 285

Glu Asp Lys Asn Ser Gly Lys Ala Pro Arg Ala Gln Tyr Val Leu AlaGlu Asp Lys Asn Ser Gly Lys Ala Pro Arg Ala Gln Tyr Val Leu Ala

    290                 295                 300290 295 300

Glu Gln Val Thr Arg Leu Val His Gly Glu Glu Gly Leu Gln Ala AlaGlu Gln Val Thr Arg Leu Val His Gly Glu Glu Gly Leu Gln Ala Ala

305                 310                 315                 320305 310 315 320

Lys Arg Ile Thr Glu Cys Leu Phe Ser Gly Ser Leu Ser Ala Leu SerLys Arg Ile Thr Glu Cys Leu Phe Ser Gly Ser Leu Ser Ala Leu Ser

                325                 330                 335325 330 335

Glu Ala Asp Phe Glu Gln Leu Ala Gln Asp Gly Val Pro Met Val GluGlu Ala Asp Phe Glu Gln Leu Ala Gln Asp Gly Val Pro Met Val Glu

            340                 345                 350340 345 350

Met Glu Lys Gly Ala Asp Leu Met Gln Ala Leu Val Asp Ser Glu LeuMet Glu Lys Gly Ala Asp Leu Met Gln Ala Leu Val Asp Ser Glu Leu

        355                 360                 365355 360 365

Gln Pro Ser Arg Gly Gln Ala Arg Lys Thr Ile Ala Ser Asn Ala IleGln Pro Ser Arg Gly Gln Ala Arg Lys Thr Ile Ala Ser Asn Ala Ile

    370                 375                 380370 375 380

Thr Ile Asn Gly Glu Lys Gln Ser Asp Pro Glu Tyr Phe Phe Lys GluThr Ile Asn Gly Glu Lys Gln Ser Asp Pro Glu Tyr Phe Phe Lys Glu

385                 390                 395                 400385 390 395 400

Glu Asp Arg Leu Phe Gly Arg PhcThr Leu Leu Arg Arg Gly Lys LysGlu Asp Arg Leu Phe Gly Arg PhcThr Leu Leu Arg Arg Gly Lys Lys

                405                410                 415405 410 415

Asn Tyr Cys Leu Ile Cys Trp LysAsn Tyr Cys Leu Ile Cys Trp Lys

            420420

<210>58<210>58

<211>424<211>424

<212>PRT<212>PRT

<213>人工<213> Artificial

<220><220>

<223>人工合成酶<223> Synthetic enzymes

<400>58<400>58

Met Ala Ser Ser Asn Leu Ile Lys Gln Leu Gln Glu Arg Gly Leu ValMet Ala Ser Ser Asn Leu Ile Lys Gln Leu Gln Glu Arg Gly Leu Val

1               5                   10                  151 5 10 15

Ala Gln Val Thr Asp Glu Glu Ala Leu Ala Glu Arg Leu Ala Gln GlyAla Gln Val Thr Asp Glu Glu Ala Leu Ala Glu Arg Leu Ala Gln Gly

            20                  25                  3020 25 30

Pro Ile Ala Leu Ala Cys Gly Phe Asp Pro Thr Ala Asp Ser Leu HisPro Ile Ala Leu Ala Cys Gly Phe Asp Pro Thr Ala Asp Ser Leu His

        35                  40                  4535 40 45

Leu Gly His Leu Val Pro Leu Leu Cys Leu Lys Arg Phe Gln Gln AlaLeu Gly His Leu Val Pro Leu Leu Cys Leu Lys Arg Phe Gln Gln Ala

    50                  55                  6050 55 60

Gly His Lys Pro Val Ala Leu Val Gly Gly Ala Thr Gly Leu Ile GlyGly His Lys Pro Val Ala Leu Val Gly Gly Ala Thr Gly Leu Ile Gly

65                  70                  75                  8065 70 75 80

Asp Pro Ser Phe Lys Ala Ala Glu Arg Lys Leu Asn Thr Glu Glu ThrAsp Pro Ser Phe Lys Ala Ala Glu Arg Lys Leu Asn Thr Glu Glu Thr

                85                  90                  9585 90 95

Val Gln Glu Trp Val Asp Lys Ile Arg Lys Gln Val Ala Pro Phe LeuVal Gln Glu Trp Val Asp Lys Ile Arg Lys Gln Val Ala Pro Phe Leu

            100                 105                 110100 105 110

Asp Phe Asp Cys Gly Glu Asn Ser Ala Ile Ala Ala Asn Asn Tyr AspAsp Phe Asp Cys Gly Glu Asn Ser Ala Ile Ala Ala Asn Asn Tyr Asp

        115                 120                 125115 120 125

Trp Phe Gly Asn Met Asn Val Leu Thr Phe Leu Arg Asp Ile Gly LysTrp Phe Gly Asn Met Asn Val Leu Thr Phe Leu Arg Asp Ile Gly Lys

    130                 135                 140130 135 140

His Phe Ser Val Asn Gln Met Ile Asn Lys Glu Ala Val Lys Gln ArgHis Phe Ser Val Asn Gln Met Ile Asn Lys Glu Ala Val Lys Gln Arg

145                 150                 155                 160145 150 155 160

Leu Asn Arg Glu Asp Gln Gly Ile Ser Phe Thr Glu Phe Ser Tyr AsnLeu Asn Arg Glu Asp Gln Gly Ile Ser Phe Thr Glu Phe Ser Tyr Asn

                165                 170                 175165 170 175

Leu Leu Gln Gly Tyr Thr Met Ala Cys Cys Asn Lys Gln Tyr Gly ValLeu Leu Gln Gly Tyr Thr Met Ala Cys Cys Asn Lys Gln Tyr Gly Val

            180                 185                 190180 185 190

Val Leu Gln Ile Gly Gly Ser Asp Gln Trp Gly Asn Ile Thr Ser GlyVal Leu Gln Ile Gly Gly Ser Asp Gln Trp Gly Asn Ile Thr Ser Gly

        195                 200                 205195 200 205

Ile Asp Leu Thr Arg Arg Leu His Gln Asn Gln Val Phe Gly Leu ThrIle Asp Leu Thr Arg Arg Leu His Gln Asn Gln Val Phe Gly Leu Thr

    210                 215                 220210 215 220

Val Pro Leu Ile Thr Lys Ala Asp Gly Thr Lys Phe Gly Lys Thr GluVal Pro Leu Ile Thr Lys Ala Asp Gly Thr Lys Phe Gly Lys Thr Glu

225                 230                 235                 240225 230 235 240

Gly Gly Ala Val Trp Leu Asp Pro Lys Lys Thr Ser Pro Tyr Lys PheGly Gly Ala Val Trp Leu Asp Pro Lys Lys Thr Ser Pro Tyr Lys Phe

                245                 250                 255245 250 255

Tyr Gln Phe Trp Ile Asn Thr Ala Asp Ala Asp Val Tyr Arg Phe LeuTyr Gln Phe Trp Ile Asn Thr Ala Asp Ala Asp Val Tyr Arg Phe Leu

            260                 265                 270260 265 270

Lys Phe Phe Thr Phe Met Ser Ile Glu Glu Ile Asn Ala Leu Glu GluLys Phe Phe Thr Phe Met Ser Ile Glu Glu Ile Asn Ala Leu Glu Glu

        275                 280                 285275 280 285

Glu Asp Lys Asn Ser Gly Lys Ala Pro Arg Ala Gln Tyr Val Leu AlaGlu Asp Lys Asn Ser Gly Lys Ala Pro Arg Ala Gln Tyr Val Leu Ala

    290                 295                 300290 295 300

Glu Gln Val Thr Arg Leu Val His Gly Glu Glu Gly Leu Gln Ala AlaGlu Gln Val Thr Arg Leu Val His Gly Glu Glu Gly Leu Gln Ala Ala

305                 310                 315                 320305 310 315 320

Lys Arg Ile Thr Glu Cys Leu Phe Ser Gly Ser Leu Ser Ala Leu SerLys Arg Ile Thr Glu Cys Leu Phe Ser Gly Ser Leu Ser Ala Leu Ser

                325                 330                 335325 330 335

Glu Ala Asp Phe Glu Gln Leu Ala Gln Asp Gly Val Pro Met Val GluGlu Ala Asp Phe Glu Gln Leu Ala Gln Asp Gly Val Pro Met Val Glu

            340                 345                 350340 345 350

Met Glu Lys Gly Ala Asp Leu Met Gln Ala Leu Val Asp Ser Glu LeuMet Glu Lys Gly Ala Asp Leu Met Gln Ala Leu Val Asp Ser Glu Leu

        355                 360                 365355 360 365

Gln Pro Ser Arg Gly Gln Ala Arg Lys Thr Ile Ala Ser Asn Ala IleGln Pro Ser Arg Gly Gln Ala Arg Lys Thr Ile Ala Ser Asn Ala Ile

    370                 375                 380370 375 380

Thr Ile Asn Gly Glu Lys Gln Ser Asp Pro Glu Tyr Phe Phe Lys GluThr Ile Asn Gly Glu Lys Gln Ser Asp Pro Glu Tyr Phe Phe Lys Glu

385                 390                 395                 400385 390 395 400

Glu Asp Arg Leu Phe Gly Arg Phe Thr Leu Leu Arg Arg Gly Lys LysGlu Asp Arg Leu Phe Gly Arg Phe Thr Leu Leu Arg Arg Gly Lys Lys

                405                 410                 415405 410 415

Asn Tyr Cys Leu Ile Cys Trp LysAsn Tyr Cys Leu Ile Cys Trp Lys

            420420

<210>59<210>59

<211>424<211>424

<212>PRT<212>PRT

<213>人工<213> Artificial

<220><220>

<223>人工合成酶<223> Synthetic enzymes

<400>59<400>59

Met Ala Ser Ser Asn Leu Ile Lys Gln Leu Gln Glu Arg Gly Leu ValMet Ala Ser Ser Asn Leu Ile Lys Gln Leu Gln Glu Arg Gly Leu Val

1               5                   10                  151 5 10 15

Ala Gln Val Thr Asp Glu Glu Ala Leu Ala Glu Arg Leu Ala Gln GlyAla Gln Val Thr Asp Glu Glu Ala Leu Ala Glu Arg Leu Ala Gln Gly

            20                  25                  3020 25 30

Pro Ile Ala Leu Thr Cys Gly Phe Asp Pro Thr Ala Asp Ser Leu HisPro Ile Ala Leu Thr Cys Gly Phe Asp Pro Thr Ala Asp Ser Leu His

        35                  40                  4535 40 45

Leu Gly His Leu Val Pro Leu Leu Cys Leu Lys Arg Phe Gln Gln AlaLeu Gly His Leu Val Pro Leu Leu Cys Leu Lys Arg Phe Gln Gln Ala

    50                  55                  6050 55 60

Gly His Lys Pro Val Ala Leu Val Gly Gly Ala Thr Gly Leu Ile GlyGly His Lys Pro Val Ala Leu Val Gly Gly Ala Thr Gly Leu Ile Gly

65                  70                  75                  8065 70 75 80

Asp Pro Ser Phe Lys Ala Ala Glu Arg Lys Leu Asn Thr Glu Glu ThrAsp Pro Ser Phe Lys Ala Ala Glu Arg Lys Leu Asn Thr Glu Glu Thr

                85                  90                  9585 90 95

Val Gln Glu Trp Val Asp LysIle Arg Lys Gln Val Ala Pro Phe LeuVal Gln Glu Trp Val Asp LysIle Arg Lys Gln Val Ala Pro Phe Leu

            100                105                 110100 105 110

Asp Phe Asp Cys Gly Glu Asn Ser Ala Ile Ala Ala Asn Asn Tyr AspAsp Phe Asp Cys Gly Glu Asn Ser Ala Ile Ala Ala Asn Asn Tyr Asp

        115                 120                 125115 120 125

Trp Phe Gly Asn Met Asn Val Leu Thr Phe Leu Arg Asp Ile Gly LysTrp Phe Gly Asn Met Asn Val Leu Thr Phe Leu Arg Asp Ile Gly Lys

    130                 135                 140130 135 140

His Phe Ser Val Asn Gln Met Ile Asn Lys Glu Ala Val Lys Gln ArgHis Phe Ser Val Asn Gln Met Ile Asn Lys Glu Ala Val Lys Gln Arg

145                 150                 155                 160145 150 155 160

Leu Asn Arg Glu Asp Gln Gly Ile Ser Phe Thr Glu Phe Ser Tyr AsnLeu Asn Arg Glu Asp Gln Gly Ile Ser Phe Thr Glu Phe Ser Tyr Asn

                165                 170                 175165 170 175

Leu Leu Gln Gly Tyr Thr Phe Ala Cys Met Asn Lys Gln Tyr Gly ValLeu Leu Gln Gly Tyr Thr Phe Ala Cys Met Asn Lys Gln Tyr Gly Val

            180                 185                 190180 185 190

Val Leu Gln Ile Gly Gly Ser Asp Gln Trp Gly Asn Ile Thr Ser GlyVal Leu Gln Ile Gly Gly Ser Asp Gln Trp Gly Asn Ile Thr Ser Gly

        195                 200                 205195 200 205

Ile Asp Leu Thr Arg Arg Leu His Gln Asn Gln Val Phe Gly Leu ThrIle Asp Leu Thr Arg Arg Leu His Gln Asn Gln Val Phe Gly Leu Thr

    210                 215                 220210 215 220

Val Pro Leu Ile Thr Lys Ala AspGly Thr Lys Phe Gly Lys Thr GluVal Pro Leu Ile Thr Lys Ala AspGly Thr Lys Phe Gly Lys Thr Glu

225                 230                 235                 240225 230 235 240

Gly Gly Ala Val Trp Leu Asp Pro Lys Lys Thr Ser Pro Tyr Lys PheGly Gly Ala Val Trp Leu Asp Pro Lys Lys Thr Ser Pro Tyr Lys Phe

                245                 250                 255245 250 255

Tyr Gln Phe Trp Ile Asn Thr Ala Asp Ala Asp Val Tyr Arg Phe LeuTyr Gln Phe Trp Ile Asn Thr Ala Asp Ala Asp Val Tyr Arg Phe Leu

            260                 265                 270260 265 270

Lys Phe Phe Thr Phe Met Ser Ile Glu Glu Ile Asn Ala Leu Glu GluLys Phe Phe Thr Phe Met Ser Ile Glu Glu Ile Asn Ala Leu Glu Glu

        275                 280                 285275 280 285

Glu Asp Lys Asn Ser Gly Lys Ala Pro Arg Ala Gln Tyr Val Leu AlaGlu Asp Lys Asn Ser Gly Lys Ala Pro Arg Ala Gln Tyr Val Leu Ala

    290                 295                 300290 295 300

Glu Gln Val Thr Arg Leu Val His Gly Glu Glu Gly Leu Gln Ala AlaGlu Gln Val Thr Arg Leu Val His Gly Glu Glu Gly Leu Gln Ala Ala

305                 310                 315                 320305 310 315 320

Lys Arg Ile Thr Glu Cys Leu Phe Ser Gly Ser Leu Ser Ala Leu SerLys Arg Ile Thr Glu Cys Leu Phe Ser Gly Ser Leu Ser Ala Leu Ser

                325                 330                 335325 330 335

Glu Ala Asp Phe Glu Gln Leu Ala Gln Asp Gly Val Pro Met Val GluGlu Ala Asp Phe Glu Gln Leu Ala Gln Asp Gly Val Pro Met Val Glu

            340                 345                 350340 345 350

Met Glu Lys Gly Ala Asp Leu Met Gln Ala Leu Val Asp Ser Glu LeuMet Glu Lys Gly Ala Asp Leu Met Gln Ala Leu Val Asp Ser Glu Leu

        355                 360                 365355 360 365

Gln Pro Ser Arg Gly Gln Ala Arg Lys Thr Ile Ala Ser Asn Ala IleGln Pro Ser Arg Gly Gln Ala Arg Lys Thr Ile Ala Ser Asn Ala Ile

    370                 375                 380370 375 380

Thr Ile Asn Gly Glu Lys Gln Ser Asp Pro Glu Tyr Phe Phe Lys GluThr Ile Asn Gly Glu Lys Gln Ser Asp Pro Glu Tyr Phe Phe Lys Glu

385                 390                 395                 400385 390 395 400

Glu Asp Arg Leu Phe Gly Arg Phe Thr Leu Leu Arg Arg Gly Lys LysGlu Asp Arg Leu Phe Gly Arg Phe Thr Leu Leu Arg Arg Gly Lys Lys

                405                 410                 415405 410 415

Asn Tyr Cys Leu Ile Cys Trp LysAsn Tyr Cys Leu Ile Cys Trp Lys

            420420

<210>60<210>60

<211>424<211>424

<212>PRT<212>PRT

<213>人工<213> Artificial

<220><220>

<223>人工合成酶<223> Synthetic enzymes

<400>60<400>60

Met Ala Ser Ser Asn Leu Ile Lys Gln Leu Gln Glu Arg Gly Leu ValMet Ala Ser Ser Asn Leu Ile Lys Gln Leu Gln Glu Arg Gly Leu Val

1               5                   10                  151 5 10 15

Ala Gln Val Thr Asp Glu Glu Ala Leu Ala Glu Arg Leu Ala Gln GlyAla Gln Val Thr Asp Glu Glu Ala Leu Ala Glu Arg Leu Ala Gln Gly

            20                  25                  3020 25 30

Pro Ile Ala Leu Thr Cys Gly Phe Asp Pro Thr Ala Asp Ser Leu HisPro Ile Ala Leu Thr Cys Gly Phe Asp Pro Thr Ala Asp Ser Leu His

        35                  40                  4535 40 45

Leu Gly His Leu Val Pro Leu Leu Cys Leu Lys Arg Phe Gln Gln AlaLeu Gly His Leu Val Pro Leu Leu Cys Leu Lys Arg Phe Gln Gln Ala

    50                  55                  6050 55 60

Gly His Lys Pro Val Ala Leu Val Gly Gly Ala Thr Gly Leu Ile GlyGly His Lys Pro Val Ala Leu Val Gly Gly Ala Thr Gly Leu Ile Gly

65                  70                  75                  8065 70 75 80

Asp Pro Ser Phe Lys Ala Ala Glu Arg Lys Leu Asn Thr Glu Glu ThrAsp Pro Ser Phe Lys Ala Ala Glu Arg Lys Leu Asn Thr Glu Glu Thr

                85                  90                  9585 90 95

Val Gln Glu Trp Val Asp Lys Ile Arg Lys Gln Val Ala Pro Phe LeuVal Gln Glu Trp Val Asp Lys Ile Arg Lys Gln Val Ala Pro Phe Leu

            100                 105                 110100 105 110

Asp Phe Asp Cys Gly Glu Asn Ser Ala Ile Ala Ala Asn Asn Tyr AspAsp Phe Asp Cys Gly Glu Asn Ser Ala Ile Ala Ala Asn Asn Tyr Asp

        115                 120                 125115 120 125

Trp Phe Gly Asn Met Asn Val Leu Thr Phe Leu Arg Asp Ile Gly LysTrp Phe Gly Asn Met Asn Val Leu Thr Phe Leu Arg Asp Ile Gly Lys

    130                 135                 140130 135 140

His Phe Ser Val Asn Gln Met Ile Asn Lys Glu Ala Val Lys Gln ArgHis Phe Ser Val Asn Gln Met Ile Asn Lys Glu Ala Val Lys Gln Arg

145                 150                 155                 160145 150 155 160

Leu Asn Arg Glu Asp Gln Gly Ile Ser Phe Thr Glu Phe Ser Tyr AsnLeu Asn Arg Glu Asp Gln Gly Ile Ser Phe Thr Glu Phe Ser Tyr Asn

                165                 170                 175165 170 175

Leu Leu Gln Gly Tyr Ser Val Ala Cys Leu Asn Lys Gln Tyr Gly ValLeu Leu Gln Gly Tyr Ser Val Ala Cys Leu Asn Lys Gln Tyr Gly Val

            180                 185                 190180 185 190

Val Leu Gln Ile Gly Gly Ser Asp Gln Trp Gly Asn Ile Thr Ser GlyVal Leu Gln Ile Gly Gly Ser Asp Gln Trp Gly Asn Ile Thr Ser Gly

        195                 200                 205195 200 205

Ile Asp Leu Thr Arg Arg Leu His Gln Asn Gln Val Phe Gly Leu ThrIle Asp Leu Thr Arg Arg Leu His Gln Asn Gln Val Phe Gly Leu Thr

    210                 215                 220210 215 220

Val Pro Leu Ile Thr Lys Ala Asp Gly Thr Lys Phe Gly Lys Thr GluVal Pro Leu Ile Thr Lys Ala Asp Gly Thr Lys Phe Gly Lys Thr Glu

225                 230                 235                 240225 230 235 240

Gly Gly Ala Val Trp Leu Asp Pro Lys Lys Thr Ser Pro Tyr Lys PheGly Gly Ala Val Trp Leu Asp Pro Lys Lys Thr Ser Pro Tyr Lys Phe

                245                 250                 255245 250 255

Tyr Gln Phe Trp Ile Asn Thr Ala Asp Ala Asp Val Tyr Arg Phe LeuTyr Gln Phe Trp Ile Asn Thr Ala Asp Ala Asp Val Tyr Arg Phe Leu

            260                 265                 270260 265 270

Lys Phe Phe Thr Phe Met Ser Ile Glu Glu Ile Asn Ala Leu Glu GluLys Phe Phe Thr Phe Met Ser Ile Glu Glu Ile Asn Ala Leu Glu Glu

        275                 280                 285275 280 285

Glu Asp Lys Asn Ser Gly Lys Ala Pro Arg Ala Gln Tyr Val Leu AlaGlu Asp Lys Asn Ser Gly Lys Ala Pro Arg Ala Gln Tyr Val Leu Ala

    290                 295                 300290 295 300

Glu Gln Val Thr Arg Leu Val His Gly Glu Glu Gly Leu Gln Ala AlaGlu Gln Val Thr Arg Leu Val His Gly Glu Glu Gly Leu Gln Ala Ala

305                 310                 315                 320305 310 315 320

Lys Arg Ile Thr Glu Cys Leu Phe Ser Gly Ser Leu Ser Ala Leu SerLys Arg Ile Thr Glu Cys Leu Phe Ser Gly Ser Leu Ser Ala Leu Ser

                325                 330                 335325 330 335

Glu Ala Asp Phe Glu Gln Leu Ala Gln Asp Gly Val Pro Met Val GluGlu Ala Asp Phe Glu Gln Leu Ala Gln Asp Gly Val Pro Met Val Glu

            340                 345                 350340 345 350

Met Glu Lys Gly Ala Asp Leu Met Gln Ala Leu Val Asp Ser Glu LeuMet Glu Lys Gly Ala Asp Leu Met Gln Ala Leu Val Asp Ser Glu Leu

        355                 360                 365355 360 365

Gln Pro Ser Arg Gly Gln Ala Arg Lys Thr Ile Ala Ser Asn Ala IleGln Pro Ser Arg Gly Gln Ala Arg Lys Thr Ile Ala Ser Asn Ala Ile

    370                 375                 380370 375 380

Thr Ile Asn Gly Glu Lys Gln Ser Asp Pro Glu Tyr Phe Phe Lys GluThr Ile Asn Gly Glu Lys Gln Ser Asp Pro Glu Tyr Phe Phe Lys Glu

385                 390                 395                 400385 390 395 400

Glu Asp Arg Leu Phe Gly Arg Phe Thr Leu Leu Arg Arg Gly Lys LysGlu Asp Arg Leu Phe Gly Arg Phe Thr Leu Leu Arg Arg Gly Lys Lys

                405                 410                 415405 410 415

Asn Tyr Cys Leu Ile Cys Trp LysAsn Tyr Cys Leu Ile Cys Trp Lys

            420420

<210>61<210>61

<211>424<211>424

<212>PRT<212>PRT

<213>人工<213> Artificial

<220><220>

<223>人工合成酶<223> Synthetic enzymes

<400>61<400>61

Met Ala Ser Ser Asn Leu Ile Lys Gln Leu Gln Glu Arg Gly Leu ValMet Ala Ser Ser Asn Leu Ile Lys Gln Leu Gln Glu Arg Gly Leu Val

1               5                   10                  151 5 10 15

Ala Gln Val Thr Asp Glu Glu Ala Leu Ala Glu Arg Leu Ala Gln GlyAla Gln Val Thr Asp Glu Glu Ala Leu Ala Glu Arg Leu Ala Gln Gly

            20                  25                  3020 25 30

Pro Ile Ala Leu Val Cys Gly Phe Asp Pro Thr Ala Asp Ser Leu HisPro Ile Ala Leu Val Cys Gly Phe Asp Pro Thr Ala Asp Ser Leu His

        35                  40                  4535 40 45

Leu Gly His Leu Val Pro Leu Leu Cys Leu Lys Arg Phe Gln Gln AlaLeu Gly His Leu Val Pro Leu Leu Cys Leu Lys Arg Phe Gln Gln Ala

    50                  55                  6050 55 60

Gly His Lys Pro Val Ala Leu ValGly Gly Ala Thr Gly Leu Ile GlyGly His Lys Pro Val Ala Leu ValGly Gly Ala Thr Gly Leu Ile Gly

65                  70                 75                  8065 70 75 80

Asp Pro Ser Phe Lys Ala Ala Glu Arg Lys Leu Asn Thr Glu Glu ThrAsp Pro Ser Phe Lys Ala Ala Glu Arg Lys Leu Asn Thr Glu Glu Thr

                85                  90                  9585 90 95

Val Gln Glu Trp Val Asp Lys Ile Arg Lys Gln Val Ala Pro Phe LeuVal Gln Glu Trp Val Asp Lys Ile Arg Lys Gln Val Ala Pro Phe Leu

            100                 105                 110100 105 110

Asp Phe Asp Cys Gly Glu Asn Ser Ala Ile Ala Ala Asn Asn Tyr AspAsp Phe Asp Cys Gly Glu Asn Ser Ala Ile Ala Ala Asn Asn Tyr Asp

        115                 120                 125115 120 125

Trp Phe Gly Asn Met Asn Val Leu Thr Phe Leu Arg Asp Ile Gly LysTrp Phe Gly Asn Met Asn Val Leu Thr Phe Leu Arg Asp Ile Gly Lys

    130                 135                 140130 135 140

His Phe Ser Val Asn Gln Met Ile Asn Lys Glu Ala Val Lys Gln ArgHis Phe Ser Val Asn Gln Met Ile Asn Lys Glu Ala Val Lys Gln Arg

145                 150                 155                 160145 150 155 160

Leu Asn Arg Glu Asp Gln Gly Ile Ser Phe Thr Glu Phe Ser Tyr AsnLeu Asn Arg Glu Asp Gln Gly Ile Ser Phe Thr Glu Phe Ser Tyr Asn

                165                 170                 175165 170 175

Leu Leu Gln Gly Tyr Ser Met Ala Cys Thr Asn Lys Gln Tyr Gly ValLeu Leu Gln Gly Tyr Ser Met Ala Cys Thr Asn Lys Gln Tyr Gly Val

            180                 185                 190180 185 190

Val Leu Gln Ile Gly Gly Ser Asp Gln Trp Gly Asn Ile Thr Ser GlyVal Leu Gln Ile Gly Gly Ser Asp Gln Trp Gly Asn Ile Thr Ser Gly

        195                 200                 205195 200 205

Ile Asp Leu Thr Arg Arg Leu His Gln Asn Gln Val Phe Gly Leu ThrIle Asp Leu Thr Arg Arg Leu His Gln Asn Gln Val Phe Gly Leu Thr

    210                 215                 220210 215 220

Val Pro Leu Ile Thr Lys Ala Asp Gly Thr Lys Phe Gly Lys Thr GluVal Pro Leu Ile Thr Lys Ala Asp Gly Thr Lys Phe Gly Lys Thr Glu

225                 230                 235                 240225 230 235 240

Gly Gly Ala Val Trp Leu Asp Pro Lys Lys Thr Ser Pro Tyr Lys PheGly Gly Ala Val Trp Leu Asp Pro Lys Lys Thr Ser Pro Tyr Lys Phe

                245                 250                 255245 250 255

Tyr Gln Phe Trp Ile Asn Thr Ala Asp Ala Asp Val Tyr Arg Phe LeuTyr Gln Phe Trp Ile Asn Thr Ala Asp Ala Asp Val Tyr Arg Phe Leu

            260                 265                 270260 265 270

Lys Phe Phe Thr Phe Met Ser Ile Glu Glu Ile Asn Ala Leu Glu GluLys Phe Phe Thr Phe Met Ser Ile Glu Glu Ile Asn Ala Leu Glu Glu

        275                 280                 285275 280 285

Glu Asp Lys Asn Ser Gly Lys Ala Pro Arg Ala Gln Tyr Val Leu AlaGlu Asp Lys Asn Ser Gly Lys Ala Pro Arg Ala Gln Tyr Val Leu Ala

    290                 295                 300290 295 300

Glu Gln Val Thr Arg Leu Val His Gly Glu Glu Gly Leu Gln Ala AlaGlu Gln Val Thr Arg Leu Val His Gly Glu Glu Gly Leu Gln Ala Ala

305                 310                 315                 320305 310 315 320

Lys Arg Ile Thr Glu Cys Leu Phe Ser Gly Ser Leu Ser Ala Leu SerLys Arg Ile Thr Glu Cys Leu Phe Ser Gly Ser Leu Ser Ala Leu Ser

                325                 330                 335325 330 335

Glu Ala Asp Phe Glu Gln Leu Ala Gln Asp Gly Val Pro Met Val GluGlu Ala Asp Phe Glu Gln Leu Ala Gln Asp Gly Val Pro Met Val Glu

            340                 345                 350340 345 350

Met Glu Lys Gly Ala Asp Leu Met Gln Ala Leu Val Asp Ser Glu LeuMet Glu Lys Gly Ala Asp Leu Met Gln Ala Leu Val Asp Ser Glu Leu

        355                 360                 365355 360 365

Gln Pro Ser Arg Gly Gln Ala Arg Lys Thr Ile Ala Ser Asn Ala IleGln Pro Ser Arg Gly Gln Ala Arg Lys Thr Ile Ala Ser Asn Ala Ile

    370                 375                 380370 375 380

Thr Ile Asn Gly Glu Lys Gln Ser Asp Pro Glu Tyr Phe Phe Lys GluThr Ile Asn Gly Glu Lys Gln Ser Asp Pro Glu Tyr Phe Phe Lys Glu

385                 390                 395                 400385 390 395 400

Glu Asp Arg Leu Phe Gly Arg Phe Thr Leu Leu Arg Arg Gly Lys LysGlu Asp Arg Leu Phe Gly Arg Phe Thr Leu Leu Arg Arg Gly Lys Lys

                405                 410                 415405 410 415

Asn Tyr Cys Leu Ile Cys Trp LysAsn Tyr Cys Leu Ile Cys Trp Lys

            420420

<210>62<210>62

<211>424<211>424

<212>PRT<212>PRT

<213>人工<213> Artificial

<220><220>

<223>人工合成酶<223> Synthetic enzymes

<400>62<400>62

Met Ala Ser Ser Asn Leu Ile Lys Gln Leu Gln Glu Arg Gly Leu ValMet Ala Ser Ser Asn Leu Ile Lys Gln Leu Gln Glu Arg Gly Leu Val

1               5                   10                  151 5 10 15

Ala Gln Val Thr Asp Glu Glu Ala Leu Ala Glu Arg Leu Ala Gln GlyAla Gln Val Thr Asp Glu Glu Ala Leu Ala Glu Arg Leu Ala Gln Gly

            20                  25                  3020 25 30

Pro Ile Ala Leu Ser Cys Gly Phe Asp Pro Thr Ala Asp Ser Leu HisPro Ile Ala Leu Ser Cys Gly Phe Asp Pro Thr Ala Asp Ser Leu His

        35                  40                  4535 40 45

Leu Gly His Leu Val Pro Leu Leu Cys Leu Lys Arg Phe Gln Gln AlaLeu Gly His Leu Val Pro Leu Leu Cys Leu Lys Arg Phe Gln Gln Ala

    50                  55                  6050 55 60

Gly His Lys Pro Val Ala Leu Val Gly Gly Ala Thr Gly Leu Ile GlyGly His Lys Pro Val Ala Leu Val Gly Gly Ala Thr Gly Leu Ile Gly

65                  70                  75                  8065 70 75 80

Asp Pro Ser Phe Lys Ala Ala Glu Arg Lys Leu Asn Thr Glu Glu ThrAsp Pro Ser Phe Lys Ala Ala Glu Arg Lys Leu Asn Thr Glu Glu Thr

                85                  90                  9585 90 95

Val Gln Glu Trp Val Asp Lys Ile Arg Lys Gln Val Ala Pro Phe LeuVal Gln Glu Trp Val Asp Lys Ile Arg Lys Gln Val Ala Pro Phe Leu

            100                 105                 110100 105 110

Asp Phe Asp Cys Gly Glu Asn Ser Ala Ile Ala Ala Asn Asn Tyr AspAsp Phe Asp Cys Gly Glu Asn Ser Ala Ile Ala Ala Asn Asn Tyr Asp

        115                 120                 125115 120 125

Trp Phe Gly Asn Met Asn Val Leu Thr Phe Leu Arg Asp Ile Gly LysTrp Phe Gly Asn Met Asn Val Leu Thr Phe Leu Arg Asp Ile Gly Lys

    130                 135                 140130 135 140

His Phe Ser Val Asn Gln Met Ile Asn Lys Glu Ala Val Lys Gln ArgHis Phe Ser Val Asn Gln Met Ile Asn Lys Glu Ala Val Lys Gln Arg

145                 150                 155                 160145 150 155 160

Leu Asn Arg Glu Asp Gln Gly Ile Ser Phe Thr Glu Phe Ser Tyr AsnLeu Asn Arg Glu Asp Gln Gly Ile Ser Phe Thr Glu Phe Ser Tyr Asn

                165                 170                 175165 170 175

Leu Leu Gln Gly Tyr Ser Phe Ala Cys Leu Asn Lys Gln Tyr Gly ValLeu Leu Gln Gly Tyr Ser Phe Ala Cys Leu Asn Lys Gln Tyr Gly Val

            180                 185                 190180 185 190

Val Leu Gln Ile Gly Gly Ser Asp Gln Trp Gly Asn Ile Thr Ser GlyVal Leu Gln Ile Gly Gly Ser Asp Gln Trp Gly Asn Ile Thr Ser Gly

        195                 200                 205195 200 205

Ile Asp Leu Thr Arg Arg Leu His Gln Asn Gln Val Phe Gly Leu ThrIle Asp Leu Thr Arg Arg Leu His Gln Asn Gln Val Phe Gly Leu Thr

    210                 215                 220210 215 220

Val Pro Leu Ile Thr Lys Ala Asp Gly Thr Lys Phe Gly Lys Thr GluVal Pro Leu Ile Thr Lys Ala Asp Gly Thr Lys Phe Gly Lys Thr Glu

225                 230                 235                 240225 230 235 240

Gly Gly Ala Val Trp Leu Asp Pro Lys Lys Thr Ser Pro Tyr Lys PheGly Gly Ala Val Trp Leu Asp Pro Lys Lys Thr Ser Pro Tyr Lys Phe

                245                 250                 255245 250 255

Tyr Gln Phe Trp Ile Asn Thr Ala Asp Ala Asp Val Tyr Arg Phe LeuTyr Gln Phe Trp Ile Asn Thr Ala Asp Ala Asp Val Tyr Arg Phe Leu

            260                 265                 270260 265 270

Lys Phe Phe Thr Phe Met Ser Ile Glu Glu Ile Asn Ala Leu Glu GluLys Phe Phe Thr Phe Met Ser Ile Glu Glu Ile Asn Ala Leu Glu Glu

        275                 280                 285275 280 285

Glu Asp Lys Asn Ser Gly Lys Ala Pro Arg Ala Gln Tyr Val Leu AlaGlu Asp Lys Asn Ser Gly Lys Ala Pro Arg Ala Gln Tyr Val Leu Ala

    290                 295                 300290 295 300

Glu Gln Val Thr Arg Leu Val His Gly Glu Glu Gly Leu Gln Ala AlaGlu Gln Val Thr Arg Leu Val His Gly Glu Glu Gly Leu Gln Ala Ala

305                 310                 315                 320305 310 315 320

Lys Arg Ile Thr Glu Cys Leu Phe Ser Gly Ser Leu Ser Ala Leu SerLys Arg Ile Thr Glu Cys Leu Phe Ser Gly Ser Leu Ser Ala Leu Ser

                325                 330                 335325 330 335

Glu Ala Asp Phe Glu Gln Leu Ala Gln Asp Gly Val Pro Met Val GluGlu Ala Asp Phe Glu Gln Leu Ala Gln Asp Gly Val Pro Met Val Glu

            340                 345                 350340 345 350

Met Glu Lys Gly Ala Asp Leu Met Gln Ala Leu Val Asp Ser Glu LeuMet Glu Lys Gly Ala Asp Leu Met Gln Ala Leu Val Asp Ser Glu Leu

        355                 360                 365355 360 365

Gln Pro Ser Arg Gly Gln Ala Arg Lys Thr Ile Ala Ser Asn Ala IleGln Pro Ser Arg Gly Gln Ala Arg Lys Thr Ile Ala Ser Asn Ala Ile

    370                 375                 380370 375 380

Thr Ile Asn Gly Glu Lys Gln Ser Asp Pro Glu Tyr Phe Phe Lys GluThr Ile Asn Gly Glu Lys Gln Ser Asp Pro Glu Tyr Phe Phe Lys Glu

385                 390                 395                 400385 390 395 400

Glu Asp Arg Leu Phe Gly Arg Phe Thr Leu Leu Arg Arg Gly Lys LysGlu Asp Arg Leu Phe Gly Arg Phe Thr Leu Leu Arg Arg Gly Lys Lys

                405                 410                 415405 410 415

Asn Tyr Cys Leu Ile Cys Trp LysAsn Tyr Cys Leu Ile Cys Trp Lys

            420420

<210>63<210>63

<211>424<211>424

<212>PRT<212>PRT

<213>人工<213> Artificial

<220><220>

<223>人工合成酶<223> Synthetic enzymes

<400>63<400>63

Met Ala Ser Ser Asn Leu Ile Lys Gln Leu Gln Glu Arg Gly Leu ValMet Ala Ser Ser Asn Leu Ile Lys Gln Leu Gln Glu Arg Gly Leu Val

1               5                   10                  151 5 10 15

Ala Gln Val Thr Asp Glu Glu Ala Leu Ala Glu Arg Leu Ala Gln GlyAla Gln Val Thr Asp Glu Glu Ala Leu Ala Glu Arg Leu Ala Gln Gly

            20                  25                  3020 25 30

Pro Ile Ala Leu Thr Cys Gly Phe Asp Pro Thr Ala Asp Ser Leu HisPro Ile Ala Leu Thr Cys Gly Phe Asp Pro Thr Ala Asp Ser Leu His

        35                  40                  4535 40 45

Leu Gly His Leu Val Pro Leu Leu Cys Leu Lys Arg Phe Gln Gln AlaLeu Gly His Leu Val Pro Leu Leu Cys Leu Lys Arg Phe Gln Gln Ala

    50                  55                  6050 55 60

Gly His Lys Pro Val Ala Leu Val Gly Gly Ala Thr Gly Leu Ile GlyGly His Lys Pro Val Ala Leu Val Gly Gly Ala Thr Gly Leu Ile Gly

65                  70                  75                  8065 70 75 80

Asp Pro Ser Phe Lys Ala Ala Glu Arg Lys Leu Asn Thr Glu Glu ThrAsp Pro Ser Phe Lys Ala Ala Glu Arg Lys Leu Asn Thr Glu Glu Thr

                85                  90                  9585 90 95

Val Gln Glu Trp Val Asp Lys Ile Arg Lys Gln Val Ala Pro Phe LeuVal Gln Glu Trp Val Asp Lys Ile Arg Lys Gln Val Ala Pro Phe Leu

            100                 105                 110100 105 110

Asp Phe Asp Cys Gly Glu Asn Ser Ala Ile Ala Ala Asn Asn Tyr AspAsp Phe Asp Cys Gly Glu Asn Ser Ala Ile Ala Ala Asn Asn Tyr Asp

        115                 120                 125115 120 125

Trp Phe Gly Asn Met Asn Val Leu Thr Phe Leu Arg Asp Ile Gly LysTrp Phe Gly Asn Met Asn Val Leu Thr Phe Leu Arg Asp Ile Gly Lys

    130                 135                 140130 135 140

His Phe Ser Val Asn Gln Met Ile Asn Lys Glu Ala Val Lys Gln ArgHis Phe Ser Val Asn Gln Met Ile Asn Lys Glu Ala Val Lys Gln Arg

145                 150                 155                 160145 150 155 160

Leu Asn Arg Glu Asp Gln Gly Ile Ser Phe Thr Glu Phe Ser Tyr AsnLeu Asn Arg Glu Asp Gln Gly Ile Ser Phe Thr Glu Phe Ser Tyr Asn

                165                 170                 175165 170 175

Leu Leu Gln Gly Tyr Thr Phe Ala Cys Thr Asn Lys Gln Tyr Gly ValLeu Leu Gln Gly Tyr Thr Phe Ala Cys Thr Asn Lys Gln Tyr Gly Val

            180                 185                 190180 185 190

Val Leu Gln Ile Gly Gly Ser Asp Gln Trp Gly Asn Ile Thr Ser GlyVal Leu Gln Ile Gly Gly Ser Asp Gln Trp Gly Asn Ile Thr Ser Gly

        195                 200                 205195 200 205

Ile Asp Leu Thr Arg Arg Leu His Gln Asn Gln Val Phe Gly Leu ThrIle Asp Leu Thr Arg Arg Leu His Gln Asn Gln Val Phe Gly Leu Thr

    210                 215                 220210 215 220

Val Pro Leu Ile Thr Lys Ala Asp Gly Thr Lys Phe Gly Lys Thr GluVal Pro Leu Ile Thr Lys Ala Asp Gly Thr Lys Phe Gly Lys Thr Glu

225                 230                 235                 240225 230 235 240

Gly Gly Ala Val Trp Leu Asp Pro Lys Lys Thr Ser Pro Tyr Lys PheGly Gly Ala Val Trp Leu Asp Pro Lys Lys Thr Ser Pro Tyr Lys Phe

                245                 250                 255245 250 255

Tyr Gln Phe Trp Ile Asn Thr Ala Asp Ala Asp Val Tyr Arg Phe LeuTyr Gln Phe Trp Ile Asn Thr Ala Asp Ala Asp Val Tyr Arg Phe Leu

            260                 265                 270260 265 270

Lys Phe Phe Thr Phe Met Ser Ile Glu Glu Ile Asn Ala Leu Glu GluLys Phe Phe Thr Phe Met Ser Ile Glu Glu Ile Asn Ala Leu Glu Glu

        275                 280                 285275 280 285

Glu Asp Lys Asn Ser Gly Lys Ala Pro Arg Ala Gln Tyr Val Leu AlaGlu Asp Lys Asn Ser Gly Lys Ala Pro Arg Ala Gln Tyr Val Leu Ala

    290                 295                 300290 295 300

Glu Gln Val Thr Arg Leu Val His Gly Glu Glu Gly Leu Gln Ala AlaGlu Gln Val Thr Arg Leu Val His Gly Glu Glu Gly Leu Gln Ala Ala

305                 310                 315                 320305 310 315 320

Lys Arg Ile Thr Glu Cys Leu Phe Ser Gly Ser Leu Ser Ala Leu SerLys Arg Ile Thr Glu Cys Leu Phe Ser Gly Ser Leu Ser Ala Leu Ser

                325                 330                 335325 330 335

Glu Ala Asp Phe Glu Gln Leu Ala Gln Asp Gly Val Pro Met Val GluGlu Ala Asp Phe Glu Gln Leu Ala Gln Asp Gly Val Pro Met Val Glu

            340                 345                 350340 345 350

Met Glu Lys Gly Ala Asp Leu Met Gln Ala Leu Val Asp Ser Glu LeuMet Glu Lys Gly Ala Asp Leu Met Gln Ala Leu Val Asp Ser Glu Leu

        355                 360                 365355 360 365

Gln Pro Ser Arg Gly Gln Ala Arg Lys Thr Ile Ala Ser Asn Ala IleGln Pro Ser Arg Gly Gln Ala Arg Lys Thr Ile Ala Ser Asn Ala Ile

    370                 375                 380370 375 380

Thr Ile Asn Gly Glu Lys Gln Ser Asp Pro Glu Tyr Phe Phe Lys GluThr Ile Asn Gly Glu Lys Gln Ser Asp Pro Glu Tyr Phe Phe Lys Glu

385                 390                 395                 400385 390 395 400

Glu Asp Arg Leu Phe Gly Arg Phe Thr Leu Leu Arg Arg Gly Lys LysGlu Asp Arg Leu Phe Gly Arg Phe Thr Leu Leu Arg Arg Gly Lys Lys

                405                 410                 415405 410 415

Asn Tyr Cys Leu Ile Cys Trp LysAsn Tyr Cys Leu Ile Cys Trp Lys

            420420

<210>64<210>64

<211>129<211>129

<212>DNA<212>DNA

<213>大肠杆菌(Escherichia coli)<213>Escherichia coli

<400>64<400>64

agcttcccga taagggagca ggccagtaaa aagcattacc ccgtggtggg gttcccgagc   60agcttcccga taagggagca ggccagtaaa aagcattacc ccgtggtggg gttcccgagc 60

ggccaaaggg agcagactct aaatctgccg tcatcgacct cgaaggttcg aatccttccc  120ggccaaaggg agcagactct aaatctgccg tcatcgacct cgaaggttcg aatccttccc 120

ccaccacca                                                          129ccaccacca 129

<210>65<210>65

<211>129<211>129

<212>RNA<212> RNA

<213>大肠杆菌(Escherichia coli)<213>Escherichia coli

<400>65<400>65

agcuucccga uaagggagca ggccaguaaa aagcauuacc ccgugguggg guucccgagc   60agcuucccga uaagggagca ggccaguaaa aagcauuacc ccgugguggg guucccgagc 60

ggccaaaggg agcagacucu aaaucugccg ucaucgaccu cgaagguucg aauccuuccc  120ggccaaaggg agcagacucu aaaucugccg ucaucgaccu cgaagguucg aauccuuccc 120

ccaccacca                                                          129ccaccacca 129

<210>66<210>66

<211>34<211>34

<212>DNA<212>DNA

<213>人工<213> Artificial

<220><220>

<223>寡核苷酸引物<223> oligonucleotide primer

<400>66<400>66

atgaagtagc tgtcttctat cgaacaagca tgcg                            34atgaagtagc tgtcttctat cgaacaagca tgcg 34

<210>67<210>67

<211>34<211>34

<212>DNA<212>DNA

<213>人工<213> Artificial

<220><220>

<223>寡核苷酸引物<223> oligonucleotide primer

<400>67<400>67

cgaacaagca tgcgattagt gccgacttaa aaag                              34cgaacaagca tgcgattagt gccgacttaa aaag 34

<210>68<210>68

<211>33<211>33

<212>DNA<212>DNA

<213>人工<213> Artificial

<220><220>

<223>寡核苷酸引物<223> oligonucleotide primer

<400>68<400>68

cgctactctc ccaaatagaa aaggtctccg ctg                               33cgctactctc ccaaatagaa aaggtctccg ctg 33

<210>69<210>69

<211>32<211>32

<212>DNA<212>DNA

<213>人工<213> Artificial

<220><220>

<223>寡核苷酸引物<223> oligonucleotide primer

<400>69<400>69

ctggaacagc tatagctact gatttttcct cg                                 32ctggaacagc tatagctact gatttttcct cg 32

<210>70<210>70

<211>34<211>34

<212>DNA<212>DNA

<213>人工<213> Artificial

<220><220>

<223>寡核苷酸引物<223> oligonucleotide primer

<400>70<400>70

gccgtcacag attagttggc ttcagtggag actg                         34gccgtcacag attagttggc ttcagtggag actg 34

<210>71<210>71

<211>33<211>33

<212>DNA<212>DNA

<213>人工<213> Artificial

<220><220>

<223>寡核苷酸引物<223> oligonucleotide primer

<400>71<400>71

gattggcttc ataggagact gatatgctct aac                          33gattggcttc ataggagact gatatgctct aac 33

<210>72<210>72

<211>33<211>33

<212>DNA<212>DNA

<213>人工<213> Artificial

<220><220>

<223>寡核苷酸引物<223> oligonucleotide primer

<400>72<400>72

gcctctatag ttgagacagc atagaataat gcg                           33gcctctatag ttgagacagc atagaataat gcg 33

<210>73<210>73

<211>35<211>35

<212>DNA<212>DNA

<213>人工<213> Artificial

<220><220>

<223>寡核苷酸引物<223> oligonucleotide primer

<400>73<400>73

gagacagcat agatagagtg cgacatcatc atcgg                          35gagacagcat agatagagtg cgacatcatc atcgg 35

<210>74<210>74

<211>37<211>37

<212>DNA<212>DNA

<213>人工<213> Artificial

<220><220>

<223>寡核苷酸引物<223> oligonucleotide primer

<400>74<400>74

gaataagtgc gacatagtca tcggaagaga gtagtag                        37gaataagtgc gacatagtca tcggaagaga gtagtag 37

<210>75<210>75

<211>35<211>35

<212>DNA<212>DNA

<213>人工<213> Artificial

<220><220>

<223>寡核苷酸引物<223> oligonucleotide primer

<400>75<400>75

ggtcaaagac agttgtaggt atcgattgac tcggc                           35ggtcaaagac agttgtaggt atcgattgac tcggc 35

<210>76<210>76

<211>34<211>34

<212>DNA<212>DNA

<213>人工<213> Artificial

<220><220>

<223>寡核苷酸引物<223> oligonucleotide primer

<400>76<400>76

cgctactctc cccaaattta aaaggtctcc gctg                             34cgctactctc cccaaattta aaaggtctcc gctg 34

<210>77<210>77

<211>34<211>34

<212>DNA<212>DNA

<213>人工<213> Artificial

<220><220>

<223>寡核苷酸引物<223> oligonucleotide primer

<400>77<400>77

cgctactctc cccaaatata aaaggtctcc gctg                           34cgctactctc cccaaatata aaaggtctcc gctg 34

<210>78<210>78

<211>34<211>34

<212>DNA<212>DNA

<213>人工<213> Artificial

<220><220>

<223>寡核苷酸引物<223> oligonucleotide primer

<400>78<400>78

cgctactctc cccaaatgga aaaggtctcc gctg                           34cgctactctc cccaaatgga aaaggtctcc gctg 34

<210>79<210>79

<211>34<211>34

<212>DNA<212>DNA

<213>人工<213> Artificial

<220><220>

<223>寡核苷酸引物<223> oligonucleotide primer

<400>79<400>79

cgctactctc cccaaagata aaaggtctcc gctg                            34cgctactctc cccaaagata aaaggtctcc gctg 34

<210>80<210>80

<211>34<211>34

<212>DNA<212>DNA

<213>人工<213> Artificial

<220><220>

<223>寡核苷酸引物<223> oligonucleotide primer

<400>80<400>80

cgctactctc cccaaaaaaa aaaggtctcc gctg                            34cgctactctc cccaaaaaaa aaaggtctcc gctg 34

<210>81<210>81

<211>34<211>34

<212>DNA<212>DNA

<213>人工<213> Artificial

<220><220>

<223>寡核苷酸引物<223> oligonucleotide primer

<400>81<400>81

gccgtcacag attttttggc ttcagtggag actg                       34gccgtcacag attttttggc ttcagtggag actg 34

<210>82<210>82

<211>34<211>34

<212>DNA<212>DNA

<213>人工<213> Artificial

<220><220>

<223>寡核苷酸引物<223> oligonucleotide primer

<400>82<400>82

gccgtcacag attatttggc ttcagtggag actg                       34gccgtcacag attatttggc ttcagtggag actg 34

<210>83<210>83

<211>34<211>34

<212>DNA<212>DNA

<213>人工<213> Artificial

<220><220>

<223>寡核苷酸引物<223> oligonucleotide primer

<400>83<400>83

gccgtcacag attggttggc ttcagtggag actg                       34gccgtcacag attggttggc ttcagtggag actg 34

<210>84<210>84

<211>34<211>34

<212>DNA<212>DNA

<213>人工<213> Artificial

<220><220>

<223>寡核苷酸引物<223> oligonucleotide primer

<400>84<400>84

gccgtcacag atgatttggc ttcagtggag actg                      34gccgtcacag atgatttggc ttcagtggag actg 34

<210>85<210>85

<211>34<211>34

<212>DNA<212>DNA

<213>人工<213> Artificial

<220><220>

<223>寡核苷酸引物<223> oligonucleotide primer

<400>85<400>85

gccgtcacag ataaattggc ttcagtggag actg                     34gccgtcacag ataaattggc ttcagtggag actg 34

<210>86<210>86

<211>424<211>424

<212>PRT<212>PRT

<213>人工<213> Artificial

<220><220>

<223>人工合成酶<223> Synthetic enzymes

<400>86<400>86

Met Ala Ser Ser Asn Leu Ile Lys Gln Leu Gln Glu Arg Gly Leu ValMet Ala Ser Ser Asn Leu Ile Lys Gln Leu Gln Glu Arg Gly Leu Val

1               5                   10                  151 5 10 15

Ala Gln Val Thr Asp Glu Glu Ala Leu Ala Glu Arg Leu Ala Gln GlyAla Gln Val Thr Asp Glu Glu Ala Leu Ala Glu Arg Leu Ala Gln Gly

            20                  25                  3020 25 30

Pro Ile Ala Leu Ile Cys Gly Phe Asp Pro Thr Ala Asp Ser Leu HisPro Ile Ala Leu Ile Cys Gly Phe Asp Pro Thr Ala Asp Ser Leu His

        35                  40                  4535 40 45

Leu Gly His Leu Val Pro Leu Leu Cys Leu Lys Arg Phe Gln Gln AlaLeu Gly His Leu Val Pro Leu Leu Cys Leu Lys Arg Phe Gln Gln Ala

    50                  55                  6050 55 60

Gly His Lys Pro Val Ala Leu Val Gly Gly Ala Thr Gly Leu Ile GlyGly His Lys Pro Val Ala Leu Val Gly Gly Ala Thr Gly Leu Ile Gly

65                  70                  75                  8065 70 75 80

Asp Pro Ser Phe Lys Ala Ala Glu Arg Lys Leu Asn Thr Glu Glu ThrAsp Pro Ser Phe Lys Ala Ala Glu Arg Lys Leu Asn Thr Glu Glu Thr

                85                  90                  9585 90 95

Val Gln Glu Trp Val Asp Lys Ile Arg Lys Gln Val Ala Pro Phe LeuVal Gln Glu Trp Val Asp Lys Ile Arg Lys Gln Val Ala Pro Phe Leu

            100                 105                 110100 105 110

Asp Phe Asp Cys Gly Glu Asn Ser Ala Ile Ala Ala Asn Asn Tyr AspAsp Phe Asp Cys Gly Glu Asn Ser Ala Ile Ala Ala Asn Asn Tyr Asp

        115                 120                 125115 120 125

Trp Phe Gly Asn Met Asn Val Leu Thr Phe Leu Arg Asp Ile Gly LysTrp Phe Gly Asn Met Asn Val Leu Thr Phe Leu Arg Asp Ile Gly Lys

    130                 135                 140130 135 140

His Phe Ser Val Asn Gln Met Ile Asn Lys Glu Ala Val Lys Gln ArgHis Phe Ser Val Asn Gln Met Ile Asn Lys Glu Ala Val Lys Gln Arg

145                 150                 155                 160145 150 155 160

Leu Asn Arg Glu Gly Gln Gly Ile Ser Phe Thr Glu Phe Ser Tyr AsnLeu Asn Arg Glu Gly Gln Gly Ile Ser Phe Thr Glu Phe Ser Tyr Asn

                165             170                     175165 170 175

Leu Leu Gln Gly Tyr Gly Met Ala Cys Ala Asn Lys Gln Tyr Gly ValLeu Leu Gln Gly Tyr Gly Met Ala Cys Ala Asn Lys Gln Tyr Gly Val

            180                 185                 190180 185 190

Val Leu Gln Ile Gly Gly Ser Asp Gln Trp Gly Asn Ile Thr Ser GlyVal Leu Gln Ile Gly Gly Ser Asp Gln Trp Gly Asn Ile Thr Ser Gly

        195                 200                 205195 200 205

Ile Asp Leu Thr Arg Arg Leu His Gln Asn Gln Val Phe Gly Leu ThrIle Asp Leu Thr Arg Arg Leu His Gln Asn Gln Val Phe Gly Leu Thr

    210                 215                 220210 215 220

Val Pro Leu Ile Thr Lys Ala Asp Gly Thr Lys Phe Gly Lys Thr GluVal Pro Leu Ile Thr Lys Ala Asp Gly Thr Lys Phe Gly Lys Thr Glu

225                 230                 235                 240225 230 235 240

Gly Gly Ala Val Trp Leu Asp Pro Lys Lys Thr Ser Pro Tyr Lys PheGly Gly Ala Val Trp Leu Asp Pro Lys Lys Thr Ser Pro Tyr Lys Phe

                245                 250                 255245 250 255

Tyr Gln Phe Trp Ile Asn Thr Ala Asp Ala Asp Val Tyr Arg Phe LeuTyr Gln Phe Trp Ile Asn Thr Ala Asp Ala Asp Val Tyr Arg Phe Leu

            260                 265                 270260 265 270

Lys Phe Phe Thr Phe Met Ser Ile Glu Glu Ile Asn Ala Leu Glu GluLys Phe Phe Thr Phe Met Ser Ile Glu Glu Ile Asn Ala Leu Glu Glu

        275                 280                 285275 280 285

Glu Asp Lys Asn Ser Gly Lys Ala Pro Arg Ala Gln Tyr Val Leu AlaGlu Asp Lys Asn Ser Gly Lys Ala Pro Arg Ala Gln Tyr Val Leu Ala

    290                 295                 300290 295 300

Glu Gln Val Thr Arg Leu Val His Gly Glu Glu Gly Leu Gln Ala AlaGlu Gln Val Thr Arg Leu Val His Gly Glu Glu Gly Leu Gln Ala Ala

305                 310                 315                 320305 310 315 320

Lys Arg Ile Thr Glu Cys Leu Phe Ser Gly Ser Leu Ser Ala Leu SerLys Arg Ile Thr Glu Cys Leu Phe Ser Gly Ser Leu Ser Ala Leu Ser

                325                 330                 335325 330 335

Glu Ala Asp Phe Glu Gln Leu Ala Gln Asp Gly Val Pro Met Val GluGlu Ala Asp Phe Glu Gln Leu Ala Gln Asp Gly Val Pro Met Val Glu

            340                 345                 350340 345 350

Met Glu Lys Gly Ala Asp Leu Met Gln Ala Leu Val Asp Ser Glu LeuMet Glu Lys Gly Ala Asp Leu Met Gln Ala Leu Val Asp Ser Glu Leu

        355                 360                 365355 360 365

Gln Pro Ser Arg Gly Gln Ala Arg Lys Thr Ile Ala Ser Asn Ala IleGln Pro Ser Arg Gly Gln Ala Arg Lys Thr Ile Ala Ser Asn Ala Ile

    370                 375                 380370 375 380

Thr Ile Asn Gly Glu Lys Gln Ser Asp Pro Glu Tyr Phe Phe Lys GluThr Ile Asn Gly Glu Lys Gln Ser Asp Pro Glu Tyr Phe Phe Lys Glu

385                 390                 395                 400385 390 395 400

Glu Asp Arg Leu Phe Gly Arg Phe Thr Leu Leu Arg Arg Gly Lys LysGlu Asp Arg Leu Phe Gly Arg Phe Thr Leu Leu Arg Arg Gly Lys Lys

                405                 410                 415405 410 415

Asn Tyr Cys Leu Ile Cys Trp LysAsn Tyr Cys Leu Ile Cys Trp Lys

            420420

<210>87<210>87

<211>6<211>6

<212>PRT<212>PRT

<213>人工<213> Artificial

<220><220>

<223>包括非天然氨基酸的胰蛋白酶肽<223> Tryptic peptides including unnatural amino acids

<220><220>

<221>MISC_FEATURE<221>MISC_FEATURE

<222>(2)..(2)<222>(2)..(2)

<223>X是非天然氨基酸(对-乙酰基-L-苯丙氨酸、对-苯甲酰基-L-苯丙氨酸、对-叠氮基-L-苯丙氨酸、0-甲基-L-酪氨酸或对-碘-L-苯丙氨酸)或色氨酸、酪氨酸或亮氨酸<223>X is an unnatural amino acid (p-acetyl-L-phenylalanine, p-benzoyl-L-phenylalanine, p-azido-L-phenylalanine, O-methyl -L-tyrosine or p-iodo-L-phenylalanine) or tryptophan, tyrosine or leucine

<400>87<400>87

Val Xaa Gly Ser Ile LysVal Xaa Gly Ser Ile Lys

1               51 5

<210>88<210>88

<211>11<211>11

<212>DNA<212>DNA

<213>人工<213> Artificial

<220><220>

<223>B框<223> B frame

<220><220>

<221>misc_feature<221>misc_feature

<222>(8)..(8)<222>(8)..(8)

<223>n是a、c、g或t<223> n is a, c, g or t

<400>88<400>88

ggttcgantc c                                     11ggttcgantc c 11

<210>89<210>89

<211>82<211>82

<212>DNA<212>DNA

<213>人工<213> Artificial

<220><220>

<223>寡核苷酸引物<223> oligonucleotide primer

<400>89<400>89

ggggggaccg gtggggggac cggtaagctt cccgataagg gagcaggcca gtaaaaagca    60ggggggaccg gtggggggac cggtaagctt cccgataagg gagcaggcca gtaaaaagca 60

ttaccccgtg gtgggttccc ga                                             82ttaccccgtg gtgggttccc ga 82

<210>90<210>90

<211>90<211>90

<212>DNA<212>DNA

<213>人工<213> Artificial

<220><220>

<223>寡核苷酸引物<223> oligonucleotide primer

<400>90<400>90

ggcggcgcta gcaagcttcc cgataaggga gcaggccagt aaaaagggaa gttcagggac    60ggcggcgcta gcaagcttcc cgataaggga gcaggccagt aaaaagggaa gttcagggac 60

ttttgaaaaa aatggtggtg ggggaaggat                                     90ttttgaaaaa aatggtggtg gggaaggat 90

<210>91<210>91

<211>68<211>68

<212>DNA<212>DNA

<213>人工<213> Artificial

<220><220>

<223>寡核苷酸引物<223> oligonucleotide primer

<220><220>

<221>misc_feature<221>misc_feature

<222>(1)..(1)<222>(1)..(1)

<223>n=I<223> n = I

<220><220>

<221>misc_feature<221>misc_feature

<222>(14)..(14)<222>(14)..(14)

<223>n=I<223> n = I

<400>91<400>91

nggggggacc ggtngggggg accggtcggg atcgaagaaa tgatggtaaa tgaaatagga    60ngggggggacc ggtngggggg accggtcggg atcgaagaaa tgatggtaaa tgaaatagga 60

aatcaagg                                                             68aatcaagg 68

<210>92<210>92

<211>62<211>62

<212>DNA<212>DNA

<213>人工<213> Artificial

<220><220>

<223>寡核苷酸引物<223> oligonucleotide primer

<400>92<400>92

gggggggaat tcagttgatt gtatgcttgg tatagcttga aatattgtgc agaaaaagaa    60gggggggaat tcagttgatt gtatgcttgg tatagcttga aatattgtgc agaaaaagaa 60

ac                                                                   62ac 62

<210>93<210>93

<211>86<211>86

<212>DNA<212>DNA

<213>人工<213> Artificial

<220><220>

<223>寡核苷酸引物<223> oligonucleotide primer

<400>93<400>93

tcataacgag aattccggga tcgaagaaat gatggtaaat gaaataggaa atctcataac    60tcataacgag aattccggga tcgaagaaat gatggtaaat gaaataggaa atctcataac 60

gagaattcat ggcaagcagt aacttg                                         86gagaattcat ggcaagcagt aacttg 86

<210>94<210>94

<211>72<211>72

<212>DNA<212>DNA

<213>人工<213> Artificial

<220><220>

<223>寡核苷酸引物<223> oligonucleotide primer

<400>94<400>94

ttactacgtg cggccgcatg gcaagcagta acttgttact acgtgcggcc gcttatttcc    60ttactacgtg cggccgcatg gcaagcagta acttgttact acgtgcggcc gcttatttcc 60

agcaaatcag ac                                                72agcaaatcag ac 72

<210>95<210>95

<211>28<211>28

<212>DNA<212>DNA

<213>人工<213> Artificial

<220><220>

<223>寡核苷酸引物<223> oligonucleotide primer

<400>95<400>95

ccgatcgcgc tcgcttgcgg cttcgatc                               28ccgatcgcgc tcgcttgcgg cttcgatc 28

<210>96<210>96

<211>27<211>27

<212>DNA<212>DNA

<213>人工<213> Artificial

<220><220>

<223>寡核苷酸引物<223> oligonucleotide primer

<400>96<400>96

atcgcggcga acgcctatga ctggttc                                27atcgcggcga acgcctatga ctggttc 27

<210>97<210>97

<211>40<211>40

<212>DNA<212>DNA

<213>人工<213> Artificial

<220><220>

<223>寡核苷酸引物<223> oligonucleotide primer

<400>97<400>97

gttgcagggt tatgccgccg cctgtgcgaa caaacagtac                  40gttgcagggt tatgccgccg cctgtgcgaa caaacagtac 40

<210>98<210>98

<211>26<211>26

<212>DNA<212>DNA

<213>人工<213> Artificial

<220><220>

<223>寡核苷酸引物<223> oligonucleotide primer

<400>98<400>98

gccgctttgc tatcaagtat aaatag                                         26gccgctttgc tatcaagtat aaatag 26

<210>99<210>99

<211>21<211>21

<212>DNA<212>DNA

<213>人工<213> Artificial

<220><220>

<223>寡核苷酸引物<223> oligonucleotide primer

<400>99<400>99

caagccgaca accttgattg g                                              21caagccgaca accttgattg g 21

<210>100<210>100

<211>60<211>60

<212>DNA<212>DNA

<213>人工<213> Artificial

<220><220>

<223>寡核苷酸引物<223> oligonucleotide primer

<400>100<400>100

ggggacaagt ttgtacaaaa aagcaggcta cgccaatttt aatcaaagtg ggaatattgc    60ggggacaagt ttgtacaaaa aagcaggcta cgccaatttt aatcaaagtg ggaatattgc 60

<210>101<210>101

<211>60<211>60

<212>DNA<212>DNA

<213>人工<213> Artificial

<220><220>

<223>寡核苷酸引物<223> oligonucleotide primer

<400>101<400>101

ggggacaagt ttgtacaaaa aagcaggcta ggccaatttt aatcaaagtg ggaatattgc    60ggggacaagt ttgtacaaaa aagcaggcta ggccaatttt aatcaaagtg ggaatattgc 60

<210>102<210>102

<211>58<211>58

<212>DNA<212>DNA

<213>人工<213> Artificial

<220><220>

<223>寡核苷酸引物<223> oligonucleotide primer

<400>102<400>102

ggggaccact ttgtacaaga aagctgggtt actctttttt tgggtttggt ggggtatc    58ggggaccact ttgtacaaga aagctgggtt actctttttt tgggtttggt ggggtatc 58

<210>103<210>103

<211>22<211>22

<212>DNA<212>DNA

<213>人工<213> Artificial

<220><220>

<223>寡核苷酸引物<223> oligonucleotide primer

<400>103<400>103

aagctatacc aagcatacaa tc                                           22aagctatacc aagcatacaa tc 22

<210>104<210>104

<211>49<211>49

<212>DNA<212>DNA

<213>人工<213> Artificial

<220><220>

<223>寡核苷酸引物<223> oligonucleotide primer

<400>104<400>104

acaaggcctt gctagcttac tctttttttg ggtttggtgg ggtatcttc              49acaaggcctt gctagcttac tctttttttg ggtttggtgg ggtatcttc 49

Claims (61)

1. composition that contains albumen, wherein this albumen comprises at least a alpha-non-natural amino acid and at least one posttranslational modification, and wherein said at least one posttranslational modification is that the molecule that will contain second reactive group is attached on the described at least a alpha-non-natural amino acid that contains first reactive group by [3+2] cycloaddition.
2. composition as claimed in claim 1, it is characterized in that described molecule is dyestuff, polymkeric substance, polyethyleneglycol derivative, photocrosslinking agent, cytotoxic compound, affinity labeling, biotin derivative, resin, second kind of protein or polypeptide, metal-chelator, co-factor, fatty acid, carbohydrates or polynucleotide.
3. composition as claimed in claim 1 is characterized in that, described first reactive group is alkynyl or azido part, and described second reactive group is azido or alkynyl part.
4. composition as claimed in claim 3 is characterized in that, first reactive group is the alkynyl part, and second reactive group is the azido part.
5. composition as claimed in claim 4 is characterized in that, described alpha-non-natural amino acid comprises right-propargyloxy phenylalanine.
6. composition as claimed in claim 3 is characterized in that, described first reactive group is the azido part, and described second reactive group is the alkynyl part.
7. composition as claimed in claim 6 is characterized in that, described alpha-non-natural amino acid comprises right-azido-L-phenylalanine.
8. composition as claimed in claim 6 is characterized in that, described at least one posttranslational modification is carried out in the body in eukaryotic.
9. composition that comprises alpha-non-natural amino acid with following chemical constitution:
Figure A2004800211550002C1
10. composition as claimed in claim 9 also comprises quadrature tRNA.
11. composition as claimed in claim 10 is characterized in that, described alpha-non-natural amino acid is covalently bound to quadrature tRNA.
12. composition as claimed in claim 10 is characterized in that, described alpha-non-natural amino acid is covalently bound to quadrature tRNA by amino-acyl bond.
13. composition as claimed in claim 10 is characterized in that, described alpha-non-natural amino acid is covalently bound to the 3 ' OH or the 2 ' OH of the terminal ribose of quadrature tRNA.
14. protein that contains the described alpha-non-natural amino acid of claim 9.
15. cell that contains the described alpha-non-natural amino acid of claim 9.
16. composition that comprises azido dyestuff with following structure:
17. composition that comprises azido dyestuff with following structure:
Figure A2004800211550003C2
18. protein that contains claim 16 or 17 described azido dyestuffs.
19. protein as claimed in claim 18 also comprises at least a alpha-non-natural amino acid, wherein the azido dyestuff is attached on this alpha-non-natural amino acid by [3+2] cycloaddition.
20. protein as claimed in claim 19 is characterized in that, described alpha-non-natural amino acid comprises alkynyl amino acid.
21. composition that comprises alkynyl polyglycol with following structure:
Figure A2004800211550003C3
Wherein n is the integer between 100 and 2,000.
22. composition as claimed in claim 21 is characterized in that, the molecular weight of described alkynyl polyglycol is about 5,000 to about 100,000Da.
23. protein that contains the described alkynyl polyglycol of claim 21.
24. protein as claimed in claim 23 also comprises at least a alpha-non-natural amino acid, wherein said alkynyl polyglycol is attached on this alpha-non-natural amino acid by [3+2] cycloaddition.
25. protein as claimed in claim 24 is characterized in that, described alpha-non-natural amino acid comprises azido amino acid.
26. the method for synthetic right-(propargyloxy) phenylalanine compound, this method comprises:
With uncle N--butoxy carbonyl-tyrosine and K 2CO 3Be suspended in the dry DMF;
Propargyl bromide is added in the reaction mixture of (a), alkanisation hydroxyl and carboxylic group produce the protection intermediate compound with following structure:
Figure A2004800211550004C1
With
To protect intermediate compound in MeOH, to mix, and make amine moiety go protection with anhydrous HCl, thus synthetic right-(propargyloxy) phenylalanine compound.
27. method as claimed in claim 26 also comprises:
Right-(propargyloxy) phenylalanine HCl is dissolved in NaOH and the MeOH solution stirring at room;
PH is adjusted to 7; With
Precipitate right-(propargyloxy) phenylalanine compound.
28. the method for a synthetic azido dyestuff, this method comprises:
The dye composition that contains sulfonyl halogenide part is provided;
In the presence of 3-azido propylamine and triethylamine, dye composition is heated to room temperature; With
The amine moiety of 3-azido propylamine is coupled to the halogen position of dye composition, thereby synthesizes the azido dyestuff.
29. method as claimed in claim 28, wherein said dye composition contains dansyl Cl, and wherein said azido dyestuff contains the described composition of claim 16.
30. method as claimed in claim 28 also comprises:
The described azido dyestuff of purifying from reaction mixture.
31. the method for a synthetic azido dyestuff, this method comprises:
Provide and contain the amine dye composition;
To contain the amine dye composition and in suitable solvent, mix with carbodiimide and 4-(3-azido propyl group carbamyl)-butyric acid, with the amine moiety coupling of the carbonyl and the dye composition of acid, thus synthetic azido dyestuff.
32. method as claimed in claim 31 is characterized in that, described carbodiimide comprises 1-ethyl-3-(3-dimethyl aminopropyl) carbodiimide hydrochloride (EDCI).
33. method as claimed in claim 31 is characterized in that, the described amine dyestuff that contains comprises fluorescein amine, and described suitable solvent comprises pyridine.
34. method as claimed in claim 31 is characterized in that, the described amine dyestuff that contains comprises fluorescein amine, and described azido dyestuff comprises the described composition of claim 17.
35. method as claimed in claim 31 also comprises:
Precipitation azido dyestuff;
Use the HCl washing precipitation;
With washed resolution of precipitate in EtOAc; With
Precipitation azido dyestuff in hexane.
36. the method for a synthetic propargyl acid amides polyglycol, this method comprises: at room temperature propargyl amine and polyglycol (PEG)-hydroxysuccinimide eater are reacted in organic solvent, produce the described propargyl acid amides of claim 21 polyglycol.
37. method as claimed in claim 36 is characterized in that, described organic solvent comprises CH 2Cl 2
38. method as claimed in claim 36 also comprises: with ethyl acetate precipitation propargyl acid amides polyglycol.
39. method as claimed in claim 38 also comprises: crystallization propargyl acid amides polyglycol again in methyl alcohol; With desciccate under the vacuum.
40. an eukaryotic that contains quadrature aminoacyl-tRNA synthetase (O-RS), wherein, O-RS in eukaryotic preferably aminoacylation have the quadrature tRNA (O-tRNA) of at least one alpha-non-natural amino acid, wherein:
(a.) described O-RS or its part are by arbitrary listed polynucleotide sequence, their complementary polynucleotide sequence or their conservative variant coding among the SEQ ID NO.:20-25;
(b.) described O-RS comprises arbitrary listed amino acid sequence among the SEQ ID NO.:48-63, or its conservative variant;
(c.) described O-RS comprises the amino acid sequence identical with the tyrosyl aminoacyl-tRNA synthetase (TyrRS) at least 90% of natural generation, and comprises two or more amino acid that are selected from down group: with glycocoll, serine or the alanine on the Tyr37 opposite position of Escherichia coli TyrRS; With the aspartic acid on the Asn126 opposite position of Escherichia coli TyrRS; With the asparagine on the Asp182 opposite position of Escherichia coli TyrRS; With alanine or the valine on the Phe183 opposite position of Escherichia coli TyrRS; With methionine, valine, halfcystine or the threonine on the Leu186 opposite position of Escherichia coli TyrRS;
(d.) efficient of O-RS aminoacylation with O-tRNA of at least one alpha-non-natural amino acid be equivalent at least to have listed amino acid sequence among the SEQ ID NO.:45 O-RS 50%.
41. cell as claimed in claim 40; it is characterized in that; described cell also comprises quadrature tRNA (O-tRNA); this O-tRNA identification selection codon wherein; and preferably has at least one alpha-non-natural amino acid by the O-RS aminoacylation; wherein this O-tRNA produces by cell processing and the corresponding nucleic acid of SEQ ID NO.:65 in cell, and this O-RS comprises the peptide sequence that is selected from SEQ ID NO.:48-63 and their conservative variant.
42. a peptide species, it is selected from:
(a) comprise the polypeptide of arbitrary listed amino acid sequence among the SEQ ID NO.:48-63;
(b) comprise polypeptide by the amino acid sequence of arbitrary listed polynucleotide sequence coding among the SEQ ID NO.:20-35;
(c) polypeptide (a) or specific antibody (b) had the polypeptide of specific immune activity;
(d) comprise the amino acid sequence identical and comprise two or more amino acid whose polypeptide that are selected from down group: with glycocoll, serine or the alanine on the Tyr37 opposite position of Escherichia coli TyrRS with the tyrosyl aminoacyl-tRNA synthetase (TyrRS) at least 90% of natural generation; With the aspartic acid on the Asn126 opposite position of Escherichia coli TyrRS; With the asparagine on the Asp182 opposite position of Escherichia coli TyrRS; With alanine or the valine on the Phe183 opposite position of Escherichia coli TyrRS; With methionine, valine, halfcystine or the threonine on the Leu186 opposite position of Escherichia coli TyrRS;
(e) contain at least 20 continuous amino acids of SEQ ID NO.:36-48 or 86 and be selected from down the polypeptide of two or more aminoacid replacement of organizing: with glycocoll, serine or the alanine on the Tyr37 opposite position of Escherichia coli TyrRS; With the aspartic acid on the Asn126 opposite position of Escherichia coli TyrRS; With the asparagine on the Asp182 opposite position of Escherichia coli TyrRS; With alanine or the valine on the Phe183 opposite position of Escherichia coli TyrRS; With methionine, valine, halfcystine or the threonine on the Leu186 opposite position of Escherichia coli TyrRS; With
(f) contain the amino acid sequence of (a) and (b), (c), (d) or conservative variant (e).
43. composition that comprises described polypeptide of claim 42 and excipient.
44. one kind has the antibody or the antiserum of specific immune activity with the described polypeptide of claim 42.
45. composition that comprises described polypeptide of claim 42 and excipient.
46. one kind has the antibody or the antiserum of specific immune activity with the described polypeptide of claim 42.
47. polynucleotide that are selected from down group:
(a) comprise the polynucleotide of arbitrary listed nucleotide sequence among the SEQ ID NO.:20-35;
(b) polynucleotide of the polynucleotide sequence of or coding (a) complementary with (a) polynucleotide sequence;
(c) coding contains the polynucleotide of the polypeptide of arbitrary listed amino acid sequence among the SEQ ID NO.:48-63 or its conservative variant;
(d) polynucleotide of the described polypeptide of coding claim 42;
(e) nucleic acid of hybridizing under highly rigorous condition with total length nucleic acid and (a) and (b), (c) or polynucleotide (d) basically;
(f) polynucleotide of coded polypeptide, described polypeptide comprise the amino acid sequence identical with the sequence at least 90% of the tyrosyl aminoacyl-tRNA synthetase (TyrRS) of natural generation and contain two or more sudden changes that are selected from down group: with glycocoll, serine or the alanine on the Tyr37 opposite position of Escherichia coli TyrRS; With the aspartic acid on the Asn126 opposite position of Escherichia coli TyrRS; With the asparagine on the Asp182 opposite position of Escherichia coli TyrRS; With alanine or the valine on the Phe183 opposite position of Escherichia coli TyrRS; With methionine, valine, halfcystine or the threonine on the Leu186 opposite position of Escherichia coli TyrRS;
(g) polynucleotide identical with (a) and (b), (c), (d), (e) or polynucleotide (f) at least 98%; With,
(h) comprise the polynucleotide of (a) and (b), (c), (d), (e), (f) or conservative variant (g).
48. carrier that comprises the described polynucleotide of claim 47.
49. carrier as claimed in claim 48 is characterized in that, described carrier comprises plasmid, clay, bacteriophage or virus.
50. carrier as claimed in claim 48 is characterized in that, described carrier is an expression vector.
51. cell that comprises the described carrier of claim 48.
52. produce at least a method of protein that contains at least one alpha-non-natural amino acid for one kind in eukaryotic, this method comprises:
Cultivate the eukaryotic that contains nucleic acid in proper culture medium, this nucleic acid comprises at least one and selects codon and encoding said proteins; Wherein said nutrient culture media contains alpha-non-natural amino acid, and described eukaryotic comprises:
The also quadrature tRNA (O-tRNA) of identification selection codon works in cell; With
Preferably aminoacylation has the quadrature aminoacyl tRNA synthetase (O-RS) of the O-tRNA of alpha-non-natural amino acid, and wherein this O-RS comprises the corresponding amino acid sequence with SEQ ID NO.:48-53.
53. produce at least a method of protein that contains at least one alpha-non-natural amino acid for one kind in eukaryotic, this method comprises:
Cultivate the eukaryotic that contains nucleic acid in proper culture medium, this nucleic acid comprises at least one and selects codon and encoding said proteins; Wherein said nutrient culture media contains alpha-non-natural amino acid, described eukaryotic be included in work in the cell and the quadrature tRNA (O-tRNA) of identification selection codon and preferably aminoacylation have the quadrature aminoacyl tRNA synthetase (O-RS) of the O-tRNA of alpha-non-natural amino acid;
In this eukaryotic alpha-non-natural amino acid is mixed described protein, wherein this alpha-non-natural amino acid comprises first reactive group; With
This protein is contacted with the molecule that comprises second reactive group; Wherein first reactive group and second reactive group reaction makes this molecule be attached on the described alpha-non-natural amino acid by [3+2] cycloaddition.
54. method as claimed in claim 53, it is characterized in that described molecule is dyestuff, polymkeric substance, polyethyleneglycol derivative, photocrosslinking agent, cytotoxic compound, affinity labeling, biotin derivative, resin, second kind of protein or polypeptide, metal-chelator, co-factor, fatty acid, carbohydrates or polynucleotide.
55. method as claimed in claim 53 is characterized in that, described first reactive group is alkynyl or azido part, and described second reactive group is azido or alkynyl part.
56. method as claimed in claim 55 is characterized in that, described first reactive group is the alkynyl part, and described second reactive group is the azido part.
57. method as claimed in claim 56 is characterized in that, described alpha-non-natural amino acid comprises right-propargyloxy phenylalanine.
58. method as claimed in claim 55 is characterized in that, described first reactive group is the azido part, and described second reactive group is the alkynyl part.
59. method as claimed in claim 58 is characterized in that, described alpha-non-natural amino acid comprises right-azido-L-phenylalanine.
60. protein of producing by the described method of claim 53.
61. protein as claimed in claim 60; it is characterized in that; described protein is modified by posttranslational modification at least a body, and wherein said posttranslational modification is selected from: N-glycosylation, O-glycosylation, acetylation, acidylate, lipid-modification, palmitoylation, palmitate addition, phosphorylation and glycolipid-be connected modification.
CNA2004800211558A 2003-06-18 2004-04-16 Genetic code increase of non-natural active amino acid Pending CN101160525A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201210057706.2A CN102618605B (en) 2003-06-18 2004-04-16 Unnatural reactive amino acid genetic code increases

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US47993103P 2003-06-18 2003-06-18
US60/479,931 2003-06-18
US60/493,014 2003-08-05
US60/496,548 2003-08-19

Related Child Applications (1)

Application Number Title Priority Date Filing Date
CN201210057706.2A Division CN102618605B (en) 2003-06-18 2004-04-16 Unnatural reactive amino acid genetic code increases

Publications (1)

Publication Number Publication Date
CN101160525A true CN101160525A (en) 2008-04-09

Family

ID=39307965

Family Applications (1)

Application Number Title Priority Date Filing Date
CNA2004800211558A Pending CN101160525A (en) 2003-06-18 2004-04-16 Genetic code increase of non-natural active amino acid

Country Status (1)

Country Link
CN (1) CN101160525A (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102307905A (en) * 2008-12-10 2012-01-04 斯克利普斯研究院 Production of carrier-peptide conjugates using chemically reactive unnatural amino acids
CN104203971A (en) * 2012-01-20 2014-12-10 医药研究委员会 Polypeptides and methods
CN104328086A (en) * 2006-09-08 2015-02-04 Ambrx公司 Site Specific Incorporation of Non-Natural Amino Acids by Vertebrate Cells
CN108752452A (en) * 2018-06-12 2018-11-06 中国医学科学院北京协和医院 The application of SARS and its mutant
CN110959041A (en) * 2017-06-02 2020-04-03 Ambrx公司 Methods and compositions for enhancing the production of proteins containing unnatural amino acids
CN118291421A (en) * 2024-04-02 2024-07-05 湖南艾科瑞生物工程有限公司 Taq DNA polymerase mutant and its preparation method and application

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104328086A (en) * 2006-09-08 2015-02-04 Ambrx公司 Site Specific Incorporation of Non-Natural Amino Acids by Vertebrate Cells
CN102307905A (en) * 2008-12-10 2012-01-04 斯克利普斯研究院 Production of carrier-peptide conjugates using chemically reactive unnatural amino acids
CN102307905B (en) * 2008-12-10 2015-11-25 斯克利普斯研究院 Chemical reactivity alpha-non-natural amino acid is utilized to produce carrier-peptide conjugate
CN104203971A (en) * 2012-01-20 2014-12-10 医药研究委员会 Polypeptides and methods
CN104203971B (en) * 2012-01-20 2019-08-09 医药研究委员会 Peptides and methods
CN110959041A (en) * 2017-06-02 2020-04-03 Ambrx公司 Methods and compositions for enhancing the production of proteins containing unnatural amino acids
CN110959041B (en) * 2017-06-02 2023-09-29 Ambrx公司 Methods and compositions for promoting production of proteins containing unnatural amino acids
US11851662B2 (en) 2017-06-02 2023-12-26 Ambrx, Inc. Methods and compositions for promoting non-natural amino acid-containing protein production
CN108752452A (en) * 2018-06-12 2018-11-06 中国医学科学院北京协和医院 The application of SARS and its mutant
CN108752452B (en) * 2018-06-12 2022-03-08 中国医学科学院北京协和医院 SARS and its mutant application
CN118291421A (en) * 2024-04-02 2024-07-05 湖南艾科瑞生物工程有限公司 Taq DNA polymerase mutant and its preparation method and application

Similar Documents

Publication Publication Date Title
CN102618605B (en) Unnatural reactive amino acid genetic code increases
CN101223272B (en) Expanding the eukaryotic genetic code
CN101160525A (en) Genetic code increase of non-natural active amino acid
AU2013201487B2 (en) Expanding the eukaryotic genetic code

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C12 Rejection of a patent application after its publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20080409