patents.google.com

CA2243533A1 - Ligand directed enzyme prodrug therapy - Google Patents

  • ️Thu Jul 31 1997

CA 02243533 l998-07-20 T~IGA~D DIRECTED ENZY~IE PRODRUG ~ }U~PY

~tro~uction and bach~louh.d to the invention The present invention relates to ligand directed enzyme ~ prodrl~g therapy (LIDEPT) and its use in the treatment of disease, including tumours.

A therapeutic approach termed "antibody-directed enzyme prodrug therapy~ ~ADEPT) has ~een proposed as a method for treat:ing tumour cells in patients using prodrugs. Tumour cells are targeted with an antibody conjugated to an enzyme capable of activating a prodrug. The anti~ody in the conjuyate binds to tumour cells in order that the enzyme can convert the prodrug to an active drug outside the tumour cells (see e.g. W088/07378). Alternatively, methods for the delivery of genes which encode an enzyme have been used (see e.g. EP-A-41~731). Such methods include calcium phosphate co-precipitation, microinjection, liposomes, directed DNA
uptake, and receptor-mediated DNA transfer. These are revie~ed in Morgan & French Anderson, Annu. Re~. siochem.~
1993, 62; 191. The term ~GDEPT~ (gene-directed enzyme prodrug therapy) is used to include both viral and non-viral delivery systems.

The present invention relates to a new class of enzyme delivery systems, involving the use of ligands which recognise receptors on the surface of tumour cells. Tumour cells are targeted with a ligand-enzyme conjugate or a

2~ ligand-enzyme fusion.

Disclosure of the invention The present invention provides a two component system for use in as~ociation with one another comprising:
~a) a fusion protein or conjugate of a ligand with enzyme; and (b) a prodrug which can be converted into an active drug ~y said enzyme.

CA 02243533 l998-07-20 The ligand component of the above system is preferably a ligand which is:
(i) a naturally occurring polypeptide whose biological role i8 to bind to a cognate receptor on the surface of the cell; or (ii) a fragment of said polypeptide which ~till binds to its cognate receptor; or (iii) derivatives of (i) and (ii) above with altered receptor specificity.

The invention also provides the system of the invention for use in a method of treatment of a patient, and a method of treating a tumour in a patient in need of treatment which comprises administering to said patient an effective ligar.d-enzyme capable of binding to the receptor and a prodrug capable of being converted by said enzyme to an active drug.
The invention further provides novel conjugates of a ligand (or fragments or derivatives thereof) with an enzyme.

~rief description of the drawinqs Figure 1 shows schematically a ligand, an enzyme, and some fusion proteins containing them.

D~tailed descri~tion of the invention.

A. Liqands Ligands which are naturally occurring polypeptides whose biological role is to bind to a cognate receptor are preferably naturally occurring m~m~lian polypeptides.
Desirably, such polypeptides will be selected from the species of m~m~ 1 which is to be treated by the invention.
Thus, when the system of the invention is intended for human use, the polypeptide will preferably be of human origin. In veterinary applications, the polypeptide may be, for example, bovine, ovine, porcine, canine or feline. In research applications, the polypeptide may be of these or other origin, e.g. primate, rodent (such as mouse, rat or rabbit).

CA 02243533 l998-07-20 W O 97/269'18 PCT/GB97/00221 ExampIes o~ ligands include epidermal growth ~actor, EGF, (which binds to epidermal growth factor receptor), heregulin and c-erbB2 ligand. EGFR and c-ErbB2 are expressed in a numbe~. of tumour types.

In general, examples of suitable ligands include those that recognise receptors that are expressed exclusively or, in greater number, on tumour cells or on vasculature that is in the vicinity of the tumour.

Such an example of the latter is the vascular endothelial growth factor ~VEGF) receptor. VEGF is a regulator of tumour angiogenesis which is produced by malignant cells and acts on tumour endothelial cells which express high affinity VEGF
recepl_ors. The known VEGF receptors are tyrosine kinases calle~ flt-l and flk-l, which are specifically expressed in endothelial cells of the tumour and in the border between the tumour and normal tis~ue (Science 253, 989, 1992; Cell 72 835, ~993). Flk-l is specifically expressed in endothelial cells during embryonic development but is down-regulated in the adult, when angiogenesis stops. Tumours are dependent ~n angiogenesis since they require a continuous source of nutrients and oxygen by tumour vasculature in order to yrow further. Therefore the ligand, VEGF which binds to flt-1 and/or flk-l may be used to selectively target an enzyme to the tumour environment. The advantage in targeting the tumour endothelium is that it overcomes the problem of poor penetration of large molecules (eg. antibodies) due to lack of accessibility. Also, the clearance time in vivo for a ligand i8 much shorter, so that less time is required between administration of the ligand-enzyme, for clearance from normal tissues and blood, before injection of the prodrug, than is required for antibody-enzyme conjugates.

The VEGF may be conjugated to, or made as a fusion protein with, a non-endogenous enzyme that can selectively catalyse a prodrug to a toxic drug in LIDEPT. In this LIDEPT system, CA 02243~33 1998-07-20 flt-1 and/or flk-1 would effectively ~e targeted with a non-endogenous enzyme that converts prodrugs to toxic dru~s.
Enzymes and prodrugs which may be used include those disclosed in W088/07378, WO89/10140, WO9~)/02729, EP-A-415 !~i 731, WO91/03460, W093/08288, WOg4/25429, W095/02420, W095/03830, PCT/GB95/01783 and PCT/GB95/01782, the disclosures of which are incorporated herein by reference.

Using this anti-angiogenic approach will serve a triple purpose since the toxic drugs are small molecules that are able to diffuse through the tumour vasculature and ~he tumour. Firstly it will kill the tumour vasculature endothelial cells. Secondly, it will ~ill the tumour cells by exerting a bystander effect. Thirdly, it will also kill the cells of the tumour by starving them of nutrients and 1~ oxygen.

Fragments of a ligand may also be used, providing they still bind to the cognate receptor to which the ligand binds.
Desirably, the binding will be of substantially similar affinity, e.g. ~rom about 10-fold lower to 10-fold higher, or higher affinity, e.g. up to about 20 to 100-fold higher.
The binding will also be of substantially similar selectivity for the cognate receptor, so that the fragment does not cross react or bind to other cell surface receptors to a sl~bstantially greater degree than the ligand itself.

The binding affinity of fragments may be determined by standard techniques, e.g. I12s ligand competition assays. The binding specificity may be determined by e.g. I12s ligand-enzyme biodistribution in vitro.

Fragments of ligands may be generated in any desired way.
For example, ligands may be cleaved chemlcally, eg. using trypsin or other protein fragmentation compounds such as cyanogen bromide, or other proteolytic enzymes. Fragments o~
the ligands may be separated by chromatography and recovered.

Conditions for cleavage, chromatography and recovery of proteins are well known in the art. More usually, fragments

3 will ~e generated ~y recom~inant DNA techniques. DNA
encoding the ligands can be spliced at restriction sites to ~ 5 delete regions of the DNA encoding the ligand. Where the fragment of the ligand is a C-terminal truncated fragment, the DNA may be altered by site directed mutagenesi~ to intro~uce a stop codon at the desired point of tr~ncation.
Other methods of manipulating DNA may be found in standard reference books, eg Sambrook et al, Molecular Cloning, CSH, 1987.

The truncated DNA may then be expressed in a recombinant expre~sion system to produce the ligand fragment, and the fragm,~nt recovered. As described herein, the DNA encoding the fragment may also be linked to DNA encoding the enzyme of the LIDEPT system so as to pro~ide a ~usion protein.

Derivati~es of ligands and their fragments may also be made by recom~inant DNA techni~ues. Site directed mutagenesis may be used to introduce amino acid substrations, deletions or insertions into the ligand. One class of substitution which may be introduced iB conservative substitutions.

Conservative substitutions may be made according to the following table, where amino acids on the same block in the second column and preferably in the same line in the third colu~ may be substituted for each other:
ALIPHATIC Non-polar G A P
I L V
Polar - uncharged C S T M
~ N Q
Polar - charged D E
K R
AROMATIC H F W Y
OTHER N Q D E

CA 02243~33 1998-07-20 As with ligand fragments, it is preferred that the derivatives bind to their cognate receptor with substantially similar or higher affinity and selectivity as the ligand itself, wherein the preferred degree of selectivity and affinity is as defined above for ligand fragments.

B. EnzYm~3 The enzyme may be any enzyme which is not normally expressed on the surface of a cell, nor released into the circulation, particularly a m~mm~lian ~especially hllm~) cell, and which is capable of converting a prodrug into an active drug. The enzyme may be a m~mm~lian enzyme which does not naturally occur in a human or a human enzyme which is not normally accessible to the prodrug. This includes enzymes from other species as well as mAm~lian enzymes which are altered in a manner which is selective for the prodrug. In other words, the alteration means that the conversion of the prodrug to an active drug by the natural enzyme will ~e at a rate one or more orders of magnitude less than the rate at which the altered enzyme operates. Altered enzymes may be made by standard recombinant DNA techniques, e.g. by cloning the enzyme, determining its gene sequence and altering the gene sequence by methods such as site-directed mutagenesis.

The enzyme will usually convert the prodrug into an active drug by removing a protecting group from the prodrug. In most cases, the protecting group will be cleaved as a whole from the prodrug. However, it is also possible for the enzyme to cleave or simply alter part of the protecting group, resulting in a partially cleaved or altered protecting group which is unstable, resulting in spontaneous removal of the remainder of the group. Such prodrugs are of particular use in association with the nitroreductase enzyme described below.
.
Preferably, the enzyme is a non-mammalian enzyme. ~uitable non-m~mm~lian enzymes include bacterial enzymes. Bacterial W O 97/26~18 PCT/G~97/00221 enzym~s include carboxypeptidases, such as carboxypeptida~e ~2 (CPG2), as disclosed in W088/07378, and Pseudomonas ~-glutamylhydrolase EC3.4.22.12 (Levy CC & Goldstein P J.
Biol. Chem. 242; p2933 (1967)) and nitroreductases, such as an E.coli nitroreductase as disclosed in W093/08288.
Examples of other suitable enzymes include thymidine kinase (tk), especially viral tk such as VZV or HSV tk; and ~-lactamase and ~-glucoronidase. Other enzymes include penicillin V amidase, penicillin G amidase and cytosine 10 ~m; n~e.

Fusion proteins will be expressed in both eukaryotic (e.g.
insect, m~ lian) cells and non-eukaryotic (bacterial) cells as required for each ligand. Many ligands are processed through the Golgi apparatus and endoplasmic reticulum, where they become glycosylated; this may be important for binding activity. In these cases, eukaryotic based expression systems will be employed. In the case of fusion proteins, especially for fusions ~etween non-eukaryotic enzymes and ligands, where the ligand needs to pass through the golgi, the enzyme may also pass through the golgi/ER and may conse~uently also become glycosylatedi this may lead in a reduction in activity of the enzyme compared to its non-glycosylated form.

In a preferred aspect of the invention the enzyme has ~een altered ~y su~stitution, deletion or insertion at one or more (e.g. two, three or four) glycosylation sites. For example, within the primary amino acid sequence of CPG2, there are three such consensus motifs, located at residues Asn 222, Asn 264 and Asn 272. Alteration of one or more of these sites is preferred when CPG2 is used in the present invention.
Desira~ly, the alteration is substitution to leucine or glutamine.

In general, alterations to enzymes which are amino acid substitutions of from at least 1 to 11 glycosylation sites CA 02243~33 1998-07-20 are particularly preferred, although deletions or insertions of for example 1, 2, 3, 4, 5 or more amino acids are also possible. In any event, the alteration will be such that the enzyme retains its ability to convert a prodrug to an active drug at substantially the same rate as the unchanged, unglycosylated enzyme. In this context, "substantially unchanged" will desirably be within 1 order of magnitude, and preferably from about 2-fold less to 2, 5 or 10 fold more.

In addition to specific changes the enzyme may otherwise be altered by truncation, substitution, deletion or insertion as long as the activity of the enzyme is substantially unchanged as defined above. For example, small truncations in the N-and/or C-terminal sequence may occur as a result of the manipulations re~uired to produce a vector in which a nucleic acid sequence encoding the enzyme is linked to the various other signal sequences described herein. The activity of the altered enzyme may be measured in suitable model systems which can be prepared in routine ways known in the art.

In a further aspect of the present invention, there is provided a vector comprising a bacterial carboxypeptidase gene which has been altered by substitution, deletion 3r insertion at one or more glycosylation sites, fused in-frame to a gene encoding a ligand or fragment or derivative thereof. The ~igand gene is fused at either the 5' or 3' end of the carboxypeptidase gene, and the two may be fused directly or spaced by a linker sequence encoding one or more (eg. up to lO, 20 or 100) amino acids. The fusion will be operably linked to a promoter capable of expressing the fusion construct in a host cell. The carboxypeptidase is desirably CPG2 with the amino acid sequence of SEQ ID No. 2 except for said one or more substitutions, deletions or insertions. Variants of such carboxypeptidases containing further substitutions, deletions or insertions but which retain substantially unchanged carboxypeptidase activity are a further part of this aspect of the invention. Such W O 97/269~18 PCT/GB97/00221 variants may for example include truncated enzymes as discussed above.

Alterations may also ~e introduced into the sequence~
encoding the ligand. Truncation, insertions deletions and speci~ic point mutations may be required to stabilise the fusion protein and to prevent it from being internalised.
Also, additional "linker regions" may be required between the ligand and enzyme to give flexibility between the two components and thus allow each to be active.

The invention also provides a nucleic acid which may be RNA
or DNA encoding such fusion protein and vectors comprising such a nucleic acid. The nucleic acid fusion is preferably ~etween that of S~Q ID No. 2 for the enzyme (except where altered to remove one or more glycosylation sites such as in 1~ SEQ ID No. 4), and SEQ ID No. 6 for the ligand or fragments thereof encoding the abo~e mentioned variants of carboxypeptidase. The vector may be an expression ~ector, wherein said nucleic acid is opera~ly linked to a promoter compatible with a host cell for expression of the fusion protein. The invention thus also provides a host cell which contains an expression vector of the invention. The host cell may be bacterial (e.g. E.coli), insect, yeast or mammalian (e.g. hamster or human~.

~ ost cells of the invention may be used in a method of making a fusion protein or fragment thereof as defined above which comprises culturing the host cell under conditions in which said enzyme or fragment thereof is expressed, and recovering ~ the enzyme or fragment thereof in substantially isolated form. Polyhistidine residues in conjunction with nickle-affinity based systems may be used for this purpose.

.
C. Ot:her vector comPonents In the system acco;rding to the invention the fusion may be lin~ed to a signal sequence which directs the fusion to be exported from the cells. This will usually be a ~m~l ian signal sequence or a derivative thereof which retains the ability to direct the enzyme to the cell surface. Some ligands ha~e a natura~ly occurring signal sequence, which may also be used. For example the VEGF signal sequence may be employed for this purpose. If the fusion does have such a signal sequence, it can be replaced by another signal sequence where this is desirable or appropriate. Suitable signal sequences include those found in transmembrane receptor tyrosine kinases such as the c-erb82 (HER2/neu) signal sequence or variants thereof which retain the ability to direct expression of the enzyme at the cell surface. The c-erbB2 signal sequence can be o~tained by reference to Coussens et al (1985) Science 230; 1132-1139.

The experiments described in the Examples herein may ~e used to determine variants of this, or other signal sequences, for their ability to express the enzyme at the cell surface. lhe variants may be produced using standard techniques known as such in molecular biology, eg. site-directed mutagenesis of a vector containing the signal sequence.

Further suitable signal sequences include those which may be found in the review by von Heijne ~1985) J. Mol. Biol. 184;
99-105.

Vectors encoding the ligand and enzyme, together with, when required, a signal sequence may be made using recombinant DNA
techniques known per se in the art. The sequences encoding the ligand enzyme and signal sequence may be constructed by splicing synthetlc or recombinant nucleic acid sequences together, or modifying existing sequences by techniques such as site directed mutagenesis. ~eference may be made to "Molec~lar Cloning" by Sambrook et al (1989, Cold Spring Harbor) for discussion of standard recom~inant DNA
techniques.

-P. Promoter~
The enzyme will be expressed in the vector using a promoter ~ capab].e of being expressed in the cell in which the vector is to be expressed. The promoter will be operably linked to the - S sequences encoding the enzyme and its associated sequences.
Suitable host cells include those mentioned above, and promot:es ~rom such host cells may be used. Viral promoters operable in such host cells may also be used.

~:. Prodru~g The prodrug for use in the system will ~e selected to be compat:ible with the enzyme, ie. such that the enzyme will be capab].e of converting the prodrug into an active drug.
Desirably, the toxicity of the prodrug to the patient being treated will be at least one order of magnitude less toxic to the patient than the active drug. Preferably, the active drug will be several, eg 2, 3, 4 or more orders of magnitude more toxic. Suitable prodrugs include nitrogen mustard prodrugs and other compounds such as those described in w088/07378, W089/10140, W090/02729, W091/03460, EP-A-540 263, W094/Q2450, W095/02420 or W09~/03830 which are incorporat~d herein by reference.

E(i) ~ Nitro~en mustard ~rodru~s Nitroc3en mustard prodrugs include compounds of the formula:
M-Ar-CONH-R
2~ where Ar represents an optionally substituted ring aromatic ring system, R-NH is the residue of an ~-amino acid ~-NHz or oligopeptide R-NH2 and contains at least one carboxylic acid group~ and M represents a nitrogen mustard group.

., The residue of the amino acid R-NH is preferably the residue of glutamic acid. It is disclosed in W088/07378 that the enzyme carboxypeptidase G2 is capable of removing the glutamic acid moiety from compounds of the type shown above, and the removal of the glutamic acid moiety results in the production of an active nitrogen mustard drug~

Thus nitrogen mustard prodrugs of use in the invention include the prodrugs o~ generic formula I of w094/024~0 and salts thereof, and in particular those of formula (I): -R1~R3 Rl a \N ~X--N H ~H

R2 ~4 5(n) wherein Rl and R2 each independently represent chlorine, bromine, iodine, OS02Me, OS02phenyl ~wherein phenyl ~s optionally substituted with l,2,3,4 or 5 substituen~s independently selected ~rom Cl4alkyl, halogen, -CN or N021;
Rla and R2a each independently represents hydrogen, Cl4 alkyl or Cl_4 haloalkyl;
R3 and R4 each independently represents hydrogen, Cl4 alkyl or Cl_4 haloalkyl;
n is an integer ~rom 0 to 4;
each R5 independently represents hydrogen, Cl4 alkyl optionally containing one double bond or one triple bond, Cl4 alkoxy, halogen, cyano, -NH2, -CoNR7R8 ~wherein 3;~7 and R8 2~re independently hydrogen, Cl6 alkyl or C36 cycloalkyl) or two adjacent Rs groups together represent a) C4 alkylene optionally having one double bond;
b) C3 alkylene; or c) -C~I=CH-CH=CH-, -CH-CEI-CH2- or -CH2-CH=CH- each optionally substituted with l, 2, 3 or 4 substituents sa~d substituents each independently selected from the group consisting of Cl4 alkyl, C,~ alkoxy, halogen, cyano and nitro;
X is a group -C(0)-, -0-C(0)-, -NH-C(O)- or -CH2-C(0)-; and Z is a group -CH2-T-C(O) o~5 where T is CH2, -O-, -S-, -(SO)-or -(S02)-, and R6 is hydrogen, Cl6 alkyl, C36 cycloalkyl amino, mono- di-Cl~ alkylamino or mono or diC36 cycloalkyl amino, provided that when R6 is hydrogen T is -CH2-; and W O 97/26~18 PCT/GB97/00221 physiologically acceptable derivatives, including salts, of the compounds of formula (I~.

Halogen includes fluorine, chlorine, bromine and iodine.
- Preferred values for the groups R'~ and R~ are methyl and hydrogen, especially hydrogen. Preferred values for the groups R3 and R' are hydrogen, methyl and trifluoromethy7, especially hydrogen. Preferred values for the groups Rl and R2 are I, Rr, Cl, OSO2Me and OSO2phenyl wherein phenyl is substituted with one or two substituents in the 2 and/or 4 positions. I, Cl and OSO2Me are especially preferred.

Preferred values for R5 when n is an integer from 1 to 4 are fluorine, chlorine, methyl-CONH2 and cyano. Preferably, n is 0, 1 or 2. When n is 1 or 2 it is preferred that R5 is fluorine at the 3 and/or ~ positions of the riny. The group X is preferably -C(O)-, -O-C(O)- or -N~-C(O)-. Z is preferably a group -CH2CH2-COOH.

Preferred specific compounds include:
N-4-[(2-chloroethyl)(2-mesyloxyethyl)amino]benzoyl-L-glutamic acid (referred to below as "CMDA") and salts thereof;
N-(4-[bis(2-chloroethyl)amino]-3-fluorophenylcarbamoyl)-L-glutamic acid and salts thereof;
N-(4-[bis(2-chloroethyl)amino]phenylcarbamoyl)-L-glutamic acid and salts thereof;
N-(4-~bis(2-chloroethyl)amino]phenoxycar~onyl)-~-glutamic acid and salts thereof; and ~-(4-[bis(2-iodoethyl)amino]phenoxycarbonyl)-L-glutamicacid (refe.rred to below as "prodrug 2") and salts thereof.

Particular sub-groups of the compounds of the present invention of interest may be obtained ~y taking any one of the above mentioned particular or generic definitions for R1-R4, R5, X or W either singly or in com~ination with any other particular or generic definition for R1-R4, R~, X or W.

Other prodrugs include compounds of the formula (II):
~ r N

R~(n ~ ~ z2 ~ N ~ Z
Z ~ O ~ O OR9 wherein Rl, R2, Rs, n and Z are as defined for compounds of the formula (I) above;
m is an integer from 0 to 4, Zl and Z2 are each independently -O- or -NH-; and R9 is hydrogen, t-butyl or allyl;
and physiologically acceptable derivatives of the compound of formula (I). Preferred values of Rl, R2, R~, n and Z are as defined above for compounds of the formula (I). Preferred values of m are 0, 1 or 2 as de~ined for n above. R9 is preferably hydrogen, but can be protected especially during synthesis by groups such as allyl or t-butyl.

These prodrugs can be activated at the site of a tumour by a carboxypeptidase enzyme, for example, CPG2 as disclosed in W088/07378 or W094/02450.

W 097/26S~18 PCT/GB97/00221 Nitrogen mustard prodrugs of the formula (III):

R2~ Rl ~r ~.
RS(n)~ NC)2 ~ RS(m~
o .
wherein R1, R2, Rs, zl, n, and m are as defined for compoun~s of t:he formula (II), and physiologically acceptable derivatives thereof, may also ~e used in the present invention.

These prodrugs can be activated at the site of a tumour by a nitroreductase enzyme, for example, as disclosed in W093/08288.

Usually to ensure enzyme activity a cofactor such as riboside or a ribotide of nicotinic acid or nicotinamide will be required and may be administered with the prodrug.

Compounds of the formulae (II) and (III) may ~e made using react.ions and methods known per se in the art of chemistry, and also by reference to GB patent application 9501052.6 and PCT/C,B96/ filed 19 January 1996 which claims priority from it. The following methods are of particular use:

A: Cc~mPounds of formula (II) where Z1 i8 -O-:

Compounds of the formula ~I) in which Zl is -O- may ~e made by reacting a nitrogen mustard of ~ormula ~IV) ~ r N ~

RS(n~

where Rl, R2, and R5 and n are as defined above and Z4 iS -~-with a linker of formula (V) ~ O ~ O-Q

R~(m)~ ~ V ) z2~ Z1 where R5, m, Z2, R~ and Z3 are as defined above, and Q is hydrogen or a leaving group. This reaction may be done in aprotic solvents in the presence of a base, for example DMF
and triethylamine.

Preferred leaving groups Q include a succinimidyl group, a

4-nitrophenyl carbonate group, a pentafluorophenyl carbonate and a tetrachloroethyl group CH(Cl)CCl3.

(ii) Compounds of the formula (IV) may be made starting from 4-nitrophenol optionally substituted with the group(s) R5~n) (as defined above). The phenolic group is protected as an W O 97/26918 PCT/GB~7/00221 adamantanyloxycarbonyl-derivative (by reacting the starting materials with ~m~ntanyl-f~uoroformate and triethylamine ~n THF at rt). The protected 4-nitrophenyl carbonate is reduced to the corresponding amine ~y hydrogen transfer in ethanol

5 usin~ ~m~o~um formate and Pd/C 10~ as catalyst at room temperature. The amine i5 then hydroxyethylated with ethylene oxide in AcOH at 20OC and then reacted to the desired nitrogen mustard. Reference may be made to EP-A-433 360 or EP-A-490970 for suitable conditions. The compounds may be puri~ied ~y column chromatography. Deprotection to remove the adamantyl group may be carried out in trifluoroacetic acid.

~iii) Alternatively, the nitrogen mustard o~ formula ~IV) may be activated as a chloroformate by treatment with phosgene or 1~ triphosgene in an aprotic solvent and triethylamine followed by cc)upling with a compound of formula (VI):

OH

R~(m)~ H ( Vl ) z1 O ORg where R5, m, Z2, R9 and Zl are as defined above. This may be carr:ied out in THF or other aprotic solvents in the present of a ~ase (for example triethylamine or pyridine).

(iv) A further alternative route of synthesis of cQ~pounds of the formula ~II) in which Zl is -O- involves direct coupling of 4-nitrophenol optionally substituted with the group(s) R5tn~ (as defined above) with the compound of the formula (V) or by reaction of the 5aid optionally substituted W 0 97/26918 PCT/GB97tO0221 4-nitrophenol compound chloroformate with the compound of formula (V), followed in each case by the reaction described above to convert the nitro group, via an amine, to a mustard group.

~: ComPounds of formula (II) where Z~ NH-:
(i) Compounds of the formula (II) in which Zl is -NH-may be made by reaction of a compound of formula (IV) in which Z4 iS -N~- with a linker of the formula (V) in aprotic solvents and in the presence of a base. Compounds of the formula (IV) in which Z4 iS -NH- may be made from a I-halo-4 nitrobenzy compound, optionally substituted with the yroup(s) Rs~n~ (as defined above). This is converted to the corresponding l-bis-hydroxyethylamino-4-nitro-benzyl compound by reaction with diethanolamine with heat and the resulting product purified by column chromatography. The corresponding 4-nitro nitrogen mustard may be made by for example mesylation using mesyl chloride in pyridine and subsequent reaction to other halo mustards, e.g. bromo or iodo mustards if required. The 4-nitro group may be reduced by hydrogen transfer in ethanol using ammonium formate and a Pd/C lO~
catalyst at 200C.

(ii) Alternatively the l-bis-hydroxyethylamino-4-nitrobenzyl compound mentioned above can be reduced using ammonium formate and Pd/C 10~ as catalyst in ethanol at 200C
to provide the corresponding phenylene-diamino deri~ative.
This derivative can be converted into the corresponding 4-amino nitrogen mustard as described in the above paragraph, e.g. initially by reaction with mesyl chloride.

C: ComPound~ of formula (III):
(i~ Compounds of the f,ormula (III~ may be obtained by co~pling nitrogen mustard phenol compounds described in section A(i) above with 4-nitrobenzyl choloroformate optionally substituted with the group(s) R~t~ (as defined above) in the presence or absence of triethylamine at 200C.

W O 97/269'18 PCT/GB97/00221 ~ii) Alternatively aniline nitrogen mustards as described in section B(ii) above may be used with the chloroformated as descri~ed in section C(i) above.

D: Com~ounds of the formul~ ~) in which Z2 i~ -NH-(i) Compounds of the formula ~IV) in which Z2 i8 -NH-may ~e made from a 4-nitro benzylic alcohol optionally substituted with the group~s) R~n~ (as defined above). The hydroxyl function is protected as a pyranyl- or t-butyl-dimethylsilyl (TBDMSi)-ether ~y treatment at 20OC
with 3,4-2H-dihydropyran and pyridinium-p-toluensulfonzte (PPTS) in an aprotic solvent or with TBDMSi chloride and imidazole in dimethylformamide (DMAC), respectively. The intermediate thus o~tained is reduced to the corresponding amine by hydrogen transfer in ethanol using ammonium formate and Pd/C 10~ as catalyst at 200C. This amine is converted to a glutamyl ester intennediate of formula (VII):

~~~Pr Pt5(m)~ ( Vll ) z1 where R5, m, R9 and Z1 are as de~ined a~ove, Z2 iS -NH- and Pr is th!e pyranyl- or t-butyl-dimethylsilyl (TBDMSi)-ether protecting group. This may ~e done ~y treating the amine with triphosgene and triethylamine in toluene at 600C to provide the corresponding isocyante, which is treated with a glutarnate derivative of ~ormula R9-CtO)-CH (NH2) _Zl where R9 .
and Z are as defined a~ove. Alternatively the correspon~; ng glutamyl-isocyanate obtained from the corresponding glutamate CA 02243533 l998-07-20 by treatment with triphosgene and triethylamine in toluene at -780C may be reacted with the amine in a one pot procedure.

(ii) The compound of formula (VII) is deprotected to remove the TBDMSi or pyranyl groups ~y treatment with mild acidic media (AcOH, THF and H20 or PPTS, EtOH, 55~C). This yields a compound of formula (VII) in which Pr is hydrogen.
Compounds of the formula (V) in which Q is a leaving group may be prepared using standard reactions known in the art.

(iii) For example where Q is a succinimidyl group the compound of formula (VII) where Pr is hydrogen may ~e treated with disuccinimidyl-carbonate and triethylamine in acetonitrile. Where a 4-nitrophenyl car~onate group is desired treatment with 4-nitrophenyl chlorformate and triethylamine in THF may be used. A pentafluorophenyl carbonate may be added by in situ phosgenation of pentafluorophenol followed by coupling to the linker of formula (VII) in which Pr is hydrogen.

~: CQm~ounds of the formula (V) in which Z2 i~ -O-:
(i) The starting materials for the linkers possessing a carbamic bond are unsubstituted or substituted (with the group(s) R~(n~ (as defined above)) 4-hydroxy-benzylic alcohols. These type of linkers may requlre an extra electron withdrawing group on the aromatic nucleus in order to undergo 1,4-elimination. The 4-hydroxy group is specifically protected as an acetate by treating the starting material with acetyl-v-triazolo-[4,5-b]pyridine, lN NaOH in T~F at 20CC. The alcohol function of the acetate is further protected as pyranyl- or TBDMSi-ether by the procedures described in section D a~ove. The acetate ~unction is then deprotected to restore the 4-hydroxy group in NaHCO3 aq. MeOH
at 200C. The resulting phenol compounds are reacted in a one pot procedure with a protected glutamyl-isocyanate as described in section D(i) above. Thi~ yields a compound of the formula of ~VII) as shown a~ove in which Z2 iS -O- and Pr CA 02243533 l998-07-20 is the pyanyl- or t-butyl-dimethylsilyl tTBDMSi)-ether protecting group.
.

(ii) Deprotection of this compound yields a compound of the formula (VII) in which Pr is hydrogen. This may ~e converted to compounds of the formula (V) by methods analogous to those described in sections D(ii) and (iii) above.

F: Alter~ative svnthesis of comPounds of formula (IV):
Compounds of the formula (IV) in which Q is hydrogen, fluoro, chloro~ ~romo or -O-(N-succinimide) may also be obtained ~y reference to W095/02420 or WO95/03830.

E(ii). Other ~rodrucs Other compounds which may be used as prodrugs include p-nitro~enzyloxycarbonyl derivatives of cytotoxic compound~.
Such compounds can ~e used in conjunction with a 1~ nitroreductase enzyme. It is ~elieved that the nitroreductase enzy~e converts the nitro group of the prodrug into a hydroxylamino or amino group, which results in the p-nitro~enzyloxycarbonyl moiety ~ecoming activated and then self-immolating. This releases the active drug. These compounds include a compound of formula:

Cl~ ~CI

~N J ~Ni~

H2N~ H2N~

CA 02243~33 1998-07-20 The nitroreductase enzyme of W093/08288 requires a co-factor such as NADH or NADPH, and this may optionally be supplied as an additional component in the system of the invention.

Examples of other compounds described in the above references 5 include prodrugs of actinomycin D, doxorubicin, daunomycin and mitomycin C. Prodrugs of the foregoing references are converted to active drugs by either nitroreductase or CPG2, although they may be modified to comprise protecting groups cleavable by other enzymes, eg ~-lactamase or glucronidase.

Further prodrugs suitable for use in the invention include those of the general formula: FTLi-(PRT)m, or salts thereof where FTLi is a ras inhibitor such as a farnesyltransferase inhibitor compound and PRT represents m' protecting groups capable of being cleaved from the ras inhibitor by the action of an enzyme, where m' is an integer from 1 to 5. Such compounds are disclosed in W095/03830.

Other suitable prodrugs for use in the system of the in~ention include those which are derivatized with a sugar or a ~-lactam derivative. For example, suitable linkers which may be attached to active drugs of the type described above are:

RO OR' R'O ~ ~ O ~ Cl W 097126!~18 PCT/GB97/00221 N~ ~S~
O ~ N ~ O ~ ~l where R~ is hydrogen or acetyl and Y~ is aryl Cuch as phenyl, benzyl or tolulyl, and these may be made in an analogous manne~ to the other prodrugs described above.

A fur~her group of prodrugs are tyrphostin compounds of the general formula: PTKi-PRTm, where PTKi is a compound with ~I'K
(prot~in tyrosine kinase) inhibitory activity, PRT is a protecting group capable of ~eing clea~ed from the PTK
inhibltor by the action of an enzyme and m~ is an integer from 1 to 5.

Suita~le tyrphostins such as the above may be obtained by the methods disclosed in, or analogous to those of, W095/02420, Gazit et al 1989 and 1991, ibid, which are incorporated herein by reference.

E(iii). Derivatives Physiologically acceptable derivatives of prodrugs include salts, amides, esters and salts of esters. Esters include carboxylic acid esters in which the non-carbonyl moiety of the ester grouping is selected ~rom straight or branched chain C16alkyl, (methyl, n-propyl,, n-butyl or t-butyl); or C36cyclic alkyl (e.g. cyclohexyl). Salts include physiologically acceptable base salts, eg derived from an appropriate base, such as alkali metal (e.g. sodium), alkaline earth metal (e.g. magnesium) salts, ammonium and NR4~
~wherein ~" is C14 alkyl~ salts. Other salts include acid addition salts, including the hydrochloride and acetate CA 02243~33 1998-07-20 salts. Amides include non-substituted and mono- and ~i-substituted derivatives.

~. AP~lications of the invention The system of the invention can be used in a method of treatment of the human or ~n; mA 1 body. Such treatment includes a method of treating the growth of neoplastic cells which comprises administering to a patient in need of treatment the system of the invention. It is also possible that the invention may be used to treat cells which are diseased through infection of the human or animal body ~y bacteria, viruses or parasites.
..
One suitable route of administration is ~y injection of the particles in a sterile solution. While it is possible for the prodrugs to be administered alone it is preferable to present them as pharmaceutical formulations. The formulations comprise a prodrug, together with one or more acceptable carriers thereof and optionally other therapeutic ingredients. The carrier or carriers must be "acceptable" in the sense of being compatible with the other ingredients of the formulation and not deleterious to the recipients thereof, for example, liposomes. Suitable liposomes includP, for example, those comprising the positively charged lipid (N[1-(2,3-dioleyloxy)propyl]-N,N,N-triethylammonium(DOTMA), those comprisingdioleoylphosphatidylethanolamine (DOPE), and those comprising 3~[N-(n',N'-dimethylaminoethane)-carbamoyl~cholesterol (DC-Chol).

Formulations suitable for parenteral or intramuscular administration include aqueous and non-aqueous sterile injection solutions which may contain anti-oxidants, buffers, bacteriostatis, bactericidal antibiotics and solutes which render the formulation isotonic with the blood of the intended recipient; and aqueous and non-aqueous sterile suspensions which may include suspending agents and thickening agents, and liposomes or other microparticulate W O 971269'18 PCT/GB97/00221 systems which are designed to target the compound to blood components or one or more organs. The formulations may ~e presented in unit-dose or multi-dose containers, for example sealed ampoules and vials, and may be stored in a freeze-dried (lyophili2ed) condition requiring only the addition ofthe ~terile liquid carrier, for example water, for injections, immediately prior to use. Injection solutio~s and suspensions may be prepared extemporaneously from sterile powders, granules and tablets of the kind previously described.

It should be understood that in addition to the ingredients particularly mentioned above the formulations may include other agents conventional in the art having regard to the type of formulation in question. Of the possibie formu]ations, sterile pyrogen-~ree a~ueous and non-aqueous solutions are pre~erred.

The doses may be administered sequentially, eg. at daily, weekly or monthly interval~, or in response to a specific need of the patient. Preferred routes of administration are oral delivery and injection, typically parenteral or intramuscular injection or intratumoural injection.

In using the system of the present invention the prodrug will usually be administered following administration ligand-enzyme. Typically, the protein will be administered to the 2~ patient and then the uptake of the protein monitored, for example by recovery and analysis of a biopsy ~ample of targeted tissue or by injecting trace-labelled protein ligand enzyme.

The exact dosage regime will, of course, need to be determined by individual clinicians for individual patients and this, in turn, will be controlled by the exact nature of the prodrug and the cytotoxic agent to be released from the prodrug but some general guidance can be given. Chemotherapy CA 02243~33 1998-07-20 of this type will normally involve parenteral administration of both the prodrug and modified protein and administration by the intravenous route is frequently found to be the most practical. For glioblastoma the route is often intratumoural. A typical dosage range of prodrug generally will be in the range of from about 1 to 150 mg per kg per patient per day, which may be administered in single or multiple doses. Preferably the dose range will be in the range from about 10 to 75, e.g. from about 10 to 40, mg per kg per patient per day. Other doses may be used according to the condition of the patient and other factors at the discretion of the physician.

Tumours which may be treated using the system of the present invention include any tumours capable or ~eing treated by a LIDEPT system and thu~ are not limited to any one particular class of tumours. Particularly suitable tumour types include breast, colorectal and ovarian tumours, as well as pancreatic, melanoma, glioblastoma, hepatoma, small cell lung, non-small cell lung, muscle and prostate tumours. In the case of a LIDEPT system which uses VE~F, these and other types of solid tumours which comprise an actively growing vasculature are all candidates for treatment.

The system of the invention may also be used to treat infections diseases, for example, and any other condition which requires eradication of a population o~ cells.

It will be understood that where trea~ment of tumours is concerned, treatment includes any measure taken by the physician to alleviate the effect of the tumour on a patient.
Thus, although complete remission of the tumour is a desirable goal, effecti~e treatment will also include any measures capable of achieving partial remission of the tumour as well as a slowing down in the rate of growth of a tumour including metastases. Such measures can be effective in prolonging and/or enhancing the quality of life and relieving the symptoms of the disease.

The following Examples illustrate the invention.

~XAMP~E 1 In order to ~emon~trate the invention, a fusion protein between CPG2 and VEGF was prepared. This fusion protein is formed between portions of c-ErbB2 (signal peptide), in order to d:irect expression outside of m~mm~l ian cells, CPG2 containing specific mutations which block glycosylation (N222, 264 and 272 ~ Q, as descri~ed in PCT/GB95/01782) and VEGF. The structure of the fusion protein is:
c-ErbB2 (amino acids 1-27): Gly Ser :CPG2 (amino acids 23-415~: Glu Phe Gly Gly Gly Gly Gly Thr Ala: VEGF 165 (amino acids 28-l91).

The DI~A and protein sequences of CPG2 are shown in SEQ ID No.
1 and 2 repectively, and this sequence was used except for the altered glycosylation sites as shown above. The sequence of VEGF can ~e found in Conn et al Proc. Natl. Acad. Sci.
(1990) 87; 2628.

This fusion was cloned into the mammalian expression vector pEF Plink 2 (Marais et al, (1995) EMBOJ 14, 3136-314~) and transfected into COS-7 cells. In order to assess expression of the fusion protein, the culture medium from the transfected cells was analysed by immunoprecipitation, CPG2 enzyme activity and heparin binding activity.

(1) Immuno-precipitation.
1 ml of tissue culture medium was immuno-precipitated with a CPG2 specific antiserum and the proteins eluted from the beads. The eluted proteins were anslysed by immuno-protein ~lot, and shown to contain a fusion protein with an apparent molecular weight of 60-64,000, which was not present in control transfected cells.

CA 02243~33 l998-07-20 (2) The supernatants were also analysed for CPG2 activi~y using methotrexate as a substrate and found to contain CPG2 activity which was absent in the control transfection.

(3) ~eparin-binding 1 ml tissue culture medium was mixed with heparin-agarose, and the heparin agarose beads collected, washed and bound proteins were eluted and analysed by immuno-protein ~lotting using a CPG2 antiserum. The results show that supernatants from cells transfected with the fusion gene contained the CPG2-VEGF fusion protein and that this was competent to bind to heparin, whereas fusion between CPG2 and VEGF amino acids 28-136 did not bind to heparin.

These data demonstrate:
(1) The fusion protein can be synthesised and excreted by mammalian cells.
(2) The fusion protein contains CPG2 acti~ity.
(3) The fusion protein is stable in tissue culture medium.
(4) the fusion protein is soluble.
(5) The fusion protein is able to bind to heparin, throuc3h the VEGF component.

~xamPle 2 1. Introduction The VEGF proteins are secreted from cells via the endoplasmic reticulum ~ER) and Golgi apparatus during which they become proteolytically processed and glycosylated to generate the mature proteins. (Eur J. 3iochem, 211, 19, 1993). There are at least four splice ~ariants of the gene, which encode different isoforms of the protein. The proteins that these genes encode for are referred to by the number of amino acids present in the mature protein with amino acid number 1 referring to the first amino acid in the mature protein. VEGF
165 is one of the isoforms of VEGF, which consists of 165 amino acids in the mature protein and which contains both a specific receptor interacting domain and a heparin-binding W O 97/269~18 PCT/GB97/00221 domain. By contrast, VEGF 121 is an isoform consisting of a protein of 121 amino acids which contains the speci~ic receptor interacting domain, but not the heparin-binding domain (Science, 253, 989, 1992; ~ell, 72, 835, 1993). The aim i6 to prepare V33GF-enzyme fusion proteins to activate a subsequently administered prodrug which will generate a cytotoxic agent at the tumour.

The bacterial enzyme Carboxypeptidase ~2 (CPG2) i~ secreted from bacterial cells and is normally located in the periplasmic space. CPG2 catalyses the degradation of Methotrexate (MTX) and can also be used to cleave and thereby activate mustard prodrugs (J. Med. Chem, 33, 677, l99o).
These prodrugs, which are relatively non-toxic are cleaved by CPG2 into active bifunctional alkylating agents, which are highly toxic to mammalian cells. If the signal peptide from CPG2 is replaced with a m~mm~l ian signal peptide, and CPG2 is expressed in mAmm~lian cells, the protein enters the secretory pathway. However, this form of CPG2 becomes inappropriately glycosylated on three sites, resulting in an inactivate enzyme. The activity of this mammalian expressed form of CPG2 can be partially recovered if the three Asn residues that are in the core of the glycosylation motif are mutated into Gln residues. This results in a protein which retains about 10~ of the activity of the ~acterial enzyme.

The purpose of this invention is to target CPG2 to tumours where they will be employed to cleave and activate prodrugs (including mustard prodrugs). When CPG2 is targeted to the tumour, then the prodrug activation should only occur at that locus.

2. DescriPtion o~ t~e CPG2 and VEGF ~enes and PFoteins ~ Ysed In considering the possible types of conjugated proteins they may ~)e produced, thie invention employs proteins produced from genes that are ligated so that the recombinant proteins CA 02243~33 1998-07-20 that are produced are produced as ready-made fusions. Since VEGF is normally secreted from cells via the endoplasmic reticulum and Golgi apparatus, the aim was that the fusion proteins should pass through this secretory pathway so that the VEGF moiety would be expressed in its mature form.
However, when CPG2 is processed through this secretoxy pathway in eukaryotic cells, it is subjected to inappropriate and inactivating glycosylation. Therefore constructions of fusions between VEGF and the form of CPG2 which cannot be glycosylated were made. The feasibility of fusions in which the VEGF moiety was either ligated to the amino-terminus (N-terminus) or the carboxyl-terminus (C-terminus) to CPG2 was tested. In addition, the presence or absence of the VEGF
heparin-binding domain was examined for its a~ility to affect the activity of the fusion proteins. A series of fusion proteins were therefore constructed for expression in eukaryotic cells that would allow these parameters to be tested.

2.1 The CP~2 aene ~nd Protein The CPG2 gene used was modified for expression in m~mm?~l ian cells. The codons for the CPG2 signal peptide were replaced with the signal peptide from the human c-erbB2 protein to ensure efficient secretion of the CPG2 protein. For c-erbB2 signal se~uence see Coussens et al (1985) Science 230; 1132-1139. Furthermore, the three Asn residues at the core of the glycosylation motifs were mutated to Gln residues to prevent glycosylation, and also a polyhistidine tag was added to the 3' end of the gene so that the fusion proteins could be purified by the nickel-NTA affinity chromatography. This construct is referred to as CPH6 and is represented schematically in Figure lA. For expression in m~mm~l ian cells, the gene for this construct was cloned into the ~mm~l ian expression vector pEFPlink 2. This vector was chosen since it uses the promoter from the elongation factor l~ gene to direct expression of foreign genes in m~mm~l ian cells and has been shown to be highly active in a variety of cell types. The structure o~ the protein that the CPH6 gene is predicted to express is shown schematically in Fig. lA;
the complete DNA and protein sequences are given in SEQ ID
Nos. 3 and 4.

2.2 The VEGF ~ene and protein The VEGF gene in the plasmid pVEGF165 (in Bluescript) encodes ~or the VEGF signal peptide and a splice variant in which the mature protein consists o~ 165 amino acids. This encodes the specific receptor binding ~o~;n of VE~F encompassed within residues 1-121 for the mature protein. VEGF165 also encodes a heparin-binding domain, which requires the region of amino acids from 122-165 of the mature protein. In this gene there is a single glycosylation motif located at residue 75 of the mature protein (VEGF165) JBC, 266, 11947, 1991). The structure of the protein that the VEGF gene i5 predicted to express is shown schematically in Fig lB; the complete D~A
and protein sequences are gi~en in SEQ ID Nos. 4 and 5.

3. Generation of the Fu~ion Proteins In order to generate the various genes that were to be tested in the LIDEPT approach, the CPH6 and VEGF165 genes were used.
Mutations to produce the fusions were created ~y PCR directed mutagenesis, which has the advantage in its precise nature that it allows defined fusions to be generate at exact positions. the PCR techniques used are standard and can be found in any basic text, although an example of such a protocol is shown below.

DNA samples were heated through thermal cycles in a proprietary heat cycling instrument, in the presence of mutagenic primers, nucleotides and appropriate buffers to generate DNA samples that can then be cloned into appropriate recipient vectors. Typical heating cycles are presented below and t:he enzyme Taq polymerase was used to generate the DNA.

. CA 02243533 1998-07-20 g5~C 60 sec 55~C 30 sec 72~C 30 sec 25 cycles were generally employed.

The products from these reactions were cloned into recipient vectors and analysis ~y dideoxy-sequ~nc;~ techniques w~re used to verify the integrity of the fragments generated.

Fusions proteins were tested in which the VEGF and CPG2 proteins were fused to each other in both orientations and in which the VEGF moiety contains or lacks the heparin-bind~ng domain. Four fusions were created, with CPG2 fused either to the N-, or to the C-terminus of VEGF, where the VEGF either contained or lacked the heparin ~inding domain. These constructs are represented schematically in Fig 1 and are 1~ described in detail ~elow.

3.1 ~, 61- CPH~
In this clone, the predicted protein product wou}d be a fusion in which the CPG2 moiety was fused to the C-terminus of amino acids 1-161 of mature VEGF. Since the VEGF moiety is located at the N-terminus of the fusion, its own signal peptide will direct the protein to the secre,tion pathway of the cell. The c-er~B2 signal peptide was removed from the CPG2 gene. This fusion would therefore contain both the receptor-binding ~o~; n and also the heparin-binding ~n~; n of VEGF. It also contains a C-terminal polyhistidine tag for purification purposes.

The cloning of this fusion was performed as follows. The PCR
was used in con3unction with oligonuleotides primers 1 and 2 (Table 1) and plasmid pVEGF165 to introduce silent mutations within the VEGF open reading frame that would destroy the internal Ncol site located within that gene. This was performed to facilitate ease of cloning at the later stages.
Next, a PCR fragmént was generated by primers 3 and 4 (Table 1) in conjunction with the altered VEGE gene was digested with the restriction endonucleases Ncol and Bam HI and the resulting fragment was cloned into those sites in plasmid pEFCPH6. This strategy results in a gene that would encode the VEGF signal peptide, VE~F amino acids 1-161, fused to CPG2 amino acids 23-415, and containing a polyhistidine tag at the C-terminus for purification purposes. This construct is referred to as Vl61CPH6 and the protein is represented schematically in Fig lC. The sequence of the Vl6lCPH6 gene and the protein, it is predicted to encode, are given in SEQ ID
Nos. 7 and 8.

3.2 V~ CPH~
In this clone, the CPG2 is fused to the C-terminus of VEGF.
The portion of VEGF used for this clone is the first 115 amino acids of the mature protein and therefore does not contain the heparin-~inding domain. As with V16lCPH6 the signal peptide of VEGF will di~ect the secretion of the fusion protein and therefore the c-erbB2 cignal peptide was remo~ed.

In cloning this fusion protein, the internal Ncol site within VEGF was destroyed with PCR and olignucleotides primers 1 and 2 ~Table 1) as descri~e din Section 3.1, in order to facilitate easier cloning at the later stages. A PCR was performed in conjunction oligonucleotides 3 and 5 (Table 1) and the altered VEGF gene and the product was digested with the restriction endonucleases Ncol and Bam HI. The resulting fragment was cloned into these sites in plasmid pEFCPH6. This strategy produces a gene which would encode the signal pepti.de from VEGF and the first 115 amino acids of the mature protein fused to amino acids 22-415 of CPG2 and the polyhistidine tag. This construct is referred to as Vl15CPH6 and t:he protein is represented schematically in Fig lD. The seq~ence of the V1l5CPH6 gene, and the protein that it is predi.cted to encode, are given in SEQ ID ~os. 9 and 10.

3.3 Ç~ s In this clone, CPG2 is fused to the N-terminus of VEGF. In this arrangement, the c-erbB2 signal peptide is used to direct secretion of the protein from the ~mm~l; an cells and so the VEGF signal peptide was ~e~l,oved. This fusion contains both the VEGF receptor ~inding domain and the heparin-binding domain. However, in the cloning of this gene, the polyhistidine tag was lost from the C-terminus of CPG2.

The cloning of this fusion was performed as follows. Standard cloning techni~ues employing oliognucleotides 6 and 7 (Table 1) were used to replace the polyhistidine tag at the 3' end of the CPH6 gene with a peptide spacer of 6 amino acids and also to create a unique Kpn I site at that position. The structure of the linker is (Gly) 5Thr and was designed to given flexibility to the region between the two prote~n moieties. A PC~ was performed with oligonucleotides 8 and 9 (Table 1) in conjunction with pVEGF165 to generate an altered VEGF gene. This PCR product was digested with the restriction endonucleases Kpn I and Xba I and cloned into the freshly created Kpn I site and Xba I sites of pEFCPH6. This arrangement results in a gene whose structure is the signal peptide from c-erbB2 (amino acids 1-27-CHECK) ~used to CPG2(Q)3 ~amino acids 22-415) fused, via the spacer to codons 1-165 of mature ~EGF. This construct is referred to as CPVl65 and the protein is represented schematically in Fig lE. The se~uence of the CPV,6~ gene, and the protein it is predicted to encode, are given in SEQ ID Nos. 11 and 12.

3.4 CPV~O, During the cloning of CPVl65 it was noticed that one of the clones that was generated contained a mutation at codon 110 of the mature VEGF sequence. This converts an AGA (Arg) codon into a TGA (stop) codon and thus results in a truncation of VEGF at that position. Since this mutation fortuitously removes the heparin-~inding domain of VEGF, we chose to characterise this fusion protein further to enable u5 to CA 02243533 l998-07-20 assess the results of having VEGF fused to the C-terminus of CPG2 in which the heparin-binding domain of VEGF was absent.
This construct is referred to as CPVlog and the protein is represented schematically in Fig lF. The 8equence of the CPVl09 gene and the protein that it is predicted to encode are given in SEQ ID Nos. 13 and 14.

Fi ~ re 1. Schematic re~re~entat~on of ~PG2, VEGF and t~e fusiQn ~ro~eins ~ ach of the proteins is represented as a bar which has variable shading to represent the different section of the fusions. The proteins are all depicted with the amino-terminus to the left and the carboxyl-terminus to the right (N and C respectively). The fusion junctions between the proteins are represented by open boxes. In the cases of CPVl65, CPVl09, CPVl6~H6 and CPVlogH6l the peptide linker between the CPG2 and VEGF moieties is represented by a stippled box (L). Two forms of signal peptides have been employed to target the proteins for the secretory pathway. One is the signal peptide from the mammalian tyrosine kinase receptor c-erb B2 (SPerb) and the other is the signal peptide from the VEGF gene (SPv), Numbers in brackets indicate the amino acids present in either the CPG2 or VEGF motif for each fusion. The position of the amino acids that are required for heparin-binding by VEGF are indicated by the hatched box.
The position of slycosylation in the VEGF fragments is indicated by ~. The three sites in CPG2 which must be mutated to pre~ent glycosylation are indicated by O. Where the fusions express a polyhistidine tag, this is indicated ~H6). The clones represented are: A, CPH6; B. ~EGFl6~; C.
Vl6lCP~6; D. Vll5CPH6; E- CPVl65; F- CPVl09; G- CPV161~6; H- CPV1osH6 Fi~ure 2. Ex~res~ion and he~arin bi~in~ of CPX~ and the fu~ion ~roteins The~t:issue culture supernatants from the transfected COS
cells as described in Table 2 were subjected to Western blot analysis and analysi~ for the ability to bind to heparin.

CA 02243533 l998-07-20 A. T~m~oprecip~tation and ~ ~protein blot analy~i~
For each sample, 500~L of the conditioned medium from cells transfected with the rele~ant expression ~ectors was made up to 0.1~ ~/v with Triton X-100 and then incubated with -5~L of a rabbit polyclonal antiserum specific for CPG2 tMarais et al 1996) immobilised on Protein G-Sepharose beads (Pharmacia).
The complexes were formed for 6 hours at room temperature and then the ~eads were washed three times with 500~L of PBS
containing 0.1~ Triton X-100. The bound proteins were eluted from the ~eads, resolved on an 8~ SDS-gel and analysed by immunoprotein blotting following st~n~rd techniques and using the CPG2 specific antiserum. The protein expressed in each sample is indicated above the appropriate lanes and are:
lane 1, pEFCPH6; lane 2, pEFVEGFl65; lane 3, pEFVl61CPH6; lane ~, pEFVll5CPH6; lane 5, pEFCPV1og; lane 6, pEFCPVl65; lane 7, control (pEFPlink.2). The position of migration of protein markers (x 10 3) iS shown to the right of the figure and the positions of migration of the major products of each gene are indicated ~y the arrows to the left of the figure.

8. Heparin-b;n~in~ of ~he fusion prote~n~
For each sample, 500~L of the conditioned medium was made up to 0.1~ v/v with Triton X-100 and then incubated with -35~1L
of Heparin-Sepharose beads at 4~C for ~16 hours. The ~eads were washed three ti~es with 500~ of PBS containing 0.1~
Triton X-100 and the bound proteins were eluted from the ~eads, resolved on an 8~ SDS-gel and analysed by immunoprotein blotting following standard techniques with the CPG2 speci~ic antiserum (Marais et al 1996). The protein expressed in each sample i5 indicated a~ove the appropriate lanes and are: lane 8, pEFCPH6; lane 9, pEFV~GFl6~; lane 10, pEFV161CPH6; lane 11, pEFVll5CPH6; lane 12, pEFCPVlog; lane 13, pEFCPVl65; lane 14, control (pEFPlink.2). The position of migration of protein markers (x 10 3) is shown to the right of the figure and the positions of migration of the major products of each gene are indicated by the arrows to the left of the figure.

_ Fioure 3: Purification of CPV165 bY He~ari~-Se~harose chromato~raDhY
Conditioned culture medium from m~mm~l ian cells transfected with pEFCPVl65 was subjected to a purification protocol by Heparin-Sepharose chromatography. For this protocol, 17ml of conditioned medium harvested from COS cells 72 hours post-tran~fection was passed over a 700~L Heparin-Sepharose CL 4B
column which was then washed with 7ml column ~uffer (100mM
Tris.HCl, 260~M ZnCl2, pH 7.3). The column was eluted sec~uentially with 2.8mL column buffer containing 400mM NaCl, 2.8m~ column buffer containing 800mM NaCl, 2.8mL col~mn ~uffer containing 1.2M NaCl, 2.8 mL column buffer cont~;n;ng 1.6M NaCl and 2.8mL column buffer containing 2M NaCl at the posi~ions indicated ~y the arrows. 0.7ml fractions were collected. The elution of bulk protelns is indicated (clo6ed circ:Les) and was determined using the Bio-R~d protein dete~mination kit, with OD measurements at ~95nm. Ihe elution of the CPG2 activity was determined using MTX as a su~sl:rate as described in Table 2 and is indicated ~y the open circles.

Ficrure 4. Purifica~ion of fusion Proteins bY Ni+ I -NTA
aqarose affinit~ ~hromatoc~raDhv.
Sf9 cells were infected with baculoviruses encoding CPH6 and Vll5CPH6 and the proteins were purified by Ni++-NTA agarose affinity chromatography as described in Ta~le 5. Samples from the load fraction (LOAD, lanes 1, 3) and purified proteins (ELUATE, lanes 2, 4) were analysed by SDS-PAGE
followed by silver staining to determine their pl~rity. The ~amples for CPH6 are in lanes 1 and 2 and the samples for V115CPH6 are in lanes 3 and 4. The position of migration of the purified proteins are indicated on the left of the figure as are the position of migration (x 10 3) of standard proteins, to the right of the figure.

Fi ~ re 5. ~imer analYsi~ of fusion ~roteins.
Vll5CPH6 (lanes 1, 3) and CPH6 (lanes 2, 4) expressed in Sf9 cells and purified by Ni++-NTA agarose affinity chromatography as described in Table 5 were used in this analysis. For each sample, -50ng of protein were heated to 65~C for ~ min in the absence (non-reduced; lanes l, 2) or presence (reduced; lanes 3, 4) of 2-mercaptoethanol (5~ v/~).
The samples were resolved on a 12~ SDS-gel and revealed by immunoprotein blotting with the CPG2 specific antiserum. The position of CPH6 is indicated as are the positions of migration of dimeric (dimer-Vl15CPH6) or monomeric (monomer-Vl15CPH6) Vl15CPH6. The portion of standard proteins ~x 10-3) .is indicated to the right of the figure.

Fiqure 6. Schematic rePresentatiOn and DNA and ~rotein sequences of ~DR~BD) A. Schematic representation of the RDR~BD).

The protein structures of KDR and KDR(BD) proteins are represented schematically. The proteins are in the conventional N-terminal to C-terminal orientation as indicated (N and C respectively). The top figure represents the complete KDR protein, where the individual regions are differently shaded. The regions represented are the signal peptide (SP), the extracelluar domain (ECD), transmembrane region (TM) and kinase domain (KD) are indicated. Also indicated are codons 809 and 810, where the EcoR1 site in the gene resides; this was used to truncate the protein to make KD~(BD) as shown in the lower schematic. The 9E10 epitope which was added to the C-terminus of the truncated protein is represented as a hatched box.

Fic~ure 7 : ExPression of RDR(BD) in Sf9 Cells Sf9 cells were infected with viruses expressing KDR~BD), or as a control, a virus with a defective polyhedron gene which does not encode any know protein ~empty). For each sample, 107 cells were extracted in lysis buffer (20mM Tris.HCl, W O 97/26~18 PCT/GB97/00221 0.5mM EDTA, 10~ v/v glycerol, 0.3M KCl, l~ v/v Triton X-100, 5~g/ml ~eupeptin, 5~g/ml Pepstatin A, 50~g/ml phenylmethylsulphonyl flouride, lmM benzamidine, pH8) 60 hours post infection. The protein content of the extracts ~ 5 was determined using the Bio-Rad protein determination kit.
For each extract ~KDRtBD), lanel; empty, lane 2), 4.5~g of protein was loaded onto a 7% SDS-Gel and the proteins therein detected by ;mmllnoprotein ~lotting, using the 9E10 mouse monoclonal antibody. The position of migration of the KDR(BD) is indicated ~y the arrow, and the portion of standard proteins (x lO 3) is indicated to the right of the figure.

F~qure 8 : B;n~i~ of fusion ~roteins to RDR(BD) The ability of the fusion proteins to bind to the KDR~BD) was tested in vitro. The KDR(BD), expressed in Sf9 cells was immunoprecipitated with the 9E10 monoclonal anti~ody and then the a~ility of purified fusion protein samples to bind to the 9E10/KDR(BD) immunocomplex was determined by mea8uring the CPG2 enzyme activity. For each sample, the KDR(BD) from 750~g of insect cell extract, as prepared in Fig 7 was immunoprecipitated with ~50~g of 9ElO monoclonal antibody immobilised on protein G-Sepharose beads (+ KDR(BD); lanes 6-10). As a control, extracts from cells infected with the control virus (- KDR(BD); lanes 1-5) were used instead of the KDR(BD) expressing virus. The VEGF/CPG2 fusion proteins were also expressed in Sf9 cells and prepared as Ni++-NTA Agarose purified proteins, as descri~ed in Table 5. The fusion proteins were balanced against each other by CPG2 enzyme acti~ity and for each fusion, the protein represented by 4.8mU' of CPG2 activity was added to the 9E10/KDR~BD) immunocomplexes. The samples were incubated for 2 hours at 4~C and washed twice in wash buffer 500 (5OmM Tris.HCl, 50OmM
NaCl, pH8) and then once in wash ~uffer 100 (50mM Tris.HCl, 500m~ NaCl, pH8). The amount of CPG2 activity retained by the immunocomplexes was determined as described in Table 4.

CA 02243533 l998-07-20 Fioure 9 : Non-specific cell CYtotoXiCitY directed ~Y the ~usion ~roteins ~ A. Sensiti~ity of NIH3T3 cells to the CMDA prodrug.
Confluent NIH3T3 cells were cultured in the absence (control, lane 1) or presence of lmM CMDA prodrug (+~MDA, lane 2) for 18 hours. After incubation, the cells were diluted in~o fresh dishes (at a dilution of 100 fold) and allowed to grow for a further 6 days after which time the survival of the cells was determined by [3H] thymidine incorporation. The results are expressed in terms of the ~ growth in terms of the control, which were considered to represent 100~ cell growth.

B. Sensiti~ity of NIH3T3 cells to the CMDA prodrug in the presence o~ the fusion proteins.
Confluent NIH3T3 cells were cultured in the presence of lmM
CMDA prodrug in the absence of any additions (control, lane 3) or in the presence of the indicated fusion proteins (lanes 4-8). The additions were : lane 4, CPH6; lane 5, CPVl6lH6;
lane 6, V16lCPH6; lane 7, Vll5CPH6; lane 8, pEFCPV1ogH6. For each fusion, 0.3 to 0.9 nM of the fusions were added as determined by ~uantitative immunoprotein blotting. The cells were incubated and processed as in section A above for cell growth. The results are expressed in terms of the ~ growth in terms of the control lane, which were considered to represent 100~ cell growth.

Fiqure 10 : VEÇF dependent cell cYtotoxicity directed ~Y the fu~ion ~rotein Vll5cPH6 A. 8ensiti~ity o~ ~u-V-ec cells to the CMDA prodrug.
Confluent Hu-V-ec cells were cultured in the absence (control, lane 1) or presence of lmM CMDA prodrug (+ CMDA, la~ 2) for 18 hours. After incu~ation, the cells were diluted into fresh dishes (at a dilution of 3 fold) and allowed to grow for a further 6 days after which time the W O 97n6918 PCT/GB97/00221 survival of the cells was determined by [3H] thymidine inco~oration. The results are expressed in terms of the ~
growt.h in terms of the control, which were considered to represent 100~ cell growth.

B. Sensiti~lty of ~u-V-ec cell~ to the CMDA prodrus $n the presence o~ the ~usion proteins.
Confluent Hu-~-ec cells were cultured in the absence of any addit.ions (control, lane 3) or in the presence of CPH6 (27 , .ane 4), or o~ V115CP~6 (llnM, lane ~) for 30 min. The cells were then washed 6 times with fresh medium and incu~ated overnight in the presence of lmM CMDA. The cells ~were re-plated at a dilution of 1/3 and allowed to grow ~or a further 4 days after which time the cell growth was estimated by r3H]thymidine incorporation. The results a~e 1~ expressed in terms of the % growth in terms of the control lane, which were considered to represent lQO~ cell growth.

CA 02243~33 l998-07-20 T~BLE 1 Q~IGONnUCLEOTI~ES USED IN MnrrATION GENnER~TION

Oligo. 1 GCTGCACCTATGGCAGAAGG
Oligo. 2 CCTTCTGCCATAGGTGCAGC
Oligo. 3 CGGCCATGGACTTTCTGCTGTCTTGGGTGC
Oligo. 4 GGCGGATCCGTCACATCTGCAAGTACGTTC
Oligo. 5 GGCGGATCCATTTTCTTGTCTTGCTCTAl~lllC
Oligo. 6 AATTCGGAGGTGGCGGAGGTACCTCTGGAGGCGGTCCAGGAGGTGGC5G
GTC
Oligo. 7 CATGGACCCGCCACCTCCTGGACCGCCTCCAGAGGATACCTCCGCCA~C
TCCG
oligo. 8 GGAAGCTTGGTACCGCACCCATGGCAGAAGG
Oligo. 9 GGTTCGAATCTAGACCCGGCTCACCGCCTCGG
Oligo. 10 AATTCCATCATCACCACCATCACGCTTCCTAGT
Oligo. 11 CTAGACTAGGAAGCGTGATGGTGGTGATGATGG
oligo. 12 GGCGGATCCATCTTTCTTTGGTCTGCATTC
oligo. 13 CGGATCACGGCCATGGAGAGCAAGGTGCTGCTG
Oligo. 14 AGCAATAAATGGAGATCTGTAATCTTG
Oligo. 1~ CGATGAGCAGAAGCTGATATCCGAGGAGGACCTGAA
Oligo. 16 CTAGTTCAGGTCCTCCTCGGATATCAGCTTCTGCTCAT

The oligonucleotides used are presented in the conventional 5'-3' orientation.

W O 97/26918 PC~/GB97/00221 TABI~' 2 : ANAI,ySIS OF CPG2 A~:L1V1-1 Y IN CONr~ITTONED C~TUR~
MEDIUM FROM l'HE TRANSFECTED COS CELLS

FI~SION PROTEIN EXPRESSED CPG2 ACTIVITY U/m1*
~ Control 0 pEFCPH6 0.096 pEFVl6lcPH6 0.011 pEFVll5cPH6 pEFCPvl6s pEFCPvlo9 0.17 pEFV~GFl6s The tissue culture medium from COS cells transfected with the indicated vector was harvested and examined for CPG2 activity. Transfactions were performed by standard techniques using the lipfectAMINE reagent ~Marals et al 1995~. For CPG2 enzyme assay, lOO~L, of culture medium was incubated in CPG2 buff~r (100 mM Tris. HCl, 260 ~M ZnCl2, 50~M MTX, pH 7.3), at 25OC and the rate of change in absorbance was monitored at 320nm. From this, the CPG2 activity in terms of units CPG2 per ml of culture medium (U/ml) was calculated.

CA 02243533 l998-07-20 T ~ ~ 3 : PREDICTED ~D OB~ ~:K~ SIZES OF l~HE EXPRESS~D
PROl~l~S

Protein No ~f Mr Apparent Mr aa (Predicted) (Observed) CP(Q)3-H6 411 43731 45000 VEGFl65 165 19152 N.D.
V161CP(Q)3-~6 566 61710 60-62000 V115CP(Q)3-H6 520 56386 54-58000 CP(Q)3V165 574 62272 58-60000 ~CP(Q)3V109 518 55657 52-55000 CP(Q)3V161-H6 578 62660 60-62000 CP(Q)3V109-~6 530 57135 59-61000 The table represents the number of amino acids predicted in each mature ~usion protein and from that the predicted molecular masses calculated; these masses therefore take into account the loss of the signal peptides. The observed masses are determined ~rom the SDS-PAGE analysis shown ln Figure 2A.

N.D. - not determined * The o~served sizes are calculated from the predom;~nt band seen in each case in Figure 7A.

CA 02243533 l998-07-20 TABL~: 4 ; EXPRESSION OF T~E FUSIO~ PROTEINS IN Sf9 CE~LS

Fusion protein expressed U/ml Empty ~
5 CP~6 ~.4 V16lcPcpH6 0.06 VllscPH6 0.2 cPvl6lH6 o . 01 CPVl o gH6 The tissue culture medium from Sf9 cells infected with viruses encoding the indicated proteins was harvested and e~mi ned for CPG2 activity. For this experiment, 10' insect cells were infected at a multipllcity of infection of -4-10.
The cells were incubated with 10 mL of tissue culture medium for 60 hours. The medium was harvested and the cells removed by centri~ugation. For the MTX assays, samples of medium were incubated in CPG2 buffer (100 mM Tris. HCl, 260~M ZnCl2, 50 ~r~ MTX, pH 7.3), at 25~C and the rate of change in absorbance was monitored at 320 nm. From this, the CPG2 activity in terms of units CPG2 per ml of culture medium (U/ml) was calculated.

~AB~E 5 ~ PROTEIN PURIFICATION BY Ni~+-NTA Ar~lNlLr ~O~TOGRAP~Y

CPH6 vl1CPH6 Fraction ~ TOTAL ~ CPG2 ~ TOTAL ~ CPG2 PROTEIN A~~ Y * PROTEIN ACTIVITY

The table shows the purification of CPH6 and Vl15CPH6 ~y NI~
NTA-affinity chromatography. For each protein, 107 insect cells were infected at a multiplicity of infection of 4-10.
The cells were incubated with 10 mL of tissue culture medium for 60 hours and the medium was harvested, removing the cells by centrifugation. The condition medium was extensively dialysed against dialysis buffer (50mM Tris. HCl, S00 mM
1~ NaCl, pH8) and then loaded onto a 0.5 ml column buffer (50 mM
Tris. HCl, 5~0 mM NaCl, 10~ glycerol, 0.5~ v.v NP40, 10 mM
imidazole, 5Mg/ml Leupeptin, SMg/ml Pepstatin A, 50 Mg/ml phenylmethylsulphonyl fluoride, lmM benzamidine, pH8~. The column was washed with 5ml of column buffer containing 20 mM
imidazole and the bound proteins were eluted in column buffer containing 50 mM imidazole. The table shows the proportion of total protein and CPG2 activity located in each of the column fractions.

* The CPG2 activity was determined in enzyme assay, using MTX as a substrate as described in Table 4.

T~i3LE 6 : ~lN~ 'lC ~N~iLYSIS OF P~nRIFIED ~R~1N~.

CONSTRUCT Km (~M) CPH6 19.6 1 3.2 V16lcPH6 18.2 i 1.
V1l5CPH6 17.0 ~ 1.0 CPVlogH6 9.1 i 1.1 Sf9 insect cells were infected with viruses expressing CPH6, V1~1CPH6, V115CPH6 and CPV1ogH6~ The culture medium was harvested and the fusion proteins purified by Ni+~-NTA Agarose affinity chromatography as described in Table 5. The purified proteins were tested for ability to cleave MTX as described in Tahle 4 and standard regression analysis was used to determine the affinity (X~) of each fusion for the substrate.

W O 97/26918 PCTtGB97/00221 4, Characterisation of the Fusion Proteins The first experiments were directed at determining whether the fusion proteins we had generated could be expressed in eukaryotic cells. We used a transient transfection system based on COS cells for this, since these cells are convenient to culture and transfect using the LipofectAMINE reagent.
The genes for V16lCPH6, V115CPH6, CPV~6s, CPV109 and VEGFl65.
were cloned into the ~mm~lian expression vector pEFPlink.2.
These vectors are referred to as pEFV16lCPH6, pEFV1l5CPH6, pEFCPV16~, pEFCPV1og and pEFVEGFl65 respectively. COS cells were transfected with these plasmids or with pEFPlink.2.
Since all of the proteins expressed by these clones would be ~expected to be secreted, the conditioned medium was collected from the transfected cells 72 hours after transfection and examined for the presence of the fusion proteins as outlined below.

4.1 CPG2 activitY of fusion ~rotein First, the conditioned medium ~rom the transfected cells was examined for the presence of CPG2 enzyme activity. Samples of culture medium were analysed for their ability to degrade the CPG2 substrate MTX. The results show that in cells transfected with pEFPlink.2, there is no detecta~le CPG2 activity whereas the medium from the cells transfected with pEFCPH6 accumulated CPG2 activity to a level of 0.096 units of CPG2 per mL tU/mL) of medium tTable 2). The medium from the cells which were transfected with each of the fusion proteins were also found to accumulate CPG2 acti~ity with a range of 0.004 to 0.14 U/ml culture medium. Thus it can be seen that all four fusion proteins have the ability to express CPG2 in a form that is active and which is secreted from the cells and accumulates in the culture medium. That this accumulation of CP&2 is not due to the expression of VEGF in these cells can be discounted because the medium from the cells transfected with pEFVEGFl65 do not accumulate CPG2 activity ~Table 2).

4,2 SDS-~alysis if fu~ion ~roteins Although the results shown in section 4.1 indicate that CPG2 activity accumulates in the tissue culture medium, it does not show the presence of intact fu~ion proteins. In order to determine whether the secreted fusion proteins were subjected to proteolytic degredation, we e~m; n~d them ~y sodium dodecyl ~ulphate-polyacrylamide gel electrophoresis (SDS-PAGE) and immunoprotein blotting. Samples of conditioned medium from each of the COs transfections were su~jected to ;mmt~ precipitation with a CPG2 specific antiserum. The immunoprecipitated proteins were resolved by SDS-PAGE and the CPG2 proteins revealed by immunoprotein blotting with the ~PG2 specific antiserum.

There were no proteins which are recognised by the CPG2 specific antiserum in the conditioned medium from the cells transfected with pEFPlink.2 (Fig ~, lane 7). A protein with a Mr of ~45,000 was recognised by the CPG2 antiserum in ~he medium of cells transfected with pEFCPH6 ~Fig 2, lane 1).
The size of this protein is similar to the mass predicted for CPH6 (Table 3). The cu~ture medium from the cells transfected with pEFVl6lCPH6, pEFVll5CPH6, pEFCPVl65 and p~FCPVl09 all accumulated proteins which were recognised by the CPG2 specific anti~ody and are all were larger in mass than CPH6 (Figure 2, lanes 1, 3, 4, ~, 6). Each of the fusionR exists as multiple bands on SDS-PAGE and this was attributed to variable glycosylation on the VEGF moiety.
When the apparent molecular masses of the predom;~Ant bands were calculated from the SDS-gel, they were ~ound to be close to the predicted masses expected for each fusion, when the size of the signal peptides is taken into account (Table 3).
These data indicate that the fusion genes produce the expected and correct proteins. The proteins are of the correct sizes and there is no significant accumulation of frée CPG2 in any of the media from the cells transfected with 3~ expression vectors encoding the fusion proteins. This -- ~0 -indicates that the fusions are stable and not subjected to proteolytic degredation in the tissue culture medium.

~.3 ~eParin-b~n~n~ of the fusion Proteins The fusion proteins were tested for their abilities to bind to heparin with Heparin-Sepharose CL-6B (Pharmacia) which i8 a column matrix consisting of Sepharose beads to which heparin is covalently bound. The conditioned medium from the transfected cells was incubated with Heparin-Sepharose CL-6B
and the proteins which were found to bind to the matrix were identified by immunoprotein blotting.

CPH6 is shown by this analysis to be unable to bind to heparin (Fig 2B, lane 9). The two fusion proteins which contain the heparin binding domain of VEGF ~Vl6lCPH6 ar.d CPVl65) bound to heparin with high efficiency (Fig 2B, lanes lO and 13). In fact, when these results were quantitated it was seen that even under these crude conditions, more than 60~ of Vl6lCPH6 and CPVl6~ were able to bind to the heparin-matrix. Weak binding of Vll5CPH6 to the heparin-matirx ~Fig 7B, lane 11) ~ut when this was quantitated, it was shown to represent less than 7~ of the fusion bound to the matrix.
Furthermore, it should be noted that the amount of Vl1sCPH6 that was used in this analysis was far greater than either V161CPH6 or CPVl65 and so the figure of 7~ bound pro~ably represents an over-estimation of the true binding relative to that seen with Vl6lCPH6 or CPVl65. CPV1~9 was also found to bind to the heparin-matrix (Fig 2B, lane 12), ~ut when quantiated this was found to be only -1.3% bound and deemed not to be significant. These results show that as predicted, when the VEGF heparin-binding domain is present in the fusion proteins, it enables them to bind to the heparin-matrix, indicating that in the context of either a N- or a C-terminal CPG2 fusion, the heparin-binding domain o~ VEGF functions autonomously. When this region is excluded from the fusions, then the resulting proteins show only very weak heparin binding activity.

W o 97/26918 PCT/GB97/00221 4.4 Purification o~ thei~u8ion Proteirls by HeParin-Se~h~Lro~e affinitY chroma~o~ra~hY
Having shown that at least two of the fusions could bind to Heparin-Sepharose ~eads with high efficiency, this technique was used to purify the fusion proteins. Conditioned medium from COS cells trans~ected with the plasmid pEFCPV165 was passed over a Heparin-Sepharose column. The column was washed and the ~ound proteins were eluted with ~uffer containing increasing concentrations of NaCl. The elution of the fusion protein was detected by assaying the fractions for CPG2 acti~ity.

When cos-conditioned medium containing CPVl65 is processed over a Heparin-Sepharose column, ~96~ of the protein loaded does not ~ind. However, all of the fusion protein bound to the column, as determined by CPG2 enzyme assay. When the column was washed with buffer containing 400mM NaCl, ~0~ of the hound column (2~ of the protein in the load sample) elu~ed, and this fraction did not contain any of the CPG2 activity (Fig 3). When buffer containing 800mM NaCl was applied to the column, the majority of the protein remaining (~2~ of the protein loaded) eluted from the column. This fraction was found to contain the CPG2 activity and at higher NaCl concentrations, no more CPG2 activity was eluted (Fig 3).

The apparent yield of CPG2 enzyme activity eluted from the column is ~140~. This suggests that the activity of the purified protein is greater than that of the unpurified protein, possibly due to the presence of an inhibitory activity present in the conditioned culture medium. This makes it impossible to assess the purification fold of the sample, howe~er, the 800mM NaCl fraction contains all of the CPG2 loaded and eluted from the column , but only ~2~ of the protein loaded. Thus, the Heparin-Sepharose column gi~es a ~ ~0 fold purification, demonstrating the feasibility of this 3~ approach.

CA 02243533 l998-07-20 4.5. Ex~ression of the fusion ~roteins in Sf9 insect cells The bulk production of the fusion proteins was attempted in order to allow them to be further characterised. The insect cell system for this analysis was chosen because this system can produce high levels of foreign proteins when the cells are infected with recombinant baculoviruses. Insect cells usually allow correct formation of structures such as disulphide bonds and often perform post-translational modifications such as proteolytic cleavages and glycosylation, which occur in mammalian cells but are not seen in bacterial expression systems. Before progression to make the insect cell viruses, however, polyhistidine tags onto the C-termini of CPVl65 and CPVlO9 were incorporated to allow these proteins to be purified ~y Ni++-NTA-Agarose affinity column chromatography.

4.5.1 Cloninq the baculo~irus vectors A polyhistidine tag was inserted on the 3' ends of the genes for CPV165 and CPVlO9.

4.5.1.1 CPVl6sH6 The cloning of CPV16sH6 was performed by using oligonucleotides 4 and 8 in a PCR reaction with pVEGEl65. The PCR product was digested with the restriction endonucleases KpnI and BamHI and this fragment was cloned back into pEFCPVl65 ~ut in conjunction with oligonucleotides 10 and 11 to generate the polyhistidine tag at the 3' end of the gene.
This cloning strategy results in the addition of the polyhistidine tag to the 3' end of the vEGF moiety of the fusion gene and therefore to the C-terminus of the protein.
In constructing this gene, the terminal four codons of the VEGF gene were removed. This construct is re~erred to as CPVl61H6 and the protein is represented schematically in Fig lG. The sequence of the CPvl6lH6 gene and the protein that it is-predicted to encode is shown in SEQ ID No. 16.

4.S.1.2 cPVl09~6 W O 97/26gl8 PCT/GB97/00221 The cloning of CPVlogH6 was performed by usi~g oligonucleotides 8 and 12 in a PCR reaction with pVEGFl65.
The PCR product was digested with the restriction endonucleases KpnI and BamHI and this fragment was then cloned back into pEFCPVl6~ but in conjunction with oligonucleotides 10 and 11 to generate the polyhistidine tag at the 3' end of the gene. This cloning ~trategy results in truncation of the VEGF gene at codon 110 the addition of the polyhistidine tag to that position. Thus the residues requlred for heparin-binding are removed. This construct is referred to as CPvlo9~6 and the protein is represented schematically in Fig lH. The sequence of the CPV1ogH6 gene and t:he protein that it is predicted to encode is shown in SEQ ID No. 18.

The genes for CPH6, Vl61CPH6, V1l5CPH6, CPVl6sH6 and CPV1o~H6 were cloned into the insect cell shuttle vector pVLPlink.2.
These vectors were used to produce recom~inant Baculovirus particles following standard protocols.

4.5.~ ExPre#sion of CPH~ and the fusion ~roteins in S~9 cells Sf9 insect cells were infected with recombinant viruses directing expression of CPH6, V161CPH6, Vll5CPH6, CPVl65H6 and CPV1osH6 Control infections were also performed using a virus which has an empty polyhedron locus (the 'empty' virus). Since these cells are eukaryotic, all the proteins were expected to be secreted into the medium in which the cells were growing. We therefore ~X~m; ned the conditioned medium from the infected cells for the presence of CPG2 enzyme activity using MTX as a substrate.

Conditioned medium from cells infected with the empty virus are unable to degrade MTX ~Table 4). The conditioned medium from the cells infected with the empty virus was unable to degrade MTX, indicating the absence of any CPG2 acitivy in infected insect cells. By contrast, the medium from the CA 02243~33 1998-07-20 cells which were infected with the virus which expresses CPH6 accumulated CPG2 protein to a level of -0. 4 units of CPG2 per mL (U/mL) of tissue culture medium (Table 4). The medium from the cells infected with viruses with each of the other fusion proteins also accumulated CPG2 activity to between O.01 and 0.2 U/mL (Table 4), indicating the secretion of the fusion proteins.

4.6 Ni+l-A~arose AffinitY chromato~ra~hy of the fusion ~rote~ns We used the insect cell produced material to determine whether the polyhistidine tags could be used to purify the expressed proteins. For this analysis, we used CPH6 and vll5cPH6- Sf9 cells were infected with the appropriate viruses and the proteins secreted into the conditioned tissue culture medium were subjected to a purification protocol using Ni++-NTA-Agarose (Quiagen: used according to manufacturer's instructions). The conditioned media were dialysed to remove histidine present in the medium (which may interfere with binding of H6-proteins because histidine has an imidazole ring) and then passed over the column. The column was washed and the bound proteins were eluted with imidazole. Only 6~ of the protein in the conditioned medium bound to the Ni++-NTA column, and ~33~ of the ~ound protein was eluted from the column in the wash cycles (Table 5).
When the flow through and wash fractions were assayed for CPG2 activity, none could be detected in these fractions. In the case of ~oth CPH6 and Vll5CPH6, the amount of protein that eluted from the 90mM imidazole wash represented ~4~ of the protein loaded onto the column (Table ~). When this was assayed, it was found to contain ~99~ of the CPG2 activity loaded onto the column in the case of CPH6 and ~100~ o the CPG2 activity loaded onto the column in the case of Vll5CPH6.
Thus, the recovery from these columns is excellent.
.
In order to determine the purity of the eluted protein, the 3~ samples were subjected to SDS-PAGE and silver staining. The W O 97/265~18 PCT/GB97/00221 results show that ~oth the prepareations of CPH6 and Vl1~CPH6 were extremely pure, representing ~95~ of the proteins in the purified samples (Fig 4). This demonstrates the feasi~ility of using this ~pproach to purify the polyhistidine tagged proteins and also underscores the fact that the insect cells can produce large amounts of the fusion proteins which are secreted and accumulate to represent -4~ of the protein present in the medium and hence the fold purification required to prepare pure samples is only ~25.

Using this protocol, CPH6, Vl61CPH6, Vll5CPH6 and CPVlogH6 were all purified from Sf9 cells ~y Ni~+-NTA-Agarose affinity chromatography as described above. These samples were used to determine the affinity of the CPG2 moiety for the substrate MTX. The results show that the Km of all of ~he fusion proteins is similar to that seen for CPH6 (Table 6), which is similar to the Km determined for bacterially produced CPG2 of ~lO~M . This indicates that fusion of CPG2 to ~3GF does not impair the affinity of CPG2 for its substrate.

2 0 4 . 7 Dimer analYsi~3 o~ fusio~ proteinEl VEGF produced in m~m~l ian cells is a dimeric protein which is stabilised by inter-molecular and intra-molecular cysteine bridges. CPG2 is also a dimeric protein, but these dimers are stabilised by non-coval~nt interactions. In order to determine whether the VEGF/CPG2 fusion proteins which were produced in Sf9 cells are dimers sta~ilised by cysteine bridges, a dimer analysis was performed. CPH6 and V115CPH6, both produced in insect cells were heated either in the presence or absence of the reducing agent 2-mercaptoethanol.
The resultant proteins were resolved ~y SDS-PAGE and subjected to immunoprotein blot analysis using the CPG2 specific antiserum. The results show that following heating, either in the presence or absence of reducing agents, CPH6 migrates as a protein with a mass of ~45kDa (Fig 5, lanes 2, 4), showing that this protein does not exist in the form of - ~6 -dimers stabilised by cysteine bridges. V115CPH6, by contrast is seen as a protein with a Mr of 120 kDa in the absence of reducing agents, but as a protein with a Mr of -60 kDa in the presence of reducing agents (Fig 5, lanes l, 3). These data ~how that Vll5CPH6 produced in Sf9 cells is a dimeric protein which is stabilised by cysteine bridges, as in ~Amm~lian cells.

4 . 8 Bi n~ ~ na of the Fusion ~roteins to the VEGF rece~tor We used the insect cell expressed protein to determine whether the fusion proteins could bind to the VEGF receptor by biochemical methods. For this analysis, we developed an in vitro assay in which the ability of the fusion proteins to bind to the external domain of the VEGF receptor, KDR.
Receptor tyrosine kinases, of which KDR is a member, span the plasma membrane and consist of three domains. The external domain which are usually highly glycosylate contain the growth factor binding domain and thus are the sites of specific interaction with their ligands. There is then a short transmembrane domain, made of a single amino acid chain and the C-terminus which is located in the cytoplasm consists of the tyrosine kinase domain. In the assay we developed, we wished to use the extracellular domain (ECD) as a specific probe for VEGF binding. In order to achieve this, we cloned the KDR external domain and transmem~rane domain for expression in SF9 cells. The internal domain was removed in this cloning and a tag for the monoclonal antibody 9E10 was added to the 3'end of the cloned fragment and this construct is referred to as the KDR binding domain (BD) (See Fig 6A).
The rational for this approach was to express the ~CDR(BD) in Sf9 cells and prepare extracts from the infected cells. The KDR(BD) could then be imm~noprecipitated with the 9E10 monoclonal antibody and used as a probe to detect bin~;ng Of the VEGF/CPG2 fusions to the VEGF receptor.
..
4.8~1. Cloninq of ~DR 8D

- ~7 -For this approach, the KDR gene in the plasmid pBK-CMV~/KDRsense was u~ed. PCR, using oligonucleotides 13 and 14 in conjunction with the KDR gene was used to create an Nco 1 restriction endonuclease site at the ~' end of the gene.
The ~DR(BD) was then cloned as an Nco I/ Eco ~I fragment into the ~\aculovirus vector pVLPlink.2. The Eco ~I site in KDR is located at codons 809 and 810 of KDR which is just on the 3' end of the sequence encoding for the transmembrane domain (See Fig 6A). A tag for the 9E10 monoclonal anti~ody was cloned into the Eco RI site using standard cloning techni~ues with oligonucleotides 15 and 16. The seguence of this KDR
fragment, referred to as KDR(BD) is shown in SEQ ID No. 20 ~nd it is represented shcematically in Fig 6A.

4.8.2. ExPre~sion o~ PU~R RD in Sf9 cells The expression of the KDR(BD) in insect cells was verified by immunoprotein ~lotting of in~ected insect cells using the 9ElO monoclonal antibody. The results show that in cells infected with the KDR(BD) virus there is a band with a Mr of ~120kDa which is recognised by the 9E10 monoclonal antibody, which is not present in extracts from cells infected with the empty virus (Fig 7). The size of this band of is greater than the size expected for the cloned fragment (expected size = 92,769 Da) and this is probably due to glycosylation of KDR(BD). The data show that KDR(BD) can be expressed in Sf9 cells.

4.8.3. B; n~; n~ of fusion ~roteins to the RDR~BD) The abilities of the four VEGF/CPG2 fusion proteins to bind to KD~(BD) were tested. For this assay, extracts from insect cell that had been infected with either the KDR(BD) expressing ~irus, or a control (empty) virus were incubated with the 9E10 monoclonal antibody immobilised on Protein G-Sepharose beads, to precipitate the KDR(BD). The immunocomplexes were washed and then incubated with CPH6, V161CPH6~ V11sCPH6~ CPV161H6 and CPVlogH6 protein which had been 3~ expressed in insect cells and purified by Ni++-NTA Affinity CA 02243533 l998-07-20 chromatography. The complexes were washed once more and the presence of the fusion proteins judged measuring CP~2 activity in the immunocomplexes.

CPH6, was found not to interact with the KD~(BD), since no CPG2 activity is found in KDR(BD)/9E10 immunocomplexes (Fig 8, lane 6). However, Vl6lCPH6, Vl15CPH6, CPVl61H6 and CPVlogH6 were all able to bind to the KDR~BD) as seen 3~y the fact that with each of these fusions, CPG2 acitivty could ~e detected associated with the KDR(BD)/9ElO immunocomplex (Fig 8, lanes 7-lO). That this interaction was specific could be shown by the fact that when the 9E10 immunocomplexes were performed with extracts from Sf9 cells infected with empty virus instead of the KDR(BD) expressing virus, no binding of the ~usion proteins could be detected. (Fig 15, ~anes 2-5) These data establish that all four of the fusion proteins can interact specifically with the ligand binding domain of K~R, whereas CPH6 cannot.

5. CE~L ~YlOl~XICITY ~SSAYS
Taken together, the above data establish that fusion proteins can be constructed between CPG2 and VEGF in two orientations and with VEGF moities that do, or do not, contain the heparin-binding domain of VEGF. These proteins are secreted by eukaryotic cells and are stable in terms of VEGF functions (receptor binding and where appropriate, heparin-binding) and in terms o~ CPG2 enzyme activity. In order to prevent glycosylation of the CPG2 moiety, the sites of inappropriate glycosylation on CPG2 were mutated. This results in a protein with only ~10~ enzyme acti~ity compared to the wild type protein, ~ut its affinity for its substrate i5 indistinguishable from ~acterially expressed wild type CPG2 (see W096/03515). The presence of a polyhistidine tag at the C-terminus of the fusions allows the purification of the fusions by Ni++-NTA Agarose chromatography and the fusions which contain the heparin-binding domain of the VEGF moiety can be purified on Heparin-Sepharose columns. We therefore _ ~9 _ progressed to an analysis of the ability of the fusion proteins to direct prodrug dependent cytotoxicity of ~ mammalian cells.

- 5.1 Non-s~eci~ic cytotoxic~ty asEIaY
For the first assay, we wished to establish that CPH6 and ~he fusion proteins were a~le to direct prodrug dependent cyotoxicity of NIH3T3 cells, in a non-targeted ~anner. For this experiment, CPH6, V16lCPH6, Vll5CPH6, CPV16s~6 and CPVlogH6 were all purified by Nil+-NTA Agarose. Samples of the purified proteins were added to NIH3T3 cells in the presence of the CMDA prodrug and the effects on cell viability were determined by measuring cell growth by [3H]-thymidine incorporation.

Treatment of these cells with CMDA alone does not significantly affect the growth of these cells (Fig 9, lanes l, 2). When these cells are treated with sub-nanomolar concentrations of CPH~ or each of the fusion proteins in the presence of CMDA, complete cell death occurs (Fig 9, lanes 3-7). 'rhus, it can be seen that each of the fusion proteins is highly efficient in directing the specific killing of m~mm~l ian cells in the presence of the CMDA prodrug.

5.2 VEGF-directed CYtotoxicitY assay The fusion proteins were assesed for their ability to direct the CMDA-dependent killing of cells which express specific receptors for VEGF on their surface. For this analysis, we employed human umbilical vein endothelial (Hu-V-ec) cells, which are dependent for growth on VEGF (Prog Growth Factor Res, 5, 89, 1994). These cells are not significantly sensitive to CMDA as judged by the effects of lmM CMDA on their growth rate (Fig lOA). The cells were incubated in the presence of CPH6, or V115CPH6 for 30 min to allow the fusion proteins to bind to the cell surface and then the cells were washed to remove unbound protein. CMDA was added and cell survival was determined by t3H]-thymidine incorporation.

CA 02243533 l998-07-20 When the analysis is performed with CPH6, there is no significant affect on the growth rate of the cells (Fig lOB, lanes }, 2). ~y contrast, when the analysis is performed with V1lsCPH6, there is a >90~ inhi~ition of cell growth (Fig lOB, lanes 1, 3). These data show that CPH6 does not appear to bind to the surface of Hu-V-ec cells and so is unable to direct their killing in the presence of the prodrug, as it is washed off the cells prior to the addition of the prodrug.
However, V115CPH6 remains ~ound to the cell surfaces, is not washed off the cells and so is able to direct their killing when the prodrug is added. This pro~ides a proof of the feasability of the L~DEPT approach.

SEOu~ INFOR~TlON
In the sequences SEQ ID NOS. 1 to 20, the DNA sequences are presented in the conventional 5' -~ 3' orientation and the protein sequences in the conventional N- -~ C-terminal ~ 5 orientation as indicated.

SEO ID No~. 3 and 4 :DNA ~nd Protein Sequence~ of CPG2 The DNA sequence is presented and below it, the predicted protein sequence. Within the DNA sequence, the engineered restriction endonuclease sites are shown in lower case. The position of the Nco 1 site at the 5' end of the gene is also indicated. Four mutations within the gene are indicated by *. The mutation at *1 represents a silent mutation at the Asn codon at position 172 of CPG2 which occurred during the PCR cloning of this gene. The mutations at *2, *3 and *4 are the Asn -~ Gln mutations required to prevent glycosylation of CPG2 in mammalian cells. Numbers below the protein sequence indicate the codons of the genes from which the individual fragments are derived. The region derived from c-erbB2 is underlined and this contains codons 1 to 17 of that gene, which encompasses the signal peptide used to direct this protein into the secretory pathway. The region derived from CPG2 contains codons 23 to 415. The polyhistidine tag at the C-terminus of the clone is shown in italics within the protein sequence and the stop codon is indicated by ~.

2 5 SEO ID Nos 5 and 6 : DNA and ~rotein Se~ences of VEGF .
The ~NA sequence is presented and below it, the predicted protein sequence. Within the protein seguence, the signal peptide which is proteolytically cleaved from the mature protein is underlined. The amino-acid numbering therefore begins with residue -26 to indicate that ~act, and residue 1 is lhe first amino acid in the mature protein. The portion of the single glycosylation site is indicated by *. The stop codon is indicated by ~.

In the fusion protein sequences the DNA sequence is presented and below it, the predicted protein ~equence. The VEGF
derived sequences are shown in bold type, and the CPG2 derived sequences in normal type. The two DNA in SEQ ID Nos.
7, 9, 11 and 13 sequences are taken ~rom SEQ ID Nos. 3 and 5.

SEO ID Nos. 7 and 8 : DNA and Protein Seouences of V,~CPHF
A silent mutation within the VEGF sequence at codon 2 (CCC-~CCT) of the mature protein was created to destroy ~he Nco I
site located in this gene (~). The engineered restriction endonuclease sites are shown in lower case. The numbers below the protein sequence indicate the codons for the genes from which the individual ~ragments are derived. The VEGF
sequences are from -26 to 161 and the CPG2 sequences from 23 to 415. The signal peptide from VEGF which is cleaved proteolytically from the mature protein is underlined. The polyhistidine tag is shown in italics within the protein sequence and the stop codon indicated by ~.

SEo ID Nos. 9 and 10 : DNA and Protein Sequences of V,,~CP~6 A silent mutation within the VEGF sequence at codon 2 (CCC-~CCT) of the mature protein was created to destroy the Nco Isite located in this gene (*). Additionally, the VEGF gene has been truncated at codon 115 as indicated by the numbering below the protein sequence. The engineered restriction endonuclease sites are shown in lower case. The numbers below the protein sequence indicate the codons for the genes from which the individual fragments are derived. The VEGF
sequences are from -26 to 11~ and the CPG2 sequences from 23 to 415. This fusion therefore does not contain the amino acids responsible for heparin-binding of VEGF. The signal peptide from VEG~ which is cleaved proteolytically from the mature protein is underlined. The polyhistidine tag is shown in italics within the protein sequence and the stop codon indicated by ~.

CA 02243533 l998-07-20 W O 97/26'918 PCT/GB97/00221 SE0 ID Nos. 11 and 12 : DNA and Protei~ Seauences o~ Cl?Vl~;5 A peptide spacer was generated by additional DNA sequences ~ between the two, encompassed by the Eco RI and Kpn 1 sites.
The spacer peptide which these encode are indicated in - ~ italics and underlined, within the protein seguence. The cloning of this construct resulted in the l 05S of the polyhistidine tag from the 3' end of the CPG2 derived sequences and also in the r~moval of the se~uences for the VEGF signal peptide. At the ~' end of the gene are the first 27 ccdons of c-erbB2, followed ~y CPG2 amino acids 23-415, fused via the peptide spacer to VEGF amino acids 1-165. The stop codon is indicated by ~.
~.
SE0 ID Nos 13 and 14 : DNA and Protein Sequences of CPV,09 A peptide spacer was generated by additional DNA sequënces between the two, encompassed ~y the ~co RI and Kpn l sites.
The spacer peptide which these encode are indicated in italics and underlined, within the protein sequence. The cloning of this construct resulted in the loss of the polyhistidine tag from the 3' end of the CPG2 derived se~uences and also in the removal o~ the sequences for the VEGF signal peptide. This clone was created when, a PCR
generated mutation occurred during the creation of CPV165.
This converts the Arg codon 110 of mature VEGF into a stop codon, resulting in truncation o~ the VEGF fraction at codon 109 as indicated in the protein sequence derived from this clone (*). At the 5' end of the gene are the first 27 codons of c-erbB2, followed by CPG2 amino acids 23-415, ~used via the peptide spacer to VEGF amino acids 1-109. The mutation resulting in the stop codon is indicated by ~.

SE0 ID ~08. 15 and 16 : DNA and Protein Seouences o~ CP~,~5H~
The DNA sequences of VEG~ and CPG2 are taken from SEQ ID No.
11. A peptide spacer was generated by additional DNA
seque~ces ~etween the two, encompassed by the ~co RI and Kpn 1 sites. The spac;er peptide encode is indicated in italics and underlined, within the protein sequence. In this clone, a polyhistidine tag has been included in the gene at the position 3' to codon 161 of mature VEGF ~y creating a Bam ~I
site at codon 162 of mature VEGF as shown. The polyhistidine tag was then cloned into this position and this encodes for a C-terminal protein tag as shown in italics at the end of the protein sequence. During the cloning, for con~enience, an Eco RI site was created 3' to the ~am XI site as indicated. At the 5' end o~ the gene are the first 27 codons of c-erbB2, followed by CPG2 amino acids 23-415, fused via the peptide spacer to VEGF amino acids 1-161 and then the polyhistidine tag. The stop codon is indicated by ~.

SE0 ID No~. 17 and 18 : DNA and Protein Seouence~ of CPV10~6 The two DNA sequences of VEGF and CPG2 are taken from SEQ ID
No 13. A peptide spacer was generated by additional DNA
sequences between the two, encompassed by the Eco RI and ~pn 1 sites. The spacer peptide which these encode is indicated in italics and underlined, within the protein sequence. In this clone, a polyhistidine tag has been included in the gene at the position 3' to codon 109 of mature VEGF by creating a Bam HI site at codon 110 of mature VEGF as shown. The polyhistidine tag was then cloned into this position and this encodes for a C-terminal protein tag as shown in italics at the end of the protein sequence. During the cloning, for convenience, an Eco RI site was created 3' to the Bam HI site as indicated. At the 5' end of the gene are the first 27 codons of c-erbB2, followed by CPG2 amino acids 23-415, ~used via the peptide spacer to VEGF amino acids 1-109 and then the polyhistidine tag. The stop codon is indicated by ~.

SE0 ID NOS. 19 AND 20 : DNA AND PROl lN ~u~ OF ~DR(BD) The position of the engineered Nco 1 site at the 5' end and the endogenous Eco ~I site at codons 80g and 810 are indicated. The signal peptide is indicated in the protein sequence as the underlined region. The transmembrane region, located between codons 746 to 770 and the protein sequence is underlined and in italics. The DNA se~uence used to encode the C-terminal 9E10 tag is shown in lower case and the protein sequence it encodes is in italics. This region encodes for amino acids 1-810 of R~R fused to the 9Elo epitope. The stop codon indicated by ~.

CA 02243~33 1998-07-20 SEQm3NCE INFOR~l~TION
Sequences of ~PG 2 SEQ ID NO. 1: ~enomic DNA
SEQ ID NO. 2: Protein Sequence GCCCGACMC AGGCGTCCAC CAG(~ CAl~CCGACA ACCCGMCGA ACAATGCGTA 180 Met Arg Pro Ser Ile His Arg Thr Ala Ile Ala Ala GTG CT& GCC ACC GCC TTC GTG GCG GGC ACC GCC CTG GCC CAG MG CGC Z78 Val Leu Ala Thr Ala Phe Val Ala Gly Thr Ala Leu Ala Gln Lys Arg Asp Asn Val Leu Phe Gln Ala Ala Thr Asp Glu Gln Pro Ala Val Ile Lys Thr Leu Glu Lys Leu Val Asn Ile Glu Thr Gly Thr Gly Asp Ala Glu Gly Ile Ala Ala Ala Gly Asn Phe Leu Glu Ala Glu Leu Lys Asn CTC GGC TTC ACG GTC ACG CGA AGC MG TCG GCC &GC CTG GTG GTG GGC 470 Y 80 ~Y35 90 Asp Asn Ile Val Gly Lys Ile Lys Gly Arg Gly Gly Lys Asn Leu Leu Leu Met Ser His Met Asp Thr Val Tyr Leu Lys Gly Ile Leu Ala Lys 110 115 12û

Ala Pro Phe Arg Val Glu Gly Asp Lys Ala Tyr Gly Pro Gly Ile Ala Asp Asp Lys Gly iGly5 Asn Ala Val Ile lL5uO His Thr Leu Lys Leu Leu =

W O 97126'918 PCT/GB97/00221 Lys Glu Tyr Gly Val Arg Asp Tyr Gly Thr Ile Thr Val Leu Phe Asn ACC GAC GAG GM MG GGT TCC ~TC GGC TCG CGC GAC CTG ATC CAG GAA 7~8 Thr Asp Glu Glu Lys Gly Ser Phe Gly Ser Arg Asp Leu Ile Gln Glu Glu Ala Lys Leu Ala Asp Tyr Val Leu Ser Phe Glu Pro Thr Ser Ala GGC GAC GM MA CTC TCG CTG GGC ACC TCG GGC ATC GCC TAC GTG CAG 8~4 Gly Asp Glu Lys Leu Ser Leu Gly Thr Ser Gly Ile Ala Tyr Val Gln 205 210 215 22û

Val Asn Ile Thr Gly Lys Ala Ser His Ala Gly Ala Ala Pro Glu Leu GGC GTG MC GCG CTG GTC GAG GCT TCC GAC CTC GTG CTG CGC ACG ATG g50 Gly Val Asn Ala Leu Val Glu Ala Ser Asp Leu Val Leu Arg Thr Met Asn Ile Asp Asp Lys Ala Lys Asn Le~ Arg Phe Asn Trp Thr Ile Ala 2~5 260 265 Lys Ala Gly Asn Val Ser Asn Ile Ile Pro Ala Ser Ala Thr Leu Asn GCC GAC GTG CGC TAC GCG CGC MC GAG GAC ~TC GAC GCC GCC ATG MG 1094 Ala Asp Val Arg Tyr Ala Arg Asn Glu Asp Phe Asp Ala Ala Met Lys Thr Leu Glu Glu Arg Ala Gln Gln Lys Lys Leu Pro Glu Ala Asp Val Lys Val Ile Val Thr Arg Gly Arg Pro Ala Phe Asn Ala Gly Glu Gly Gly Lys Lys Leu Val Asp Lys Ala Val Ala Tyr Tyr Lys Glu Ala Gly Gly Thr Leu Gly Val Glu Glu Arg Thr Gly Gly Gly Thr Asp Ala Ala ~50 355 360 CA 02243~33 1998-07-20 W O 97/26918 PCT/GB97/~0221 - 68 -Tyr Ala Ala Leu Ser Gly Lys Pro Val Ile Glu Ser Leu Gly Leu Pro 36~ 370 375 380 Gly Phe Gly Tyr His Ser Asp Lys Ala Glu Tyr Val Asp Ile Ser Ala Ile Pro Arg Arg Leu Tyr Met Ala Ala Arg Leu ~le Met Asp Le~ Gly Ala Gly Lys 41~

GTCACATAGA AGGMCTGCC ATGTTGTTGA CAGCAGACCA GGMGCCATC CGCGACGCGG lS00 ATGGCGGCGC CGGCCTCGAC TACCTCACCT CGCGCTGGTG CTGGAGGAGA TCGCGGCCGG ~780 CGGCMGMC GGCCAGGTGG CGGGATCC . 2048 CA 02243533 l998-07-20 W O 97/26~18 PCT/GB97/00221 Sequenc:es of CPH6 SEQ ID NO.3: DNA Sequence SEQ ID NO.4: Protein Sequence ~co ~' CCAl~GAGCT&GCG~
~- Met~luLel~l.A ~1~T .e~C~ys CGCTGGGGGCTC~ C~ l~CCC CCCGGAGCCGCGAGCA~CCAAGTGTGCACC
A~aTr~G:lvLeuLeuLeuAlaLeuLell~ro ~roGlvAlapl2serT~rGlnvalcvsThr Bam HI
gyatccGCCCTGGCCCAG~AGCGCGACAAc GTGCTGTTCCAGGCAGCTACCGACGAGCAG
GlySerAla:Le ~laGlnLysAr:rAspAsn ValLeupheG~ aThrAspGl~Gln CCGGCCGTGA~CAAGACGCTGGAGAAGCTG GTcAAc~TcGAGAccGGcAccGGTGAc&~c ProAl~ValIleLysThrLeuGl~LysLeu ~alAsnIleGluThrGlyThrGlyAspA12 G~C-GGCATCGCCGCTGCGGGCA~CTTCCTC GAGGCCGAGCTCAAGAACCTCGGCTTCACG
GluC-lyIleAlaAlaAlaGlyAsnPheheu GluAlaGlu~eu~ysAsnLeuGlyPhe~h~

GTCACGCGAAGCAAGTCGGCCGGCCTC-GTG GTC~ CGACAACATCGTGGGCAAGATCAAG
ValThrArgSerLysSerAlaGlyLeuV~l Val~lyAspAsnIleValGlyLysIleLys GGCCGC~GCG5C~AGAACCTGCTGCTGATG TCGCACATGGACACC~ ACCTCAAGG~C
GlyArgC-ly&lyLysAsnLeuLeuLeuMet SerHis~etAspThr~alTyrLeu~ysGly ATTcTcGcGAAGGccccGTTccGcGTcGAA GGCGACAAGGCCTACGGCCCGGGCATCGCC
IleLeuAlaLysAlaProPheArgValGlu GlyAspLysAlaTyrGlyProGlyIle~Ja GACGACAAG.r-~CGGCAACGCGGTCATCCTG CACACGCTCAAGCTGCTGAAGGAATACGGC
AspAspI.ysGlyGly~sn~la~alIleLeu HisThrLeuLys~euLeuL~sGluTyrGly *l GTGCGCGACTACGGCACCATCACCGTGCTG TTCAATACC&ACGAGGAAAAGG~~ C
ValArgAsp~yrGlyThrIleThrValLeu PheAsnThrAspGluGluLysGlySerPhe GGCTCGCGCGACCTGATCCAGGAAGAAGCC AAGC~GGCCG~CTACGTG~TCTCC~ ~-~AG
GlySerArgAspLeuIleGlnGluGluAla LysLeuAlaAs~Tyr~al~euserphe&lu CCCACC~GCGCAGGCGACGAAA~CTCTCG CT~GGCACCTCGGGCATCGCCTACGTGC~G
ProThrserAlaGlyAspGlllLysLeuser LeuGlyThrserGlyI~ aTyrvalGln ~2 GTCCAA~TCACCGGCAAGGCCTCGCATGCC GGCGCCGCGCCCGAGCTGGGCGTGAACGCG
Va1Gln~leThr~lyLysAlaSerHisAla ~lyAlaAlaProGluLeuGlyValAsnAla CTGGTCGAGGCTICCGACC~CGTGCTGCGC ACGATGAACATCGACGACAAGGCGAAGAAC
LeuValGluAlaSerAspLeu~alLe~Ara Thr~etAsnIleA~pAspLysAlaLysAsn ~3 *4 CTGCGCTTCCAATGGACCATCGCCA~GGCC GGCCAAG~CTCGAACATCATCCCCGCCAGC
LeuArgpheGlnTrpThrIleAlaLysAla GlyGlnValSerAsnIleIleProAlaSer AlaThr~euAsnAlaAspValArgTyrAla ArgAsnGluAspPheAspAlaAlaMetLys ACGCTGGAAGAGCGCGCGCAGCAGAAGAAG CTGCCCGAGGCCGACGTGAAGGTGATCGTC
ThrLeuGluGluArgAlaGlnGlnLysLys LeuProGluAlaAspVal~ysValIle~al A~GCGCGGCCGCCCGGCCTTCAATGCCGGC GAAGGCGGCAAGAAGCTGGTCGACAAGGCG
ThrArgGlyArgProAlaPhcAsnAlaGly GluGlyGlyLysLys~eu~alAspLysAla GTGGCCTACTACAAGGAAGCCGGCGGCACG CTGGGCGT&GAAGAGCGCACCGGCG~CG~C
ValAlaTyrTyrLysGluAlaGlyGlyThr LeuGlyValGluGlUArgTh~GlyGlyGly ACCGACGCGGCCTACGCCG~l~lGAGGC AAGCCAGTGATCGAGAGCCTGGGCCTGCCG
ThrAspAlaAlaTyrAlaAlaLeuSerGly LysPro~alIleGluSerLeuGlyLeuPro GGCT~CGGCTACCACAGCGACAAGGCCGAG ~ACGTGGACATCAGCGCGAITCCGCGCCGC
GlyPheGlyTyrHisSerAspL~sAlaGlu ~yrValAspIleSerAlaIleProArgArg Eco RI
GTGTACATGGCTGCGCGCCTGATCATGGAT CTGGGCGCCGGCAAGgaattc~ATCATCAC
~euTyrMetAlaAlaArgLeuIl~MetAsp LeuGl~AlaGlyLysCluPhe~isHisHis XbaI
CACCA~C~ACGCl~TCC~AGtctaga 3 Hi sHi sHi sAl 2 Ser ~ -C

SEQUENCES C~= VEGF
S~Q ID NO.5 : DNA ~ ~N~
SEQ TD NO.6 : PRO~EIN SEQUENOE
5'ATG&~CTTTCTGCTGTCTTGGGTGCATTGGAGCCTTGCCTTGCTGCTCTACCTCCACCAT
N-MetA~ppheLeuLeuserTrpvalHIaTrpserLe~AlaLeuLeuLeuTyr~euHisHi - . -26 .~co 1 GCCAAG'~GTCCCAGGCTGCACCCAT&GCA GAAGGAGGAGGGrA~-AA~CATCACGAAGTG
AlaLYsTrpserG~n~ proMetAla AspGlyGlyGly~l nA~n~; ~; sGluVal GTGAAGTTCATGGATGTCTATCAGCGCAGC TACTGCCATCCAATCr.~ACCCT~Gq~C
ValLysPheMetAspValTyrGlnArgSer TyrCys~isProIleGluThrLeuValAsp ATCTTCCAGGAGTACCCTGATGhGATCGAG TAC~TCTTCA~GCCATCCTGTG~GCCCCTG
IlePher,l n~l uTyrProAspGluIleGlu TyrIlePheLysProSerCysValPrO~eu ATG~GAq~C&GGGGCTGCTGCAATGACGAG GGCCTGGAGTGTGTGCCCACTGAGGAGTCC
MetArgCysGluGluCysCys~n ~ ~pGlu GlyLeuAspCysValProThrGluGluSer AACATC~CCATGCAGATTA~GCGGATCAAA CcTcAccAAGGccAG~A~ TG
A~Tl el~rMetGlnIle~etArgIleLys Pro~isGlnGlyGlnHisIleGlyGlUMet *

AGCTTCCTACAG~A~AACAbA~GTGAATGC AGAcc~A~GAAA-~-ATAG~rAA~-AcAAG~A
SerPhe~e~ ~;~A~T.ys~ysGluCys ArgProLysLysAspArgAl~A~gGlnGlu AaTCCCTGTGGGCCTTGCTCAGAGCGGAGA AAGCATTTGT ~ Gl'A(~AAf-'Aq'CCGCAGACG
A5nProCysGlyProCysSerGluArgArg LysHis~euPheValGlnAspProGlnThr TGTAAATGTTCCTGC~ A AA~ GACTCG CGTTGCAAGGCGAGGCAGCTTGAGTTAAAC
CysLysCysSerCys~ysAsnThrAspSer ArgCys~ysAlaArgGlnLeuGluLeuAsn GAACGTACTTGCAGATGTrA~A~CGAGG CGGTGA3' GluAr~ThrCysArgCysAspLysProArg Arg ~ -C

SUBSTITUTE SHEET (RULE 26) W O 97/26918 PCT/GB97/0~221 SEQUENC:~ES OF Vl 61 CPI 1~ ~
SEQ ID ~0.7 : DN~ SEQUENCE
SEQ ~D NO 8 : PROTEI~ U~N~ Ncol 5' ccATGGACTTTCTGCTGTCTTGG
N- MetAs~heLe~LeuSerTr~
GTGCATTGGAGCCTTGCCTTGCTGCTCTAC CTCCACCATGCCAAGTGGTCCCA&GCTGCA
Val~Ia~Se~Leu~laLeuLeuLe~Ty~ ~euHisHisAlaLY~Tr~serG~nAlaAla *
CCTATGGCAGAAGGAGGAGGGCAGAATCAT CACGAAGTGGTGAAGTTCATGGATGTCTAT
ProMetAlaAspGlyGlyGlyGlnAsnHis HisGluValValLysPheMetAspValTyr CAGCGCAGCTAC~GCCATCCAATCGAGACC CTGGTGGACATCTTCCAGGAGTACCCTGAT
GlnA~SerTyrCy~HisProIleGluThr LeuValA8pIl~Ph~GlnGluTyrProAsp GAGATCGAQTACATCTTCAAGCCATCCTGT GTGCCCCT.GATGCGATGCGGGGGCTGCTGC
GluIleGluTyrIlePheLy~ProSerCyfi ValProLeuMetArgCy8GluGluCyaCys AATGACGAGGGCCTGGAGTGTGTGCCCACT GAGGAGTCCAACATCACCATGCAGATTATG
AsnAspGl~GlyL~uA~pCycValProThr GluGluS~rAsnIleThrMetGl~IleMet CGGATCAAACCTCACCAAGGCCAGCACATA GG~GAGATGAGCTTCCTACAGCACAACA~A
Ar~IlQ~ycProHi8GlnGlyGlnHi8Ile GlyGluMetSerPheLeuGlnHisAsnLys TGTGAATGCAGACCAAAGAAAGATAGAGCA AGACAAGAAAATCCCTGTGGGCCTTGCTCA
CysGluCysArgProLy~Ly~AspArgAla ArgG~nGl AsnProCy~GlyProCy~Ser GAGCGGAGAAAGCATTTGTTTGTACAAGAT CCGCAGACGTGTAAATGTTCCTGCPA~A~C
GluArsJAr~LysHicLeuPheValGlnAsp ProGlnThrCysLysCysSerCysLysAcn AC~GACTCGCGTTGCAAGGCGAGGCAGCTT GAGTTAAACGAACGTACTTGCAGATGTGAC
ThrAspSer~r~Cy~Ly8AlaAr~GlnLeu GluLeuAsnGluArgThrCysArgCysA8p Bam HI
ggatccGCCCTGGCCCAGAAGCGCGACAAC GTGCTGTTCCAGGCAGCTACCGACG~GCAC-GlySerAlaLeuAlaGlnLysArgAspAsn ValLeuPheGlnAlaAlaThrAspGluGln CCGGCCGTGATCAAGACGCTGGAGAAGCTG GTCAACATCGAGACCGGCACC&GTGACGCC
ProAlaValIleLysThrLeuGluLysLeu ValAsnIleGluThrGlyThrGlyAspAla GAGGGCATCGCCGCTGCGGGCAACTTCCTC GAGGCCGAGCTCAAGAACCTCGGCTTCACG
GluGlyIleAlaAlaAlaGlyAsnPheLeu GluAlaGluLeuLysAsnLeuGlyPheThr GTCACGCGAAGCAAGTCGGCCGGCCTGGTG GTGGGCGACAACATCGTGGGCAAGATCAAG
ValThrArgSerLysSerAlaGlyLeuVal ValGlyAspAsnIleValGly~ysIle~ys GGCCGCGGCGGCAAGAACCTGCTGCTGATG TCGCACATGGACACCGTCTACCTCAAGGGC
GlyArgGlyGlyLysAsnLeuLeuLeuMet SerHisMetAspThrValTyrLeuLysGly ATTCTCGCGAAGGCCCCGTTCCGCGTCGAA GGCGACAAGGCCTACGGCCCGGGCATCGCC
IleLeuAlaLysAlaProPheArgValGlu GlyAspLysAlaTyrGlyProGlyIleAla S~S ~ TE SHEET(RULE26) CA 02243533 l998-07-20 W O 97/26'318 PCT/GB97/00221 GACGAC}~GGGCGGCAACGCGGTCATCCTG CACACGCTCAAGCTGCTGAAGGAATACGGC
AspAspl,ysGlyGlyAsnAlaValIleLeu His~hrLeuLysLeuLeuLysGluTyrGly GTGCGCGACTACGG~ACCATCACCGTGCTG TTcAATAccGAcGAGGAAAAGG~ ~c~
ValArgAspTyrGlyTh~IleT~rVal~eu PheAsnThrAspGluGluLysGlySerPhe GGCTCGCGCGACCTGAICCAGGAAGAAGCC ~GCTGGCCGACTACGTGCTCTCCTTCGAG
GlySer.~raAspLe~IleGlnGluGluAla ~ysLeu~laAsp ~ ValIJeuSerPheGlu CCCACC~GCGCA5GCGACG,~AAAAC~ CTGGGCACCTCGGGCATCGCCT~CGTGCAG
ProThrSerAlaGlyAspGluLysLeuser ~eu&lyThsSerGlyIleAlaTyr~alGln GTcc ~ ~TcAccGGcA;~GGccTcGcATGcc GGCGCCGCGCCCGAGCTG~j~C~;il~AACGCG
ValGlnlleThr&lyLysAl~SerHisAla GlyAlaAlaP~oGluLeuGlY~alAsnAla CTGGTcGAGGcTTccGAccTcGTGcTGcGc ACGATGAACATCGACGACAAGGCGAAGAAC
~euVa~GluAlaSerAspkeuval~euArg ThrMetAsnIleAspAspLysAlaLysAsn CTGCGCTTCCAATGGACCATCGCCAAGGCC GGCCAAGTCTCGAACATCATCCCCGCCA~C
~euArgpheGl~TrpThrIleAlaLysAla GlyGlnV~lSerAsnI~eIle~roAlaSer ~CCACGCTC-~ACGCCGACGTGCGCTACGCG CGCAACGAGGACTTCGACGCCGCCATGAAG
AlaThr~euAsnAla~spValArgTyrAla ArgAsnGluAsppheAspAlaplaMetLy~
ACGCTGG~AGAGCGC~CGCAGCAGAAGAAG CTGCCCGAGGCCGACGTGA~GGT~AT~GTC
~hrLeuGluGluArgAlaGlnGlnLysLys LeuProGluAlaAspVal~ys~alIle~al ACGCGCGGC~GCCCGGCCTT~AATGCCGGC GAAGG~GGCAAGA~GC~GGTCGACAAGGCG
ThrAraGlyAr~ProAlaPheAsnAlaGly GluGlyGlyLysLysLeuValAsp~ysAla GTGGCCTACTACAAGGAAGCCGGCGGCACG CTGGGCGTGGAAGAGCGCACCGGCGGCGGC
ValAla~yrTyrLysGluAlaGlyGlyThr LeuGlyValGluGluAryThrGlyGlyGly ACCGACGCGGCCTACCCCGCGCTCTCAGGC AAGCCAGTGATCGAGAGCCT~GGCCTGCCG
Th~A-cpAlaAlaTyrAlaAlaLeuS~rGly LysProValIleGluSerLeuGlyLeuPro GGCTTCGGCTACCACAGCGACAAGGCCGAG TACGTGGACATCAGCGCGATT~CGCGCCGC
GlyPheGlvTyrHisSerAspLysAlaGlu TyT~alAspIleSerAlaIleProArgArg Eco ~I
CTGTAC~TGGCT&CGCGCCT~ATCATGGAT CTGGGCGCCGGCA~Gg~attcCATCATCAC
LeuTyrMe~AlaAlaArgLeuIleMetAsp LeuGlyAlaGlyLysGluPhe~i,~HisHis XbaI
CACCA~CACGC~CCTAGtc~aga 3 HisHisHisAlaSer ~ -C

W O 97/26918rCT/GB97/00221 SEQUENÇE QF V1 l~;CPHç.
S13Q ID NO. 9 : DNA S~lENOE Nco ID NO. 10 : PROTEIN ~U~L~ 5' ccATG
N- ~et GACTTTCTGCTGTC~TGGGTGCATTGGAGC CTTGCCTTGCTGCTCTACCTCCACCATGCC

Asl~PhçI~e.uI,euS~3rT~ValHIaTrPS~3r LeuAla1e7lhçuLeuTYrI,çuHislIisAlz AAGTGGTCCCAGGCTGCACCTATGGCAGAA GGAGGAGGGCAGAATCATC~CGAAGTGGTG
LY~Tx~SerGln~laAlaProMetAlaA~p GlyGlyGlyGlnA8n~isHi8GluValVal AAGTTCATGGATGTCTATCAGCGC~GCTAC TGCCATCCAATCGAGACCCTGGTGGACATC
Ly~PheUetA8pValTyrGinArgSerTyr Cy8Xi8ProIleGluThrLeuValA8pIle TTCCAGGAGTACCCTGATGAGATCGAGTAC ATCTTCAAGCCATCCTGTGTGCCCCTGATG
PheGlnGluTyrProA8pGluIl~GluTyr IlePh~hy5ProSerCy8ValProLeuM~t CGATGCGGGGGCTGCTGCAATGACGAGGGC CTGGAGTGTGTGCCCACTGAGGAGTCCAAC
Ar~CysGlUGl~CysCy~AsnAspGluGly ~euA8pCy8ValProThrGluGluSerA~
A~CACCATGCAGATTATGCGGATCAAACCT CACCAAGGCCAGCACATAGGAGAGATGAGC
IleThr~etGlnIleMetAr~Ile~ysPro ~i8GlnGlyGln~i8IlQGlyGluMetSer TTCCTACAGCACAACAAATGTGAATGCAGA CCAAAGAAAGATAGAGCAAGACAAGAAAAT
PheL~uGln~i8A8nLy~Cy8GluCy~Ar~ Prohy8Ly8A8~ArgAla~r~GlnGluA;n Bam HI
ggatccGCCCTGGCCCAGAAGCGCGACAAC GTGCTGTTCCAGGCAGCTACCGACGAGCAG
GlySerAlaLeuAlaGlnLysArgAspAsn ValLeuPheGlnAlaAlaThrAspGluGln ~3 CCGGCCGTGATCAAGACGCTGGAGAAGCTG GTCAACATCGAGACCGGCACCGGTGACGCC
ProAlaValIleLysThrLeuGluLysLeu ValAsnIleGluThrGlyThrGlyAspAla GAGGGCATCGCCGCTGCGGGCAACTTCCTC GAGGCCGAGCTCAAGAACCTCGGCTTCACG
GluGlyIleAlaAlaAlaGlyAsnPheLeu GluAlaGluLeuLysAsnLeuGlyPheThr GTCACGCGAAGCAAGTCGGCCGGCCTGGTG GTGGGCGACAACATCGTGGGCAAGATCAAG
ValThrArgSerLysSerAlaGlyLeuVal ValGlyAspAsnIleValGlyLIsIleLys GGCCGCGGCGGCAAGAACCTGCTGCTGAT~ TCGCACATGGACACCGTCTACCTCAAGGGC
GlyArgGlyGlyLysAsnLeuLeuLeuMet SerHisMetAspThrValTyrLeuLysGly ATTCTCGCGAAGGCCCCGTTCCGCGTCGAA GGCGACAAGGCCTACGGCCCGGGCATCGCC
IleLe~uAlaLysAlaProPheArgValGlu GlyAspLysAlaTyrGlyProGlyIleAla GACGACAAGGGCGGCAACGCGGTCATCCTG CACACGCTCAAGCTGCTGAAGGAATACGGC
AspAspLysGlyGlyAsnAlaValIleLeu HisThrLeuLysLeuLeuLysGluTyrGly SUD~ 111 ~JTE SHEET (RULE Z6) -GTGCGCG~CTACGGCACCATCACCGTGCTG TTcAAT~ccGAcGAGGAAAAGGGTTccTTc ValArg~spTy~GlyThrIleThrValLeu PheAsn~hrAspGluGluLysGlyserphe GGCTCGCGCGACCTGATCC~GGAAGAAGCC AAGCT~GCCGACTACGTGC~ r~GAG
GlySerAr~AspLeuIleGlnGluGlu~la Lys~euAlaAspTyrValLe~Se~PheGlu CCCACCA~CGCAGGCGACGAAAAA~l~l~ CTG~GCACCTCGG~CATCGCCTACGTGCAG
ProThrSerAlaGlyAspGlu~ysLe~er LeuGlyThrSerGlyIleAl~Tyr~alGln GTccAAA~cAccGGcAAGGccTcGcATGcc GGCGCCGCGCCCGAGCTGG&CGTGAAC~CG
Val51nIleThrGlyLysAlaSerHisAla ~ rAlaAlaProt~;luLeuGlyValAc~
CTGGTCGAGGCllCC~ACCTCGTGCTGCGC ACGATGAACATCGACGACAAGGCGAAGAAC
LeuValGluAlaSerAspLeuValLeuArg ThrMetAsnIleAspAspLysAlaLysAsn CTC-CGCTTCCAATGGACCATCGCCAAGGCC GGccAAGi~AAcATc~Tc~ccGccAGc LeuArgpheG~ TrpThrIleAlaI,ysAla G~ nValSerAsnIleIlePr~AlaSer GcGAcGcTGAA~Gcc&A~GT~cGcTAcGcG CGCAACGAGGACT~CGACGCCGCCATGAAG
AlaThrLeuAsnAlaAspvalArgTyrAla ArgAsnGluAsp~heAspAlaAlaMetLys ACGCTGGAAGAGCGCGCGCAGCAGAAGAAG CTGCCCGAGGCCGACGTGAAGGTGATCGTC
ThrLeuGluGluArgAlaGl~GlnLysLys LeuPr~GluAlaAspValLys~alIleVal AC~.CGCC~CCGCCC~G~CT~CAAT&CCGGC GAAGGCGGCAAGAAGCTGGTCGACAAG~CG
ThrArgGlyAr~ProAlaPheAsnAlaGiy GluGlyGlyLysLysLeU~alAspLysAla GTGGCC~ACTA~AAGGAAGCCGGCGGCAC& ~lG~G~l~GAAGAGCGCACCGGCGGCGGC
Val~la~yrTyrLysGluAlaGlyGlyThr LeuGlyValGluGlUAraThr&lyGlyGly ACCGACGCGGCCTACGCCGCGCTCTCAG~C AAGCCAGTGATCGAG~GCCTGGGCCTGCCG
ThrAsp~laAlaTyrAlaAlaLeuSerGly LysProValIleGluSerLeuGlyLeuPr~
GGCTTC~GCTACCACAGCGAC~AGGCCGAG TACGTGGACATCAGCGCGATTCCGCGCCGC
GlyPheGlyTyr~isSerAsp~ysA~aGlu TyrValAspIleSerAlaIl~ProArgArg ~co RI
CTGTACATGGCTGCGCGCCTGATCATGGAT CTG5GCGCCGGCAAGga~ttcCATCATCAC
LeuTyrMetAlaAlaAraLeuIleMetAsp LeuGlyAla~ly~SGluPh~i.~iF~is XbaI
CACCAT~ACGCTTCCTA~tctaga 3 ~isHis~lisAlaSer ~ -C

Sequences of CPVI6s SEQ ID NO. 11: DNA Sequence SEQ ID NO. 12: Protein Sequence N~o 1 ~' CCATGGAGCTGGCGG~C~
N-- MetG11~T.e~lAl~l~T~euCys CGCTGGGGGCTCCTCCTCG~ ~CCC CCCGGAGCCGCGAG~ACCC'AAGTGTGCACC
AraTr~GlvLeuT,ellT~eu~la~ellT.euPro ProGly~laAl~serThrGlnv~lcvsThr Bam HI
g~atccGCCCTGGCCCAGAAGCGCG~cAAC ~~ ~CAGGCAGCTACCGACGAGCAG
Glyser~laLeuAlaG~LysArgAspAsn ValLeuPheGlnAlaAlaThrAspGluGl~

CCGGCCGTGATCAAG~CGCTGGAGAAGCTG GTcAAcATcGAGAccGGcAccGGTGAcGcc ProAlaValIleLysThrLeuGluL~sLeu ValAsnIleGluThrGlyThrGly~spAl~
GAGGGCATCGCCGCTGCGG~CA~CTTCCTC GAGGCCGAGCTCAAGAACCTCGGCTTCACG
GluGlyIleAlaAlaAlzGlyAsnPheLPu GluAlaGluLeuLy~Asn~euGlyPheThr GTCACGCGAAGCAAGTC&GCCGGCCTGGTG GTGGGCGAC~ACAICGTGGGCAAGATCA~G
ValThrArgSerLysScrAlaGlyLeuVal ValGlyAspAsnIlevalGlyLysIleLys GGCCGCGGCGGCAAGAACCTGCTG~TGATG TCGCACATGGACA~ ACCTCAAGGGC
GlyArgGlyGly~ysAsnLeuLeuLeu~et SerHis~etAspThrValTyrLeuLysGly PTTCTCGCGA~&GCCCCGTTCCGCG~CGAA GGCGACAAGGCCTACGGCCCGGGCATCGCC
IleLeuAlaLysAlaProPheArcValGlu GlyAsp~ysAlaTyrGlyProGlyIleAla GACGACAAGGGC&GCAACGCGGTCA~CCTG CACACGCTCAAGCTGCT&AAGGAATACGGC
AspAspLysGlyGlyAsnAlaValIleLeu ~isThrLeuLysLeuLeuLysGluTy~Gly GTGCGCGACTACGGCACCATCACCGTGCTG T~CAATACCGACGAGGAAA~GGG~
ValArg~spTyrGlyThrIleThrValLeu PheAs~ThrAspGluGluLysGlySerPhe GGCTCGCGCGACCTGATCCAGGAAGAAGCC AAGCTGGCCGACTACGTG~~ GA&
GlySe~ArgAspLeuIleGlnGluGluAla LysLeuAlaAspTyr~al~euSerPheGl~
CccAccAGcGcAGGcGAcGAAAAA~ cG CTGGGCACCTCGGGCATCGCCTACGTGCAG
ProThrSerAlaGlyAspGl~ysLeuSer LeuG~yThrSerGlyIleAlaTyrValGln GTCCAA~TCACCGGCAAGGCCTCGCATGCC GGcGccGcGcccGAGcTGGGcGTGAAcG~G
ValGlnIleThrGlyLysAlaser~isAla ~lyAlaA~proGluLeuGlyvalAsnAla C~GGTCGAGGCTTCCGACCTCGTGCTGC~C AcGAT~AAcATcGAcGAcAAGGcGAAGAAc LeuvalGluA~aserAspLeuvalLeuA~g ThrMetAsnIleAspAspLysAla~ysAsn CTG~ ~AATGGACCATCGCCAAGGCC GGCCAAG~CTCG~CATC~TCCCCGCCAGC
LeuArgpheGlnTrpThrIleAlaL~sAla GlyGlnvalserAsnIleIleproAlaser ~CCACGCTGAACGCCGACGTGCGCTACGCC CGCAACGAGGACTTCGACGCCGCCATGAAG
~1aThrLeuAsnAlaAspValArgTyrAla ArgAsnGluAsppheAspAlaAla~et~ys AcGcTGGAAGAGcGcGcGcAGcAGAAGAAG CTGcccGAGGccGAcGTGA~GGTGATcGTc ThrLeuGluGluArgAlaGlnGlnLysLys LeuProGlu~laAspValLysValIleVal ACGCGCGGCCGCCCGGCCTTCAATGCCGGc GAAGGcGGcA~GAAGcTGGTcGAcAAGGcG
ThrArgGlyArgProAlaPheAsnAlaGly GluGlyGlyLysLys~euvalAspLysAla GTGGCCTACTACAAGGAAGCCGGCGGCACG CTGGGCGTGGAAGAGCGCACCGGCGGCGGC
ValAlarryrrryrLysGluAlaGly~lyThr LeuGlyValGluGluArgThrGlyGlyGly ACCGACGCGGCCTACGCCGCGCTCTCAGGC AAGCCA~TGATCGAGAGCCTGGGCCTGCCG
ThrAspAlaAlar~rrAlaAlaLeu~erGly LysProValIleGluSerLeuGlyLeuPro GGCTTCGGCTACCACAGCGACAAGGC~GAG TACGTGGACATCAGCGCGATTCCGCGCCGC
GlyPheGlyTyrHisSerAspLysAlaGlu TyrValAspIleSerAlaIleProArgAr~
Eco RI
CTGTACATGGCTGCGCGCCTGATCATGGAT CTGGGCGCCGGCAAGgaattcGGAGGTGGC
LeuTyr~etAlaAlaArg~euIleMetAsp ~euGlyAlaGlyLysGluPheGlvGlY~lY

Kpr 1 GGAggtacc GCACCCATGGCAGAAGGAGGA GGGCAGAATCATCACGAAGTGGTGAAGTTC
Gl V~l VThrAlaProMetAlaAspGlyGly GlyGlnAsn~i~isGluValValLysPhe ATGGATGTCTATCAGCGCAGCTACTGCCAT CCAATCGAGACCCTGGTGGACATCTTCCAG
MetAspV~lTyrGlnArgSerTyrCysHis ProIleGluThrLeuValAspIlePheGln GAGTACCCTGATGAGATCGAGTACATCTTC AAGCCATCCTGTGTGCCCCTGATGCGATGC
GluTyrProAspGluIleGluTyrIlePhe LysProSerCysValProLeuMetArgCys GGGGGCT~CTGCAATGACGAGGGCCTGGAG TGTGTGCCCACTGAGGAGTCCAACATCACC
GluGluCysCysAsnAspGluGlyLeuAsp CysValProThrGluGluSerAsnIleThr ATGcAGA~rTATGcGGATcAAACcTcAccAA GGCCAGCACATAGGAGAGATGAGCTTCCTA
MetGl~IleMetArgIleLysProHisGln GlyGlnHisIleGlyGluMetSerPheLeu CAGCACAACAAATGTGAATGCAGACCAAAG AAAGATAGAGCAAGACAAGAAAATCCCTGT
GlnHisA~nLysCysGluCysArgProLys LysAspArgAlaArgGlnGluAsnProCys GGGCCTTGCTCAGAGCGGAGAAAGCATTTG TTTGTACAAGATCCGCAGACGTGTAAATGT
GlyProCysSerGluArgArgLys~isLeu PheValGlnAspProGlnThrCysLysCys TCCTGCAAAAACACAGACTCGCGTTGCAAG GCGAGGCAGCTTGAGTTAAACGAACGTACT
SerCysL~AsnThrAspSerArgCysLys AlaArgGlnLeuGluLeuAsnGlUArgThr TGCAGATGTGACAAGCCGAGGCGGTGA 3' CysArgCy~;AspLy~;ProArgArg ~ -C

.

S~ UTE SHEET (RULE 26) Sequences of CPV,~,g SEQ ID NO. 13: DNA Sequence SEQ ID NO. 14: Protein Sequence ~co 1 5' CCATGG~GCT&GCG~~ ~C
N- MetGluLeu~la~laLeuC~s CGCTGGGGGCTCCTCCTCGCC~-lCll~CCC ~CCGGAGCCGCGAGCAC~CAA~l~l~CC
Ar~TrDr-l~L~ T.el lT ~euAlaLe--T.ellPro ~roÇlYAl~laSerThr~1nValC~sThr Bam HI
ggatccGCCCTG~CCCAGAAGCGCGACAAC GTG~ ~AGGCAGCTACCGACGAGCAG
GlySerA aLeuAlaGlnLysArgAspA5n ValLeuPheGlnAlaAlaThrAspGluGln CCGGCCGTGATCAAGACGCTGGAG~AGCTG GTCAACATCGAGACCGGCACCGGTGACGCC
ProAlaValIleLys~hrLeuGluLysLeu ValAsnIleGlu~hrGlyThrGlyAspAla GAGGGCATCGCCGCTGCGGGCAACTTCCTC &AGGCCGAGCTCAAGAACC~CG&CTTCACG
GluGlyIleAlaAlaAlaGlyAsnPheLeu GluAlaGluLeuLysAsnLeuGlyPheT~lr GTCACGCGAAGCAAGTCGGCC~GCCTGGTG GTG~GCGACAACATCGTGGGCAAGATCAAG
~alThr~rgSerLysSerAlaGlyLeuVal ValGlyAspAsnIleValGlyLysIle~s GGCCGCGGCGGCAAGAACCTGCTGCTGATG TCGCACATGGACACCGTCTACCTCAAGGGC
GlyArgGlyGlyLysAsnheuLeuLeuMet SerHisMetAspThrValTyrLeu~ysGly ATTCTCGC~-AAC-GCCCCGTTCCGCGTCGAA GGCGACAAGGCCTACGGCCCGGGCATCGCC
IleLeuAla~ysAlaProPheArgValGlu GlyAspLysAlalyrGlyProGlyIleAla GACGACAAGGGCGGCAAC&CGGTCATCCTG CACACGCTCAAGCTGCTGAAGGAATACGGC
AspAspLysGlyGlyAsnAlaValIle~eu HisThrLeuLys~eu~euLysGl~TyrGly GTGCGCGACTACGGCACCATCACCG~GCTG TTCAATACCGACGAGGAAAAGGGTTCCTTC
V~lArgAspTylGlyThrIleThrV~l~eu PheAsnT~rAspGluGlULysGlySerPhe GGCTCGCGCGACCTGATCCA~GAAGAAGCC AAGCTGGCCGACTACGTGCTCTCCTTCGAG
GlySerArgAspLeuIleGlnGluGlu~la LysLeuAlaAspTyrVal~euSerPheGlu CCCACCAGCGCAGGCGACGAAAAAC~CTCG CTGGGCACCTCG~GCATCGCCTACGTGCAG
ProThrSerAlaGlyASPGlULysLeuSer LeuGlyThrserGlylleAlal~rrvalGln GTccAAATcAccGGcAAGGccTcGcATGcc GGCGCCGCGCCCGAGCTGGGCGIGAACGCG
ValGlnIleThrGlYLysAlaSerHisAla GlyAlaAlaProGluLeuGlyValAsnAla CTGGTCGAGGCTTCCGACCTCGTGCTGCGC ACGATGAACATCGACGACAAGGCG~AGAAC
IeuValGlu;~laSerA5pLeuValLeuAr~ ThrMetAsnIleAspAspLysAlaLysAsn CTGCGCTTCCAATGGACCATCGCCAAGGCC ~GCCAAGIC~CGA~CAT~ATCCCCGCCAGC
LeuA~gpheGlnTrpThrIleAlaLysAla GlyGlnvalserAsnIleIleproAlaser GCCACGCTGAACGCCGACGTGCGCTACGCG CGCAACGAGGACTTCGACGCCGCCATGAAG
AIal~rLeu~snAlaAspvalArgTyrAla ArgAsnGlUAsPPheAsPAlaA}aMetLys W O 97/26~18 PCT/GB97/00221 AcGcTGGAAGAGcGcGc~GcAGcAGAAGAAG CTc~;cccGAGGccGAcGTGA~GGTGATcGTc ThrLeuGluGluArgAlaGlnGlnLysLys LeuproGluAlaAspvalLysvalIleval ACGCGCGGCCGCCCGGCCTTCAATGCGGGC GAAGGCGGCAAGAAGCTGGTCGACAAGGCG
ThrArgGlyArgProAlaPheAsnAlaGly GluGlyGlyLys~ys~euValAsPLYSAla GTGGCCTACTACAAGGAAGCCGGCGGCACG CTGGGCGTGGAAGAGCGCACCGGCGGCGGC
ValAla'~yrTyrLysGluAlaGlyGlyThr LeuGlyValGluGluArgThrGlyGlyGly ACCGACGGGGCCTACGCCGCGCTCTCAGGC AAGCCAGTGATCGAGAGCCTGGGCCTGCCG
ThrAspAlaAlaTyrAlaAlaLeuSerGly LysProValIleGlu~rLeuGlyLeuPro GGCTTCGGCTACCACAGCGACAAGGCCGAG TACGTGGACATCAGCGCGATTCCGCGCCGC
GlyPheGlyTyrHisSerAspLysAlaGlu T- ~VGlAspIleSerAlcIleProArgAra ~co RI
CTGTACATGGCTGCGCGCCTGATCATGGAT CTGGGCGCCGGCAAGgaattc~GAGGTGGC
LeuTyr~IetAlaAlaArgLeuIleMetAsp LeuGlyAlaGlyLysGluPheGlvGlY~lY

~pn 1 GG~ggtacc GCACCCATGGCAGAAGGAGGA GGGCAGAATCATCACGAAGTGGTGAAGTTC
GlvGl ~'Th~AlaProMetAlaAspGlyGly GlyGlnAsnHisHisGluValVal~ysPhe ATGGATGTCTATCAGCGCAGCTACTGCCAT CCAATCGAGACCCTGGTGGACATCTTCCAG
MetAspV~lTyrGlnAr~SerTyrCy8His ProIleGluThrLeuValAspIlePheGln GAGTACCCTGATGAGATCGAGTACATCTTC AAGCCATCCTGTGTGCCCCTGATGCGATGC
GluTyrProAspGluIleGluTyrIlePhe LysProS~rCysValPro~euMetArgCy~

GGGGGCTGCTGCAATGACGAGG¢CCTGGAG TGTGTGCCCACTGAGGAGTCCAACATCACC
GluGluCysCysA~nAspGluGlyLeuAsp CysValProThrGluGluSerAsnIlQThr ~TGCAGATTATGCGGATCAAACCTCACCAA GGCCAGCACATAGGAGAGATGAGCTTCCTA
NetGlnIleNetArgIleLysPro~isGln GlyGlnHisIleGlyGluMetSerPhQLeu *

CAGCACAACAAATGTGAATGCAGACCAAAG AAAGATTGAGCAAGACAAGAAAATCCCTGT
GlnHisAsnLysCysGluCysAr~ProLy~ LysAsp ~ -C

GGGCCTTGCTCAGAGCGGAGAAAGCATTTG TTTGTACAAGATCCGCAGACGTGTAAATGT
TCCTG~ A ~1~ ACACAGACTCGCGTTGCAAG GCGAGGCAGCTTGAGTTAAACGAACGTACT
TGCAGATGTGACAAGCCGAGGCGGTGA 3' SIJ~ JTE SHEET (RULE 26) -S~QUENCE ID NOS. 15 and 16 lVco 1 5- Cc~yiGAGcTGGcGGc~ Gl~c N- MetGll]r~e~ aLeuC~rs CGCTGGGG~CTCC-'lCC-lC~-~ ~CCC CCCGGAG~-~GCGAGCACCCAAGTGTGCA~C
~rqTrDGlvT.el~T~el~T~ ~LeuLeupro ProG~AlaAlaS~rThrGlnValCysThr Ba~ HI
~g~tccGCCCT~GCCCA~AAGc~CGACAAC G~ C~AG&CAGCTACCGACGAGCAG
~lySerAlaLeuAla~lnLysArgAspAsn ValLeuPheGlnAlaAl~ThrAspGluGln CCGGCC~TGATCAAGACGCTGGAGAAGCTG GTCAACATCGAGACC~GCACCGGT~ACGCC
ProAlaVallleLysThrLeuGluLysLeu ValAsnIleGluThrGlyThrGlyAspA~.
G~GGGCAT~CG~lGC~GGCAA~l~l~-C~ C GAGGCCGAGCTCAAGAACCTCGGCTTCACG
Gl~GlyIleAlaAlaAlaGlyAsnPh~Leu GluAlaGluLeuLys~snL~uGly~heThr GTCACGCGAAGCAAGTCGGCCG~C'~ G C-TGGG~GACAACATCGTGG&CAAGATCAAG
V~l~hrArgSerLysSerAlaG~yl.euVal ValGlyAspAsnIlevalGly~ysIleLys GGCCGCG~CGGCAAGAACCT~C~G~l'GATG TCGCACATGGACA~-C~~ ACCTCAA~GGC
GlyArgGlyGlyLysAsnLeuLeuLeuMet SerHisMetAspThr~alTyrLeuLysGlY
ATTC~CGC~AAGGCCCC~ll~-~CGTCGAA GGCGACA~GGCCTACGGCCC~C~TCGCC
IleLeuAlaT,ysAlaProPheArgValGlu G~yAspLysAlaTyrGlyproGlyIleAla GACGACAAG~GC~GCAACGCGGTCATCCT~ CACACGCTCAAGCTGCTGAAGGAATACG&C
AspAspLysGlyGlyAsnAla~alIleLeu ~isThrLeuLysLeuLeuLysGluTyrGly GTGCC~GACTACG&CACCATCACCGTGCTG TTCAATACCGACGAG&AAAA~ll~C~llC
ValArgAspTyrGlyThr~lcThrValLeu PheAsnThrAspGluGlu~ysGlySerPhe GGCl~GC~ACCTGATCCAGGAAGAAGCC AAGCTGGCCGA~TACGT~~ CCl-l~CGAG
GlyS~rArgAspLeuIleGlnGluGluAla LysLeuAlaAspTyrVal~euSerPheGlu CCCACCAGCGCA~ÇCGACG~AAAA~ C~ CTGGGCACCTCGGGCATCGCCTACGTGCAG
ProThrserAlaGlyAspGluLysLeuser LeuGlyThrSer~lyIleAlaTyrValGln GTCCAAATCACCGGCAAGG~-lCGCATGCC G~CGC~GCGC~CGAGC~GGGCGTGAACGCG
ValGlnIleThrGlyLysAlaSerXisAla GlyAlaAlaPr~GluLeuGlyValAsnAla CTGGTCGAGGCTTCCGACClC~l ~l~CGC ACGATGAACA~CGACGACAAGGCGAAGAAC
LeuValGluAlaSerAspLeuVal~euArg ThrMetAsnIleAspAspLysAlaLySAsn CTGCGCTTCCAAl~GACCATCGCCAAGGCC GGCCAA~~ ~AACA~CAT~-~CC~-~AGC
LeuArgPheGlnTrpThrIleAlaLysAla GlyGlnValSerAsnIleIl~ProAlaSer GCC~CGCTGAACGCCGA~-GlGCGClACGCG CGCAACGAGGACTTCGACGCCGCCATGAAG
AlaThrLeuAsnAlaAsp~alArgTyr~la ArgAsnGluAsppheAspAlaAlaMetLys W O 97/26~18 PCT/GB97/00221 ACGCTGGAAGAGCGCGCGCAGCAGAAGAAG CTGCCCGAGGCCGACGTGAAGGTGATCGTC
ThrLeuGluGluArgAlaGlnGlnLysLys LeuProGluAlaAspValLysValIleVal ACG~GCGGCCGCCCGGCCTTCAATGCCGGC GAAGGCGGCAAGAAGCTGGTCGACAAGGCG
ThrArgGlyArgProAlaPheAsnAlaGly GluGlyGlyLysLysLeuValAspLysAla GTGGCCTACTACAAGGAAGCCGGCGGCACG ~TGGGCGTG&AAGAGCGCACCGGCGGCGGC
ValAlaTyrTyrLysGluAlaGlyGlyThr LeuGlyValGluGluArgThrGlyGlyGly ACCGACGCGGCCTACGCCGCGCTCTCAGGC AAGCCAGTGATCGAGAGCCTGGGCCTGCCG
ThrAspAlaAiaTyrAlaAlaLeuSerGly LysProValIleGluSerLeuGlyLeuPro GGCTTCGGCTACCACAGCGACAAGGCCGAG TACGTGGACATCAGCGCGATTCCGCGCCGC
GlyPheGlyTyrHisSerAspLysAlaGlu TyrValAspIleSerAlaIleProArgArg Eco RI
CTGTACATGGCTGCGCGCCTGATCATGGAT CTGGGCGCCGGCAAGgaattcGGAGGTGGC
LeuTyrMetAlaAlaArgLeuIl~etAsp LeuGlyAlaGlyLysGluPheGlv~lYGlY

K~n 1 GGAggtaccGcAcccATG&cAGAAGGAGGA GGGCAGAATCATCACGAAGTGGTGAAGTTC
çly~lyThrAlaproMetAlaAspGlyGly GlyGlnAsnHisHisGluValV~lLysPhe ATGGATGTCTATCAGCGCAGCTACTGCCAT CCAATCGAGACCCTGG~GGACATCTTCCAG
MetAspValTyrGlnArsSerTyrCys~is ProIleGluThrLeuValA~pIlePheGln GAGTACCCTGATGAGATCGAGTACATCTTC AAGCCATCCTGTGTGCCCCTGATGCGATGC
GluTyrProAs~GluIleGluTyrIlePhe LysProSerCysValProLeuMetArgCy~
GGGGGCTGCTGCAATGACGAGGGCCTGGAG TGTGTGCCCACTGAGGAGTCCAACATCACC
GluGluCy~CysAsnAspGluGlyLeuAsp Cy~ValProThrGluGluSerAsnIleThr ATGCAGATTATGCGGATCAAACCTCACCAA GGCCAGCACATAGGAGAGATGAGCTTCCTA
MetGlnIleMetArgIleLysProEisGln GlyGlnHisIleGlyGluMetSerPheLeu CAGCACAACAAATGTGAATGCAGACCAAAG AAAGATAGAGCAAGACAAGAAAATCCCTGT
Gln~is~snLy~CysGluCysArgProLy~ Ly~AspArgAlaArgGlnGluAsnProCy~
GGGCCTI'GCTCAGAGCGGAGAAAGCATTTG TTTGTACAAGATCCGCAGACGTGTAAATGT
GlyProCysSer~luArgArgLysHi~Leu PheValGlnA~ProGlnThrCy~LysCys TccTGrAAAAAcAcAGAcTcGcGTTGcAAG GCGAGGCAGCTTGAGTTAAACGAACGTACT
SerCy~LysAsnThrAs~SerArgCy~Lys AlaArgGlnLeuGluLeuAsnGluArgThr ~am HI Eco RI ~ba I
TGcAr~ATGTGAcggattcgaattcCATCAT CACCACCA TCACGCTTCCTAGtctaga 3' cy~Ar~cy-sA~pGlySerGluPheHisHis l~isHisl~isl~isAlaSer ~ -C

SU~S 111 UTE SHEET (RULE 26) CA 02243533 l99X-07-20 SEQUENCE ID NOS. 17 and 18 Nco l S' CCATGGAGCTGGCGG~=~ ~C
N- MetGll~TeuAlaAlaLeu~ys CGCT~GGGGCTC~lLCl~GCG~l~l~ G CCCGGAGCCGCGAGCACCCAAGTGTGCACC
~oTr~5].vLeuLe~T~uAla~euLellPro ProG~ A~laserThr~lnvalcv~Th~
Bam HI
~gatccGCCCTGGCCCAGAAGCGCGACAAc GTGCTCI~CCAGGCAGCTACCGACGAGCAG
GlySerAlaLeuAlaGlnLysAr~AspAsn ValLeupher~n~laAlaThrAspGluGln CCG&CCGTGATC~GACGGTGGAGAAGCTG G~CAACATCGAGACCGGCACCG~T~ACGCC
ProAlavalIleLysThrLeuGlul~ysIJeu ValAsnIleGluThrGlyThrGly~spAla GAGG&G~TCGCCGCTGCGGGCAACTTCCTC GAG&ccGAGcTcAAGAAccTcG~x~rrcAcG
GluGlyIleAlaAlaAlaGlyAsnPheLeu GluAlaGluLeuLysAsnLeuGlyPheTh~
GTCACGCGAAGCAA~'~G~GCGTG~TG GTGGGCGACAACATCGTGG&CAAGATCA~
ValThrArgSerLysSerAlaGlyLeuVal ValGlyAspAsnIleValGlyLysIleLys GGCCGCGGCGGCAAGAA~-l~GG~ GATG ~CGCACATGGAC~ lACCTCAAGGGC
GlyArgGlyGlyLys~nT.~-I.euI,euMet 5erHisMetAspThrValTyrLeuLys&ly ATTCTCGC&AAGGCCCC~llC~GC~l~GAA GGC&ACAA&GCCTACGGCC~ ~ ATCGCC
IleLeuAlaLysAlaProPheArgValGlu GlyAspLysAlaTyrGlYProGlyIleAla GACGACAAGGGCG~CAACGCGGTCATCCTG CACACGCTC~AGCTGCTGAAGGAATACGGC
AspAspLysGlyGlyAsnAlaValIleLeu HisThrLeuLysLeuLeuLysGluTyrGly GTGCGCGACTACGGCACCATCA~ G TTCAATACC&ACGAGG~AA~GGGTTCCTTC
ValArgAspTyrGlyThrIleThrVa1Leu PheAsnThrAspGluGluLysGlySerP~e GGCTC&CGCCACCTGATCCAGGAAGAAGCC AAGCTGGCCGACTACGTG~-lC~C~ ~AG
GlySer~rgAspLeuIleGlnGluGluAla LysLeuAlaAspTyrValLeuSerPheGlu CCCACCAGCGCAGGCGACGAAAAACTCTCG CTGGGCACCTCGGGCATCGCCTACG~GCAG
ProThrSerAlaGlyAspGluLys~euSer LeuGlyThrSerGlyIle~laTyr~alGln GTCCAAATCACCGGCAA ~ CCTCGCATGCC GGCG~CGCGCCCGAGCT&GGCGTGAACGCG
ValGlnIleThrGlyLysAlaS~;.cAla GlyAlaAlaProGluLeuGlyValAsnAla AGG~ GA~C~l~Gl~C~GCGC ACGP.TGAACAT~CGACAAGGCGAAGAAC
LeuValGluAlaSerAsp~euValLeuArg Thr~etAsnIleASpAsp~sAlaLysAsn CTG~ ~AAT&&ACCATCGCCAA~GCC GGCCAAGTCTCGAACATCATCCCCGCCAGC
LeuArgPheGlnT~pThrIleAlaLysAla GlyGln~AlserAsnIleIleproAlaser GCCACGCTGA~CGCCGACGT~CGCTACGCG CGcAAcGhGGAcTTcGAcGccGccATGAAG
AlaThr~euAsnAla~spValArgl~rAla ArgAsnGluAsppheAspAlaAlaMetLys W O 97/2~918 ACGCTGGAAGAGCGCGCGCAGCAGAAGAAG CTGCCCGAGGCCGACGTGAAGGTGATCGTC
ThrLeuGluGluArgAlaGl~GlnLysLys LeuProGluAlaAspValLysValIleVal ACGCGCGGCCGCCCGGCCTTCAATGCCGGC GAAGGCGGCAAGAAGCTGGTCGACAAGGCG
ThrArgGlyArgProAlaPheAsnAlaGly GluGlyGlyLys~ysLeuValAspLysAla GTGGCCTACTACAAGGAAGCCGGCGGCACG CTG&GCGTGGAAGAGCGCACCGGCGGCGGC
ValAlaTyrTyrLysGluAlaGlyGlyThr LeuGlyValGluGluArgThrGlyGlyGly ACCGACGCG&CCTACGCC~CGCTCTCAGGC AAGCCAGTGATCGAGAGCCTGGGCCTGCCG
ThrAspAlaAlaTyrAlaAlaLeuSerGly LysProValIleGluSerLeuGlyLeuPro GGCTTCGGCTACCACAGCGACAAGGCCGAG TACGTGGACATCAGCGCGATTCCGCGCCGC
GlyPheGlyTyrHisSerAspLysAlaGlu TyrValAspIleSerAlaIleProArgArg Eco RI
- CTGTACATGGCTGCGCGCCTGATCATGGAT CTGGGCGCCG~CAAGgaattcGGAGGTGGC
LeuTyrMetAlaAlaArgLeuIleMetAsp LeuGlyAlaGly~ysGluPheGlYGlvSlv Kpn 1 GGAggtaCCGCACCCATGGCAGAAGGAGGA GGGCAGAATCATCACGAAGTGGTG~AGTTC
Gl VGl vT~ rAlaPro~etAlaAspGlyGly GlyGl~Asn~isHisGl~ValVal~y~Phe ATGGATGTcTATcAGcGcAGcTAcTGccAT CcAATcGAGAcccTGGTGGAcATcTTccAG
~etAsp~alTyrGl~Ar~SerTyrCys~is ProIleGluThrLeuvalAspIlepheGln GAGTACC'CTGATGAGATCGAGTACATCTTC AAGCCATCCTGTGTGCCCCTGATGCGATGC
GluTyr~'roAs~GluIleGluTyrIlePhe LysProSerCysValproLeuMetArgCys GGGGGCTGCTGCAATGACGAGGGCCTGGAG TGTGTGCCCACTGAGGAGTCCAAC~TCACC
GluGluCysCysAsnAs~)GluGlyI,eU~ip cy~;valproThrGluGluse~A~:nIleThr ATGCAGATTATGCGGATCAAACCTCACCAA GGCCAGCACATAGGAGAGATGAGCTTCCTA
MetGlnIleMetArgIleLysProHi~Gln GlyGl~isIleGlyGluMetSerPheLeu Bam HI ~co RI
CAGCAC~CAAATGTGAATGCAGACCAAAG AAAGATgga ttcgaa t tc CATCATCACCAC
Gln~isA~inLysCysGluCysArgProLys Ly~iA8pGlySerGluPheHisHisHisHis 1 0 g XbaI
CATCACCCTTCCTAGtctaga 3' Hi sHi sAl aSer ~ -C

Sl ~ 1 1 1 UTE SHEE~ (RULE 26) CA 02243~33 l998-07-20 _ 84 -SEQUENCE ID NOS. 1g ana 20 ~-co I
5~ CCATGCAC~GC&~G~l-GClGClGGCCGTC G~ Cl~CG~YX~AGAC-~ CC
N- MetGl~7~erLYsVAlT.e-~T.euAlaV~l A~euTr~T euCYsValGtl7~h~P~Ala ~lC~ CCTA~l~~ lL-ll GA1~1~CCCA~-l~AGCATAC~AAAA&AC
AlaSerValGlyLeuProSerValser~eu AspLeuProAryLeuSerIleGlnLysAsp ATAcTTAcAAlur~GGcT~TAc ~ CTCTT CP~ ~ A~ 45XXG37~:~GAGC~CTI~
IleLeuThrIleLysAlai~snThrThrLeu GlnIleT}LrCysArgGlyGlni~rgAspLeu GACTGG~ ~CCCAATAATCAGAGTGGC AGTGAGCAAAGGGTGGAGGTGACTGAG~VC
AspTrpL~uTrp~roAsn~cn~l n CerGly 5~rGl~ n ~gValGluValThrGluCys AGCGATG~-~-r~rl~lGlAAGACACTCACA ATTCCAAAAGTGATCGGAAAT~ACACT~A
SerAspGlyLeuPheCysLysThrLeuThr Ile~roLysvalIleGlyAsnAspThrGly GCCTACAA~~ ACCGGGAAACTGAC TTGGC~1~W1~ATTTA1~1~1ATGTTCAA
AlaTy.rLysCysPheTyrAr~GluThrAsp LeuAlaSerValIleTyIq~ryrVal~ln GATTACAG~TCTCCATTT~ AGTGACCAACATGGA~~ ACATT~CT
AspTvrArgSerProPheIl~Al~SerVal SerAsr~l n~; ~GlyValValTyrIleThr G~GAAcAAAAAcAAAA~~ ~ATTccA ~ ~l~CATTTCAAATCTCAACGTG
GluAsnLysAsn~ysThrValValIlePro CysLeuGlySerIleSerAsnLe~AsnVal TCA~~ AAGATACCCAGAAAAGA~A ~ ATGGTAA~hGA~l~ ~&
S~rLeuCysAlaArgTyrProGluLysArg PheValPr~p~,lyAsnArgIleserTrp GACAGCAAGAA~G~ ACTATTCCCAGC TACATG~TCAGCTA~l~-l~&C~
~spSer~ysLysGlyPheThrIleProSer TyrMetIleSerTyrAlaGlyMetValPhe T&TGAAG~AAAAATTA~GATGAAAGTTAC CAGTCTATTATGTACATA~ A
CysGluA~aLysIleAsnAspGluSerTyr GlnSerIleMetTyrIleValValValVal GGGTATAGGATTTATG~l~~ ~AGT CCGTCTCAT&GAATT&AACTAlCu'Gll~GA
GlyTyrArgIleTyrAspValValLeuSer ProSerHisGlyIleGluLeuSerValGly GAAAAGCl~ AA~TTGTACAGCAAGA ACTGAA~TAAATGTGGGGATTGACTTCAAC
Glu~ysTJeuVal~euAsnCysThrAlaArg ThrGluLeuAsnValGlyIleAspPheAsn T~GGAATAGCC~ ~AAGCATCAGCAT AAGAAA~ AAAccGAGAccTAAAA~cc TrpGluTyrProserserLysHisGl~lis LysLysLeuValAsnAr~AspLeuLysThr CAGTC~G~Ç~GTGAGATGAAGAAA~ lG AGCACCTTAACTATAGATGGT~TAACCCGG
GlnSerClySerGl~etLysLysPheLeu SerTh~T~ le~spGlyValThrP~rg AGTGACCPL~G~TTGTACA~-l~l~X~4GCA TCC~ XX~C-l~AIY~CC~iGh~;G~C~GC
SerAspGlnG~y~euTyrThrcysAla~la SerSerGlyLeu~etTnr~ysLysAsnSer W O 97!2691f~ PCT/GB97/00221 - 85 -A~A~ 'CA~:GGTCCA5Y ~ C~ VJ' GlUrG~ GTGGCATC~;AA'lCTClG
Thr~heValArg~alHisGlu~ysProPhe ~alAlaPheGlySerGlyMct~luSerLeu GT&GA~GCCA~ ~AGCG~l~AGA A~ GAAGTAL~-l-lw llACCCACCC
ValGl~AlaThrValGlyGluArgValArg IleProAlaLysTyrLeuG~yTyrProPro CCAGAAArrAA~ATG~TATAAAAATGGAATA CC~ ~AGTCCAATCACACAATTAAAGCG
ProGlu~leLysTrpl~yrLysAsnGlyIle ProLeuGluser~sn~i~ThrIleLysAla ~GGCATG'~ACTGACGATTAT&G~AGTGAGT GAAAGA~ACACAGGAAATTACACTGTCATC
GlyHisVa~LeuThrIleMetGluvalser GluArgAspThrGlyAsnTy~hrValTle C~TAC~ TCCCATTTCAAA~GAGAAGCAG AGccA~ ATGTc ~euThrAsnP~oIleSerLysGllllys~ln Ser~isValVa~er~eu~alValTyrVal CCACCCCAGA~ GAGAAATCTCTAATC ~l~iCC~l~A1~-lACCAGTAC&GCACC
ProProGlnIleGlyGluLysSerLeuIle SerProValAspS~rTyrGlnTyrGlyThr ACTCAAACGC~&ACATGTAC~-lATGCC Al-lC~l~L~-~L~'ATCACATCCACT ~ T~T
ThrGlnT~LeuT ~ rsThrValTyrAla I~epropropro~;c~;~cIle~isTrplsrr TGGCAGTTGGAGGAAGAGTGCGCCAACGAG CCCAGCCAAG~~ AGTGAC~AACCCA
TrpGlnL~uGluGluGluCycA~aAsnGlu ProSerGlnAlaValSerValThr.~snPro TA~ ~AAGAATGGAGA~lGl~AG GA-~ ~A~Gr-Arr-~A~T~AAATT&AAGTT
~rrProC~rsGluGluTrpArgSerValGlu AsppheGl nt~l yGlyAsnLysIleGluV81 AATAL~TCAA~ AAT~GAAGG~ AAAAACAAAACTGT~AGTA~-GC~ ATC
AsnLys~snGlnPheiAlaLeuIle&luGly LysAsnLysThrValSer~hrLeuValIle CAAGCGGCAAAl~l~l~A~Gl~ ACAAA TGTGAA~-~lCAACAAAGTC&GGA~AGGA
GlnAlaAlaAsnValSerPl~T~ TyrLys CysGluAl~ValAsnLys~alGlyArgGly GAGAGGG~A~ ~Cl~ -ACGTGACCAGG ~l~-l~AAATTACrl-l~LAACCTGA~ATG
Glu~rgVz~ eSerP~eHisValThrArg GlyProGluIleThrLeuGlnProAspMet CAGCCCACTGAGCAGGAGAGCGl~l~W~ GC~CTr~r~ GATCTAC~~ ~AG
GlnProThr~GluGlnCluSerValSer~eu TrpCysThrAlaAspArgSerThrPheGlu AACCTCAC:ATC~TAC~AG~~ CCACAG ~Cl~-l~-AATCCAl~l~GAGAGTTGCCC
AsnLeuThrTrpTyrLySLeUGlyPrOGln ProLeuProIlellisValGlyGluLeuPro ACACCTGlYT~C~GA~CTTGGAT~CTCTT T~GAAATTGAATGCCACCA~ AAT
ThrProValCysL~sAsnLeuAspThrLeu TrpLysLeuAsn~la~hrMetPheSerAsn A~CACAAATGACATTrTGATCAT&GAGCTT AAGAATGCA~ ~AGGACCAAGGAGAC
SerThrAsnAspIleLeuIleMetGluLeu LysAcn~ erLeu~lnAspGlnGlyAsp T~lG~ GCCTTGCTCAA~ACAGGAAGACC AAGAAAAGAcAllb~l&GTcAGGcAGcTc TyrVaiCys~euAlaGlnAspArgLysThr LysLysAr~His~ysValValArgGlnLeu ~CA&-lG~lAG~GC-~l ~ CCCACGATC ~CAGGAAACCTGGAGAATCAGA~G~CAAGT
T~ValL~uGluAr$ValAlaProThrIle ThrGly~cnT~ GluA~n~l~ThrThrser A'~ ~AAGCATCGAAGTCTCATGC~CG GCATCTGGGAATCCCCC*CCACAGATCATG
IleGlyGluSerIleGlu~alSerCysThr AlaSerGlyAsnProProPr~GlnIle~et ~ AAAGATAATGAGACCCl~ AGAA G~CTCAGGCATTGTATTGAAGGATGGGAAC
TrpPheLysAspAsnGluThrLeuValGl~ AspSerGlyIle~lLeuLysAspGlyAsn CGGAA~CTCACTA~CCGCAGAGTGAGGAAG &AG&Ac&AAGGcL~ AcA~-~~ AGGcA
ArgAsn~euThrIle~rgArgValArgLys &l~ r~ yLeuTyrThrCysGlnAl~
TGCA~L7~l-lC~ ~l~AAGTGGAG GC~~ ~TAz~TAGAAG~l~Lc~
~ysSerValLeuGlyCysAlaLysValGlu AlaPhePheIlel3eGluGlyAlaGlnGlu AAGACGAAC~u~ AAATCATTATTCTAGTA &~CA~'~G~ w I~ATTGCCAl~~ w LysThrAsr~euGluIleIleIleLeuv~ ;lrIhr~V~leA7aMetP~ePheTr~

CTAC~ ~ATC~TCCl'ACGGACCGTT AAGC&GGCCAATGGAGGGGAACTGAAGACA
L~7~T,~71T,euVA 7IleTl~rreu~a~hrval ~ysArgAlaAsnGlyG].yGluLeuLysThr GGC~AL~ CATCGTCAT~GATC~AGAT GAAC~CCCATTGGATCAAC~~ ~AACGA
GlyTyrLeuSerIleValMetAspProAsp GluLeuProLeuAspGlu~isCysGluArg ~co RI ~indI~
~l-~c~ AT&ATGccAGcAAATGGGAATTc aag~ttcc.yy~Lcgacatcgat~gcag LeuProTyrAspAlaSerLysTrpGl~Phe Ly~LeuProGly~alAspll eA~pGl u~n X~a I
a~tgatatccgagga~gacctg~actagt~taga ~' LysLeuIl eSe~l uGl u~spLeuAsn ~ -C