PKS gene constructs were transformed into E. coli BL21-. For protein isolation, cultures were grown to an OD of approximately 0.6 at 37 C when transcription of the cDNA inserts was induced with IPTG and cultures were grown for an additional 10–12 h at 28 C. His-tagged protein was isolated from the bacteria using the MagneHis Protein Purification System . PKS enzyme assays contained 4 lg protein, 100 mM KPO4 , 5 mM malonyl-CoA, and either 250 lM hexanoyl-CoA or 67.5 lM 4-coumaroyl-CoA as substrates. Boiled protein was assayed in parallel with all reactions. All negative controls showed a lack of product formation. Reactions were incubated for 1 h at 30 C, dried in a Speed Vac, resuspended in 40 ll methanol, and applied to Agilent Tech 1200 HPLC with a Spherisorb 6 l ODS2 separation column. Products and reactants were resolved across a gradient of 1% H3PO4 to 100% acetonitrile with molecular weights determined by LC-MS.Quantitative reactions were performed as described previously using primers listed in Supplementary Table 4B at JXB online. Equivalent quantities of RNA isolated from glands and inflorescence associated leaves were used to generate the respective single stranded cDNAs. qPCR reactions containing equal quantities of gland or leaf cDNA were run in duplicate along with reactions containing standards consisting of 100-fold sequential dilutions of isolated target fragments, on a Lightcyler qPCR machine . Lightcycler software was used to generate standard curves covering a range of 106 to which gland and leaf data were compared. Two biological replicates were used to generate the means and standard deviations shown in Supplementary Table 4A at JXB online. These values were used to compute the gland over leaf ratios and P-values shown in Supplementary Table 4A at JXB online.
Raw relative expression data, means, standard deviations, P-values from gland versus leaf t tests, qPCR primer sequences,clone rack and representative real-time qPCR tracings are shown in Supplementary Table 4A at JXB online.Anatomical study revealed that glands located on mature floral bracts of female plants are the site of enhanced secondary metabolism leading to the production of THCA and other compounds in Cannabis sativa . These glands are located on multicellular stalks and typically are composed of eight cells . The outer gland surface is composed of a smooth capsule covered by a membrane. The capsule contains exudates derived from the gland cells . The weakly attached glands can easily be separated from the bracts and purified as shown in Fig. 1E and F. An EST library was constructed using RNA isolated from purified glands. Over 100 000 ESTs were cloned. Plasmid DNA was isolated and sequenced from over 2000 clones. Because of the directed orientation of cDNA insertion, sequences are expected to represent the coding strand. After the removal of vectoronly, poor quality sequences, and sequences obviously originating from organelles or ribosomal RNA, the remaining sequences were clustered into 1075 unigenes . Overall, 111 of the unigenes were contigs containing two or more closely related ESTs . Only 14 contigs lacked a similar sequence in the NCBI database. Nine hundred and sixty four of the ESTs were only found once and of these 710 were similar to sequences in the NCBI database . The top three unigenes representing the greatest number of ESTs encoded proteins related to metallothionein, RD22-like BURP domain-containing proteins, and chitin binding hevein-like proteins . All three of these proteins have functions related to biotic or abiotic stress responses . Gene Ontology analysis was performed on the sequence dataset . An analysis of biological function indicates that 27% of the unigenes encode proteins with metabolic activity.
Unigenes with NCBI matches encoding proteins with unknown function comprise 14% of the total and another 28% are predicted to be involved in various cellular processes such as protein synthesis and protein degradation.The specific biochemical steps leading to THCA are proposed to begin with a reaction involving a type III PKS enzyme that catalyses the synthesis of olivetolic acid from hexanoyl-CoA and three molecules of malonyl-CoA . Malonyl-CoA is derived from the carboxylation of acetyl-CoA. ESTs encoding acetyl-CoA carboxylase were identified. Hexanoyl-CoA could be produced by more than one pathway in the trichomes. One route to produce hexanoyl-CoA would involve the early termination of the fatty acid biosynthetic pathway, yielding hexanoyl-ACP . The hexanoyl moiety would then be transferred to CoA by the action of an ACP-CoA transacylase or it would be cleaved by the action of a thioesterase, yielding n-hexanol, which would then be converted into n-hexanoyl-CoA by the action of acyl-CoA synthase. Most of the enzymes needed for this route are represented in the EST database, except for thetransacylase and 2,3-trans-enoyl-ACP reductase . A second route to hexanoyl-CoA would involve the production of hexanol from the breakdown of the fatty acid linoleic acid via the lipoxygenase pathway . A survey of the sequenced ESTs revealed candidate genes encoding the enzymes needed to synthesize linoleic acid from acetyl-CoA by the typical fatty acid biosynthetic pathway in plastids followed by the production of hexanol from linoleic acid via the LOX pathway. An third pathway related to the biosynthesis of branched chain amino acids has been proposed to be involved in the production of short-chain and medium-chain fatty acids . However, the enzymes in this pathway [2-isopropylmalate synthase, 3-isopropylmalate dehydratase, 3-isopropylmalate dehydrogenase, and 2- oxoisovalerate dehydrogenase ] were not represented in the Cannabis trichome EST library.
After the formation of olivetolic acid, a prenyltransferase is proposed to add a prenyl group derived from geranyl diphosphate to create cannabigerolic acid. GPP is derived from the fusion of two isoprene units . Two different biochemical pathways support the synthesis of isoprenoids in plants . Within the list of unigenes all but one of the enzymatic activities needed to convert pyruvate and glyceraldehyde-3-phosphate into isopentenyl and dimethylallyl diphosphate via the methylerythritol 4-phosphate pathway were represented . This finding is consistent with isotopic studies showing that the GPP cannabinoid precursors are synthesized via this pathway . The formation of GPP is mediated by GPP synthase. Several unigenes related to GPP synthase were identified , however, they were more closely related to other terpene synthases. In particular, CAN36 and CAN55, which possibly were derived from the same gene, and the closely related CAN37, are most similar to hop sesquiterpene synthases HISTS1 and HISTS2 , with an average identity of 56% over the first 160 amino acid residues . CAN41 is most similar to hop monoterpene synthase HIMTS2 .The nature of the prenyltransferase is unknown. However, previous studies identified a soluble aromatic geranylpyrophosphate:olivetolate geranyltransferase in the extract of young leaves with the appropriate activity . The only EST encoding a predicted prenyltransferase was CAN121. However, the encoded protein is more closely related to members of the membrane-bound chloroplast-localized family of prenyltransferases than to soluble prenyltransferases . The final step in the pathway is mediated by THCA synthase, which mediates the conversion of cannabigerolic acid to THCA . Two ESTs with sequences identical to the previous reportedly THCA synthase were identified .Whereas the nature of the prenyltransferase responsible for the synthesis of cannabigerolic acid is unknown, three unigenes, CAN24, CAN383, and CAN1069, comprising eight, one, and two ESTs, respectively, could encode the PKS activity needed to synthesize olivetolic acid. These were therefore characterized in more detail. All three unigenes were represented by individual ESTs encoding complete PKS polypetides. These were sequenced and compared to related PKS sequences . CAN1069 was identical to a previously identified Cannabis gene encoding a chalcone synthase, and is the most closely related of the PKS sequences to other known chalcone synthases from hop and Arabidopsis . The relationships of hop phlorisovalerophenone synthase , which mediates the conversion of malonyl-CoA and isovaleryl-CoA to phlorisovalerophenone, to CAN24 and CAN383 is less clear . CAN24 and CAN383 show 64.6% identity and are nearly equally similar to hop VPS at 71.2% and 72.0%, respectively. The enzymatic activities encoded by CAN24 and CAN1069 were explored in detail. The coding regions of the two genes were inserted into the pHis8 vector in frame with a His8 tag. The tagged proteins were purified on a nickel-containing magnetic bead matrix and were assayed for chalcone and olivetol/olivetolic acid synthase activities . Recombinant protein from CAN1069, but not CAN24,hydroponic shelves produced reaction products when incubated with coumaroyl-CoA and malonyl-CoA . The reaction products were analysed by LC-MS and peak 2 was found to have a molecular mass and absorption spectrum consistent with naringenin , the major product of chalcone synthases. Both CAN24 and CAN1069 were capable of using malonyl-CoA and hexanoyl-CoA as reaction substrates and LC-MS indicated that products of these enzymes were the same, but neither molecular mass nor the absorption spectrum of this product matched olivetol or olivetolic acid . Results similar to CAN24 were obtained using protein purified from CAN383 .
Genes required for THCA production are probably more highly expressed in glands of pistillate inflorescences because this is where THCA is most highly concentrated. To test this hypothesis, the relative expression levels in isolated glands versus young inflorescence-associated leaves of selected unigenes were compared using real-time qPCR. The identity of the genes assayed and the differences in relative expression levels are listed in Table 2 and in Supplementary Table 4A at JXB online. Consistent with this hypothesis, THCA synthase expression was 437 times higher in isolated glands than in leaves. CAN24 was expressed 1600 times higher in glands of the inflorescence than in associated leaves. CAN1069 encoding CHS was also more highly expressed in glands than leaves . The expression of a third PKS, CAN383, was expressed at similar levels in glands and leaves . These results are not explained by poorRNA isolation from leaves as unigene CAN219 encoding chlorophyll A/B binding protein showed elevated leaf expression levels . The activities of several housekeeping genes were also tested. A relatively modest increase in levels of histone H2A and beta tubulin expression in glands compared to leaves was detected. The increase in expression levels of these latter two genes might reflect a combination of the heightened metabolic activity and the unique cellular structure of glandular trichomes. Two different pathways could provide the hexanol required for olivetolic acid synthesis, as shown in Fig. 2. Expression levels provide support for the de novo pathway as a primary source, given that CAN498, CAN82, and CAN915 were much more highly expressed in glands than leaves , whereas the relative expression of genes encoding enzymes in the lipid breakdown pathway were depressed or modestly elevated in glands.Eighty Cannabis unigenes were similar to transcription factors found in Arabidopsis and 11 contain MYB DNA binding domains . Expression of four MYB genes in isolated glands and leaves was compared by real-time qPCR . CAN833 and CAN738 exhibited 954-fold and 586-fold higher expression in glands, respectively, whereas CAN483 and 792 showed more modest induction in glands. None of the other putative transcription factors that were assayed showed the same degree of differential expression as CAN833 and CAN738 .The identities of the most abundant ESTs derived from the glandular trichomes of Cannabis sativa are consistent with the protective function of plant glands. For example, themost abundant ESTs encoded a protein closely related to type II metallothioneins. These proteins bind heavy metals such as Cd, Zn, and Cu, and their proposed primary function is the maintenance of Cu tolerance . The second most abundant class of ESTs encoded an RD22-like BURP domain containing protein. This class of proteins contains a hydrophobic Nterminal signal peptide, and an N-terminal conserved region followed by a series of small repeats . The BURP domain of approximately 230 amino acids is located in the C-terminal region. The function of RD22-like proteins is unknown but some members of this class of genes are induced by dehydration . The third most abundant ESTs encoded a protein containing a hevein domain. Hevein domains contain a conserved 43-amino acid motif that binds chitin and members of this protein class are known for anti-fungal activity . The unique secondary metabolism in Cannabis may also play a role in plant defence. Synthesis of THCA is extracellular and results in hydrogen peroxide production, which has general antimicrobial properties , and a recent report further indicates that THCA may directly inhibit microbial growth . The analysis of gland-derived ESTs has identified nearly all the candidate genes required for THCA synthesis from primary metabolic products.