Secretory expression of human insulin precursor in Pichia pastoris employing truncated -factor leader sequence and a short C-peptide

In the past ten years, diabetes prevalence has increased rapidly in lowand middle-income countries due to lifestyle changes. This increased number of diabetic patients leads to the escalation of recombinant insulin demand, which is creating a large global insulin market. Pichia pastoris has appeared as an alternative host to produce recombinant proteins. It has excellent qualifications as an expression host for large-scale production of recombinant proteins for therapeutic use. In this study, we attempted to express the insulin precursor (IP) in P. pastoris. We used a synthetic IP-encoding gene constructed in frame with the truncated α-factor secretory signal and a short C-peptide (DGK) linked Aand B-chain of human insulin in a pD902 expression vector. Several zeocin resistant clones were successfully obtained and verified with PCR using AOX1 specific primers for the integration of the expression cassette into the P. pastoris genome and for the identification ofMut phenotypes. The secretion of IP by the Pichia pastoris clone in the culture supernatant was confirmed using SDS-PAGE, where a single band of the secreted IP with amolecular mass above 6.5 kDawas found.


Introduction
Insulin is a naturally occurring polypeptide hormone produced by β-cells of the pancreas, and plays an essential role in regulating blood glucose levels as part of the human metabolism (Kjeldsen et al. 2002). Consequently, it also serves as the most important drug in the treatment all types of insulin-lacking diabetic patients, to reduce blood sugar. Insulin demand continues to increase every year, together with the increasing number of diabetic patients around the world. The World Health Organization (WHO) estimated that around 422 million people suffered from diabetes mellitus in 2014 (WHO 2016), and this number is predicted to increase to 642 million by 2040 (Ogurtsova et al. 2017). A prolonged hyperglycemic condition caused by diabetes often leads to the multiple organ damage and failure. As such, diabetes has presently become one of the leading causes of death globally. It caused 1.5 million deaths in 2012 (WHO 2016).
To meet insulin demand, large-scale recombinant human insulin production is currently employing two major systems which involve Escherichia coli and Saccharomyces cerevisiae as hosts (Baeshen et al. 2014). With the E. coli as an expression host, insulin precursor (IP) is expressed in the form of inclusion bodies that need solubilization and refolding procedures (Nilsson et al. 1996). The other system uses yeast expression hosts (mainly S. cerevisiae) where the soluble IP is secreted into culture su-pernatant (Thim et al. 1986;Kjeldsen 2000). Though half of the world insulin supply is produced using the S. cerevisiae system (Meehl and Stadheim 2014), several alternative yeast hosts are available for recombinant proteins expression (Porro et al. 2005). The methylotrophic yeast Pichia pastoris is one of the promising yeast hosts for recombinant proteins expression. It has several excellent characteristics such as its ability to grow to a very high cell density and to express a high amount of recombinant proteins under the control of strong and tightly regulated promoters (Ahmad et al. 2014). It also has low levels of secreted native proteins which simplify the purification of secreted recombinant proteins (Cregg et al. 1993). There are several reports investigating the secretory expression of IP in P. pastoris that have proved that P. pastoris is applicable for IP production (Kjeldsen et al. 1999;Wang et al. 2001;Xie et al. 2008;Gurramkonda et al. 2009;Baeshen et al. 2016;Polez et al. 2016).
The human insulin, which is characterized to have a molecular weight of 5.8 kDa, is composed of 51 amino acids. Insulin synthesis begins with the formation of a single polypeptide, named preproinsulin, in the pancreatic β-cells. The preproinsulin has a 24-residue signal peptide which has a role to leads the translocation of the immature polypeptide into the endoplasmic reticulum (ER). Proinsulin is then formed in the ER after removal of the signal peptide and folded into exact conformation with three disulphide bonds. Cellular endopeptidases cleave 34 amino acids, termed as C-peptide, from the folded insulin to form mature insulin. Thus, the mature insulin consists of two chains, A-and B-chain containing 21 and 30 amino acids, respectively. Two interchain disulphide bonds linked the polypeptide A-and B-chain. Additionally, the A-chain has an intrachain disulphide bond (Fu et al. 2013;Baeshen et al. 2014).
The secretory expression of various heterologous proteins in S. cerevisiae and P. pastoris utilized S. cerevisiae mating factor α (α-factor) prepro-leader (Kjeldsen et al. 1999). It consists of a 19-residue signal (pre) sequence followed by a 67-residue pro-sequence containing three consensus N-linked glycosylation sites and a dibasic Kex2 endoprotease processing site (Kurjan and Herskowitz 1982). Since the secretion of proinsulin expressed in S. cerevisiae is inefficient, a cDNA encoding proinsulinlike molecule (with deletion of Thr at B30) fused with the S. cerevisiae α-factor and followed by replacement of the human proinsulin C-peptide with a short C-peptide such as Ala-Ala-Lys (AAK) results in the efficient secretion of proinsulin-like molecules to the culture supernatant (Kjeldsen 2000). Several studies employed full-length αfactor leader sequence and short C-peptide (AAK) for insulin secretory expression in P. pastoris (Kjeldsen et al. 1999;Xie et al. 2008;Gurramkonda et al. 2009;Polez et al. 2016), which resulted up to 3.6 g/L secreted IP. Other study reported strain, which employed short C-peptide Asp-Gly-Lys (DGK), expressed a higher yield of IP relative to the strain employed C-peptide AAK (Kjeldsen et al. 2012 Mar 6). On the other hand, Lin-Cereghino et al. (2013) have determined the influence of deletions and substitutions of the prepro region of the α-mating factor secretion signal sequence by site-directed mutagenesis. They reported that the deletion of part of the second alpha helix (MATα: ∆30-43) and removal of the entire last helix, which included some N-terminal residues (MATα: ∆57-70), increased secretion of a reporter protein horseradish peroxidase (HRP). In this study, we employed truncated α-factor leader sequence consisting of 57 amino acids secretion signal, derived from S. cerevisiae α-factor with deleted region at the second alpha helix (MATα: ∆30-43) and the entire last helix (MATα: ∆57-70) plus a Kex (KR) cleavage site and a short C-peptide (DGK) for insulin precursor secretory expression in P. pastoris.

Cloning of IP expression vector in E. coli
A Pichia integrative vector pD902-IP harbouring IP expression cassette consisting of a truncated α-factor secretion signal sequence, spacer peptide, synthetic IP encoding gene codon-optimized for expression in P. pastoris, and a short linker connecting insulin B-chain and A-chain, was provided by ATUM. For subsequent transformation into P. pastoris X33 (Invitrogen, Carlsbad, CA), the pD902-IP plasmid was firstly cloned in E. coli TOP10 by CaCl 2 -heat shock method as previously described by Sambrook and Russel (2001). The E. coli transformants were selected in LB plate supplemented with zeocin ™ (100 μg/mL). Zeocin ™ -resistant transformants were inoculated into 2 mL LB medium with 100 μg/mL Zeocin ™ and cultured overnight at 37°C with shaking. Plasmid DNA was isolated by QIAprep Spin Miniprep Kit (Qiagen, Hilden, Germany) following the manufacturer's instruction. Plasmid confirmation was done by restriction analysis using SacI restriction enzyme.

Transformation of pD902-IP into P. pastoris
The pD902-PI construct was transformed into P. pastoris X33 strain by electroporation followed the EasySelect ™ Pichia Expression Kit manual (Invitrogen 2010). About 5-10 μg of pD902-PI plasmid DNA was linearized with SacI in order to obtain efficient integration of the recombinant construct into Pichia genome. SacI enzyme cuts one time in the 5´AOX1 region to linearize the pD902-PI plasmid. A small aliquot of the plasmid digestion mix was run by agarose gel electrophoresis to check complete linearization of the plasmid. The linearized plasmid was then purified by QIAquick PCR Purification Kit (Qiagen, Hilden, Germany). On the other hand, P. pastoris cells were prepared for electroporation. An overnight culture (0.01-0.05 mL) of P. pastoris X33 strain was inoculated into 50 mL of fresh YPD medium (1% (w/v) yeast extract, 2% (w/v) peptone, 2% (w/v) glucose) in 250 mL flask and allowed to grow overnight at 30°C to an OD600 = 1.3-1.5. The cells were harvested by centrifugation 1,500 × g for 5 min at 4°C and the cell pellet was washed twice with ice-cold sterile water 50 and 25 mL, respectively. Then the cell pellet was washed once with 2 mL of ice-cold 1 M sorbitol. Cells were centrifuged at 1,500 × g for 5 min at 4°C and the cell pellet was resuspended in 1 mL of icecold 1 M sorbitol, cells were kept on ice. An 80 μL of the cells was mixed with ∼1 μg of linearized plasmid DNA in 10 μL sterile water. The mixture was then transferred to an ice-cold 0.2 cm electroporation cuvette. The cuvette with the cells was incubated on ice for 5 min and pulsed following the manufacturer's instructions for P. pastoris (Gene Pulser XcellTM Electroporation Systems, Bio-Rad). Immediately, 1 mL of ice-cold 1 M sorbitol was added to the cuvette. The cuvette contents were transferred to a sterile 15 mL tube and incubated at 30°C without shaking for 2 h. To obtain more transformant colonies, 1 mL YPD medium was added to each tube and shaken 200 rpm for 1 h. Cell suspension in 50-100 μL YPD was plated on YPDS plates (1% (w/v) yeast extract, 2% (w/v) peptone, 2% (w/v) glucose, 1% (w/v) sorbitol, 2% (w/v) agar) containing 100 μg/mL zeocin and incubated at 30°C for 3-10 d until colonies form.

Protein analysis
The supernatant of culture broth samples was concentrated 10 times using Centrifugal Filter Units (Merck Millipore, Germany) and analyzed by denaturing 10% polyacrylamide gel electrophoresis using the Tricine buffer system (Haider et al. 2012). Samples (20 μL) were mixed with an equal volume of Tricine sample buffer, mixed and boiled for 15 min. Samples were loaded on to the gel (20 μL per lane) and electrophoresed, the separated polypeptides were visualized using Blue Lightning Stain (Vivantis Technologies Sdn. Bhd., Malaysia). IP protein concentration was analysed by performing Tris/Tricine/Urea SDS-PAGE which employs 6 M urea in 15% polyacrylamide to resolve low molecular weight protein band (Okita et al. 2017). Protein concentration in Tris/Tricine/Urea SDS-PAGE was determined using imageJ by applying lysozyme as standard ranging from 0.01-1.25 mg/mL.

Cloning of synthetic IP constructed in the expression vector pD902 in E. coli
In this study, an IP cassette for human IP expression was constructed in the Pichia integrative vector pD902 under the control of a strong and inducible AOX1 promoter. The IP cassette consisted of S. cerevisiae truncated α-factor secretory signal located at 5' terminus of the IP expression cassette which contains Kex2 cleavage site (LEKR) at its carboxy-terminal part, followed by a spacer peptide (EEAEAEAEPK) for more efficient Kex2 processing and secretion (Kjeldsen et al. 1999), followed by 29 amino acid residues of insulin B-chain, a short connecting linker (DGK), and finally 21 amino acid residues of the insulin A-chain (Figure 1a). The α-factor is a leader peptide (prepro-leader peptide) commonly used in S. cerevisiae and P. pastoris for secreted expression of a heterologous protein. However, P. pastoris is often unable to secrete some proteins even though it has a proper secretion leader. The α-factor divided into two parts: a preand pro-peptide (Lin-Cereghino et al. 2013). Pre-peptide is thought to build an alpha helix which can bind to the signal recognition particle required for entry into ER (Stern et al. 2007). The pre-peptide deletion caused no activity of HRP reporter protein. Whereas, the pro-peptide has an important role in the secretion efficiency because of deletion of some parts of it such as MAα: ∆30-43 and MATα: ∆57-70 increased secretion about 20-30 and 50%, respectively (Lin-Cereghino et al. 2013). Therefore in this study, we used truncated α-factor with the deleted region at the second alpha helix (MATα: ∆30-43) and the entire last helix (MATα: ∆57-70) to obtain more efficient secretion of the IP (Figure 1c). Spacer peptide (EEAEAEAEPK) was inserted between α factor and insulin B-chain to increase the fermentation yield of IP in P. pastoris as previously reported by Kjeldsen et al. (1999) that the addition of spacer peptide EEAEAEAEPK in the IP fusion protein enhanced yield up to 167%. The expression cassette of IP was constructed in the pD902 plasmid named pD902-IP (Figure 1b). The pD902-IP was then cloned in E. coli TOP10 and the E. coli clone harbouring pD902-IP was confirmed by digestion using SacI restriction enzyme which resulted in one fragment size of ∼3921 bp (Figure 2).

Transformation of P. pastoris and identification of
Mut phenotypes P. pastoris is a methylotrophic yeast which is able to metabolize methanol as sole carbon source through methanol utilization (Mut) pathway. Alcohol oxidase (AOX) enzymes involved in the first step of methanol metabolism. They are encoded by two genes in P. pastoris, AOX1 and AOX2 genes (Juturu and Wu 2018). AOX1 gene expression is regulated by AOX1 strong promoter which was used in this study to drive the heterologous expression of IP protein in P. pastoris. In the transformation step, the SacI linearized pD902-IP was introduced into P. pastoris FIGURE 2 Agarose electrophoresis of pD902-IP digestion by SacI. M = marker 1 kb plus DNA ladder; 1 = pD902-IP; 2 = pD902-IP digested bySacI.
X33 electrocompetent cells to facilitate single crossover recombination at the AOX1 locus. Transformants were selected in YPDS medium supplemented with zeocin (100 μg/mL). After 3 d incubation, 19 colonies transformants were formed on the sample plate of P. pastoris competent cells introduced by pD902-IP, whereas no colonies were grown on the negative control plate (Figure 3). In order to select putative multicopy recombinant strains, the transformants were further cultured in the YPD agar supplemented with increasing zeocin concentrations (100-2000 µg/mL). All 19 of the transformants were able to recover up to the highest zeocin concentration (2000 µg/mL) (Figure 4). This result may reveal that the transformants can be identified as multicopy clones (>2 copy integrants). As the clone survival at the zeocin concentration up to 500 µg/mL indicates a copy number of 2 (Vassileva et al. 2001). We can expect from a multicopy transformant a higher expression level of IP since the expression level of IP in P. pastoris/pPIC9K system could be increased with the increasing of gene dosage (Wang et al. 2001). However, we confirmed by qRT-PCR that the copy number of integrant of clone No. 4 which is recovered in zeocin 100 µg/mL and 2000 µg/mL has a single copy integrant in the genome (data not shown).
Since the transformation was conducted using a linearized construct that favours single crossover recombination in the 5' region of AOX1 locus, most of the transformants should be Mut + (methanol utilization plus). Mut + strain has wild-type ability to metabolize methanol as sole carbon source. However, there is still a possibility that recombination can also occur at the 3' region of AOX1 locus, disrupting the wild-type AOX1 gene and creating Mut s (methanol utilization slow) transformants (Invitrogen 2010). IP cassette integration into the Pichia genome was confirmed by PCR using specific primer pair for AOX1 gene (AOX1F/AOX1R) ( Figure 5). PCR of 18 colonies transformants resulted in two fragments of the AOX1 gene (∼2000 bp) and IP cassette (∼500 bp). It revealed that those 18 transformants are identified as Mut + . Conversely, PCR of one colony transformant resulted in only one fragment of an IP cassette (∼500 bp). It revealed that one colony transformant is identified as Mut s where the recombination may have occurred through a double crossover event. Sequencing analysis of the ∼500 bp amplicon of clone No. 4 was also conducted to confirm the IP cassette integration into genome. The 511 bp amplicon fragment has 100% identity to the sequence of pD902-IP where it harbours the IP cassette (data not shown).

IP expression and protein analysis
The secretory expression of recombinant proteins in yeast requires the presence of a signal sequence that helps the recombinant protein to enter the ER system, the initial step for secretory expression. In P. pastoris, both full-length S. cerevisiae α-factor signal sequence (89 amino acids) and its truncated version have been successfully utilized for secretory expression of recombinant proteins (Juturu and Wu 2018). Protein secretion from the lumen of the ER is occurred after folding and it frequently becomes the rate-limiting step in the protein secretion. Therefore improvement of folding stability by engineering the connect- ing peptide can enhance the secretion efficiency of the IP (Kjeldsen et al. 2002). Expression analysis of IP was done using one of the Mut + P. pastoris recombinant strains. The P. pastoris recombinant and wild-type strains were grown in BMMY medium with 100% methanol added to a final concentration of 0.5% methanol every 24 h to maintain induction. The supernatant was collected from the culture after 72 h of methanol induction. Secreted IP is predicted to consist of 63 amino acids with a molecular weight of 7053 Da. Protein analysis of IP from culture supernatant was conducted by SDS-PAGE using Tricine buffer system. It shows that after 72 h induction, the culture supernatant contained a protein with the size of above 6.5 kDa which may represent the IP protein as it was not found in both the culture supernatant of wild-type and 0 h methanol induction ( Figure 6). The secreted IP comprised of 29 residues of insulin B-chain and the 21 residues of insulin A-chain connected by a short linker peptide DGK, has a smaller FIGURE 5 Agarose electrophoresis of PCR confirmation of P. pastoris recombinant clones. M = Marker 1 kb plus DNA ladder; 1-19 = positive clones; P = control plasmid pD902-IP; G = control wildtype; W = control negative (water). The methanol utilization phenotype of 19 clones tested was identified as Mut + (18 clones = 1-12,14-19) and Muts (1 clone = 13).