Wednesday 17 November 2010

GGS LIVE - Making a fusion protein

Yo BioFreakers,

today in the GGS LIVE section we will learn how to virtually design and generate a fusion protein.

Method: protein tagging.

About: protein tagging allows to perform specific experimenents that are not possible with the endogenous protein (wild-type protein).

What: desing and generation of protein X fusion with a GFP (green fluorescent protein) tag.

There are many different tags available for protein tagging. Choosing the tag depends on the experiments that we want to perfrorm with the fusion protein. So, if we want to:
- purify the protein of interest, we would use MBP (maltose binding protein), GST (gluthatione S-transferase), FLAG or His (hexahisitidine) tags,
- easily detect our protein, we would use epitope tags like myc, V5 or HA,
- follow cellular localization of the protein, we would use fluorescent tag like GFP.
The most important is to remember that fusion protein might behave differently in vivo than wild-type protein (folidng, solubility or activity, etc may change) so it is crucial to check if the fusion is functional before we start our experiments.

Ok lets start with design of the fusion protein. In our study case we will use a sequence of a protein X (shown below) and a pEGFP-C1 and -N1 plasmids for N and C terminal tagging, respectively.

The red triplets are start and stop codons, respectively. We are also going to need information about multi cloning site (MCS) sequence of the pEGFPN1 and -C1 vectors, where DNA sequence of protein X will be inserted. Information about those is shown below.

As you see there are many restriction sites available in both plasmids. Our aim here is to find a restriction enzymes that will cut in MCS of our vectors but not in the sequence of protein X (we can do that with any cloning software, like free pDRAW32 which you can download from here). In our study case two enzymes XhoI and EcoRI are cuttining in the MCSs but not in sequence of protein X. What we have to do now is to virtually introduce XhoI and EcoRI sequences at 5' and 3' end, respectively (sequences recognised by XhoI and EcoRI endonucleases are available here XhoI and EcoRI). There is one more issue to look at before we are going to add our restriction sites (see picture below).

It is important to remove start codon of the protein when tagging it on the N-terminus to avoid an expression of untagged form of the protein. You have to remember that promoter will drive expression of any open reading frame that is downstream of it. It is also crucial to remove stop codon when we tagging protein on the C-terminus to prevent premature termination and allow expression of a fusion protein. Including above information we have:

Now we put this sequence into our plasmid and we get this:

Lets have a look at our constructs now. We are going to focus on the reading frame at the moment. In the case of N-terminal tagging our reading frame is determined by the two last codons of GFP tag. To get the right fusion DNA sequence of protein X has to be in the reading frame with the GFP tag. The reading frame is indicated by the horizontal brackets (each triplet codes for one amino acid). As you can see stop codon of the protein X (indicated with the red colour) is not in reading frame with GFP tag. To shift the reading frame we have to add two extra nuclotides to our protein X just between the XhoI restriction site and the coding sequence of protein X. Similar situation takes place with the C-terminal tagging. You can see that now start codon is not in frame with GFP tag and addition of a single nuclotide between EcoRI restriction site and protein sequence will rescue that problem (see the pictures below).

It is important to remember that addition of extra nuclotides to our sequence may result in introduction of the stop codon. We have to check our sequence before we proceed further. If everything is ready we should obtain this:

As you can see now, protein X sequence is in the frame with the GFP tag in both cases. Now the protein sequence including the XhoI / EcoRI restriction sites and extra nucleotides can be used to design primers for cloning of the protein X DNA. Such DNA then will be sequenced to check for potential mutations and if correct subcloned into pEGFPN1 or -C1 plasmids. Such construct can be later used for expression and localization studies of the protein X.

I hope u enjoyed it.




  1. I'm a total beginner in this field, and I can't tell you how much this helped, thank you so much!

  2. not a beginner in this field but am embarrassingly lacking in my skills as a true molecular biologist. this is very well-written and helpful!! thanks!!

  3. is it not crucial to remove the atg from the c-terminal tag?

  4. Your article is a godsend!

    Is there a book or a helpful article that elaborates on this with more examples and details? Much appreciated!

  5. This seriously saved my arse on my homework. I finally get what we were supposed to do. Thanks so much for making this!

  6. I just wanted to say that your blog is really useful, especially for new PhD students. Putting these kinds of methods in a blog is genius and I hope you will continue outlining and explaining as well as you do!

  7. Perfect Explanation for the beginners like me!! Too good seriously!

  8. Can you clarify why you have to remove the start codon from the gene for an N-terminal fusion? Translational start is determined by the Kozak sequence, not ATG.

  9. Dear Guest,

    it is not crucial to remove the ATG codon from your cDNA it will still produce the fusion protein. But remember that even without the optimal Kozak sequence any ATG codon downstream of the strong promoter will be transcribed and potentially translated (as you mentioned Kozak sequence is required for translation). Non-canonical Kozak sequence can be used as translation start. Kozak found that one but not other sequences facilitate translation of recombinant proteins in bacteria but this is not 100% detrimental as many other observations in the cell biology. Believe me, I cloned a lot of cDNAs and always found the untagged version of the protein if the ATG codon was not removed. Therefore, if you do not want the untagged protein to be produced in the cells, (regardless if they are bacterial or mammalian) it is better to remove the start codon when generating the plasmid construct. I hope this answered your question.

  10. You have added extra nucleotides in your sequence to get a proper open reading frame. Would they not have an affect on the coded protein ?
    Thank you

  11. I have one doubt. If u want to isolate the protein of interest from GST tag how will u isolate it in this example?

  12. Dear Sree,

    I assume you meant: how to isolate your protein from GST after GST cleavage?. If this is what you meant, then you can pass the protein prep after the cleavage through GSH beads to capture the GST protein, however, you may have to remove the reduced glutathione first (for instance by dialysis or gel filtration) which will prevent GST protein to bind to the GSH beads. This step is usually done before adding the protease that separates GST from the protein as it may interfere with the digest step.

    I hope this helped.

    If you have any other question, please do not hesitate to contact me by email:


    GGS Team

  13. When tagging c-terminal do you need to add a linker between the ORF of you GOI and the HA-tag?

  14. Thank you for sharing us education, please kindly visit mine :D