Protein splicing is an intramolecular reaction of a particular protein in which an internal protein segment (called an intein) is removed from a precursor protein with a ligation of C-terminus and N-terminus external proteins (called ) on both sides. The splicing junction of the precursor protein is mainly a cysteine or a serine, which are containing a Nucleophile side chain. The protein splicing reactions which are known now do not require exogenous cofactors or energy sources such as adenosine triphosphate (ATP) or guanosine triphosphate (GTP). Normally, splicing is associated only with pre-mRNA splicing. This precursor protein contains three segments—an N-extein followed by the intein followed by a C-extein. After splicing has taken place, the resulting protein contains the N-extein linked to the C-extein; this splicing product is also termed an extein.
Protein splicing was unanticipated and its mechanisms were discovered by two groups (Anraku and Stevens) in 1990. They both discovered a Saccharomyces cerevisiae VMA1 in a precursor of a vacuolar ATPase enzyme. The amino acid sequence of the N- and C-termini corresponded to 70% DNA sequence of that of a vacuolar H+-ATPase from other organisms, while the amino acid sequence of the central position corresponded to 30% of the total DNA sequence of the yeast HO nuclease.
Many have unrelated intein-coding segments inserted at different positions. For these and other reasons, inteins (or more properly, the gene segments coding for inteins) are sometimes called selfish genetic elements, but it may be more accurate to call them Parasite. According to the gene centered view of evolution, most genes are "selfish" only insofar as to compete with other genes or but usually they fulfill a function for the organisms, whereas "parasitic genetic elements", at least initially, do not make a positive contribution to the fitness of the organism.
As of December 2019, the UniProtKB database contains 188 entries manually annotated as inteins, ranging from just tens of amino acid residues to thousands. The first intein was found encoded within the VMA gene of Saccharomyces cerevisiae. They were later found in fungi (ascomycetes, basidiomycetes, zygomycetes and chytrids) and in diverse proteins as well. A protein distantly related to known inteins containing protein, but closely related to metazoan , has been described to have the intein sequence from Glomeromycota. Many of the newly described inteins contain homing endonucleases and some of these are apparently active. The abundance of intein in fungi indicates lateral transfer of intein-containing genes. While in eubacteria and archaea, there are 289 and 182 currently known inteins. Not surprisingly, most intein in eubacteria and archaea are found to be inserted into nucleic acid metabolic protein, like fungi.
Inteins vary greatly, but many of the same intein-containing proteins are found in a number of species. For example, pre-mRNA processing factor 8 (Prp8) protein, instrumental in the spliceosome, has seven different intein insertion sites across eukaryotic species. Intein-containing Prp8 is most commonly found in fungi, but is also seen in Amoebozoa, Chlorophyta, Capsaspora, and Choanoflagellida. Many mycobacteria contain inteins within dnaB helicase (bacterial replicative helicase), RecA (bacterial DNA recombinase), and SufB (FeS cluster assembly protein). There is remarkable variety within the structure and number of DnaB inteins, both within the mycobacterium genus and beyond. Interestingly, intein-containing DnaB is also found in the of algae. Intein-containing proteins found in archaea include RadA (RecA homolog), RFC, PolB, RNR. Many of the same intein-containing proteins (or their homologs) are found in two or even all three domains of life. Inteins are also seen in the proteomes encoded by bacteriophages and eukaryotic viruses. Viruses may have been involved as vectors of intein distribution across the wide variety of intein containing organisms.
Class 2 inteins have no nucleophilic first side chain, only an alanine. Instead, the reaction starts directly with a nucleophilic displacement, with the first residue of the C-extein atticking the peptide carboxyl on the final residue of the N-extein. The rest proceeds as usual, starting with Asn turning into a cyclic imide.
Class 3 inteins have no nucleophilic first side chain, only an alanine, yet they have an internal noncontiguous "WCT" motif. The internal C (cysteine) residue attacks the peptide carboxyl on the final residue of the N-extein (nucleophilic displacement). Transesterification occurs when the first residue of the C-extein attacks the newly formed thioester. The rest proceeds as usual.
The mechanism for the splicing effect is a naturally occurring analogy to the technique for chemically generating medium-sized proteins called native chemical ligation.
Normally, as in this example, just three letters suffice to specify the organism, but there are variations. For example, additional letters may be added to indicate a strain. If more than one intein is encoded in the corresponding gene, the inteins are given a numerical suffix starting from 5 to 3 or in order of their identification (for example, "Msm dnaB-1").
The segment of the gene that encodes the intein is usually given the same name as the intein, but to avoid confusion the name of the intein proper is usually capitalized ( e.g., Pfu RIR1-1), whereas the name of the corresponding gene segment is italicized ( e.g., Pfu rir1-1). A different disambiguating convention is to place a lowercase "i" after the source protein name, e.g. "Msm DnaBi1".
Enzyme inhibitor of intein excision may be a useful tool for drug development; the protein that contains the intein will not carry out its normal function if the intein does not excise, since its structure will be disrupted.
It has been suggested that inteins could prove useful for achieving allotopic expression of certain highly hydrophobe proteins normally encoded by the genome, for example in gene therapy. The hydrophobicity of these proteins is an obstacle to their import into mitochondria. Therefore, the insertion of a non-hydrophobic intein may allow this import to proceed. Excision of the intein after import would then restore the protein to wild-type.
Affinity tags have been widely used to purify recombinant proteins, as they allow the accumulation of recombinant protein with little impurities. However, the affinity tag must be removed by proteases in the final purification step. The extra proteolysis step raises the problems of protease specificity in removing affinity tags from recombinant protein, and the removal of the digestion product. This problem can be avoided by fusing an affinity tag to self-cleavable inteins in a controlled environment. The first generation of expression vectors of this kind used modified Saccharomyces cerevisiae VMA (Sce VMA) intein. Chong et al. used a chitin binding domain (CBD) from Bacillus circulans as an affinity tag, and fused this tag with a modified Sce VMA intein. The modified intein undergoes a self-cleavage reaction at its N-terminal peptide linkage with 1,4-dithiothreitol (DTT), β-mercaptoethanol (β-ME), or cystine at low temperatures over a broad pH range. After expressing the recombinant protein, the cell homogenate is passed through the column containing chitin. This allows the CBD of the chimeric protein to bind to the column. Furthermore, when the temperature is lowered and the molecules described above pass through the column, the chimeric protein undergoes self-splicing and only the target protein is eluted. This novel technique eliminates the need for a proteolysis step, and modified Sce VMA stays in column attached to chitin through CBD.
Recently inteins have been used to purify proteins based on self aggregating peptides. Elastin-like polypeptides (ELPs) are a useful tool in biotechnology. Fused with target protein, they tend to form aggregates inside the cells. This eliminates the chromatographic step needed in protein purification. The ELP tags have been used in the fusion protein of intein, so that the aggregates can be isolated without chromatography (by centrifugation) and then intein and tag can be cleaved in controlled manner to release the target protein into solution. This protein isolation can be done using continuous media flow, yielding high amounts of protein, making this process more economically efficient than conventional methods. Another group of researchers used smaller self aggregating tags to isolate target protein. Small amphipathic peptides 18A and ELK16 (figure 5) were used to form self cleaving aggregating protein.
Current research on intein splicing inhibitors has focused on developing antimycobacterials ( M. tb. has three intein-containing proteins), as well as agents active against pathogenic fungi Cryptococcus and Aspergillus. Cisplatin and similar platinum-containing compounds inhibit splicing of the M. tb. RecA intein through coordinating to catalytic residues. Divalent cations, such as copper (II) and zinc (II) ions, function similarly to reversibly inhibit splicing. However, neither of these methods are currently suitable for an effective and safe antibiotic. The fungal Prp8 intein is also inhibited by divalent cations and cisplatin through interfering with the catalytic Cys1 residue. In 2021, Li et al. showed that small molecule inhibitors of Prp8 intein splicing were selective and effective at slowing the growth of C. neoformans and C. gattii'', providing exciting evidence for the antimicrobial potential of intein splicing inhibitors.
Intein
Naming conventions
Types of inteins
Full and mini inteins
Split inteins
Applications in biotechnology
Applications in Antimicrobial Development
See also
Further reading
External links
|
|