Cas9 ( CRISPR associated protein 9, formerly called Cas5, Csn1, or Csx12) is a 160 kilodalton protein which plays a vital role in the immunological defense of certain bacteria against and , and is heavily utilized in genetic engineering applications. Its main function is to cut DNA and thereby alter a cell's genome. The CRISPR-Cas9 genome editing technique was a significant contributor to the Nobel Prize in Chemistry in 2020 being awarded to Emmanuelle Charpentier and Jennifer Doudna.
More technically, Cas9 is a RNA-guided DNA endonuclease enzyme associated with the Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) adaptive immune system in Streptococcus pyogenes. S. pyogenes utilizes CRISPR to memorize and Cas9 to later interrogate and cleave foreign DNA, such as invading bacteriophage DNA or plasmid DNA. Cas9 performs this interrogation by unwinding foreign DNA and checking for sites complementary to the 20 nucleotide spacer region of the guide RNA (gRNA). If the DNA substrate is complementary to the guide RNA, Cas9 cleaves the invading DNA. In this sense, the CRISPR-Cas9 mechanism has a number of parallels with the RNA interference (RNAi) mechanism in eukaryotes.
Apart from its original function in bacterial immunity, the Cas9 protein has been heavily utilized as a genome engineering tool to induce site-directed double-strand breaks in DNA. These breaks can lead to gene inactivation or the introduction of heterologous genes through non-homologous end joining and homologous recombination respectively in many laboratory model organisms. Research on the development of various cas9 variants has been a promising way of overcoming the limitation of the CRISPR-Cas9 genome editing. Some examples include Cas9 nickase (Cas9n), a variant that induces single-stranded breaks (SSBs) or variants recognizing different PAM sequences. Alongside zinc finger nucleases and transcription activator-like effector nuclease (TALEN) proteins, Cas9 is becoming a prominent tool in the field of genome editing.
Cas9 has gained traction in recent years because it can cleave nearly any sequence complementary to the guide RNA. Because the target specificity of Cas9 stems from the guide RNA:DNA complementarity and not modifications to the protein itself (like TALENs and zinc fingers), engineering Cas9 to target new DNA is straightforward. Versions of Cas9 that bind but do not cleave cognate DNA can be used to locate transcriptional activator or repressors to specific DNA sequences in order to control transcriptional activation and repression. Native Cas9 requires a guide RNA composed of two disparate RNAs that associate – the CRISPR RNA (crRNA), and the trans-activating crRNA (tracrRNA). Cas9 targeting has been simplified through the engineering of a chimeric single guide RNA (chiRNA). Scientists have suggested that Cas9-based may be capable of editing the genomes of entire populations of organisms. In 2015, Cas9 was used to modify the genome of human embryos for the first time.
Groups led by Feng Zhang and George Church simultaneously published descriptions of genome editing in human cell cultures using CRISPR-Cas9 for the first time. It has since been used in a wide range of organisms, including baker's yeast ( Saccharomyces cerevisiae), the opportunistic pathogen Candida albicans, zebrafish ( Zebrafish), fruit flies ( Drosophila melanogaster), ants ( Harpegnathos saltator and Ooceraea biroi), mosquitoes ( Aedes aegypti), nematodes ( Caenorhabditis elegans), plants, mice ( Mus musculus domesticus), monkeys and human embryos.
CRISPR has been modified to make programmable transcription factors that allow activation or silencing of targeted genes. The CRISPR-Cas9 system has been shown to make effective gene edits in Human tripronuclear zygotes, as first described in a 2015 paper by Chinese scientists P. Liang and Y. Xu. The system made a successful cleavage of mutant Beta-Hemoglobin (HBB) in 28 out of 54 embryos. Four out of the 28 embryos were successfully recombined using a donor template. The scientists showed that during DNA recombination of the cleaved strand, the homologous endogenous sequence HBD competes with the exogenous donor template. DNA repair in human embryos is much more complicated and particular than in derived stem cells.
Cas9 has been used often as a genome-editing tool. Cas9 has been used in recent developments in preventing viruses from manipulating hosts' DNA. Since the CRISPR-Cas9 was developed from bacterial genome systems, it can be used to target the genetic material in viruses. The use of the enzyme Cas9 can be a solution to many viral infections. Cas9 possesses the ability to target specific viruses by the targeting of specific strands of the viral genetic information. More specifically the Cas9 enzyme targets certain sections of the viral genome that prevents the virus from carrying out its normal function.
CRISPR-Cas systems are divided into three major types (type I, type II, and type III) and twelve subtypes, which are based on their genetic content and structural differences. However, the core defining features of all CRISPR-Cas systems are the cas genes and their proteins: cas1 and cas2 are universal across types and subtypes, while cas3, cas9, and cas10 are signature genes for type I, type II, and type III, respectively.
Loss of spacers and even groups of several have also been observed by Aranaz et al. 2004 and Pourcel et al. 2007. This probably occurs through homologous recombination of the between-repeat material.
Stage 1: CRISPR spacer integration. Protospacers and protospacer-associated motifs (shown in red) are acquired at the "leader" end of a CRISPR array in the host DNA. The CRISPR array is composed of spacer sequences (shown in colored boxes) flanked by repeats (black diamonds). This process requires Cas1 and Cas2 (and Cas9 in type II), which are encoded in the cas locus, which are usually located near the CRISPR array.
Stage 2: CRISPR expression. Pre-crRNA is transcribed starting at the leader region by the host RNA polymerase and then cleaved by Cas proteins into smaller crRNAs containing a single spacer and a partial repeat (shown as hairpin structure with colored spacers).
Stage 3: CRISPR interference. crRNA with a spacer that has strong complementarity to the incoming foreign DNA begins a cleavage event (depicted with scissors), which requires Cas proteins. DNA cleavage interferes with viral replication and provides immunity to the host. The interference stage can be functionally and temporarily distinct from CRISPR acquisition and expression (depicted by white line dividing the cell).
When examining the effects of repression of transcription further, H3K27, an amino acid component of a histone, becomes methylated through the interaction of dCas9 and a peptide called FOG1. Essentially, this interaction causes gene repression on the C + N terminal section of the amino acid complex at the specific junction of the gene, and as a result, terminates transcription.
dCas9 also proves to be efficient when it comes to altering certain proteins that can create diseases. When the dCas9 attaches to a form of RNA called guide-RNA, it prevents the proliferation of repeating codons and DNA sequences that might be harmful to an organism's genome. Essentially, when multiple repeat codons are produced, it elicits a response, or recruits an abundance of dCas9 to combat the overproduction of those codons and results in the shut-down of transcription. dCas9 works synergistically with gRNA and directly affects the DNA polymerase II from continuing transcription.
Further explanation of how the dCas9 protein works can be found in their utilization of plant genomes by the regulation of gene production in plants to either increase or decrease certain characteristics. The CRISPR-CAS9 system has the ability to either upregulate or downregulate genes. The dCas9 proteins are a component of the CRISPR-CAS9 system and these proteins can repress certain areas of a plant gene. This happens when dCAS9 binds to repressor domains, and in the case of the plants, deactivation of a regulatory gene such as AtCSTF64, does occur.
Bacteria are another focus of the usage of dCas9 proteins as well. Since eukaryotes have a larger DNA makeup and genome; the much smaller bacteria are easy to manipulate. As a result, eukaryotes use dCas9 to inhibit RNA polymerase from continuing the process of transcription of genetic material.
A key feature of the target DNA is that it must contain a protospacer adjacent motif (PAM) consisting of the three-nucleotide sequence- NGG. This PAM is recognized by the PAM-interacting domain (PI domain, orange) located near the C-terminal end of Cas9. Cas9 undergoes distinct conformational changes between the apo, guide RNA bound, and guide RNA:DNA bound states.
Cas9 recognizes the stem-loop architecture inherent in the CRISPR locus, which mediates the maturation of crRNA-tracrRNA ribonucleoprotein complex. Cas9 in complex with CRISPR RNA (crRNA) and trans-activating crRNA (tracrRNA) further recognizes and degrades the target dsDNA. In the co-crystal structure shown here, the crRNA-tracrRNA complex is replaced by a chimeric single-guide RNA (sgRNA, in red) which has been proved to have the same function as the natural RNA complex. The sgRNA base paired with target ssDNA is anchored by Cas9 as a T-shaped architecture. This crystal structure of the DNA-bound Cas9 enzyme reveals distinct conformational changes in the alpha-helical lobe with respect to the nuclease lobe, as well as the location of the HNH domain. The protein consists of a recognition lobe (REC) and a nuclease lobe (NUC). All regions except the HNH form tight interactions with each other and sgRNA-ssDNA complex, while the HNH domain forms few contacts with the rest of the protein. In another conformation of Cas9 complex observed in the crystal, the HNH domain is not visible. These structures suggest the conformational flexibility of HNH domain.
To date, at least three crystal structures have been studied and published. One representing a conformation of Cas9 in the apo state, and two representing Cas9 in the DNA bound state.
Wild-type S. pyogenes Cas9 requires magnesium (Mg2+) cofactors for the RNA-mediated DNA cleavage; however, Cas9 has been shown to exhibit varying levels of activity in the presence of other divalent metal ions. For instance, Cas9 in the presence of manganese (Mn2+) has been shown to be capable of RNA-independent DNA cleavage. The Enzyme kinetics of DNA cleavage by Cas9 have been of great interest to the scientific community, as this data provides insight into the intricacies of the reaction. While the cleavage of DNA by RNA-bound Cas9 has been shown to be relatively rapid ( k ≥ 700 s−1), the release of the cleavage products is very slow ( t1/2 = ln(2)/ k ≈ 43–91 h), essentially rendering Cas9 a single-Turnover number enzyme. Additional studies regarding the kinetics of Cas9 have shown engineered Cas9 to be effective in reducing off-target effects by modifying the rate of the reaction.
The cleavage efficiency of Cas9 depends on numerous factors. A key requirement is the presence of a valid PAM at the non-target strand 3 nucleotides downstream from the cleavage site. The canonical PAM sequence for S. Pyogenes Cas9 is NGG, but alternative motifs are tolerated with lower cleavage activity. The most efficient alternative PAM motifs for the wild-type S. Pyogenes Cas9 are NAG and NGA. The sequence composition at the target DNA site complementary to the 20 nucleotide spacer region of the gRNA also affects cleavage efficiency. The most relevant nucleotide composition properties that impact efficiency are those in the PAM-proximal region. Free energy changes of nucleic acids are also highly relevant in defining cleavage activity. In addition to efficiency, the nucleotide composition of the five nucleotides closest to the PAM in the target sequence also affects the scission profile, influencing whether DNA cleavage is blunt or staggered. Guide RNAs that bind to the DNA forming a duplex that falls into a restricted range of binding free energy changes that excludes extremely weak or stable bindings generally perform efficiently. Stable guide RNA folding conformations can also impair cleavage.
Using molecular dynamics simulation, a study reported that cleavage of the NTS between 17|16 of the target sequence was more energetically favored than 18|17, generating 1 nucleotide 5’ ssDNA overhangs. Notably, the authors demonstrated that the 5’ overhangs are filled in, and the product of DNA repair are templated insertions, where the 5’ overhang is used as a template by Pol4 for the repair reaction. The association between staggered cleavage and precise templated insertions was supported by additional studies in human cells.
Recently, a high-throughput investigation of Cas9 scission profile revealed that ~85% of on-target cleavage is blunt, whereas ~15% had a 1 nucleotide 5' overhang. Off-targets had a higher staggered cleavage rate compared to on-target sites, with approximately 1/3 of off-targets displaying 5' overhangs from 1 to 3 nucleotides. The scission profile analysis revealed that sequence patterns in the target sequence favor the formation of blunt or staggered DNA cuts, and staggered cleavage favored the formation of predictable indels.
The interaction of dCas9 with target dsDNA is so tight that high molarity urea protein denaturant can not fully dissociate the dCas9 RNA-protein complex from dsDNA target. dCas9 has been targeted with engineered single guide RNAs to transcription initiation sites of any loci where dCas9 can compete with RNA polymerase at promoters to halt transcription. Also, dCas9 can be targeted to the coding region of loci such that inhibition of RNA Polymerase occurs during the elongation phase of transcription. In Eukaryotes, silencing of gene expression can be extended by targeting dCas9 to enhancer sequences, where dCas9 can block assembly of transcription factors leading to silencing of specific gene expression. Moreover, the guide RNAs provided to dCas9 can be designed to include specific mismatches to its complementary cognate sequence that will quantitatively weaken the interaction of dCas9 for its programmed cognate sequence allowing a researcher to tune the extent of gene silencing applied to a gene of interest. This technology is similar in principle to RNAi such that gene expression is being modulated at the RNA level. However, the dCas9 approach has gained much traction as there exist less off-target effects and in general larger and more reproducible silencing effects through the use of dCas9 compared to RNAi screens. Furthermore, because the dCas9 approach to gene silencing can be quantitatively controlled, a researcher can now precisely control the extent to which a gene of interest is repressed allowing more questions about gene regulation and gene stoichiometry to be answered.
Beyond direct binding of dCas9 to transcriptionally sensitive positions of loci, dCas9 can be fused to a variety of modulatory protein domains to carry out a myriad of functions. Recently, dCas9 has been fused to chromatin remodeling proteins (HDACs/HATs) to reorganize the chromatin structure around various loci. This is important in targeting various eukaryotic genes of interest as heterochromatin structures hinder Cas9 binding. Furthermore, because Cas9 can react to heterochromatin, it is theorized that this enzyme can be further applied to studying the chromatin structure of various loci. Additionally, dCas9 has been employed in genome wide screens of gene repression. By employing large libraries of guide RNAs capable of targeting thousands of genes, genome wide genetic screens using dCas9 have been conducted.
Another method for silencing transcription with Cas9 is to directly cleave mRNA products with the catalytically active Cas9 enzyme. This approach is made possible by hybridizing ssDNA with a PAM complement sequence to ssRNA allowing for a dsDNA-RNA PAM site for Cas9 binding. This technology makes available the ability to isolate endogenous RNA transcripts in cells without the need to induce chemical modifications to RNA or RNA tagging methods.
|
|