Cas9 Protein – Structure, Types, Functions, Applications
The field of biology is now experiencing a transformative phase with the advent of facile genome engineering in animals and plants using RNA-programmable CRISPR-Cas9. The CRISPR-Cas9 technology originates from type II CRISPR-Cas systems, which provide bacteria with adaptive immunity to viruses and plasmids. The CRISPR-associated protein Cas9 is an endonuclease that uses a guide sequence within an RNA duplex, tracrRNA:crRNA, to form base pairs with DNA target sequences, enabling Cas9 to introduce a site-specific double-strand break in the DNA. The dual tracrRNA:crRNA was engineered as a single guide RNA (sgRNA) that retains two critical features: a sequence at the 5′ side that determines the DNA target site by Watson-Crick base-pairing and a duplex RNA structure at the 3′ side that binds to Cas9. This finding created a simple two-component system in which changes in the guide sequence of the sgRNA program Cas9 to target any DNA sequence of interest. The simplicity of CRISPR-Cas9 programming, together with a unique DNA cleaving mechanism, the capacity for multiplexed target recognition, and the existence of many natural type II CRISPR-Cas system variants, has enabled remarkable developments using this cost-effective and easy-to-use technology to precisely and efficiently target, edit, modify, regulate, and mark genomic loci of a wide array of cells and organisms.
Name |
Cas9 endonuclease |
Alternative name |
spCas9/spyCas9 |
Organism |
Streptococcus pyogenes serotype M1 |
Molecular weight |
~163KDa |
Gene |
cas9 |
Location on chromosome |
0.85 to 0.86Mb |
Protein |
CRISPR-associated endonuclease Cas9/Csn1 |
Cofactor |
Mg2+ |
Biological processing |
Interference- defense response to phage. |
Functions |
DNA and RNA binding |
- Cas9 is a nuclease that degrades phage DNA via RNA-guided double DNA cleavage, DNA binding, and nuclease activity.
- Cas9 protein is prominent in CRISPR systems of bacterial type II.
- It requires both crRNA and tracrRNA to function properly.
- Catalytic activity also requires a PAM sequence on the target DNA.
- Cas9 is changed for a variety of functions, including gene activation and gene expression suppression.
- Cas9’s significance in CRISPR-mediated gene editing and applications such as disease modelling, gene role research, therapeutic and gene expression investigations is well established.
- Using the PAM sequence as a marker, it simply locates, binds, and cleaves the target nucleic acid. To identify the fugitive, the sgRNA containing cRNA and tracrRNA seeks complementarity with the target location.
- However, its two-level authentication (the employment of sgRNA and PAM) diminishes in vitro gene editing efficiency significantly. Therefore, customised Cas9 nucleases such as spCas9, dCas9, SaCas9, and XCas9 are available.
What is Cas9 Protein?
- Cas9, also known as CRISPR-associated protein 9, is one of the well-studied, significant, and commercially available nucleases employed not only in bacterial systems, but also in in vitro gene-editing techniques.
- Cas9 is a form of DNA nuclease that can accurately remove dsDNA, and it is exclusive to CRISPR type II. It is most typically found in Streptococcus pyogenes and is referred to as dual RNA-guided DNA endonuclease.
- To comprehend why only Cas9 is commonly employed for gene editing, it is necessary to comprehend the structure, function, and significance of the Cas9 protein, formerly known as Cas5, Csx12, and Csn1.
- S. pyogenes SpyCas9 is a large (1,368 amino acids), multidomain, and multifunctional DNA endonuclease.
- It uses its two unique nuclease domains to snip dsDNA 3 bp upstream of the PAM: an HNH-like nuclease domain that cleaves the DNA strand complementary to the guide RNA sequence (target strand), and a RuvC-like nuclease domain that cleaves the DNA strand opposite the complementary strand (nontarget strand).
- Cas9 also contributes in crRNA maturation and spacer acquisition, in addition to its essential involvement in CRISPR interference.
Structure of Cas9
- Cas9 in its apo state has two different lobes: the alpha-helical recognition (REC) lobe and the nuclease (NUC) lobe, which contains the conserved HNH and split RuvC nuclease domains as well as the more variable C-terminal domain (CTD).
- Two linking segments join the two lobes, one created by the arginine-rich bridge helix and the other by a disordered linker (residues 712–717).
- The REC lobe consists of three alpha-helical domains (Hel-I, Hel-II, and Hel-III) and is structurally distinct from all other known proteins.
- The extended CTD has a Cas9-specific fold and contains PAM-interacting sites necessary for PAM interrogation. Nonetheless, this PAM-recognition region is highly disordered in the apo–Cas9 structure, indicating that the apo–Cas9 enzyme is maintained in an inactive state, unable to detect target DNA prior to binding to a guide RNA.
- This structural finding is consistent with so-called DNA curtains tests demonstrating that apo–Cas9 binds nonspecifically to DNA and can be swiftly removed from nonspecific locations in the presence of competing RNA (guide RNA or heparin).
- The structural superimposition of apo–Cas9 with sgRNA-bound and DNA-bound structures indicates further that the enzyme adopts a catalytically inactive conformation in its apo state, requiring RNA-induced structural activation for DNA recognition and cleavage.
- This structural result corroborates the biochemical findings that Cas9 enzymes are inactive as nucleases in the absence of bound guide RNAs and further supports their activity as RNA-guided endonucleases.
HNH and RuvC Nuclease Domains
- Comparing the structures of Cas9 nuclease domains to those of other DNA-bound nucleases shows that the Cas9 RuvC nuclease domain is similar to members of the retroviral integrase superfamily that have an RNase H fold. This suggests that RuvC probably uses a two-metal-ion catalytic mechanism to cut the nontarget DNA strand.
- The HNH nuclease domain, on the other hand, has the same -metal fold as other HNH endonucleases and most likely uses a single metal ion to cut the target-strand DNA.
- One metal-ion-dependent and two metal-ion-dependent nucleic acid cleaving enzymes can be identified by a general base histidine that is always the same and an aspartate residue that is always the same.
- This is in line with Cas9 mutagenesis studies that show changing either the HNH (H840A) or the RuvC domain (D10A) turns Cas9 into a nickase, while changing both nuclease domains of Cas9 (so-called “dead Cas9” or dCas9) keeps its ability to bind to RNA-guided DNA but gets rid of its ability to cut DNA.
- But these proposed catalytic mechanisms still need to be tested in the lab to make sure they work.
Mechanism of working
- In the first step of bacterial interference, the REC lob and the gRNA complex work together to form the ribonucleoprotein complex (RNP). Then, the nuclease domains RuvC and HNH break two phosphodiester bonds between two different strands of DNA, which separates the dsDNA strands.
- An in-depth study shows that the HNH active domain hydrolyzes the phosphodiester bond of the complementary strand, while the RuvC active site hydrolyzes the phosphodiester bond of the non-complementary strand. RuvC and HNH each use two metal ions and one metal ion for hydrolysis because it needs metal ions to work (Tang H et al., 2021).
Lobe |
Domain |
Residues |
function |
REC |
Bridge helix |
60-93 |
Recognition of DNA |
REC |
REC1 |
94-179, 308-713 |
RNA guided DNA targeting |
REC |
REC2 |
180-307 |
DNA binding |
NUC |
RuvC (RuvCI, RuvCII and
RuvCIII) |
1-59, 718-769, 909-1098 |
RNase H activity; Nuclease
activity for non-complementary target strand. |
NUC |
HNH |
775-908 |
Nuclease activity for
complementary target strand |
NUC |
PAM-interacting- domain |
1099-1368 |
Finds the PAM sequence on the
target DNA. |
Types of Cas9 nucleases
There are different kinds of Cas9 nucleases that come from
both nature and labs. They are put into groups based on their function or the
species from which they came. I’ll list and explain a few of them here.
1. SpCa9
Structure |
Bilobed (REC
and NUC) |
Domains |
NUC (Nuclease
domain): HNH and RuvC |
Bacterial
CRISPR system |
System II |
PAM
sequence |
5’-NGG-3’ (N
is any nucleotide) |
SgRNA |
Required
(crRNA: tracrRNA) |
Variants |
SpCas9-NRRH,
SpG, SpCas9-NRCH, SpCas9-NRTH, |
- SpCas9 comes from Streptococcus Pyogenes and is one of the most popular, well-studied, and widely used Cas9 nucleases in genetic engineering experiments.
- As was already said, it needs both crRNA and tracrRNA as sgRNA and the PAM sequence to find the target.
- Once the SpCas9 finds the PAM (5′-NGG-3′) sequence, the sgRNA sends the nuclease right to the target region, where the spCas9 cuts through both strands of DNA.
- The structure is similar to the general structure of Cas9, with the nuclease lobe for catalytic activity and the recognition lobe for recognising and identifying the target DNA.
Advantages of SpCa9
- easy to get and well-researched.
- Simple to separate
- Very efficient
- Simple to use.
Disadvantages of SpCa9
- Required PAM sequence.
- Also finds false PAM and makes effects that don’t hit the target.
- Learn to recognise other PAMs, such as 5′-NAG-3′ and 5′-NGA-3′.
- It’s big and can’t be moved around easily.
- Hard to say and say out loud.
Applications of SpCa9
- As was already said, the current system has been carefully studied and has a lot of data. Because of this, it is popular in gene therapy. Among the most common uses are: Transcriptional repression, Activation of transcription, Epigenetic modulation, Gene disruption, Conversion of a single base pair
2. SaCas9
Structure |
Bilobed (REC
and NUC) |
Domains |
NUC (Nuclease
domain): HNH and RuvCREC (recognition domain): Rec1, Rec2 and Rec3. |
Bacterial
CRISPR system |
System II |
PAM
sequence |
5’-NNGRRT-3’
(N is any nucleotide) |
SgRNA |
Required
(crRNA: tracrRNA) |
Variants |
efSaCas9,
KKHSaCas9 and SaCas9-HF |
- The SaCas9 is another very popular Cas9 nuclease. Its structure is similar to that of the SpCas9, but its size is different. The best thing about SaCas9 is that it is small. Since then, it can be used to replace the SpCas9.
- SaCas9 comes from the bacteria Streptococcus aureus. It is made up of only 1053 amino acids, which is about 1Kb less than SpCas9.
- It also needs a PAM sequence, such as 3′-NNGRRT-5′, to tell the difference between its own DNA and other DNA. When catalysed, it makes double-stranded ends that are sticky.
Advantages
- Small in size
- A lot of accuracy
- Versatile
- Accurate
- Easy to put into a virus’s carrier
Disadvantages
- Required PAM sequence
- You need a bigger sgRNA to have a big effect off-target.
Applications
- The current Cas9 nuclease is used a lot to change the genome of plants in studies of how plants and pests interact.
- Research on stress tolerance
- Research into pathogen resistance
- It can also be used to treat diseases that are caused by viruses or genes.
- Recently, a special kind of SpCas9 was used to figure out what role the Myostatin gene plays in Muscular atrophy.
3. ScCas9
Name |
ScCas9 |
Species
derived |
Streptococcus
canis |
PAM
sequence |
5’-NNG-3’ |
sgRNA
requirement |
Yes, as
crRNA:tracrRNA |
Variants |
SpCas9++,
SpCas9n++ |
- Streptococcus canis is where the ScCas9 nuclease was found. For it to work, it needed a slightly different PAM recognition site, which is 5′-NNG-3′ (instead of NGG).
- The structure of the present nuclease is similar to that of other Cas9, but it shouldn’t be used because it doesn’t work as well.
- Plant genome editing is often done with ScCas9 and its variations, such as SpCas9++, SpCas9n++, and SpCas9+.
4. dCas9
dCas9
variant |
Function |
dCas9-TadA |
repair
mutated resistance in gene bacteria, preserve adenosine deaminase activity.
The present modification is capable enough to repair the faulty or mutated
resistance gene for various gene editing purposes. |
dCas9-rAPOBEC1 |
preserves
cytidine deaminase activity |
dCas9-APOBEC3A |
preserves
cytidine deaminase activity |
dCas9-AID |
preserves
cytidine deaminase activity |
SunTag-VP64 |
transcriptional
activator used to study the effect of overexpression. |
dCas9-VPR |
tripartite
complex and transcription activator |
dCas9-CBP |
rearranging
chromatin structure by histone acetyltransferase domain. |
Falk-fused
dCas9 |
transcriptional
activator module |
Why is dCas9 one of the most advanced, flexible, amazing,
and unique versions of the Cas9 nuclease? Because it doesn’t have “nucleolytic
activity,” which is the main job of nuclease. So, people call it the dead Cas9
system.
When the catalytic domain is taken away, the recognition domains can only find the target DNA, but they can’t cut it. So, in a technical sense, different transcriptional factors can be moved to a target location.
5. ThermoCas9
SpCas9 |
GeoCas9 |
|
Size |
1368AA |
1087AA |
PAM |
NGG |
CRAA (R=A or
G) |
Spacer length |
20nt |
22nt |
Temperature |
33-45 |
50-70 |
- Mougiakos et al. (2017) created a thermoCas9 nuclease that could work well at a higher temperature. It is made from the thermostable bacterium Geobacillus thermodenitrificans T12.
- They have also said that it can delete genes and stop transcription even at higher temperatures (55°C) without affecting the sensitivity or the need for PAM. Most of the time, it works well between 20°C and 70°C.
- It can also be called GeoCas9.
6. HypaCas9
- The HypaCas9 is a Hyper Cas9 that enhances genome-wide specificity without diminishing target activity in human and mouse cells.
- Additionally, it reduces off-target activities. Technically, HypaCas9 is created by introducing the Cas9 mutations N692A, M694A, Q695A, and H698A.
7. eSpCas9
- Enhanced precision Cas9 is a mutant version of the natural SpCas9, with a single point mutation reducing off-target activity.
- It is sometimes referred to as high-fidelity spCas9 or highly specific Cas9
8. XCas9
- XCas9 is a specialised, genetically designed nuclease with a reduced off-target effect with both non-NGG and NGG PAM.
- As is well known, Cas9 requires a PAM sequence in order to function well, which boosts its specificity and significantly complicates research.
- XCas9 can effectively detect many PAM sequences, including NGG, GAA, and GAT.
- Therefore, it becomes more effective and efficient than SpCas9 or SaCas9 and significantly reduces the need for PAM (Hu et al., 2018).
Cas9
type |
Origin |
PAM
sequence (5’ to 3’) |
Specialization |
SpCas9 |
Streptococcus
pyogenes |
NGG |
Cleaves dsDNA
using the sgRNA |
SaCas9 |
Streptococcus
aureus |
NNGRRT or
NNGRR(N) |
Small
off-targeting effect |
ScCas9 |
Streptococcus
canis |
NNG |
The PAM
sequence can be altered depending upon the variant used. |
ThermoCas9 |
Geobacillus
thermodenitrificans T12 |
CRAA (R=A or
G) |
Can work efficiently
at a higher temperature. |
StCas9 |
Streptococcus
thermophilus |
NNAGAAW |
High
on-target cleavage activity |
HypaCas9 |
Streptococcus
pyogenes |
N/A |
Greater
genome-wide specificity |
eSpCas9 |
Streptococcus
pyogenes |
NGG |
Enhanced
SpCas9 work more effectively than native SpCas9 |
NmCas9 |
Neisseria
meningitidis |
NNNNGATT |
Need longer
cRNA which increases the accuracy |
XCas9 |
Streptococcus
pyogenes |
NGG and
non-NGG |
A specialized
Cas9 that works with/without the PAM. |
dCas9 |
Streptococcus
pyogenes |
NGG |
Specialized
Cas9 that lacks nuclease activity |
Cas9-DD |
Streptococcus
pyogenes |
NGG |
Destabilized
Cas9 prepared to increase the accuracy and efficiency. |
SpCas9-VQR |
Streptococcus
pyogenes |
NGA |
Altered PAM
for increasing SpCas9 specificity |
SpCas9-EQR |
Streptococcus
pyogenes |
NGAG |
Altered PAM
for increasing SpCas9 specificity |
SpCas9-VRER |
Streptococcus
pyogenes |
NGCG |
Altered PAM
for increasing SpCas9 specificity |
SpCas9-NG |
Streptococcus
pyogenes |
NG |
Altered PAM
for increasing SpCas9 specificity |
SpCas9-HF1 |
Streptococcus
pyogenes |
NGG |
Altered PAM
for increasing SpCas9 specificity |
evoCas9 |
Streptococcus
pyogenes |
NGG |
Altered PAM
for increasing SpCas9 specificity |
Sniper-Cas9 |
Streptococcus
pyogenes |
NGG |
Altered PAM
for increasing SpCas9 specificity |
CRISPR–Cas9 Effector Complex Assembly
- Cas9 must be associated with guide RNA (a natural crRNA–tracrRNA or a sgRNA) to create an active DNA surveillance complex for site-specific DNA recognition and cleavage.
- The 20-nt spacer sequence of crRNA confers DNA target selectivity, whereas tracrRNA is indispensable for Cas9 recruitment.
- Genetic and pharmacological research have elucidated the significance of a so-called seed sequence of RNA nucleotides within the spacer region of crRNAs for target selectivity.
- In type II CRISPR systems, the seed region is described as the 10–12 nucleotides positioned at the 3 end of the 20-nt spacer sequence that are closest to the PAM.
- Mismatches in this seed region severely impede or abrogate target DNA binding and cleavage, but close homology in the seed region frequently results in off-target binding events, even in the presence of numerous mismatches elsewhere.
Conformational Rearrangement Upon sgRNA Binding
- The sgRNA-bound crystal structure best illustrates the concepts of Cas9–sgRNA assembly and the placement of guide RNA before to target identification.
- Comparison of the sgRNA-bound structure to that of apo–Cas9 reveals precisely how guide RNA binding induces Cas9 to undergo a substantial structural rearrangement from an inactive conformation to a DNA recognition–competent conformation, as suggested by studies with lower resolution electron microscopy.
- Upon sgRNA binding, the most notable conformational shift occurs in the REC lobe, namely Hel-III, which advances 65 A toward the HNH domain.
- Cas9 exhibits much smaller conformational changes upon binding to target DNA and PAM sequence, indicating that the majority of the extensive structural rearrangements occur prior to target DNA binding and reinforcing the notion that guide RNA loading is an essential regulator of Cas9 enzyme function.
Interactions with sgRNA
- Cas9 interacts extensively with the sgRNA. It forms several direct interactions with the repeat–antirepeat duplex, stem loop 1, and the linker region between stem loops 1 and 2 via Hel-I, the arginine-rich bridge helix, and the CTD domain.
- Cas9 makes significantly less interactions with stem loop 2 of the sgRNA, mostly through its RuvC and CTD domains.
- Due to the absence of a 3 tracrRNA tail in the sgRNA construct used for crystallography, no protein–RNA interaction was detected for stem loop 3 in the Cas9–sgRNA structure.
- However, the DNA-target-bound structures demonstrate that Cas9 has very few interactions with stem loop 3.
- According to biochemical investigations, sgRNAs lacking the linker region and stem loops 2 and 3 are still capable of inducing Cas9-mediated DNA cleavage, albeit with diminished efficiency, but stem loop 1 deletion entirely abolishes cleavage.
- Nevertheless, functional studies demonstrate that stem loops 2 and/or 3 are necessary for substantial Cas9 activation in vivo.
- These observations suggest that the repeat–antirepeat duplex and stem loop 1 are required for Cas9–sgRNA complex formation, whereas the linker, stem loop 2, and stem loop 3 are not required for function but may stabilise guide RNA binding to promote active complex formation, thereby enhancing catalytic efficiency in vivo.
Preordered Seed RNA and PAM-Interacting Cleft
- Cas9 creates extensive interactions with the ribose–phosphate backbone of the guide RNA, thereby establishing the A-form conformation of the 10-nt RNA seed sequence required for initial DNA interrogation.
- This preordering is assumed to be thermodynamically advantageous for target binding, similar to the positioning of guide RNA reported in other small regulatory RNA processes, such as the bacterial Hfq protein–RNA complex and eukaryotic Argonaute-mediated RNA silencing.
- Notably, in the type I CRISPR interference complex Cascade, the guide RNA is preordered throughout the entire crRNA, not just in the seed region. This is likely due to the helical assembly of the complex and the release of topological constraints by completely flippedout nucleotides at every sixth position.
- The PAM-interacting sites R1333 and R1335, which are responsible for 5 -NGG-3 PAM recognition and disordered in the apo structure, are prepositioned prior to establishing contact with target DNA, demonstrating that sgRNA loading permits Cas9 to form a DNA recognition– capable structure.
- Notably, despite the fact that the 5 10-nt nonseed RNA sequence is completely disordered in the sgRNA-bound crystal structure, the electron microscopy (EM) structure of SpyCas9 bound to a full-length sgRNA (EMD-3276) reveals that the 5 end of the guide RNA lies within the cavity formed between the HNH and RuvC nuclease domains.
- This structural observation shows that the 5 end of sgRNA is shielded from degradation and that an additional conformational change is necessary to liberate the 5 distal end from constraint during target DNA binding.
References
Jiang, F., & Doudna, J. A. (2017). CRISPR-Cas9
Structures and Mechanisms. Annual review of biophysics, 46, 505–529.
https://doi.org/10.1146/annurev-biophys-062215-010822.
Wada, N., Ueta, R., Osakabe, Y. et al. Precision genome
editing in plants: state-of-the-art in CRISPR/Cas9-based genome engineering.
BMC Plant Biol 20, 234 (2020).
Nishimasu, Hiroshi et al. “Crystal structure of Cas9 in
complex with guide RNA and target DNA.” Cell vol. 156,5 (2014): 935-49.
doi:10.1016/j.cell.2014.02.001.
Zuo, Z., Liu, J. Structure and Dynamics of Cas9 HNH Domain
Catalytic State. Sci Rep 7, 17271 (2017).
https://doi.org/10.1038/s41598-017-17578-6
Mougiakos, I., Mohanraju, P., Bosma, E.F. et al.
Characterizing a thermostable Cas9 for bacterial genome editing and silencing.
Nat Commun 8, 1647 (2017). https://doi.org/10.1038/s41467-017-01591-4
Hu, J., Miller, S., Geurts, M. et al. Evolved Cas9 variants
with broad PAM compatibility and high DNA specificity. Nature 556, 57–63
(2018). https://doi.org/10.1038/nature26155.
Qi, L. S., Larson, M. H., Gilbert, L. A., Doudna, J. A.,
Weissman, J. S., Arkin, A. P., & Lim, W. A. (2013). Repurposing CRISPR as
an RNA-guided platform for sequence-specific control of gene expression. Cell,
152(5), 1173–1183. https://doi.org/10.1016/j.cell.2013.02.022.
https://geneticeducation.co.in/cas9-protein-structure-function-types-and-importance/
No comments