Breaking News

Cas9 Protein – Structure, Types, Functions, Applications

The field of biology is now experiencing a transformative phase with the advent of facile genome engineering in animals and plants using RNA-programmable CRISPR-Cas9. The CRISPR-Cas9 technology originates from type II CRISPR-Cas systems, which provide bacteria with adaptive immunity to viruses and plasmids. The CRISPR-associated protein Cas9 is an endonuclease that uses a guide sequence within an RNA duplex, tracrRNA:crRNA, to form base pairs with DNA target sequences, enabling Cas9 to introduce a site-specific double-strand break in the DNA. The dual tracrRNA:crRNA was engineered as a single guide RNA (sgRNA) that retains two critical features: a sequence at the 5′ side that determines the DNA target site by Watson-Crick base-pairing and a duplex RNA structure at the 3′ side that binds to Cas9. This finding created a simple two-component system in which changes in the guide sequence of the sgRNA program Cas9 to target any DNA sequence of interest. The simplicity of CRISPR-Cas9 programming, together with a unique DNA cleaving mechanism, the capacity for multiplexed target recognition, and the existence of many natural type II CRISPR-Cas system variants, has enabled remarkable developments using this cost-effective and easy-to-use technology to precisely and efficiently target, edit, modify, regulate, and mark genomic loci of a wide array of cells and organisms.

Name 

Cas9 endonuclease

Alternative name

spCas9/spyCas9

Organism 

Streptococcus pyogenes serotype M1

Molecular weight

~163KDa

Gene 

cas9

Location on chromosome

0.85 to 0.86Mb

Protein

CRISPR-associated endonuclease Cas9/Csn1

Cofactor

Mg2+

Biological processing 

Interference- defense response to phage.
Maintaining CRISPR repeat sequences

Functions

DNA and RNA binding
Metal ion binding
3’-5’ exonuclease activity
Endonuclease activity 

  • Cas9 is a nuclease that degrades phage DNA via RNA-guided double DNA cleavage, DNA binding, and nuclease activity.
  • Cas9 protein is prominent in CRISPR systems of bacterial type II.
  • It requires both crRNA and tracrRNA to function properly.
  • Catalytic activity also requires a PAM sequence on the target DNA.
  • Cas9 is changed for a variety of functions, including gene activation and gene expression suppression.
  • Cas9’s significance in CRISPR-mediated gene editing and applications such as disease modelling, gene role research, therapeutic and gene expression investigations is well established.
  • Using the PAM sequence as a marker, it simply locates, binds, and cleaves the target nucleic acid. To identify the fugitive, the sgRNA containing cRNA and tracrRNA seeks complementarity with the target location.
  • However, its two-level authentication (the employment of sgRNA and PAM) diminishes in vitro gene editing efficiency significantly. Therefore, customised Cas9 nucleases such as spCas9, dCas9, SaCas9, and XCas9 are available.


What is Cas9 Protein?

  • Cas9, also known as CRISPR-associated protein 9, is one of the well-studied, significant, and commercially available nucleases employed not only in bacterial systems, but also in in vitro gene-editing techniques.
  • Cas9 is a form of DNA nuclease that can accurately remove dsDNA, and it is exclusive to CRISPR type II. It is most typically found in Streptococcus pyogenes and is referred to as dual RNA-guided DNA endonuclease.
  • To comprehend why only Cas9 is commonly employed for gene editing, it is necessary to comprehend the structure, function, and significance of the Cas9 protein, formerly known as Cas5, Csx12, and Csn1.
  • S. pyogenes SpyCas9 is a large (1,368 amino acids), multidomain, and multifunctional DNA endonuclease.
  • It uses its two unique nuclease domains to snip dsDNA 3 bp upstream of the PAM: an HNH-like nuclease domain that cleaves the DNA strand complementary to the guide RNA sequence (target strand), and a RuvC-like nuclease domain that cleaves the DNA strand opposite the complementary strand (nontarget strand).
  • Cas9 also contributes in crRNA maturation and spacer acquisition, in addition to its essential involvement in CRISPR interference.

Structure of Cas9

  • Cas9 in its apo state has two different lobes: the alpha-helical recognition (REC) lobe and the nuclease (NUC) lobe, which contains the conserved HNH and split RuvC nuclease domains as well as the more variable C-terminal domain (CTD).
  • Two linking segments join the two lobes, one created by the arginine-rich bridge helix and the other by a disordered linker (residues 712–717).
  • The REC lobe consists of three alpha-helical domains (Hel-I, Hel-II, and Hel-III) and is structurally distinct from all other known proteins.
  • The extended CTD has a Cas9-specific fold and contains PAM-interacting sites necessary for PAM interrogation. Nonetheless, this PAM-recognition region is highly disordered in the apo–Cas9 structure, indicating that the apo–Cas9 enzyme is maintained in an inactive state, unable to detect target DNA prior to binding to a guide RNA.
  • This structural finding is consistent with so-called DNA curtains tests demonstrating that apo–Cas9 binds nonspecifically to DNA and can be swiftly removed from nonspecific locations in the presence of competing RNA (guide RNA or heparin).
  • The structural superimposition of apo–Cas9 with sgRNA-bound and DNA-bound structures indicates further that the enzyme adopts a catalytically inactive conformation in its apo state, requiring RNA-induced structural activation for DNA recognition and cleavage.
  • This structural result corroborates the biochemical findings that Cas9 enzymes are inactive as nucleases in the absence of bound guide RNAs and further supports their activity as RNA-guided endonucleases.

HNH and RuvC Nuclease Domains

  • Comparing the structures of Cas9 nuclease domains to those of other DNA-bound nucleases shows that the Cas9 RuvC nuclease domain is similar to members of the retroviral integrase superfamily that have an RNase H fold. This suggests that RuvC probably uses a two-metal-ion catalytic mechanism to cut the nontarget DNA strand.
  • The HNH nuclease domain, on the other hand, has the same -metal fold as other HNH endonucleases and most likely uses a single metal ion to cut the target-strand DNA.
  • One metal-ion-dependent and two metal-ion-dependent nucleic acid cleaving enzymes can be identified by a general base histidine that is always the same and an aspartate residue that is always the same.
  • This is in line with Cas9 mutagenesis studies that show changing either the HNH (H840A) or the RuvC domain (D10A) turns Cas9 into a nickase, while changing both nuclease domains of Cas9 (so-called “dead Cas9” or dCas9) keeps its ability to bind to RNA-guided DNA but gets rid of its ability to cut DNA.
  • But these proposed catalytic mechanisms still need to be tested in the lab to make sure they work.

Mechanism of working

  • In the first step of bacterial interference, the REC lob and the gRNA complex work together to form the ribonucleoprotein complex (RNP). Then, the nuclease domains RuvC and HNH break two phosphodiester bonds between two different strands of DNA, which separates the dsDNA strands.
  • An in-depth study shows that the HNH active domain hydrolyzes the phosphodiester bond of the complementary strand, while the RuvC active site hydrolyzes the phosphodiester bond of the non-complementary strand. RuvC and HNH each use two metal ions and one metal ion for hydrolysis because it needs metal ions to work (Tang H et al., 2021).

Lobe 

Domain 

Residues 

function

REC

Bridge helix

60-93

Recognition of DNA

REC

REC1

94-179, 308-713

RNA guided DNA targeting

REC

REC2

180-307

DNA binding 

NUC

RuvC (RuvCI, RuvCII and RuvCIII)

1-59, 718-769, 909-1098

RNase H activity; Nuclease activity for non-complementary target strand. 

NUC

HNH

775-908

Nuclease activity for complementary target strand

NUC

PAM-interacting- domain

1099-1368

Finds the PAM sequence on the target DNA.


Types of Cas9 nucleases

There are different kinds of Cas9 nucleases that come from both nature and labs. They are put into groups based on their function or the species from which they came. I’ll list and explain a few of them here.

1. SpCa9

Structure 

Bilobed (REC and NUC)

Domains

NUC (Nuclease domain): HNH and RuvC
REC (recognition domain): Rec1, Rec2 and Rec3. 

Bacterial CRISPR system 

System II

PAM sequence 

5’-NGG-3’ (N is any nucleotide)

SgRNA

Required (crRNA: tracrRNA)

Variants

SpCas9-NRRH, SpG, SpCas9-NRCH, SpCas9-NRTH, 

  • SpCas9 comes from Streptococcus Pyogenes and is one of the most popular, well-studied, and widely used Cas9 nucleases in genetic engineering experiments.
  • As was already said, it needs both crRNA and tracrRNA as sgRNA and the PAM sequence to find the target.
  • Once the SpCas9 finds the PAM (5′-NGG-3′) sequence, the sgRNA sends the nuclease right to the target region, where the spCas9 cuts through both strands of DNA.
  • The structure is similar to the general structure of Cas9, with the nuclease lobe for catalytic activity and the recognition lobe for recognising and identifying the target DNA.

Advantages of SpCa9

  • easy to get and well-researched.
  • Simple to separate
  • Very efficient
  • Simple to use.

Disadvantages of SpCa9

  • Required PAM sequence.
  • Also finds false PAM and makes effects that don’t hit the target.
  • Learn to recognise other PAMs, such as 5′-NAG-3′ and 5′-NGA-3′.
  • It’s big and can’t be moved around easily.
  • Hard to say and say out loud.

Applications of SpCa9

  • As was already said, the current system has been carefully studied and has a lot of data. Because of this, it is popular in gene therapy. Among the most common uses are: Transcriptional repression, Activation of transcription, Epigenetic modulation, Gene disruption, Conversion of a single base pair

2. SaCas9

Structure 

Bilobed (REC and NUC)

Domains

NUC (Nuclease domain): HNH and RuvCREC (recognition domain): Rec1, Rec2 and Rec3. 

Bacterial CRISPR system 

System II

PAM sequence 

5’-NNGRRT-3’ (N is any nucleotide)

SgRNA

Required (crRNA: tracrRNA)

Variants

efSaCas9, KKHSaCas9 and SaCas9-HF

  • The SaCas9 is another very popular Cas9 nuclease. Its structure is similar to that of the SpCas9, but its size is different. The best thing about SaCas9 is that it is small. Since then, it can be used to replace the SpCas9.
  • SaCas9 comes from the bacteria Streptococcus aureus. It is made up of only 1053 amino acids, which is about 1Kb less than SpCas9.
  • It also needs a PAM sequence, such as 3′-NNGRRT-5′, to tell the difference between its own DNA and other DNA. When catalysed, it makes double-stranded ends that are sticky.

Advantages

  • Small in size
  • A lot of accuracy
  • Versatile
  • Accurate
  • Easy to put into a virus’s carrier

Disadvantages

  • Required PAM sequence
  • You need a bigger sgRNA to have a big effect off-target.

Applications

  • The current Cas9 nuclease is used a lot to change the genome of plants in studies of how plants and pests interact.
  • Research on stress tolerance
  • Research into pathogen resistance
  • It can also be used to treat diseases that are caused by viruses or genes.
  • Recently, a special kind of SpCas9 was used to figure out what role the Myostatin gene plays in Muscular atrophy.

3. ScCas9

Name 

ScCas9

Species derived 

Streptococcus canis 

PAM sequence 

5’-NNG-3’ 

sgRNA requirement 

Yes, as crRNA:tracrRNA

Variants 

SpCas9++, SpCas9n++

  • Streptococcus canis is where the ScCas9 nuclease was found. For it to work, it needed a slightly different PAM recognition site, which is 5′-NNG-3′ (instead of NGG).
  • The structure of the present nuclease is similar to that of other Cas9, but it shouldn’t be used because it doesn’t work as well.
  • Plant genome editing is often done with ScCas9 and its variations, such as SpCas9++, SpCas9n++, and SpCas9+.

4. dCas9

dCas9 variant  

Function 

dCas9-TadA

repair mutated resistance in gene bacteria, preserve adenosine deaminase activity. The present modification is capable enough to repair the faulty or mutated resistance gene for various gene editing purposes. 

dCas9-rAPOBEC1

preserves cytidine deaminase activity 

dCas9-APOBEC3A

preserves cytidine deaminase activity

dCas9-AID

preserves cytidine deaminase activity

SunTag-VP64

transcriptional activator used to study the effect of overexpression. 

dCas9-VPR

tripartite complex and transcription activator

dCas9-CBP

rearranging chromatin structure by histone acetyltransferase domain.

Falk-fused dCas9

transcriptional activator module

Why is dCas9 one of the most advanced, flexible, amazing, and unique versions of the Cas9 nuclease? Because it doesn’t have “nucleolytic activity,” which is the main job of nuclease. So, people call it the dead Cas9 system.

When the catalytic domain is taken away, the recognition domains can only find the target DNA, but they can’t cut it. So, in a technical sense, different transcriptional factors can be moved to a target location.

5. ThermoCas9

SpCas9

GeoCas9

Size

1368AA

1087AA

PAM

NGG

CRAA (R=A or G)

Spacer length

20nt

22nt

Temperature

33-45

50-70

  • Mougiakos et al. (2017) created a thermoCas9 nuclease that could work well at a higher temperature. It is made from the thermostable bacterium Geobacillus thermodenitrificans T12.
  • They have also said that it can delete genes and stop transcription even at higher temperatures (55°C) without affecting the sensitivity or the need for PAM. Most of the time, it works well between 20°C and 70°C.
  • It can also be called GeoCas9.

6. HypaCas9

  • The HypaCas9 is a Hyper Cas9 that enhances genome-wide specificity without diminishing target activity in human and mouse cells.
  • Additionally, it reduces off-target activities. Technically, HypaCas9 is created by introducing the Cas9 mutations N692A, M694A, Q695A, and H698A.

7. eSpCas9

  • Enhanced precision Cas9 is a mutant version of the natural SpCas9, with a single point mutation reducing off-target activity.
  • It is sometimes referred to as high-fidelity spCas9 or highly specific Cas9

8. XCas9

  • XCas9 is a specialised, genetically designed nuclease with a reduced off-target effect with both non-NGG and NGG PAM.
  • As is well known, Cas9 requires a PAM sequence in order to function well, which boosts its specificity and significantly complicates research.
  • XCas9 can effectively detect many PAM sequences, including NGG, GAA, and GAT.
  • Therefore, it becomes more effective and efficient than SpCas9 or SaCas9 and significantly reduces the need for PAM (Hu et al., 2018).

Cas9 type 

Origin 

PAM sequence (5’ to 3’)

Specialization 

SpCas9

Streptococcus pyogenes

NGG

Cleaves dsDNA using the sgRNA

SaCas9

Streptococcus aureus

NNGRRT or NNGRR(N)

Small off-targeting effect 

ScCas9

Streptococcus canis

NNG

The PAM sequence can be altered depending upon the variant used. 

ThermoCas9

Geobacillus thermodenitrificans T12

CRAA (R=A or G)

Can work efficiently at a higher temperature.

StCas9

Streptococcus thermophilus

NNAGAAW

High on-target cleavage activity 

HypaCas9

Streptococcus pyogenes

N/A

Greater genome-wide specificity

eSpCas9

Streptococcus pyogenes

NGG

Enhanced SpCas9 work more effectively than native SpCas9

NmCas9

Neisseria meningitidis

NNNNGATT

Need longer cRNA which increases the accuracy 

XCas9

Streptococcus pyogenes

NGG and non-NGG

A specialized Cas9 that works with/without the PAM. 

dCas9

Streptococcus pyogenes

NGG

Specialized Cas9 that lacks nuclease activity

Cas9-DD

Streptococcus pyogenes

NGG

Destabilized Cas9 prepared to increase the accuracy and efficiency. 

SpCas9-VQR

Streptococcus pyogenes

NGA

Altered PAM for increasing SpCas9 specificity

SpCas9-EQR

Streptococcus pyogenes

NGAG

Altered PAM for increasing SpCas9 specificity

SpCas9-VRER

Streptococcus pyogenes

NGCG

Altered PAM for increasing SpCas9 specificity

SpCas9-NG

Streptococcus pyogenes

NG

Altered PAM for increasing SpCas9 specificity

SpCas9-HF1

Streptococcus pyogenes

NGG

Altered PAM for increasing SpCas9 specificity

evoCas9

Streptococcus pyogenes

NGG

Altered PAM for increasing SpCas9 specificity

Sniper-Cas9

Streptococcus pyogenes

NGG

Altered PAM for increasing SpCas9 specificity


CRISPR–Cas9 Effector Complex Assembly

  • Cas9 must be associated with guide RNA (a natural crRNA–tracrRNA or a sgRNA) to create an active DNA surveillance complex for site-specific DNA recognition and cleavage.
  • The 20-nt spacer sequence of crRNA confers DNA target selectivity, whereas tracrRNA is indispensable for Cas9 recruitment.
  • Genetic and pharmacological research have elucidated the significance of a so-called seed sequence of RNA nucleotides within the spacer region of crRNAs for target selectivity.
  • In type II CRISPR systems, the seed region is described as the 10–12 nucleotides positioned at the 3 end of the 20-nt spacer sequence that are closest to the PAM.
  • Mismatches in this seed region severely impede or abrogate target DNA binding and cleavage, but close homology in the seed region frequently results in off-target binding events, even in the presence of numerous mismatches elsewhere.

Conformational Rearrangement Upon sgRNA Binding

  • The sgRNA-bound crystal structure best illustrates the concepts of Cas9–sgRNA assembly and the placement of guide RNA before to target identification.
  • Comparison of the sgRNA-bound structure to that of apo–Cas9 reveals precisely how guide RNA binding induces Cas9 to undergo a substantial structural rearrangement from an inactive conformation to a DNA recognition–competent conformation, as suggested by studies with lower resolution electron microscopy.
  • Upon sgRNA binding, the most notable conformational shift occurs in the REC lobe, namely Hel-III, which advances 65 A toward the HNH domain.
  • Cas9 exhibits much smaller conformational changes upon binding to target DNA and PAM sequence, indicating that the majority of the extensive structural rearrangements occur prior to target DNA binding and reinforcing the notion that guide RNA loading is an essential regulator of Cas9 enzyme function.

Interactions with sgRNA

  • Cas9 interacts extensively with the sgRNA. It forms several direct interactions with the repeat–antirepeat duplex, stem loop 1, and the linker region between stem loops 1 and 2 via Hel-I, the arginine-rich bridge helix, and the CTD domain.
  • Cas9 makes significantly less interactions with stem loop 2 of the sgRNA, mostly through its RuvC and CTD domains.
  • Due to the absence of a 3 tracrRNA tail in the sgRNA construct used for crystallography, no protein–RNA interaction was detected for stem loop 3 in the Cas9–sgRNA structure.
  • However, the DNA-target-bound structures demonstrate that Cas9 has very few interactions with stem loop 3.
  • According to biochemical investigations, sgRNAs lacking the linker region and stem loops 2 and 3 are still capable of inducing Cas9-mediated DNA cleavage, albeit with diminished efficiency, but stem loop 1 deletion entirely abolishes cleavage.
  • Nevertheless, functional studies demonstrate that stem loops 2 and/or 3 are necessary for substantial Cas9 activation in vivo.
  • These observations suggest that the repeat–antirepeat duplex and stem loop 1 are required for Cas9–sgRNA complex formation, whereas the linker, stem loop 2, and stem loop 3 are not required for function but may stabilise guide RNA binding to promote active complex formation, thereby enhancing catalytic efficiency in vivo.

Preordered Seed RNA and PAM-Interacting Cleft

  • Cas9 creates extensive interactions with the ribose–phosphate backbone of the guide RNA, thereby establishing the A-form conformation of the 10-nt RNA seed sequence required for initial DNA interrogation.
  • This preordering is assumed to be thermodynamically advantageous for target binding, similar to the positioning of guide RNA reported in other small regulatory RNA processes, such as the bacterial Hfq protein–RNA complex and eukaryotic Argonaute-mediated RNA silencing.
  • Notably, in the type I CRISPR interference complex Cascade, the guide RNA is preordered throughout the entire crRNA, not just in the seed region. This is likely due to the helical assembly of the complex and the release of topological constraints by completely flippedout nucleotides at every sixth position.
  • The PAM-interacting sites R1333 and R1335, which are responsible for 5 -NGG-3 PAM recognition and disordered in the apo structure, are prepositioned prior to establishing contact with target DNA, demonstrating that sgRNA loading permits Cas9 to form a DNA recognition– capable structure.
  • Notably, despite the fact that the 5 10-nt nonseed RNA sequence is completely disordered in the sgRNA-bound crystal structure, the electron microscopy (EM) structure of SpyCas9 bound to a full-length sgRNA (EMD-3276) reveals that the 5 end of the guide RNA lies within the cavity formed between the HNH and RuvC nuclease domains.
  • This structural observation shows that the 5 end of sgRNA is shielded from degradation and that an additional conformational change is necessary to liberate the 5 distal end from constraint during target DNA binding.


References

Jiang, F., & Doudna, J. A. (2017). CRISPR-Cas9 Structures and Mechanisms. Annual review of biophysics, 46, 505–529. https://doi.org/10.1146/annurev-biophys-062215-010822.

Wada, N., Ueta, R., Osakabe, Y. et al. Precision genome editing in plants: state-of-the-art in CRISPR/Cas9-based genome engineering. BMC Plant Biol 20, 234 (2020).

Nishimasu, Hiroshi et al. “Crystal structure of Cas9 in complex with guide RNA and target DNA.” Cell vol. 156,5 (2014): 935-49. doi:10.1016/j.cell.2014.02.001.

Zuo, Z., Liu, J. Structure and Dynamics of Cas9 HNH Domain Catalytic State. Sci Rep 7, 17271 (2017). https://doi.org/10.1038/s41598-017-17578-6

Mougiakos, I., Mohanraju, P., Bosma, E.F. et al. Characterizing a thermostable Cas9 for bacterial genome editing and silencing. Nat Commun 8, 1647 (2017). https://doi.org/10.1038/s41467-017-01591-4

Hu, J., Miller, S., Geurts, M. et al. Evolved Cas9 variants with broad PAM compatibility and high DNA specificity. Nature 556, 57–63 (2018). https://doi.org/10.1038/nature26155.

Qi, L. S., Larson, M. H., Gilbert, L. A., Doudna, J. A., Weissman, J. S., Arkin, A. P., & Lim, W. A. (2013). Repurposing CRISPR as an RNA-guided platform for sequence-specific control of gene expression. Cell, 152(5), 1173–1183. https://doi.org/10.1016/j.cell.2013.02.022.

https://geneticeducation.co.in/cas9-protein-structure-function-types-and-importance/

No comments