Poster

ONE-seq™ for Variant-Aware Therapeutic Guide Selection

Summary

ONE-seq leverages computational tools and biochemical assays to nominate candidate off-target sites across thousands of genomes with high sensitivity. Here, we present an application of ONE-seq to identify guides with the lowest potential off-target editing risk. The biochemical in vitro cleavage data are augmented with biological annotation to enable prioritization of high scoring sites by their potential impact. We applied ONE-seq in a variant-aware manner to nominate off-target sites across global human populations for three therapeutically relevant guide RNAs targeting the PCSK9 gene. Comprehensive ONE-seq libraries were generated, including sites with up to six differences relative to the on-target site by screening the HG38 reference sequence and over 4,000 genomes from the 1000 Genomes and Human Genome Diversity Project data sets. Candidate off-target sites were identified by ONE-seq analysis and classified into multiple tiers based on their ONE-seq editing score and combined annotation concern score. An approach for screening multiple guides in a single run using Guide Select™ – a modified ONE-seq workflow – streamlines the guide selection process and reduces costs for analysis of larger numbers of candidate guides.

Candidate guides

We evaluated three guide RNAs selected from 20 candidates reported by Musunuru et al., 2021. The three guides selected had the highest reported editing efficiencies and a range of MIT scores:

Name Guide Sequence MIT Score Editing %
PCSK9-1 CCCGCACCTTGGCGCAGCGG/TGG 90 26
PCSK9-3 GCTTACCTGTCTGTGGAAGC/GGG 66 25.4
PCSK9-4 TGCTTACCTGTCTGTGGAAG/CGG 63 23.8

ONE-seq incorporates global human genetic diversity

A map of the world with different colored dots representing the varying number of samples across different geographic regions.

Nomination of candidate off-target sites incorporates genetic variation across diverse populations. Shown are sample collection sites for 1kG + HGDP samples (4150 genomes). Color: superpopulation (SP), dot size: number of samples.

Libraries are synthesized and cleaved in vitro

Comprehensive oligonucleotide libraries are synthesized, representing candidate off-target sites from over 4,000 human genomes with up to six differences compared to the on-target site. Some sites occur multiple times in the genome.

Library PCSK9-1 PCSK9-3 PCSK9-4
Reference Targets 60,136 121,650 156,334
Variant Targets 38,195 64,324 80,715
Total Targets 98,331 185,974 237,049
Total Genomic Sites 505,371 220,263 279,816

Amplified libraries are subjected to in vitro cleavage in triplicate using complexed Cas9/gRNA at three RNP to DNA ratios (10:1, 1:1 and 0.1:1).

Schematic diagram of ONE-seq library and in vitro cleavage. Three sets of two parallel lines showing first the target, then the PAM side, and finally the PS side, between which the is a vertical dotted line showing the cut site.

Schematic diagram of ONE-seq library and in vitro cleavage. A pair of barcodes are uniquely associated with each candidate off-target site.

Candidate sites are scored and annotated

In vitro cleavage products are sequenced and the resulting reads (~4 million per sample) are processed using a custom analysis pipeline. A ONE-seq score is calculated for each library member (ratio of off-target to on-target read counts). Functional annotation is added, including:

  • Proximity of high-scoring sites to the on-target locus.
  • Variant effect analysis with gene feature, frameshift and splice site prediction.
  • Genetic constraint (tolerance to loss-of-function variation).
  • Potential gene expression impact (ENCODE promoter and enhancer elements, LncRNAs, and GTEx tissue-specific expression profiles).
  • Clinical cancer and disease gene information with classification data, CGC, ClinGen and Mondo links.

A combined annotation concern score is assigned, based on component scores for each of these categories. Candidate off-target sites are classified into multiple tiers based on their ONE-seq and combined annotation concern scores.

Tiered off-target editing analysis

Classifying ONE-seq results in the context of annotation concern scores enables assessment of off-target editing risk and biological impact potential. The below figure shows results including reference and variant sites from each of the three targets. Sites with higher in vitro cleavage efficiency are at the top, and sites with higher potential biological impact are toward the right.

Three scatter plots showing results of reference and variant sites from each of the three PCSK9 targets. Sites with higher in vitro cleavage efficiency are at the top of the graph, and sites with higher potential biological impact are towards the right.

Variant off-target sites with significant AF and editing impact

A graphical summary of variant and reference sites is shown below with variants of higher global allele frequency being displayed in warmer colors. This provides a visual indication of the likelihood of encountering high-scoring variant off-target sites for each guide. Accompanying information on the allele frequencies in different populations permits a more detailed analysis in relation to the ethnogeographic prevalence of the disease being treated. The dot indicated by the arrow represents a variant off-target site with a global allele frequency of 26 percent and elevated editing of the variant allele by 874-fold. This variant occurs in a LncRNA gene, ENSG00000232325, that is expressed at significant levels in reproductive tissues and whole blood – based on GTEx data.

Three graphs of variant and reference sites, with variants of higher global allele frequency being displayed in warmer colors.

Off-target proximity to target site

High-frequency off-target editing at sites close to the on-target locus increase the probability of intrachromosomal rearrangements. The figure below highlights candidate off-target sites with high ONE-seq scores that are present on the same chromosome as the on-target locus. PCSK9-1 has the lowest risk.

Visualizations of the three candidate off-target sites with high ONE-seq scores that are present on the same chromosome as the on-target locus. PCSK9-1 has the lowest risk

Reduced complexity libraries power ONE-seq screen

Variant-aware ONE-seq libraries vary significantly in size depending on the guide sequence. The chart below shows 13 Cas9 libraries with various complexity cutoffs including some highly specific and highly promiscuous guides. The average size of libraries with up to four differences and two indels is approximately 1,900 sites. Multiple reduced-complexity libraries such as these can be combined and processed in a single ONE-seq run for cost-effective selection of guides based on in vitro editing and annotation, as described in this poster.

A bar graph shows 13 Cas9 libraries (guides on the x axis) with 4 diffs in orange, 5 diffs in light blue, and 6 diffs in dark blue by number of targets from 1 to a million (on the y axis). Average size of libraries with up to 4 diffs is about 1,900 sites.

Related resources

A table for guides that then lists off-target score (by MIT and CFD), Activity (CRISPRater), number of loci (L. Distance, and Risk Profiler number.
Poster
Guide Profiler™: A Genetic Variant-Aware Computational Tool for Improved Guide RNA Selection for CRISPR-Based Therapeutic Applications
Cover image of Guide Select Example Report linked pdf
Example Customer Report
Guide Select™ Screening Assay Example Report
View all resources