Poster

Variant-Aware Off-Target Analysis for Informed Guide RNA Selection

Abstract

Background: The specificity of CRISPR-based gene editing is critical for therapeutic success and safety. Off-target activity may cause unintended modifications, potentially disrupting normal gene function or introducing harmful genetic alterations. Careful consideration of off-target effects during guide RNA selection and development can significantly reduce these risks. It is also important to account for genetic diversity within human populations when evaluating off-target effects to effectively assess and mitigate risks in target patient populations. In this study, we combined in-silico and biochemical methods to implement a variant-aware approach for selecting guide RNA (gRNA) for editing the PCSK9 gene while considering population-scale genetic diversity. We then utilized the ONE-seq assay to obtain in-depth biochemical data, to thoroughly characterize off-target risk.

Methods: As described below, we generated data for guides targeting the PCSK9 gene using the CRISPR-Cas9 nuclease system while considering common variants in the major human superpopulations.

  • Guide Profiler™ - This in silico computational tool was used to assess off-target risk profiles for eight candidate gRNAs against the human reference genome (Hg38) and 3,502 haplotype-phased genomes. The guides were ranked based on off-target burden, functional annotation and proximity to coding and regulatory regions. To evaluate the tool's utility in ranking off-target risks, PCSK9-1, PCSK9-3 and PCSK9-4 were selected for further screening.
  • Guide Select™ - A multiplexed, variant-aware biochemical assay was performed for these three guides, covering all off-target sites with an edit distance of 4 from the guide sequence. The assay identified PCSK9-1 as the most specific, based on cleavage efficiency and off-target activity across diverse genetic backgrounds.
  • ONE-seq™ - A high-sensitivity, variant-aware biochemical assay was performed for a comprehensive off-target risk assessment of PCSK9-1, mapping cleavage sites genome-wide and prioritizing high-risk loci through biochemical validation and deep sequencing.

Results: Guide PCSK9-1 was ranked highest by the Guide Profiler among all guides analyzed. The biochemical screen using Guide Select provided results consistent with the in silico analysis, identifying the fewest off-targets for PCSK9-1, confirming it as the optimal candidate. ONE-seq provided deep risk characterization, confirming minimal cleavage at high-risk loci and suitability for therapeutic applications.

Conclusion: This variant-aware, multi-step approach is a practical strategy for gRNA selection, systematically reducing off-target risk at each stage. By integrating computational and biochemical assays with population-scale variant analysis, we improve confidence in therapeutic genome editing and regulatory readiness.

Computational screening and off-target risk stratification of PCSK9 sgRNAs using Guide Profiler

A) Guide Select workflow

Graphic and text depiction of Guide Profiler workflow for in silico guide RNA screening. Column on left lists three icons with summary points: a magnifying glass indicates the in silico preview of off-target search space; the target symbol behind a human figure indicates the avoidance of common and rare genetic variation on the on-target site; and the dollar sign indicates the integration with standard ONE-seq workflow. Row on the right features five laptop icons with process steps below them: 1. Computational search of multiple genomes to identify putative off-targets; 2. On-target variant analysis; 3. Off-target library size benchmarking; 4. Analysis of biological impacts at off-target loci; 5. Summarize and report guide RNA risk profile.

B) sgRNA sequences and IDs

Target GENCODE region Sponsor guide ID Sequence
PCSK9 PCSK9-1 CCCGCACCTTGGCGCAGCGGtgg
PCSK9 PCSK9-2 GGTGGCTCACCAGCTCCAGCagg
PCSK9 PCSK9-3 GCTTACCTGTCTGTGGMGCggg
PCSK9 PCSK9-4 TGCTTACCTGTCTGTGGMGcgg
PCSK9 PCSK9-5 TTGGAAAGACGGAGGCAGCCtgg
PCSK9 PCSK9-6 GAAP/3ACGGAGGCAGCCTGGtgg
PCSK9 PCSK9-7 TCCCAGGCCTGGAGTTTATTcgg
PCSK9 PCSK9-8 AGCACCTACCTCGGGAGCTGagg

C) Search parameters for library enumeration

Search parameters Enumeration
Maximum total differences from target sequence 6
Maximum mismatches 6
Maximum PAM mismatches 1
Maximum proto mismatches 6
Maximum gap bases 2
Maximum number of inserted bases 2
Maximum number of deleted bases 2
Maximum number of gaps between proto and PAM 1

D) Guides in order of low to high off-target risk

This table shows a high-level summary of findings for all eight PCSK9 guides in the “All Loci” and “Loci ≤ 3 Diffs” sections. Columns in this section are colored by best to worst values (cream to burgundy). ‘Risk Profiler’ functions as a guide for interpreting relative off-target risk of guides. Columns in this section are colored by best to worst values/ranks (dark green to white). Order of best to worst guides (furthermost left column of table): PCSK9-1, -7, -3, -5, -8, -4, and -2 and -6. PCSK9-1 had risk profile score of 9.50; PCSK-6 and PCSK-2 tied for the worst risk score at 41.50.

Figure 1: A) Example of Guide Profiler workflow for in silico guide RNA screening. B) Sequences and IDs of sgRNAs targeting PCSK9 evaluated using Guide Profiler. C) Search parameters for off-target enumeration. D) Guides ordered by off-target risk (low to high) are shown here. This table shows a high-level summary of findings for all guides in the “All Loci” and “Loci ≤ 3 Diffs” sections. Columns in this section are colored by best to worst values (cream to burgundy). "Risk Profiler" functions as a guide for interpreting relative off-target risk of guides. Columns in this section are colored by best to worst values/ranks (dark green to white).

Population-scale variant analysis enhances off-target profiling of PCSK9 guides

A) Counts of haplotype-phased genomes via Guide Profiler

A world map provides a visual representation of the number of haplotype-phased genomes included in the Guide Profiler analysis. The size of each circle is proportional to the number of individuals in a subpopulation that is mapped to a superpopulation by color. AFR (African) is green, EUR (European) is blue, SAS (South Asian) is turquoise, MID (Middle Eastern) is yellow, EAS (East Asian) is dark purple, AMR (Mixed American) is pink and OCE (Oceanian) is light purple. The circles are relatively evenly distributed, besides for the smallest numbers of circles and smallest sizes—for Middle Eastern and Oceanian.

B) Number of off-target loci for all guides

Bar plot shows number off-target loci for the eight guides present in the reference and individual genomes (variants). Reference genomes are indicated in blue, variant in green. PCSK9-7 has the highest numbers for both reference and variant, 550k and 150k respectively. PCSK9-3 has the lowest, with 140k and 67k respectively.

C) Counts of potential off-targets by number of differences from PCSK9-1 guide sequence

A heat map using cool-toned colors shows potential off-targets by number of differences from PCSK9-1 guide sequence. Lighter colors are lower, darker are higher. Top half of map is bar graph showing number of loci or off-targets up to 100k and bottom half consists of rows indicating (from top to bottom) differences: deleted bases, inserted bases, mismatches, and total differences. As the number of total differences from the guide increases (from left to right in the bottom row), the number of potential off-targets (bars in top graph) increases.

D) Superpopulation Allele Frequency (AF) for off-targets with PCSK9-4 guide

This horizontal bar plot shows superpopulation Allele Frequency for off-targets (number of loci up to 40) using the PCSK9-4 guide. Superpopulations (left side) are listed in alphabetical order top-down: African, Mixed American, East Asian, European, Middle Eastern, Oceanian and South Asian. Allele Frequency, or AF, is depicted by color. Orange: .1-1; yellow: .05-.1; teal: .01-.05; light blue: 0-.01. Off-targets for the African superpopulation have the highest number of off-targets, with AF mostly light blue or under .01 and then orange/between .1 and 1, teal/.01-.05, and yellow/.05-.1 in that order. Information is shown for off-targets present in individual genomes (and not in hg38 genome) with fewer than three differences from the guide sequence.

E) Variant analysis at the on-target loci for PCSK9-2

Variant analysis at the on-target loci revealed a high-frequency variant (AF = 16.7%) for the PCSK9-2 guide. Image shows colored DNA code highlighting positions 1 (G), 10 (C), 20 (C) and 23 (G) and lists a guide RNA sequence: 1-55044044-C-G(-).
Superpopulation Population 1-55044044-C-G(-)
AFR SAN(San) 0.16700
SAS ITU(Telugu) 0.00485

Figure 2: A) The map provides a visual representation of the counts of haplotype-phased genomes included in the Guide Profiler analysis. The size of each circle is proportional to the number of individuals in a subpopulation that is mapped to a superpopulation by color. AFR (African), EUR (European), SAS (South Asian), MID (Middle Eastern), EAS (East Asian), AMR (Mixed American) and OCE (Oceanian). B) Bar plot shows counts of off-target loci for each guide present in the reference and individual genomes (variants). C) Heat map showing the counts of potential off-targets by number of differences from guide sequence. As the number of differences from the guide increases (as indicated by the total row in the heat map), the number of potential off-targets increases. D) Superpopulation Allele Frequency (AF) information is shown for off-targets present in individual genomes (and not in hg38 genome) with fewer than three differences from the guide sequence. E) Variant analysis at the on-target loci revealed a high-frequency variant (AF = 16.7%) for the PCSK9-2 guide.

Biochemical validation of top-ranked PCSK9 sgRNAs from Guide Profiler using multiplexed Guide Select assay

A) Guide Select workflow

variant-aware-off-target-analysis-gRNA_fig-3A

B) Candidate off-targets binned by cleavage frequency

Guide Higher cleavage frequency
(ONE-seq Screen score ≥ 0.01)
Lower cleavage frequency
(ONE-seq Screen score ≥ 0.001 & < 0.01)
Total off-target loci screened
PCSK9-1 8 22 1910
PCSK9-3 11 42 2014
PCSK9-4 15 47 2718

C) Guide Select nominated off-targets by tier

Three scatter plots for the guides (from left to right) PCSK9-1, -3, and -4. Plots show ONE-seq score up to 10 over annotation concern score up to 15. Each plot is divided into four quadrants, for tier 2a, 1, 3, and 2b (top left, counterclockwise) to show guides of interest (no on-target in this figure).

D) Candidate off-target loci from tier 1 and tier 2a

Candidate off-target loci from tier 1 and tier 2a (high cleavage frequency) on the same chromosome as the on-target locus for all guides (PCSK9-1, PCSK9-3, and PCSK9-4, from top to bottom) are shown here. On-target locus is depicted with a green arrow, tier 1 with an orange arrow, and tier 2a with a yellow one. Due to scaling, arrows indicating off-target sites very close in location may not be individually discernible.

E) Gene, transcript and regulatory region annotation

A table titled Gene, transcript and regulatory region annotation. Table has secondary label for gencode annotation and encode annotation. For guide PCSK9-1, total loci is 11; then, for gencode annotation, number of loci in protein-coding intronic regions is 8, number of loci with gencode transcript ID available is 10, and loci in long non-coding RNA is 2. Finally, for PCSK9-1 encode annotation, the number of loci in PLS regions (promoter-like signature) is 1, number in dELS regions (distal enhancer-like signature) is 3, and number in pELS regions (proximal enhancer-like signature) is 0. For guide PCSK9-3, total loci is 20, with gencode annotation including 14 loci in protein-coding intronic regions, 17 with transcript ID, 3 in long non-coding RNA and with encode annotation including none in PLS regions, 2 in dELS regions, and 1 in pELS regions. For PCSK9-4, total is 21 with gencode annotation including 19 loci in protein-coding intronic regions, 21 with transcript ID, and 2 in long non-coding RNA and with encode annotation including no loci in PLS regions, 3 in both dELS and pELS regions.

Figure 3: A) Example of Guide Select workflow for multiplexed, biochemical guide RNA screening. B) Count of candidate off-targets binned by cleavage frequency. A small number of sites screened in the Guide Select assay are cleaved at appreciable frequencies. C) Guide Select nominated off-targets are shown by tier for guides of interest. The on-target is not presented in this figure. D) Candidate off-target loci from tier 1 and tier 2a (high cleavage frequency) on the same chromosome as the on-target locus for all guides are shown here. Note, due to scaling, arrows indicating off-target sites very close in location may not be individually discernable. E) Counts of tier 1, 2a and 2b putative off-target loci that are linked to annotations from the Gencode and Encode databases.

Comprehensive off-target nomination and risk tiering of PCSK9-1 with ONE-seq

A) ONE-seq workflow

Graphic and text of ONE-seq workflow featuring a column of summary points (left) and a row of steps in the process (middle and right). Column: A magnifying glass showcases an illustration of people to signify ONE-seq’s population–scale variant-aware off-target detection. Beneath this is a figure with a target symbol surrounding them to indicate the assay’s high sensitivity to low-frequency off-target events. Row: A blank laptop screen indicates the assay’s computational enrichment of candidate off-target sequences from multiple genomes. Following this is a microchip icon, representing the DNA chip that synthesizes target oligos (~50-240k). Then is an illustration of documents with qualitative data to represent the uniform ONE-seq library. Following that is a magnifying glass with a zigzagging line of electricity through it, representing in vitro editing (Cas9, Cas12a, ABE, CBE, etc.). Next is an icon of a hard drive to signify the deep NGS sequencing, after which is a laptop screen with data on it, for the off-target nomination/analysis of biological impacts.

B) ONE-seq nominated off-targets by tier for PCSK9-1

Scatter plot of ONE-seq nominated off-targets for the PCSK9-1 guide. Plot shows 1:1 mean ONE-seq score up to 1 over annotation concern score up to 15. Each plot is divided into four quadrants, for tier 2a, 1, 3, and 2b (top left, counterclockwise), with most points in tier 3 (ONE-seq score below .01 and annotation concern score between 0 and 8.

C) Description of ERC tiers assigned to candidate off-target loci

ERC tier Description ONE-seq score Concern score # Loci
1
  • High cleavage frequencies
  • High concern scores
  • Higher chances of functional impact
  • Higher chances of structural changes
≥0.01 ≥8 2
2a
  • High cleavage frequencies
  • Low concern scores
  • Lower chances of functional impact
  • Higher chances of structural changes
≥0.01 <8 6
2b
  • Low cleavage frequencies
  • High concern scores
  • Higher chances of functional impact
  • Lower chances of structural changes
<0.01 ≥8 5
3
  • Low cleavage frequencies
  • Low concern scores
  • Lower chances of functional impact
  • Lower chances of structural changes
<0.01 <8 30

D) & E) Contribution of individual genomes to nominated off-target loci

A circle graph describes the breakdown of tiered off-targets that are variants: 70% of all tiered off-targets are variants; 80% of tier 2b, 66% of tier 2a, and 50 percent of tier 1.
A horizontal bar graph describes contribution of individual genomes to nominated off-target loci by eight superpopulations (y axis) and allelic fractions, represented by bar colors. Orange is an allelic fraction 0.1-1, yellow 0.05-0.1, teal 0.01-0.05, and light blue is 0-0.01.

Figure 4: A) Example of ONE-seq workflow for comprehensive off-target nomination and risk assessment. B) ONE-seq nominated off-targets are shown by tier for PCSK9-1. No on-target variation is observed. C) Description of ERC tiers assigned to candidate off-target loci. Higher ONE-seq scores indicate observation of higher cleavage frequencies in the ONE-seq assay. Annotation concern scores represent summary of genomic, functional and known disease associations of loci. D) & E) Contribution of individual genomes to the list of nominated off-target loci.

Related resources

A table for guides that then lists off-target score (by MIT and CFD), Activity (CRISPRater), number of loci (L. Distance, and Risk Profiler number.
Poster
Guide Profiler™: A Genetic Variant-Aware Computational Tool for Improved Guide RNA Selection for CRISPR-Based Therapeutic Applications
A map of the world with different colored dots representing the varying number of samples across different geographic regions.
POSTER
ONE-seq™ for Variant-Aware Therapeutic Guide Selection
View all resources