The design of CRISPR gRNAs requires accurate on-target efficiency predictions, which demand high-quality gRNA activity data and efficient modeling. To advance, we here report on the generation of on-target gRNA activity data for 10,592 SpCas9 gRNAs. Integrating these with complementary published data, we train a deep learning model, CRISPRon, on 23,902 gRNAs. Compared to existing tools, CRISPRon exhibits significantly higher prediction performances on four test datasets not overlapping with training data used for the development of these tools. Furthermore, we present an interactive gRNA design webserver based on the CRISPRon standalone software, both available via
https://rth.dk/resources/crispr/
. CRISPRon advances CRISPR applications by providing more accurate gRNA efficiency predictions than the existing tools. High-quality gRNA activity data is needed for accurate on-target efficiency predictions. Here the authors generate activity data for over 10,000 gRNA and build a deep learning model CRISPRon for improved performance predictions.
When working with CRISPR, it is important to realize that there is always a difference between the gRNAs you use. Some work more efficiently and generate a high frequency of predicted modifications, while others are less efficient. Many of these differences are not obvious and are impossible to predict by simply looking at the sequences. Thus, in recent years, many efforts have been made to develop models that can help scientists select the most efficient gRNAs. Recently, researchers have created a new algorithm-based deep learning model, CRISPRon, from a novel approach that is based on studying a barcoded gRNA and its endogenous substitution site in a same lentiviral vector. In a single experiment, thousands of different lentiviral vectors can be used for transduction of human cells. The gene editing events are then studied and a massive parallel quantification of the editing activity of more than 10,000 gRNAs was obtained using this lentiviral library. This huge dataset was used to train the deep learning model, CRISPRon, and it was found that this new tool is significantly better at predicting gRNA efficiency than existing models.