Synthetic lethal (SL) gene pairs represent a unique genetic relationship where the simultaneous loss-of-function of two genes leads to cell death, while the loss of either gene alone does not compromise cell survival. Identifying these pairs is crucial for developing targeted therapies, particularly in cancer, where exploiting SL interactions allows for selective killing of cancer cells harboring mutations in one of the genes while sparing normal cells. However, the experimental identification of SL pairs remains a significant challenge due to the vast number of possible gene combinations and the fact that SL interactions are rare and context-dependent. This makes exhaustive experimental screening impractical, both in terms of time and cost. Foundation models, however, can learn complex gene dependencies and can be applied to simulate genetic perturbations, such as gene knockouts, at single-cell resolution. Geneformer, a single-cell foundation model that is pre-trained on gene expression data from 95 million cells and represents cellular states as ordered lists of gene expression values, has an architecture built for simulating in silico perturbations, and is therefore suitable for the task of predicting SL interactions. We constructed a classifier to predict cell viability and SL interaction based on Geneformer’s pre-training knowledge alongside data from the Cancer Dependency Map (DepMap), which includes cell viability scores for a variety of gene knockouts, as well as SynLethDB, a comprehensive database of experimentally validated SL pairs. We identify synthetically lethal gene pairs by measuring the divergence in both the gene embeddings generated by Geneformer and the predicted viability score between single-gene and dual-gene knockouts.
The Identification of Synthetically Lethal Gene Pairs Using Single-cell Foundation Models
Category
Student Abstract Submission