# scVAR **scVAR** is a computational tool for extracting and integrating genetic variants from single-cell RNA-seq (scRNA-seq) data. It uses variational autoencoders to build a latent space that combines transcriptional and genetic signals, helping resolve cellular heterogeneity — especially in complex diseases like leukemia. ## 🔍 Motivation Leukemias such as AML and B-ALL show high genetic and transcriptomic heterogeneity, making clonal analysis challenging. While scRNA-seq is widely used to study gene expression, it also contains useful information about genetic variants. scVAR takes advantage of this to jointly analyze transcriptional and genetic signals from the same dataset, without the need for parallel DNA sequencing. ## 🧠 What it does - Detects expressed genetic variants directly from scRNA-seq data - Integrates transcriptomic and variant information using multi-input variational autoencoders - Builds a shared latent space capturing both omics layers - Improves detection of rare subclones and subtle transcriptional states - Recovers structure that is often missed by transcriptomic or genomic data alone ## 📊 Use cases - Clonal architecture analysis in AML and B-ALL - Interpretation of relapse samples - Joint modeling of gene expression and mutation signals - Making use of sparse variant data from 10x Genomics 5′ scRNA-seq ## 📁 Data & Results In AML samples, scVAR revealed subclones with distinct transcriptional programs that were not identifiable using gene expression or variants alone. In B-ALL, it uncovered fine-grained cellular structure and helped disentangle overlapping signals from transcriptomic and genetic data. ## 🚀 Getting Started See the `notebooks/` folder for example workflows. To install dependencies: ``` pip install -r requirements.txt ``` Compatible with **Python ≥ 3.8**. ## 📜 License Distributed under the MIT License. See `LICENSE` for details.