## Introduction **Whole Exome** data analysis - from fastq data to variants calling and annotation As stated in supplementary methods we performed WES analysis relying mostly on the GATK best practices for WES analysis. Below the main steps performed and the relative running commands: # Alignment # Fastq files were aligned with [BWA aligner](https://github.com/lh3/bwa) (v0.7.17) to GRCh38 reference genome (GRCh38.p13 gencodegenes) using default parameters, except for the -M option for [Picard](https://broadinstitute.github.io/picard/) compatibility necessary for marking of duplicates.
Click to expand `bwa mem -t 12 -R @RG\tID:BALL_10_395_3_L1\tSM:BALL_10_395_3 PL:ILLUMINA -M GRCh38.p13.genome.fa 3_S8_L001_R1_001.fastq.gz 3_S8_L001_R2_001.fastq.gz`
We perform alignment using both human GRCh38 and mouse GRCm38 reference genomes in order to perform reads disambiguation and discard reads from human mapping that belong to mouse cells.