## Introduction
**Whole Exome** data analysis - from fastq data to variants calling and annotation
As stated in supplementary methods we performed WES analysis relying mostly on the GATK best practices for WES analysis.
Below the main steps performed and the relative running commands:
# Alignment #
Fastq files were aligned with [BWA aligner](https://github.com/lh3/bwa) (v0.7.17) to GRCh38 reference genome (GRCh38.p13 gencodegenes) using default parameters, except for the -M option for [Picard](https://broadinstitute.github.io/picard/) compatibility necessary for marking of duplicates.
Code
```
bwa mem -t 12 -R @RG\tID:BALL_10_395_3_L1\tSM:BALL_10_395_3 PL:ILLUMINA -M GRCh38.p13.genome.fa 3_S8_L001_R1_001.fastq.gz 3_S8_L001_R2_001.fastq.gz`
picard SortSam INPUT=BALL_10_395_3_L1.sam OUTPUT=BALL_12_27_1_L1_mouse.bam SORT_ORDER=coordinate
picard MergeSamFiles I=BALL_10_395_3_L1_mouse.bam I=BALL_10_395_3_L2_mouse.bam OUTPUT=BALL_10_395_3_mouse.bam
samtools index BALL_10_395_3_mouse.bam
picard MarkDuplicates INPUT=BALL_10_395_3_mouse.bam OUTPUT=BALL_10_395_3.dedup_reads_mouse.bam METRICS_FILE=BALL_10_395_3.metrics_mouse.txt
gatk BaseRecalibrator --input BALL_10_395_3.dedup_reads_mouse.bam --reference $genome --known-sites $vreference --output BALL_10_395_3_recal_data_mouse.table
gatk ApplyBQSR --reference $genome --input BALL_10_395_3.dedup_reads_mouse.bam --output BALL_10_395_3.dedup_reads_mouse_recal.bam --bqsr-recal-file BALL_10_395_3_recal_data_mouse.table --static-quantized-quals 10 --static-quantized-qual
gatk BaseRecalibrator --input BALL_10_395_3.dedup_reads_mouse_recal.bam --reference $genome --known-sites $vreference --output BALL_10_395_3_post_recal_data_mouse.table
```
## Disambiguation ##
We perform alignment using both human and mouse reference genomes in order to perform reads disambiguation using [disambiguate](https://pubmed.ncbi.nlm.nih.gov/27990269/) and discard reads from human mapping that belong to mouse cells.
Code
`python disambiguate.py -a bwa -s Sample_10_395_3 02-Alignment_human/BALL_10_395_3.dedup_reads_rehead_recal.bam 02-Alignment_mouse/BALL_10_395_3.dedup_reads_mouse_recal.bam`
# Variant calling #