README.md 996 Bytes
Newer Older
Matteo Barcella's avatar
Matteo Barcella committed
1
2
## Introduction

Matteo Barcella's avatar
Matteo Barcella committed
3
**Whole Exome** data analysis - from fastq data to variants calling and annotation
Matteo Barcella's avatar
Matteo Barcella committed
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
As stated in supplementary methods we performed WES analysis relying mostly on the GATK best practices for WES analysis.

Below the main steps performed and the relative running commands:

# Alignment #

Fastq files were aligned with [BWA aligner](https://github.com/lh3/bwa) (v0.7.17) to GRCh38 reference genome (GRCh38.p13 gencodegenes) using default parameters, except for the -M option for [Picard](https://broadinstitute.github.io/picard/) compatibility necessary for marking of duplicates.  

<details><summary>Click to expand</summary>
`bwa mem -t 12 -R @RG\tID:BALL_10_395_3_L1\tSM:BALL_10_395_3  PL:ILLUMINA -M GRCh38.p13.genome.fa 3_S8_L001_R1_001.fastq.gz 3_S8_L001_R2_001.fastq.gz`
</details>

We perform alignment using both human GRCh38 and mouse GRCm38 reference genomes in order to perform reads disambiguation and discard reads from human mapping that belong to mouse cells.