Commit bc10093f authored by Stefano Beretta's avatar Stefano Beretta
Browse files

Update README.md

parent ebc0635b
...@@ -17,3 +17,11 @@ RNA-seq analysis to compare Treated (BE4, ABE8, Cas9, Mock electro) vs Untreated ...@@ -17,3 +17,11 @@ RNA-seq analysis to compare Treated (BE4, ABE8, Cas9, Mock electro) vs Untreated
- Differential Gene Expression (DGE) analysis with the R/Bioconductor package `DESeq2`: genes having FDR < 0.05 were considered as differentially expressed - Differential Gene Expression (DGE) analysis with the R/Bioconductor package `DESeq2`: genes having FDR < 0.05 were considered as differentially expressed
- post-analyses with the R/Bioconductor package `ClusterProfiler` using the Hallmark collection from MSigDB as reference database. - post-analyses with the R/Bioconductor package `ClusterProfiler` using the Hallmark collection from MSigDB as reference database.
- visualization of the (spliced) alignments on the TP73 gene was done with Integrative Genomes Viewer (`IGV`). - visualization of the (spliced) alignments on the TP73 gene was done with Integrative Genomes Viewer (`IGV`).
Variant calling analysis on RNA-Seq base editing data:
- merging of reads from replicates of each condition and downsampling to 120M with `SeqTK`;
- alignment to the human genome assembly (GRCh38) with `STAR`;
- mark duplicates with Picard `MarkDuplicates` and split of eads containing Ns with GATK `SplitNCigarReads`;
- variant calling using three different tools: `HaplotypeCaller` (with options `--min-base-quality-score 20`, `--dont-use-soft-clipped-bases`, and `–standard-min-confidence-threshold-for-calling 20`), `Mutect2` (in tumor-only mode, with options `--disable-read-filter MateOnSameContigOrNoMappedMateReadFilter`), and `FreeBayes`.
Nucleotide composition of each position was also assessed using REDItools (https://github.com/tflati/reditools2.0) on each sample, discarding all the positions having coverage lower than 20 and base quality lower than 30 to avoid errors due to low sampling. Next, variants called by each tool in the untreated controls were filtered out in the treated samples to enrich for private mutations. This procedure retained only variants in high-quality genomic positions in both treated and untreated sample, for which the untreated sample showed ≥ 99% of reads supporting the reference, non-mutant, base at the position of the mutation (based on REDItools). The final lists of variants for each sample were made by those called by all tools and passing the filtering procedure (intersection).
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment