**Bulk RNAseq** data analysis workflow includes the following steps:
1. Quality control by [FastQC](https://www.bioinformatics.babraham.ac.uk/projects/fastqc/)
2. Trimming of bad quality reads with [TrimGalore](https://www.bioinformatics.babraham.ac.uk/projects/trim_galore/)
3. Alignment with [STAR](https://github.com/alexdobin/STAR)
Running command
"STAR " +
"--runThreadN {threads} " +
"--genomeDir {input.genome} " +
"--readFilesIn {params.trim_seq} " +
"--outSAMstrandField intronMotif " +
"--outFileNamePrefix {params.aln_seq_prefix} " +
"--outSAMtype BAM SortedByCoordinate " +
"--outSAMmultNmax 1 " +
"--outFilterMismatchNmax 10 " +
"--outReadsUnmapped Fastx " +
"--readFilesCommand zcat "
4. Gene expression quantification with [FeatureCounts](https://academic.oup.com/bioinformatics/article/30/7/923/232889)
Running command
"featureCounts " +
"-a {input.annot} " +
"-o {output.fcount} " +
"-g gene_name " +
"-p -B -C " +
"-s {params.strand} " +
"--minOverlap 10 " +
"-T {threads} " +
"{input.bams} "
5. Differential Expression analysis with [Deseq2](https://bioconductor.org/packages/release/bioc/html/DESeq2.html) and [edgeR](https://doi.org/10.1093/bioinformatics/btp616).
For Thal vs HD comparison intra population (HSC or MPP - n = 2) we used the simple design with counts ~ condition, whereas for assessing differencies between Thal and HD in HSC and MPP togheter we included the celltype covariate : count ~ celltype + condition. Genes with an FDR < 0.05 or < 0.001 were considered DEGs for intra population comparisons and combined (HSC + MPP) with covariate respectively.
6. Downstream analysis
We performed Over Representation Analyses (ORA) and Gene Set Enrichment Analyses (GSEA) using R/Bioconductor package [clusterProfiler](https://bioconductor.org/packages/release/bioc/html/clusterProfiler.html) (v.3.8.1). Pre-ranked gene list according to log2FC was used to perform Gene Set Enrichment Analysis (GSEA). Terms with adjusted p-value (Benjiamini-Hochberg correction) less than 0.05 were considered significantly enriched.
7. Supplementary analysis: Music deconvolution analysis
We also applied Music to perform BulkRNAseq deconvolution according to celltype specific transcriptomic profiles obtained from scRNAseq dataset (Tusi et al 2021).
For details see content of Music folder.