# Amodio_Infertility_2023 Amodio G, Giacomini G, Boeri L, et al. **Specific types of male infertility are correlated with T cell exhaustion or senescence signatures** ### Single-Cell RNA Sequencing Analysis ### **scRNAseq** (from 10X Genomics) analysis of CD3+ T cells purified from the peripheral blood of men diagnosed with oligo-astheno-teratozoospermia (OAT, n=4), idiopathic non-obstructive azoospermia (iNOA, n=6), and a control group (FER, n=5). scRNAseq analysis was performed using a standard [Seurat](https://satijalab.org/seurat/) pipeline that includes the following steps starting from a minimal object after loading of 10X data to markers identification: - Preprocessing and cell filtering - Each sample was pre-processed and cells with mitochondrial RNA percentages higher than 10 and a number of features <1200 or >6000, were filtered out. Samples were merged into a single Seurat dataset - Normalization - Default Seurat settings [(NormalizeData function)](https://satijalab.org/seurat/reference/normalizedata) - Scaling: - Data was regressed out by passing UMI count, the percentage of mitochondrial genes, the difference between the cell cycle phases scores, as described in the Seurat [vignette](https://satijalab.org/seurat/articles/cell_cycle_vignette.html#alternate-workflow-1). - Dimensionality reduction and Harmony batch removal: - A principal component analysis (PCA) with 100 principal components (PCs) was performed and a UMAP-representation as well as clusters were computed on the top 55 components (orig.ident as batch variable) - Clustering: - K-nearest neighbor (KNN) graph was first constructed based on the Euclidean distance using the [FindNeighbors](https://satijalab.org/seurat/reference/findneighbors) function, with the KNN algorithm set to 20. - The modularity optimization technique was applied using the Louvain algorithm through the [FindCluster](https://satijalab.org/seurat/reference/findclusters) function, with resolution parameters set to 1.2. - Markers identification: - Marker genes for each cluster were identified using the [FindAllMarkers](https://satijalab.org/seurat/reference/findallmarkers) function with the logfc.threshold argument set to 0.25. Only genes expressed in at least 25% of cells in one of the compared clusters were considered (min.pct = 0.25). Genes with pvalues < 1e10 -5 from the Wilcoxon Rank Sum test were considered as markers for a specific cluster. - Cluster annotation - Gene enrichment analysis (GSEA): - Intra-cluster comparisons: Intra-cluster comparisons among the experimental conditions were conducted using the [FindMarkers](https://satijalab.org/seurat/reference/findmarkers) function, setting test.use = wilcox, a logFC threshold = 0, min.cells.group = 5 and return.thresh parameter equal to 1. - GSEA function of [ClusterProfiler R package](https://bioconductor.org/packages/release/bioc/manuals/clusterProfiler/man/clusterProfiler.pdf) was applied, using the full marker gene list ranked by decreasing logFC and the hallmarks gene set. Gene sets were considered enriched if their adjusted pvalue was <0.1. ### Directories and Files ### - sampleSheet.csv: names of samples and corresponding conditions