This repository contains the code to reproduce all the analysis done for our paper
introducing the SuperCellCyto R package: https://github.com/phipsonlab/SuperCellCyto.
SuperCellCyto is an adaptation of the SuperCell R package.
Initially developed for scRNAseq data, the SuperCell package aggregates cells
with similar transcriptomic profiles into "supercells" (also known as “metacells” in the scRNAseq literature).
The preprint of the paper is available on bioRxiv:
Putri, G. H., Howitt, G., Marsh-Wakefield, F., Ashhurst, T. M., & Phipson, B. (2023). SuperCellCyto: enabling efficient analysis of large scale cytometry datasets. bioRxiv; DOI: https://doi.org/10.1101/2023.08.14.553168
To reproduce all the figures in the paper, refer to the Rmd files in the analysis folder:
explore_supercell_purity_clusteringfor Supercells Preserve Biological Heterogeneity and Facilitate Efficient Cell Type Identificationb_cells_identificationfor Identifying Rare B Cells Subsets by Clustering Supercellsbatch_correctionfor Mitigating Batch Effects in the Integration of Multi-Batch Cytometry Data at the Supercell Levelde_testfor Recovery of Differentially Expressed Cell State Markers Across Stimulated and Unstimulated Human Peripheral Blood Cellsda_testfor Identification of Differentially Abundant Rare Monocyte Subsets in Melanoma Patientslabel_transferfor Efficient Cell Type Label Transfer Between CITEseq and Cytometry Datarun_timefor measuring the run time of SuperCellCyto and clustering process applicable for the first 3 items above.
The code folder contains the scripts used to generate the results that are
processed in the Rmd files in the analysis folder.
Please note that running some of these scripts will take a long time. That's the reason why they are in separate R scripts.
Otherwise, each rebuilding of the workflowr website will take hours.
The data and output folders are meant for storing raw data and processed data
generated by the scripts in the code folder respectively.
The content of these folders are purposely not committed into the repository
as they are enormous (over 40GB in total).
If you would like to reproduce our analysis, please download the content for the data and output folder from Zenodo: .
Instruction after downloading the files:
- Uncompress
data_20232308.tar.gz(usingtar -zxvf <filename>.tar.gz). You should get onedatafolder. This is thedatafolder for the workflowr website. - Uncompress each of the
tar.gzfiles starting with the wordoutput. Each file should uncompress into one folder. - Create a new folder call
outputand place all the folders uncompressed in step 3 into it. - Run
wflow_build().