Input type
Reference genome source
ArchR genome source
Preprocessing strategy
Is genome index (bwa) available?
Is genome index (chromap) available?
Is genome index (cellranger) available?
Other pipeline-level parameters
CSV file (fragment)
Full path to CSV file containing fragment file info.
CSV file (FASTQ)
Provide full path to a CVS file with each row containing four columns, i.e., the absolute paths to R1, R2, and R3 from each individual run of each sample, and a unique sample name. An example can be found here. If multiple lane/run sequencing data is provided for a sample, the corresponding fastq files for each lane/run must be supplied in separate rows with the same sample name. Importantly, if users have sequencing data from multiple libraries on the same sample, the sample name for each library should be distinct to avoid collapsing data from different libraries. The INPUT_CHECK_FASTQ sub-workflow checks if the input CSV file is valid. To speed up the preprocessing of large fastq files, users can set the command-line parameter split_fastq to split the files into 20 million reads each using the SPLIT_FASTQ module.
output folder
Full path or path relative to working directory.
species latin name
genome FASTA
genome GTF
ENSEMBL genome name
UCSC genome name
whether or not to split FASTQ
Set to "true" to split reads into 20M chunks to gain more speed.
BWA index file
barcode correction algorithm
whitelist barcode folder
Full path or path relative to working directory.
How to filter BAM files
Choose from 'false' (no bam filtering will be performed), 'improper' (reads with low mapping quality, extreme fragment size(outside of 38 - 2000bp), etc. will be filtered out), and 'both' ('improper' + mitochondrial reads will be filtered out.)
cellranger index folder
chromap index folder
doublet removal algorithm
Amulet rmsk bed
Full path to your Amulet rmsk BED file. Set to 'false' to skip.
Amulet autosomes file
Full path (path relatively to working directory) to your Amulet autosome file. E.g. assets/
homo_sapiens_autosomes.txt
homo_sapiens_autosomes.txt
ArchR thread
ArchR genome
ArchR genome FASTA
TxDb
Bioconductor TxDb name for building ArchR genome.
OrgDb
Bioconductor OrgDb name for building ArchR genome.
BSgenome
Bioconductor BSgenome name for building ArchR genome.
ArchR blacklist
Full path to blacklist BED file, will be used in building ArchR genome. Set to 'false' to skip.
batch correction (Harmony)
filter sample
Samples to get rid of for downstream analysis, refer to header line of archr_clustering/
Cluster_xxx_matrix.csv for valid sample names. Default to 'false' meaning that no sample will be excluded. E.g. 'PBMC_1K_N, PBMC_5K_V'.
Cluster_xxx_matrix.csv for valid sample names. Default to 'false' meaning that no sample will be excluded. E.g. 'PBMC_1K_N, PBMC_5K_V'.
filter cluster ILSI
To filter out undesired clusters (e.g. outliers). Filtered clusters will not appear in downstream analysis. The clusters are generated with dimension reduced matrix using ISLI. Refer to archr_clustering/
Cluster_xxx_matrix.csv for valid cluster names. Default to 'false' meaning that no clusters will be excluded. E.g. 'C1, C2'.
Cluster_xxx_matrix.csv for valid cluster names. Default to 'false' meaning that no clusters will be excluded. E.g. 'C1, C2'.
filter cluster harmony
To filter out undesired clusters (e.g. outliers). Filtered clusters will not appear in downstream analysis. Refer to archr_clustering/
Cluster_xxx_matrix.csv for valid cluster names. Default to 'false' meaning that no clusters will be excluded. E.g. 'C1, C2'.
Cluster_xxx_matrix.csv for valid cluster names. Default to 'false' meaning that no clusters will be excluded. E.g. 'C1, C2'.
custom peaks
Name and path to custom peak file in .bed.gz format, used for motif enrichment and deviation analyses. E.g. 'Encode_K562_GATA1 = "https://www.encodeproject.org/files/ENCFF632NQI/@@download/ENCFF632NQI.bed.gz"'
scRNAseq Seurat object
Full path to scRNA-seq Seurat object, when supplied, will perform integrated scRNA-seq analysis.
scRNAseq grouplist
scRNAseq cluster grouping information for constrained integration. Example see conf/test.config.
profile
Profile refers to a set of pre-defined parameters that are bundled together. E.g. "lsf" bundles "executor = 'lsf'" and "queue = 'long'".
lsf
singularity local test
singularity local test