scRNA-seq Assay Metadata Template

A template outlining metadata to be collected for each scRNA-seq library in a dataset.

Attribute Description Required Valid Values
Component A high-level attribute for grouping attributes into templates. TRUE
specimenModality Label assigned to experimental data files indicating whether the data contained corresponds to a single or multiple biospecimens TRUE single specimen, multispecimen
biospecimenID A unique identifier assigned to specimens collected from study participants. For multi-specimen data files provide all IDs in a comma-separated list. TRUE
assay The technology used to generate the data in this file. For multimodal datasets with concomitant profiling of biospecimen select all assays that apply. e.g., the GEX files from a CITE-seq experiment should be labeled with both 'scRNASeq' and 'CITESeq'. TRUE scRNASeq, CyTOF, Xenium, Olink Explore HT, CITESeq, snRNASeq, snATACSeq, RNASeq, multiplexed ELISA, SNP array, imaging mass cytometry, H&E, ASAPSeq, CosMX, serial IHC, imaging mass spectrometry, LC-MS/MS, CE-MS, VDJSeq, scVDJSeq, feature barcode sequencing, SomaScan, WES, WGS, flow cytometry, NULISA
libraryID A library label or name, unique within an experiment, used to distinguish sequencing libraries. FALSE
batch A label or identifier that is unique within a given experiment or dataset used to group data related to sample processing, sample pooling, or data collection batches. These batches often introduce technical effects that can be important to account for in downstream analyses. FALSE
libraryPrepMethod Sequencing library preparation method or kit used to create the library. If no commercially available kit was used, please select 'in-house library prep'. TRUE NEBNext Ultra II Directional RNA Library, QIAseq miRNA Library, SMART-Seq v4 Ultra Low Input RNA, Nextera XT, SMARTer Stranded Total RNA v2, Nextera XT DNA, TruSeq Stranded mRNA, Chromium Single Cell Human TCR Amplification Kit, Chromium Single Cell Human BCR Amplification Kit, SMART-Seq Human BCR with UMI, SMART-Seq Human TCR with UMI, Takara Human BCR profiling for Illumina, Takara Human TCR profiling for Illumina, Takara Human TCRv2 profiling for Illumina, Takara Human scTCR profiling for Illumina, NEBNext Human Immune Sequencing Kit, 10x Chromium 3 GEX, 10x Chromium 3 GEX v3.1, 10x Chromium 5 GEX, 10x Universal 3 GEX, 10x Universal 5 GEX, 10x Chromium Flex, CEL-Seq2, Chromium scATAC Kit v2, in-house library prep, Chromium Next GEM Chip G, Fluidigm C1 HT
nucleicAcidSource The source of the nucleic acid used as input for sequencing library fragments. Select all that apply, though in most cases only a single label is expected. TRUE poly(A) RNA, rRNA-depleted RNA, gDNA, surface protein feature barcode, intracellular protein feature barcode, antigen capture barcode, multiplexing oligo, BCR mRNA, TCR mRNA, Tn5-accessible gDNA, globin-depleted RNA, CRISPR protospacer feature barcode
totalReads Total number of reads sequenced from the library. TRUE
percentCellViability A measure of the proportion of viable cells within a cell suspension. Scale is 0-100. TRUE
expectedCellCount An estimate of the number of cells expected to be sequenced in a library. Software that process single-cell sequencing data often include options for users to specify this value to improve processing results. TRUE
platform The specific version (manufacturer, model, etc.) of a technology that is used to carry out a laboratory or computational experiment. Specify where applicable for experimental data files, else enter 'none'. In most cases only a single label is expected, however multiple selections can be provided in comma-delimited list where applicable e.g., for 10x Genomics fastq files please specify both the 10x instrument and the sequencing platform. FALSE CyTOF XT, Helios Mass Cytometer, Hyperion, Illumina NextSeq 500, Illumina HiSeq 2500, Illumina NovaSeq 6000, Helios Mass Cytometer, Hyperion, Illumina NovaSeq X, Chromium X, Chromium iX, Chromium Xo, Chromium Controller, Visium CytAssist, Xenium, none
sequencingSaturation A measure of the fraction of library complexity that was sequenced in a library. This metric quantifies the fraction of reads originating from an already-observed UMI. More specifically, this is the fraction of confidently mapped, valid cell-barcode, valid UMI reads that are non-unique. Scale is 0-1. FALSE