Cell Selection

TL;DR

Problem: Identify individual cells from scRNA-Seq data.

The first step in scRNA-Seq analysis is to identify which reads come from individual cells. This requires removing empty GEMs (Figure 1), heterotypic and homotypic multiplets, and cells with low expression diversity. I selected the following as putative cells for clustering and downstream analysis (Table 1). Replicates 3 and 4 are puzzling, with having more captured cells than expected. Either I am over-calling cells or more cells were loaded than thought.

Table 1. Final cell selection counts.

Replicate	Approx. Num. Cells Loaded	Expected Num. Cells Captured	Num. of Cells Selected	Num. of Reads
Testis 1	6,000	3,000	2,710	119,932,060
Testis 2	6,000	3,000	4,226	148,921,414
Testis 3	12,000	8,000	11,946	131,590,010
Testis 4	12,000	8,000	14,697	159,678,110

Overview

10X genomics is a droplet based method where cells are mixed with gel beads (GEMs) that contain indices and all materials needed for the first strand cDNA synthesis (Figure 1). Cells are loaded with GEMs using a Poisson distribution which results in most GEMs being empty. The 10X genomics kit that we are using has 737,280 GEMs per sample. Oligonucleotides attached to each GEM contain a cell index and a molecule index. I want to only use GEMs capturing a single cell, so I need to remove empty GEMs and GEMs that captured more than 1 cell.

We can model total unique molecule expression (UMI) to separate GEMs that contain ≥1 cell from empty GEMs. This will account for any ambient RNA in the media. 10X Genomics provides their own software for calling cells (cell ranger). Recently, 10X genomics released version 3 of cell ranger, which boasts improved cell calling for low RNA content cells. Here I evaluate cell ranger v2 and v3 as well as a 3rd party tool DropLetUtils.

In addition to removing GEMs with 0 cells, we also want to remove GEMs with ≥2 cells (a.k.a multiplets). There are two types of multiplets that can occur:

Heterotypic: ≥2 cells from different cell types.
Homeotypic: ≥2 cells from the same cell type

Heterotypic multiplets can be identified in silico. Essentially I create synthetic doublets by mixing two or more cells from different cell type clusters. I then re-cluster cells + synthetic cells. Cells that cluster with synthetic doublets are likely to be heterotypic multiplets and are removed. There are several tools available to do this analysis, I decided to use scrublet because it works with raw 10X data.

We do expect biases when it comes to the types of multiplets that we capture. We do not expect to capture doublets of large cells or large aggregates because there is a filtration step (35 μm) during sample prep. In particular, fused later staged primary spermatocytes are too large (~50 μm) to pass through the filter. While multiplets from smaller somatic cells, spermatogonia, or early stage primary spermatocytes may easily pass through the filter. Sharvani does see evidence for multiple small cells stuck together after filtration. She is working on getting an estimate of the frequency of multiplets to give better insight of how big of a problem this may be.

Results

NOTE: Below I display results for each replicate on different tabs. Click the replicate name to view results. Complete descriptions of the data are included with Testis 1.

Empty GEM Identification

To identify empty GEMs I used multiple software and parameterizations.

cellranger-wf uses cell ranger v2 with default settings. This workflow calls the fewest number of cells (~500 cells / rep). According to 10X Genomics we expect to capture ~50% of loaded cells, but this method gives ≤8%. This method is known not to perform well when there is a large dynamic range of RNA content. Our samples are from the Drosophila testis which is known to contain nearly quiescent cells and extremely high RNA content cells. Therefore, we think this method is under calling captured cells.

cellranger-force-wf uses cell ranger v2 with the --force-cells option. When samples contain a high dynamic range of RNA content, 10X Genomics suggests to use the --force-cells. Essentially this method orders cells according to total UMI content and then keeps the highest N cells. We specified our forced settings to capture 50% of loaded cells, but this is ad hoc and likely over calls captured cells.

cellranger3-wf uses cell ranger v3 with default settings. Recently, 10X Genomics released version 3 of cell ranger. This version adds an additional step to the cell calling algorithm which improves calling of low RNA content cells. Interestingly, this method called more cells than cellranger-force-wf in all replicates expect Testis 1.

droputils uses DropLetUtils v3.0.1 with default settings. DropLetUtils is an R packaged with several algorithms for scRNA-Seq. I used the emptydrops function which models UMI to classify empty GEMs. This method performs very different for each replicate. This method seems to over call cells in Testis 1, Testis 3, and Testis 4 while performing similarly to cell ranger on Testis 2.

Without a ground truth data set it is impossible to know which method is best approximating real cell calls. Consensus is often used in the absence of truth. However, sense all of these methods use total UMI in their models, the 4-way consensus would be the same as the most conservative cell calls (cellranger-wf). Given the known limitation of cellranger-wf and the ad hoc nature of cellranger-force-wf I decided to use the consensus of cellranger3-wf and droputils.

Table 2. Cell count after removing empty GEMs.

Replicate	`cellranger-wf`	`cellranger-force-wf`	`cellranger3-wf`	`droputils`	2-way Consensus^‡
Testis 1	483	3,000	2,826	13,884	2,790
Testis 2	550	3,000	6,385	5,765	4,801
Testis 3	423	8,000	12,485	39,872	12,515
Testis 4	349	8,000	15,033	34,761	15,033

^‡Intersection of cellranger3-wf and droputils.

Testis 1

`cellranger-wf` (v2.1.1; defaults)

Summary statistics and barcode rank plot provided by cell ranger. Things to pay attention to:

The three numbers in green at the top (Estimated Number of Cells, Mean Reads per Cell, and Median Genes per Cell). We are looking for a balance between these numbers. As cells with lower UMI are added, the Mean Reads per Cell and Median Genes per Cell will decrease. We do not want these numbers to get too low as the added cells will not have enough signal.
The Fraction of Reads in Cells (2nd number below plot). We expect that most reads should be in a cell and not in empty GEMs.
The percent of Reads Mapped to Genome. This number will be the same across cell ranger runs, but indicates how good our samples are. Testis 1 has the lowest mapping rate of 73%. This is considered moderate to low quality in Bulk RNA-Seq studies where we typically aim for ≥90% mapping.
The Barcode Rank Plot. This plot shows the Log Total UMI Count (Y-axis) vs the Log Rank Ordered Cell Count (X-axis). Light gray points are empty GEMs and colored points are considered non-empty cells. The goal of cell selection is to find the cutoff along this curve.

`cellranger-force-wf` (v2.1.1; force)

Note the Fraction Reads in Cells is closer to the >70%.

`cellranger3-wf (v3.0.1; defaults)

These results are very similar to cellranger-force-wf.

`droputils` (DropletUtils v1.2.1)

Finally, I wanted to use an external tool for cell selection. The most popular tool I founds was an R package called dropletUtils.

Lun ATL, Riesenfeld S, Andrews T, Dao T, Gomes T, participants in the 1st Human Cell Atlas Jamboree, Marioni JC (2019). “EmptyDrops: distinguishing cells from empty droplets in droplet-based single-cell RNA sequencing data.” Genome Biol., 20, 63. doi: 10.1186/s13059-019-1662-y.

This tool also builds a classification model based on UMI content. I do not have all of the summary stats for these cell calls, but I proved the barcode rank plot. The tool estimates the knee (blue line) and the inflection point (green line) automatically. Putative empty GEMs are in light gray and non-empty cells are in black. This method always calls way more cells than cell ranger methods.

Pairwise Similarity (Jaccard)

In this section I look at the similarity of calls between the different methods using the pairwise Jacaard similarity score. The cellranger-foce-wf and the cellranger3-wf always have the highest similarity. The droputils and the cellranger-wf always have the lowest similarity.

method	cellranger-wf	cellranger-force-wf	cellranger3-wf	droputils
cellranger-wf	1	0.8219	0.8342	0.0537
cellranger-force-wf	0.8219	1	0.9771	0.1919
cellranger3-wf	0.8342	0.9771	1	0.2146
droputils	0.0537	0.1919	0.2146	1

Consensus

I provide a summary for the 3-way consensus (cellranger-force-wf, cellranger3-wf, droputils) and the 2-way consensus (cellranger3-wf and droputils).

Number of cells with 3-way consensus: 2,717
Number of cells with 2-way consensus: 2,790
Number of different calls between consensus with 3 vs 2 measures: 73
Jaccard similarity of consensus with 3 vs 2 measures: 0.9948479074034865

This is an UpSet plot, it is like a high dimensional Venn Diagram. Each bar represents the number of cells for a given set. The set is represented below as black dots. For example the first bar is the 4-way consensus (intersection of all 4 methods).

Testis 2