# Spatial Decomposition

Calling cell-type compositions for spot-based spatial transcriptomics data

## Description

Spatial decomposition (also often referred to as Spatial deconvolution) is applicable to spatial transcriptomics data where the transcription profile of each capture location (spot, voxel, bead, etc.) do not share a bijective relationship with the cells in the tissue, i.e., multiple cells may contribute to the same capture location. The task of spatial decomposition then refers to estimating the composition of cell types/states that are present at each capture location. The cell type/states estimates are presented as proportion values, representing the proportion of the cells at each capture location that belong to a given cell type.

We distinguish between *reference-based* decomposition and *de novo* decomposition, where the former leverage external data (e.g., scRNA-seq or scNuc-seq) to guide the inference process, while the latter only work with the spatial data. We require that all datasets have an associated reference single cell data set, but methods are free to ignore this information.

## Summary

## Metrics

**r2**^{1}: R2, or the “coefficient of determination”, reports the fraction of the true proportion values’ variance that can be explained by the predicted proportion values. The best score, and upper bound, is 1.0. There is no fixed lower bound for the metric. The uniform/non-weighted average across all cell types/states is used to summarise performance.

## Results

Results table of the scores per method, dataset and metric (after scaling). Use the filters to make a custom subselection of methods and datasets. The “Overall mean” dataset is the mean value across all datasets.

## Details

## Methods

**Cell2location (alpha=20, amortised, hard-coded)**^{2}: Cell2location is a decomposition method based on Negative Binomial regression that is able to account for batch effects in estimating the single-cell gene expression signature used for the spatial decomposition step. Note that since batch information is unavailable in this task, here we use either a hard-coded reference, or a negative-binomial learned reference without batch labels. The parameter alpha refers to the detection efficiency prior. Links: Docs.

**Cell2location (alpha=1, reference hard-coded)**^{2}: Cell2location is a decomposition method based on Negative Binomial regression that is able to account for batch effects in estimating the single-cell gene expression signature used for the spatial decomposition step. Note that since batch information is unavailable in this task, here we use either a hard-coded reference, or a negative-binomial learned reference without batch labels. The parameter alpha refers to the detection efficiency prior. Links: Docs.

**Cell2location (alpha=20, reference hard-coded)**^{2}: Cell2location is a decomposition method based on Negative Binomial regression that is able to account for batch effects in estimating the single-cell gene expression signature used for the spatial decomposition step. Note that since batch information is unavailable in this task, here we use either a hard-coded reference, or a negative-binomial learned reference without batch labels. The parameter alpha refers to the detection efficiency prior. Links: Docs.

**Cell2location (alpha=200, reference hard-coded)**^{2}: Cell2location is a decomposition method based on Negative Binomial regression that is able to account for batch effects in estimating the single-cell gene expression signature used for the spatial decomposition step. Note that since batch information is unavailable in this task, here we use either a hard-coded reference, or a negative-binomial learned reference without batch labels. The parameter alpha refers to the detection efficiency prior. Links: Docs.

**Cell2location (alpha=20, NB reference)**^{2}: Cell2location is a decomposition method based on Negative Binomial regression that is able to account for batch effects in estimating the single-cell gene expression signature used for the spatial decomposition step. Note that since batch information is unavailable in this task, here we use either a hard-coded reference, or a negative-binomial learned reference without batch labels. The parameter alpha refers to the detection efficiency prior. Links: Docs.

**DestVI**^{3}: destVI is a decomposition method that leverages a conditional generative model of spatial transcriptomics down to the sub-cell-type variation level, which is then used to decompose the cell-type proportions determining the spatial organization of a tissue. Links: Docs.

**Non-Negative Matrix Factorization (NMF)**^{7}: NMF is a decomposition method based on Non-negative Matrix Factorization (NMF) that reconstructs expression of each spatial location as a weighted combination of cell-type signatures defined by scRNA-seq. It is a simpler baseline than NMFreg as it only performs the NMF step based on mean expression signatures of cell types, returning the weights loading of the NMF as (normalized) cell type proportions, without the regression step. Links: Docs.

**NMF-reg**^{8}: NMFreg is a decomposition method based on Non-negative Matrix Factorization Regression (NMFreg) that reconstructs expression of each spatial location as a weighted combination of cell-type signatures defined by scRNA-seq. It was originally developed for Slide-seq data. Links: Docs.

**Non-Negative Least Squares**^{4}: NNLS13 is a decomposition method based on Non-Negative Least Square Regression (NNLS). It was originally introduced by the method AutoGenes. Links: Docs.

**Random Proportions**^{13}: Random assignment of predicted celltype proportions from a Dirichlet distribution. Links: Docs.

**RCTD**^{5}: RCTD (Robust Cell Type Decomposition) is a decomposition method that uses signatures learnt from single-cell data to decompose spatial expression of tissues. It is able to platform effect normalization step, which normalizes the scRNA-seq cell type profiles to match the platform effects of the spatial transcriptomics dataset. Links: Docs.

**SeuratV3**^{9}: SeuratV3 is a decomposition method that is based on Canonical Correlation Analysis (CCA). Links: Docs.

**Stereoscope**^{6}: Stereoscope is a decomposition method based on Negative Binomial regression. It is similar in scope and implementation to cell2location but less flexible to incorporate additional covariates such as batch effects and other type of experimental design annotations. Links: Docs.

**Tangram**^{10}: Tangram is a method to map gene expression signatures from scRNA-seq data to spatial data. It performs the cell type mapping by learning a similarity matrix between single-cell and spatial locations based on gene expression profiles. Links: Docs.

**True Proportions**^{13}: Perfect assignment of predicted celltype proportions from the ground truth. Links: Docs.

## Baseline methods

**Random Proportions**: Random assignment of predicted celltype proportions from a Dirichlet distribution.

**True Proportions**: Perfect assignment of predicted celltype proportions from the ground truth.

## Datasets

**DestVI**^{3}: scRNA-seq is generated based on learn NB parameters from the destVI manuscripts leveraging sparsePCA. Number of cells and cell types present in each spatial spot is computed via combination of kernel-based parametrization of a categorical distribution and the NB model.

**Pancreas (alpha=0.5)**^{11}: Human pancreas cells aggregated from single-cell (Dirichlet alpha=0.5).

**Pancreas (alpha=1)**^{11}: Human pancreas cells aggregated from single-cell (Dirichlet alpha=1).

**Pancreas (alpha=5)**^{11}: Human pancreas cells aggregated from single-cell (Dirichlet alpha=5).

**Tabula muris senis (alpha=0.5)**^{12}: Mouse lung cells aggregated from single-cell (Dirichlet alpha=0.5).

**Tabula muris senis (alpha=1)**^{12}: Mouse lung cells aggregated from single-cell (Dirichlet alpha=1).

**Tabula muris senis (alpha=5)**^{12}: Mouse lung cells aggregated from single-cell (Dirichlet alpha=5).

## Download raw data

Task info Method info Metric info Dataset info Results Quality control

## Quality control results

Category | Name | Value | Condition | Severity |
---|---|---|---|---|

Scaling | Worst score seuratv3 r2 | -4.847695 | worst_score >= -1 | ✗✗✗ |

Scaling | Worst score tangram r2 | -2.638332 | worst_score >= -1 | ✗✗ |