Human pancreas

Human pancreas cells dataset from the scIB benchmarks



Luecken et al. (2021)
1.26 GiB
16382 × 18771

Used in


Human pancreatic islet scRNA-seq data from 6 datasets across technologies (CEL-seq, CEL-seq2, Smart-seq2, inDrop, Fluidigm C1, and SMARTER-seq).


dataset is an AnnData object with n_obs × n_vars = 16382 × 18771 with slots:


Name Description Type Data type Size
batch A batch identifier. This label is very context-dependent and may be a combination of the tissue, assay, donor, etc. vector category 16382
cell_type Classification of the cell type based on its characteristics and function within the tissue or organism. vector category 16382
size_factors The size factors created by the normalisation method, if any. vector float32 16382
feature_name A human-readable name for the feature, usually a gene symbol. vector object 18771
hvg Whether or not the feature is considered to be a ‘highly variable gene’ vector bool 18771
hvg_score A ranking of the features by hvg. vector float64 18771
knn_connectivities K nearest neighbors connectivities matrix. sparsematrix float32 16382 × 16382
knn_distances K nearest neighbors distance matrix. sparsematrix float64 16382 × 16382
X_pca The resulting PCA embedding. densematrix float32 16382 × 50
pca_loadings The PCA loadings matrix. densematrix float64 18771 × 50
counts Raw counts sparsematrix float32 16382 × 18771
normalized Normalised expression values sparsematrix float32 16382 × 18771
dataset_description Long description of the dataset. atomic str 1
dataset_id A unique identifier for the dataset. This is different from the obs.dataset_id field, which is the identifier for the dataset from which the cell data is derived. atomic str 1
dataset_name A human-readable name for the dataset. atomic str 1
dataset_organism The organism of the sample in the dataset. atomic str 1
dataset_reference Bibtex reference of the paper in which the dataset was published. atomic str 1
dataset_summary Short description of the dataset. atomic str 1
dataset_url Link to the original source of the dataset. atomic str 1
knn Supplementary K nearest neighbors data. dict 3
normalization_id Which normalization was used atomic str 1
pca_variance The PCA variance objects. dict 2


Luecken, Malte D., M. Büttner, K. Chaichoompu, A. Danese, M. Interlandi, M. F. Mueller, D. C. Strobl, et al. 2021. “Benchmarking Atlas-Level Data Integration in Single-Cell Genomics.” Nature Methods 19 (1): 41–50.