cellrank.datasets.reprogramming(subset='full', path='datasets/reprogramming.h5ad', **kwargs)[source]

Reprogramming of mouse embryonic fibroblasts to induced endoderm progenitors at 8 time points from [Morris18].

scRNA-seq dataset comprising 104,887 cell recorded using 10X Chromium and Dropseq [Macosko15] at 8 time points spanning days 0-28 past reprogramming initiation.

Contains raw spliced and un-spliced count data, low-dimensional embedding coordinates as well as clonal information from CellTagging [Morris18]. Moreover, contains the following anndata.AnnData.obs: annotations:

  • ‘reprogramming_day’ - time-point information.

  • ‘reprogramming’ - whether this clone is enriched for cells from successfully reprogrammed populations.

  • ‘CellTagDN_XXk’ - CellTag from day N from the XXk cells subset.

  • subset (str) –

    Whether to return the full object or just a subset. Can be one of:

    • ’full’ - return the complete dataset containing 104,887 cells.

    • ’85k’ - return the subset as described in [Morris18] Fig. 1, containing 85,010 cells.

    • ’48k’ - return the subset as described in [Morris18] Fig. 3, containing 48,515 cells.

  • path (Union[str, Path]) – Path where to save the dataset.

  • **kwargs – Keyword arguments for scanpy.read().


adata – Annotated data object.

Return type



The dataset has approximately 1.5GiB and the subsetting is performed only locally after the full download.