cellrank.datasets.reprogramming_morris(subset=ReprogrammingSubset.FULL, path='datasets/reprogramming_morris.h5ad', **kwargs)[source]

Reprogramming of mouse embryonic fibroblasts to induced endoderm progenitors at 8 time points from [Biddy et al., 2018].

scRNA-seq dataset comprising 104 887 cell recorded using 10X Chromium and Dropseq [Macosko et al., 2015] at 8 time points spanning days 0-28 past reprogramming initiation.

Contains raw spliced and un-spliced count data, low-dimensional embedding coordinates as well as clonal information from CellTagging [Biddy et al., 2018]. Moreover, it contains the following anndata.AnnData.obs annotations:

  • ‘reprogramming_day’ - time-point information.

  • ‘reprogramming’ - whether this clone is enriched for cells from successfully reprogrammed populations.

  • ‘CellTagDN_XXk’ - CellTag from day N from the XXk cells subset.

  • subset (Literal[‘full’, ‘48k’, ‘85k’]) –

    Whether to return the full object or just a subset. Can be one of:

    • ’full’ - return the complete dataset containing 104 887 cells.

    • ’85k’ - return the subset as described in [Biddy et al., 2018] Fig. 1, containing 85 010 cells.

    • ’48k’ - return the subset as described in [Biddy et al., 2018] Fig. 3, containing 48 515 cells.

  • path (Union[str, Path]) – Path where to save the dataset.

  • kwargs (Any) – Keyword arguments for scanpy.read().

Return type



adata : anndata.AnnData Annotated data object.


The dataset has approximately 1.5GiB and the subsetting is performed locally after the full download.