cellrank.pl.cluster_lineage(adata, model, genes, lineage, backward=False, time_range=None, clusters=None, n_points=200, time_key='latent_time', covariate_key=None, ratio=0.05, cmap='viridis', norm=True, recompute=False, callback=None, ncols=3, sharey=False, key=None, random_state=None, use_leiden=False, show_progress_bar=True, n_jobs=1, backend='loky', figsize=None, dpi=None, save=None, pca_kwargs=mappingproxy({'svd_solver': 'arpack'}), neighbors_kwargs=mappingproxy({'use_rep': 'X'}), clustering_kwargs=mappingproxy({}), return_models=False, **kwargs)[source]

Cluster gene expression trends within a lineage and plot the clusters.

This function is based on Palantir, see [Setty et al., 2019]. It can be used to discover modules of genes that drive development along a given lineage. Consider running this function on a subset of genes which are potential lineage drivers, identified e.g. by running cellrank.tl.lineage_drivers().

  • adata (anndata.AnnData) – Annotated data object.

  • model (Union[BaseModel, Mapping[str, Mapping[str, BaseModel]]]) –

    Model based on cellrank.ul.models.BaseModel to fit.

    If a dict, gene and lineage specific models can be specified. Use '*' to indicate all genes or lineages, for example {'gene_1': {'*': ...}, 'gene_2': {'lineage_1': ..., '*': ...}}.

  • genes (Sequence[str]) – Genes in adata.var_names or in adata.raw.var_names, if use_raw=True.

  • lineage (str) – Name of the lineage for which to cluster the genes.

  • backward (bool) – Direction of the process.

  • time_range (Union[float, Tuple[Optional[float], Optional[float]], None]) –

    Specify start and end times:

    • If a tuple, it specifies the minimum and maximum pseudotime. Both values can be None, in which case the minimum is the earliest pseudotime and the maximum is automatically determined.

    • If a float, it specifies the maximum pseudotime.

  • clusters (Optional[Sequence[str]]) – Cluster identifiers to plot. If None, all clusters will be considered. Useful when plotting previously computed clusters.

  • n_points (int) – Number of points used for prediction.

  • time_key (str) – Key in adata.obs where the pseudotime is stored.

  • covariate_key (Union[str, Sequence[str], None]) – Key(s) in adata.obs containing observations to be plotted at the bottom of each plot.

  • ratio (float) – Height ratio of each covariate in covariate_key.

  • cmap (Optional[str]) – Colormap to use for continuous covariates in covariate_key.

  • norm (bool) – Whether to z-normalize each trend to have zero mean, unit variance.

  • recompute (bool) – If True, recompute the clustering, otherwise try to find already existing one.

  • callback (Union[Callable, Mapping[str, Mapping[str, Callable]], None]) – Function which takes a cellrank.ul.models.BaseModel and some keyword arguments for cellrank.ul.models.BaseModel.prepare() and returns the prepared model. Can be specified in gene- and lineage-specific manner, similarly to model.

  • ncols (int) – Number of columns for the plot.

  • sharey (Union[str, bool]) – Whether to share y-axis across multiple plots.

  • key (Optional[str]) – Key in adata.uns where to save the results. If None, it will be saved as lineage_{lineage}_trend .

  • random_state (Optional[int]) – Random seed for reproducibility.

  • use_leiden (bool) – Whether to use scanpy.tl.leiden() for clustering or scanpy.tl.louvain().

  • show_progress_bar (bool) – Whether to show a progress bar. Disabling it may slightly improve performance.

  • n_jobs (Optional[int]) – Number of parallel jobs. If -1, use all available cores. If None or 1, the execution is sequential.

  • backend (str) – Which backend to use for parallelization. See joblib.Parallel for valid options.

  • figsize (Optional[Tuple[float, float]]) – Size of the figure.

  • dpi (Optional[int]) – Dots per inch.

  • save (Union[str, Path, None]) – Filename where to save the plot.

  • pca_kwargs (Dict) – Keyword arguments for scanpy.pp.pca().

  • neighbors_kwargs (Dict) – Keyword arguments for scanpy.pp.neighbors().

  • clustering_kwargs (Dict) – Keyword arguments for scanpy.tl.louvain() or scanpy.tl.leiden().

  • return_models (bool) – If True, return the fitted models for each gene in genes and lineage in lineages.

  • kwargs – Keyword arguments for cellrank.ul.models.BaseModel.prepare().

Return type

Optional[Mapping[str, Mapping[str, BaseModel]]]


  • None – If return_models=False, just plots the figure and optionally saves it based on save.

  • Dict[str, Dict[str, cellrank.ul.models.BaseModel]] – Otherwise returns the fitted models as {'gene_1': {'lineage_1': <model_11>, ...}, ...}. Models which have failed will be instances of cellrank.ul.models.FailedModel.

    Also updates adata.uns with the following:

    • key or lineage_{lineage}_trend - an anndata.AnnData object of shape (n_genes, n_points) containing the clustered genes.