Fit estimator

This example shows how to fit an estimator in order to compute the lineages.

CellRank is composed of cellrank.tl.kernels and cellrank.tl.estimators. Kernels compute transition matrices on the basis of directed data, given by i.e. RNA velocity [Manno18] [Bergen20] while estimators perform inference making use of kernels.

Here, we show how to fit an estimator, given a kernel. An estimator may be used e.g. to identify initial and terminal states or to compute lineage probabilities towards the terminal states, for each individual cell. The cellrank.tl.estimators.BaseEstimator.fit() method is a combination of individual methods to compute terminal states and absorption probabilities.

To have maximum control over the inference performed, use these individual methods directly.

import cellrank as cr

adata = cr.datasets.pancreas_preprocessed("../example.h5ad")
adata

Out:

AnnData object with n_obs × n_vars = 2531 × 2000
    obs: 'day', 'proliferation', 'G2M_score', 'S_score', 'phase', 'clusters_coarse', 'clusters', 'clusters_fine', 'louvain_Alpha', 'louvain_Beta', 'initial_size_unspliced', 'initial_size_spliced', 'initial_size', 'n_counts', 'velocity_self_transition', 'dpt_pseudotime'
    var: 'highly_variable_genes', 'gene_count_corr', 'means', 'dispersions', 'dispersions_norm', 'fit_r2', 'fit_alpha', 'fit_beta', 'fit_gamma', 'fit_t_', 'fit_scaling', 'fit_std_u', 'fit_std_s', 'fit_likelihood', 'fit_u0', 'fit_s0', 'fit_pval_steady', 'fit_steady_u', 'fit_steady_s', 'fit_variance', 'fit_alignment_scaling', 'velocity_genes'
    uns: 'clusters_colors', 'clusters_fine_colors', 'diffmap_evals', 'iroot', 'louvain_Alpha_colors', 'louvain_Beta_colors', 'neighbors', 'pca', 'recover_dynamics', 'velocity_graph', 'velocity_graph_neg', 'velocity_params'
    obsm: 'X_diffmap', 'X_pca', 'X_umap', 'velocity_umap'
    varm: 'PCs', 'loss'
    layers: 'Ms', 'Mu', 'fit_t', 'fit_tau', 'fit_tau_', 'spliced', 'unspliced', 'velocity', 'velocity_u'
    obsp: 'connectivities', 'distances'

First, we prepare the kernel using high-level pipeline and the cellrank.tl.estimators.GPCCA estimator.

k = cr.tl.transition_matrix(
    adata, weight_connectivities=0.2, softmax_scale=4, show_progress_bar=False
)
g = cr.tl.estimators.GPCCA(k)

Afterwards, we simply call cellrank.tl.estimators.BaseEstimator.fit(). It offers a quick and easy way to compute the terminal states and optionally the absorption probabilities by following similar steps as defined in Compute terminal states using GPCCA.

g.fit(n_lineages=3, cluster_key="clusters", compute_absorption_probabilities=True)

Out:

INFO: Using pre-computed schur decomposition

In order to verify that the absorption probabilities have been computed, we plot them below.

g.plot_absorption_probabilities()
absorption probabilities (fwd)

Total running time of the script: ( 0 minutes 21.933 seconds)

Estimated memory usage: 565 MB

Gallery generated by Sphinx-Gallery