Note
Click here to download the full example code
Fit estimator¶
This example shows how to fit an estimator in order to compute the lineages.
CellRank is composed of cellrank.tl.kernels
and cellrank.tl.estimators
. Kernels compute transition
matrices on the basis of directed data, given by i.e. RNA velocity [Manno18] [Bergen20] while estimators perform
inference making use of kernels.
Here, we show how to fit an estimator, given a kernel. An estimator may be used e.g. to identify initial and terminal
states or to compute lineage probabilities towards the terminal states, for each individual cell.
The cellrank.tl.estimators.BaseEstimator.fit()
method is a combination of individual methods to compute terminal
states and absorption probabilities.
To have maximum control over the inference performed, use these individual methods directly.
import cellrank as cr
adata = cr.datasets.pancreas_preprocessed("../example.h5ad")
adata
Out:
AnnData object with n_obs × n_vars = 2531 × 2000
obs: 'day', 'proliferation', 'G2M_score', 'S_score', 'phase', 'clusters_coarse', 'clusters', 'clusters_fine', 'louvain_Alpha', 'louvain_Beta', 'initial_size_unspliced', 'initial_size_spliced', 'initial_size', 'n_counts', 'velocity_self_transition', 'dpt_pseudotime'
var: 'highly_variable_genes', 'gene_count_corr', 'means', 'dispersions', 'dispersions_norm', 'fit_r2', 'fit_alpha', 'fit_beta', 'fit_gamma', 'fit_t_', 'fit_scaling', 'fit_std_u', 'fit_std_s', 'fit_likelihood', 'fit_u0', 'fit_s0', 'fit_pval_steady', 'fit_steady_u', 'fit_steady_s', 'fit_variance', 'fit_alignment_scaling', 'velocity_genes'
uns: 'clusters_colors', 'clusters_fine_colors', 'diffmap_evals', 'iroot', 'louvain_Alpha_colors', 'louvain_Beta_colors', 'neighbors', 'pca', 'recover_dynamics', 'velocity_graph', 'velocity_graph_neg', 'velocity_params'
obsm: 'X_diffmap', 'X_pca', 'X_umap', 'velocity_umap'
varm: 'PCs', 'loss'
layers: 'Ms', 'Mu', 'fit_t', 'fit_tau', 'fit_tau_', 'spliced', 'unspliced', 'velocity', 'velocity_u'
obsp: 'connectivities', 'distances'
First, we prepare the kernel using high-level pipeline and the cellrank.tl.estimators.GPCCA
estimator.
k = cr.tl.transition_matrix(
adata, weight_connectivities=0.2, softmax_scale=4, show_progress_bar=False
)
g = cr.tl.estimators.GPCCA(k)
Afterwards, we simply call cellrank.tl.estimators.BaseEstimator.fit()
. It offers a quick and easy way
to compute the terminal states and optionally the absorption probabilities by following similar steps as defined in
Compute terminal states using GPCCA.
g.fit(n_lineages=3, cluster_key="clusters", compute_absorption_probabilities=True)
Out:
0%| | 0/90 [00:00<?, ?/s]
8%|7 | 7/90 [00:00<00:01, 49.81/s]
17%|#6 | 15/90 [00:00<00:01, 54.95/s]
26%|##5 | 23/90 [00:00<00:01, 57.02/s]
34%|###4 | 31/90 [00:00<00:01, 58.21/s]
43%|####3 | 39/90 [00:00<00:00, 58.82/s]
52%|#####2 | 47/90 [00:00<00:00, 59.02/s]
61%|######1 | 55/90 [00:00<00:00, 59.24/s]
70%|####### | 63/90 [00:01<00:00, 57.19/s]
79%|#######8 | 71/90 [00:01<00:00, 52.36/s]
87%|########6 | 78/90 [00:01<00:00, 50.02/s]
94%|#########4| 85/90 [00:01<00:00, 48.43/s]
100%|##########| 90/90 [00:01<00:00, 52.90/s]
In order to verify that the absorption probabilities have been computed, we plot them below.
g.plot_absorption_probabilities()

Out:
/home/docs/checkouts/readthedocs.org/user_builds/cellrank/envs/latest/lib/python3.8/site-packages/scvelo/plotting/utils.py:115: MatplotlibDeprecationWarning: Calling gca() with keyword arguments was deprecated in Matplotlib 3.4. Starting two minor releases later, gca() will take no keyword arguments. The gca() function should only be used to get the current axes, or if no axes exist, create new axes with default keyword arguments. To create a new axes with non-default arguments, use plt.axes() or plt.subplot().
ax = pl.figure(None, figsize, dpi=dpi).gca(projection=projection)
Total running time of the script: ( 0 minutes 23.258 seconds)
Estimated memory usage: 711 MB