Plot cluster lineage

This example shows how to cluster and plot genes in a specific lineage.

import cellrank as cr

adata = cr.datasets.pancreas_preprocessed("../example.h5ad")
adata

Out:

AnnData object with n_obs × n_vars = 2531 × 2000
    obs: 'day', 'proliferation', 'G2M_score', 'S_score', 'phase', 'clusters_coarse', 'clusters', 'clusters_fine', 'louvain_Alpha', 'louvain_Beta', 'initial_size_unspliced', 'initial_size_spliced', 'initial_size', 'n_counts', 'velocity_self_transition', 'dpt_pseudotime'
    var: 'highly_variable_genes', 'gene_count_corr', 'means', 'dispersions', 'dispersions_norm', 'fit_r2', 'fit_alpha', 'fit_beta', 'fit_gamma', 'fit_t_', 'fit_scaling', 'fit_std_u', 'fit_std_s', 'fit_likelihood', 'fit_u0', 'fit_s0', 'fit_pval_steady', 'fit_steady_u', 'fit_steady_s', 'fit_variance', 'fit_alignment_scaling', 'velocity_genes'
    uns: 'clusters_colors', 'clusters_fine_colors', 'diffmap_evals', 'iroot', 'louvain_Alpha_colors', 'louvain_Beta_colors', 'neighbors', 'pca', 'recover_dynamics', 'velocity_graph', 'velocity_graph_neg', 'velocity_params'
    obsm: 'X_diffmap', 'X_pca', 'X_umap', 'velocity_umap'
    varm: 'PCs', 'loss'
    layers: 'Ms', 'Mu', 'fit_t', 'fit_tau', 'fit_tau_', 'spliced', 'unspliced', 'velocity', 'velocity_u'
    obsp: 'connectivities', 'distances'

First, we compute the absorption probabilities and select a model that will be used for gene trend smoothing.

cr.tl.terminal_states(
    adata,
    cluster_key="clusters",
    weight_connectivities=0.2,
    n_states=3,
    softmax_scale=4,
    show_progress_bar=False,
)
cr.tl.lineages(adata)

model = cr.ul.models.GAM(adata)

Out:

INFO: Using pre-computed schur decomposition

Next, we can fit the model for some subset of genes for a specific lineage, as seen below. After the model has been fitted, we use it to get the smoothed gene expression for the test points (by default, it is 200 points uniformly spaced along the pseudotime). Afterwards, we reduce the dimension using PCA and cluster using the louvain algorithm.

Note that calling this function twice will use the already computed values, unless recompute=True is specified.

cr.pl.cluster_lineage(
    adata,
    model,
    adata.var_names[:200],
    lineage="Alpha",
    time_key="dpt_pseudotime",
    show_progress_bar=False,
)
Cluster 0, Cluster 1, Cluster 2, Cluster 3, Cluster 4, Cluster 5, Cluster 6

Out:

HBox(children=(HTML(value=''), FloatProgress(value=0.0, max=200.0), HTML(value='')))

The clustered genes can be accessed as shown below. In general, adata.uns['lineage_..._trend'] contains anndata object of shape (n_test_points, n_genes).

adata.uns["lineage_Alpha_trend"].obs["clusters"]

Out:

Snhg6      4
Ncoa2      0
Stau2      5
Uggt1      0
Tmem131    5
          ..
Ank3       0
Zwint      1
Gnaz       0
Rab36      0
Bcr        5
Name: clusters, Length: 200, dtype: category
Categories (7, object): ['0', '1', '2', '3', '4', '5', '6']

Total running time of the script: ( 1 minutes 20.733 seconds)

Estimated memory usage: 599 MB

Gallery generated by Sphinx-Gallery