Plot potential lineage drivers

This example shows how to compute and plot expression trends for genes which may be involved in lineage decisions.

We identify these by correlating gene expression with absorption probabilities towards a specific terminal state.

import cellrank as cr

adata = cr.datasets.pancreas_preprocessed("../example.h5ad")
adata

Out:

AnnData object with n_obs × n_vars = 2531 × 2000
    obs: 'day', 'proliferation', 'G2M_score', 'S_score', 'phase', 'clusters_coarse', 'clusters', 'clusters_fine', 'louvain_Alpha', 'louvain_Beta', 'initial_size_unspliced', 'initial_size_spliced', 'initial_size', 'n_counts', 'velocity_self_transition', 'dpt_pseudotime'
    var: 'highly_variable_genes', 'gene_count_corr', 'means', 'dispersions', 'dispersions_norm', 'fit_r2', 'fit_alpha', 'fit_beta', 'fit_gamma', 'fit_t_', 'fit_scaling', 'fit_std_u', 'fit_std_s', 'fit_likelihood', 'fit_u0', 'fit_s0', 'fit_pval_steady', 'fit_steady_u', 'fit_steady_s', 'fit_variance', 'fit_alignment_scaling', 'velocity_genes'
    uns: 'clusters_colors', 'clusters_fine_colors', 'diffmap_evals', 'iroot', 'louvain_Alpha_colors', 'louvain_Beta_colors', 'neighbors', 'pca', 'recover_dynamics', 'velocity_graph', 'velocity_graph_neg', 'velocity_params'
    obsm: 'X_diffmap', 'X_pca', 'X_umap', 'velocity_umap'
    varm: 'PCs', 'loss'
    layers: 'Ms', 'Mu', 'fit_t', 'fit_tau', 'fit_tau_', 'spliced', 'unspliced', 'velocity', 'velocity_u'
    obsp: 'connectivities', 'distances'

First, we need to compute the terminal states and the absorption probabilities towards them.

cr.tl.terminal_states(
    adata,
    cluster_key="clusters",
    n_cells=30,
    n_states=3,
    softmax_scale=4,
    show_progress_bar=False,
)
cr.tl.lineages(adata)

Out:

/home/docs/checkouts/readthedocs.org/user_builds/cellrank/checkouts/latest/examples/plotting/plot_lineage_drivers.py:17: DeprecationWarning: `cellrank.tl.terminal_states` will be removed in version `2.0`. Please use the `cellrank.kernels` or `cellrank.estimators` interface instead.
  cr.tl.terminal_states(
/home/docs/checkouts/readthedocs.org/user_builds/cellrank/checkouts/latest/cellrank/tl/_init_term_states.py:156: DeprecationWarning: `cellrank.tl.transition_matrix` will be removed in version `2.0`. Please use the `cellrank.kernels` or `cellrank.estimators` interface instead.
  kernel = transition_matrix(
/home/docs/checkouts/readthedocs.org/user_builds/cellrank/checkouts/latest/examples/plotting/plot_lineage_drivers.py:25: DeprecationWarning: `cellrank.tl.lineages` will be removed in version `2.0`. Please use the `cellrank.kernels` or `cellrank.estimators` interface instead.
  cr.tl.lineages(adata)

  0%|          | 0/3 [00:00<?, ?/s]
100%|##########| 3/3 [00:00<00:00, 36.78/s]

Once the lineages have been computed, we can compute the potential driver genes for each of them. It is also possible to restrict this computation to just a few clusters, defined by cluster_key and clusters.

By default we are computing the driver genes for all lineages.

drivers = cr.tl.lineage_drivers(adata, lineages="Alpha")
drivers

Out:

/home/docs/checkouts/readthedocs.org/user_builds/cellrank/checkouts/latest/examples/plotting/plot_lineage_drivers.py:32: DeprecationWarning: `cellrank.tl.lineage_drivers` will be removed in version `2.0`. Please use the `cellrank.kernels` or `cellrank.estimators` interface instead.
  drivers = cr.tl.lineage_drivers(adata, lineages="Alpha")
Alpha_corr Alpha_pval Alpha_qval Alpha_ci_low Alpha_ci_high
index
Gcg 0.866438 0.000000e+00 0.000000e+00 0.856386 0.875833
Irx2 0.519357 4.508562e-184 4.508562e-181 0.490317 0.547245
Peg10 0.515331 1.303002e-180 8.686677e-178 0.486130 0.543383
Wnk3 0.482853 1.564785e-154 7.823924e-152 0.452402 0.512179
Tmsb15l 0.456772 9.197172e-136 3.678869e-133 0.425381 0.487066
... ... ... ... ... ...
Nkx6-1 -0.313467 8.401476e-60 7.637706e-58 -0.348176 -0.277899
Nnat -0.318801 6.114820e-62 6.436653e-60 -0.353374 -0.283359
Gng12 -0.324276 3.495796e-64 4.112701e-62 -0.358706 -0.288965
Pdx1 -0.330744 6.754179e-67 8.442723e-65 -0.365003 -0.295592
Ptma -0.373322 1.357914e-86 2.468935e-84 -0.406373 -0.339295

2000 rows × 5 columns



Finally, we can plot the potential drivers. Below we plot top 3 driver genes for the ‘Alpha’ lineage.

cr.pl.lineage_drivers(adata, lineage="Alpha", n_genes=3)
Gcg qval=0.0000e+00, Irx2 qval=4.5086e-181, Peg10 qval=8.6867e-178

Total running time of the script: ( 0 minutes 21.158 seconds)

Estimated memory usage: 682 MB

Gallery generated by Sphinx-Gallery