Plot potential lineage drivers

This example shows how to compute and plot expression trends for genes which may be involved in lineage decisions.

We identify these by correlating gene expression with absorption probabilities towards a specific terminal state.

import cellrank as cr

adata = cr.datasets.pancreas_preprocessed("../example.h5ad")
adata

Out:

AnnData object with n_obs × n_vars = 2531 × 2000
    obs: 'day', 'proliferation', 'G2M_score', 'S_score', 'phase', 'clusters_coarse', 'clusters', 'clusters_fine', 'louvain_Alpha', 'louvain_Beta', 'initial_size_unspliced', 'initial_size_spliced', 'initial_size', 'n_counts', 'velocity_self_transition', 'dpt_pseudotime'
    var: 'highly_variable_genes', 'gene_count_corr', 'means', 'dispersions', 'dispersions_norm', 'fit_r2', 'fit_alpha', 'fit_beta', 'fit_gamma', 'fit_t_', 'fit_scaling', 'fit_std_u', 'fit_std_s', 'fit_likelihood', 'fit_u0', 'fit_s0', 'fit_pval_steady', 'fit_steady_u', 'fit_steady_s', 'fit_variance', 'fit_alignment_scaling', 'velocity_genes'
    uns: 'clusters_colors', 'clusters_fine_colors', 'diffmap_evals', 'iroot', 'louvain_Alpha_colors', 'louvain_Beta_colors', 'neighbors', 'pca', 'recover_dynamics', 'velocity_graph', 'velocity_graph_neg', 'velocity_params'
    obsm: 'X_diffmap', 'X_pca', 'X_umap', 'velocity_umap'
    varm: 'PCs', 'loss'
    layers: 'Ms', 'Mu', 'fit_t', 'fit_tau', 'fit_tau_', 'spliced', 'unspliced', 'velocity', 'velocity_u'
    obsp: 'connectivities', 'distances'

First, we need to compute the terminal states and the absorption probabilities towards them.

cr.tl.terminal_states(
    adata,
    cluster_key="clusters",
    n_cells=30,
    n_states=3,
    softmax_scale=4,
    show_progress_bar=False,
)
cr.tl.lineages(adata)

Out:

INFO: Using pre-computed schur decomposition

Once the lineages have been computed, we can compute the potential driver genes for each of them. It is also possible to restrict this computation to just a few clusters, defined by cluster_key and clusters.

By default we are computing the driver genes for all lineages.

drivers = cr.tl.lineage_drivers(adata, lineages="Alpha")
drivers
Alpha corr Alpha pval Alpha qval Alpha ci low Alpha ci high
index
Gcg 0.866438 0.000000e+00 0.000000e+00 0.856386 0.875833
Irx2 0.519357 4.506551e-184 4.506551e-181 0.490317 0.547246
Peg10 0.515331 1.302594e-180 8.683959e-178 0.486130 0.543383
Wnk3 0.482853 1.564546e-154 7.822729e-152 0.452402 0.512179
Tmsb15l 0.456772 9.195784e-136 3.678314e-133 0.425381 0.487066
... ... ... ... ... ...
Nkx6-1 -0.313467 8.400535e-60 7.636850e-58 -0.348176 -0.277899
Nnat -0.318801 6.115121e-62 6.436970e-60 -0.353374 -0.283359
Gng12 -0.324276 3.496286e-64 4.113277e-62 -0.358706 -0.288965
Pdx1 -0.330744 6.755869e-67 8.444837e-65 -0.365002 -0.295591
Ptma -0.373322 1.357200e-86 2.467637e-84 -0.406373 -0.339296

2000 rows × 5 columns



Finally, we can plot the potential drivers. Below we plot top 3 driver genes for the ‘Alpha’ lineage.

cr.pl.lineage_drivers(adata, lineage="Alpha", n_genes=3)
Gcg q-value=0.0000e+00, Irx2 q-value=4.5066e-181, Peg10 q-value=8.6840e-178

Total running time of the script: ( 0 minutes 22.028 seconds)

Estimated memory usage: 598 MB

Gallery generated by Sphinx-Gallery