Compute eigendecomposition

This example shows how to compute and plot the spectrum of a transition matrix.

The spectrum of a rectangular matrix \(T\) is given by its eigenvalues. For a general, real valued matrix \(T\), eigenvalues will either be real or appear in pairs of complex conjugates. CellRank can plot the spectrum in the complex plane or restrict the visualization to the real part of the eigenvalues, which may be helpful to spot the eigengap.

import cellrank as cr

adata = cr.datasets.pancreas_preprocessed("../example.h5ad")
adata

Out:

example.h5ad: 0.00B [00:00, ?B/s]
example.h5ad:   0%|          | 8.19k/147M [00:00<3:33:07, 11.5kB/s]
example.h5ad:   0%|          | 57.3k/147M [00:00<2:31:14, 16.2kB/s]
example.h5ad:   0%|          | 213k/147M [00:00<1:46:25, 23.0kB/s]
example.h5ad:   1%|          | 893k/147M [00:01<1:14:18, 32.8kB/s]
example.h5ad:   2%|2         | 3.63M/147M [00:01<51:04, 46.8kB/s]
example.h5ad:   7%|6         | 9.59M/147M [00:01<34:16, 66.8kB/s]
example.h5ad:  11%|#         | 15.6M/147M [00:01<22:57, 95.4kB/s]
example.h5ad:  14%|#4        | 20.8M/147M [00:01<15:27, 136kB/s]
example.h5ad:  17%|#6        | 24.7M/147M [00:01<10:29, 194kB/s]
example.h5ad:  20%|##        | 29.7M/147M [00:01<07:03, 277kB/s]
example.h5ad:  23%|##3       | 33.9M/147M [00:01<04:46, 394kB/s]
example.h5ad:  27%|##6       | 39.0M/147M [00:02<03:12, 561kB/s]
example.h5ad:  29%|##9       | 43.0M/147M [00:02<02:10, 797kB/s]
example.h5ad:  33%|###2      | 48.2M/147M [00:02<01:27, 1.13MB/s]
example.h5ad:  36%|###5      | 52.3M/147M [00:02<00:59, 1.59MB/s]
example.h5ad:  39%|###9      | 57.6M/147M [00:02<00:39, 2.25MB/s]
example.h5ad:  42%|####1     | 61.5M/147M [00:02<00:27, 3.13MB/s]
example.h5ad:  45%|####5     | 66.6M/147M [00:02<00:18, 4.35MB/s]
example.h5ad:  48%|####8     | 70.8M/147M [00:02<00:12, 5.92MB/s]
example.h5ad:  51%|#####1    | 75.7M/147M [00:02<00:08, 8.04MB/s]
example.h5ad:  54%|#####4    | 80.1M/147M [00:03<00:06, 10.6MB/s]
example.h5ad:  58%|#####7    | 85.1M/147M [00:03<00:04, 13.8MB/s]
example.h5ad:  61%|######    | 89.3M/147M [00:03<00:03, 17.0MB/s]
example.h5ad:  64%|######3   | 93.8M/147M [00:03<00:02, 20.9MB/s]
example.h5ad:  67%|######7   | 98.6M/147M [00:03<00:01, 24.6MB/s]
example.h5ad:  70%|#######   | 103M/147M [00:03<00:01, 28.6MB/s]
example.h5ad:  73%|#######3  | 108M/147M [00:03<00:01, 31.5MB/s]
example.h5ad:  77%|#######6  | 113M/147M [00:03<00:00, 35.1MB/s]
example.h5ad:  80%|#######9  | 117M/147M [00:03<00:00, 36.5MB/s]
example.h5ad:  83%|########2 | 122M/147M [00:03<00:00, 39.5MB/s]
example.h5ad:  86%|########5 | 126M/147M [00:04<00:00, 39.2MB/s]
example.h5ad:  89%|########9 | 131M/147M [00:04<00:00, 42.1MB/s]
example.h5ad:  92%|#########2| 135M/147M [00:04<00:00, 40.9MB/s]
example.h5ad:  95%|#########4| 139M/147M [00:04<00:00, 41.1MB/s]
example.h5ad:  98%|#########8| 144M/147M [00:04<00:00, 42.1MB/s]
example.h5ad: 147MB [00:04, 32.2MB/s]

AnnData object with n_obs × n_vars = 2531 × 2000
    obs: 'day', 'proliferation', 'G2M_score', 'S_score', 'phase', 'clusters_coarse', 'clusters', 'clusters_fine', 'louvain_Alpha', 'louvain_Beta', 'initial_size_unspliced', 'initial_size_spliced', 'initial_size', 'n_counts', 'velocity_self_transition', 'dpt_pseudotime'
    var: 'highly_variable_genes', 'gene_count_corr', 'means', 'dispersions', 'dispersions_norm', 'fit_r2', 'fit_alpha', 'fit_beta', 'fit_gamma', 'fit_t_', 'fit_scaling', 'fit_std_u', 'fit_std_s', 'fit_likelihood', 'fit_u0', 'fit_s0', 'fit_pval_steady', 'fit_steady_u', 'fit_steady_s', 'fit_variance', 'fit_alignment_scaling', 'velocity_genes'
    uns: 'clusters_colors', 'clusters_fine_colors', 'diffmap_evals', 'iroot', 'louvain_Alpha_colors', 'louvain_Beta_colors', 'neighbors', 'pca', 'recover_dynamics', 'velocity_graph', 'velocity_graph_neg', 'velocity_params'
    obsm: 'X_diffmap', 'X_pca', 'X_umap', 'velocity_umap'
    varm: 'PCs', 'loss'
    layers: 'Ms', 'Mu', 'fit_t', 'fit_tau', 'fit_tau_', 'spliced', 'unspliced', 'velocity', 'velocity_u'
    obsp: 'connectivities', 'distances'

First, we prepare the kernel using the high-level pipeline and the cellrank.tl.estimators.GPCCA estimator.

k = cr.tl.transition_matrix(
    adata, weight_connectivities=0.2, softmax_scale=4, show_progress_bar=False
)
g = cr.tl.estimators.GPCCA(k)

In this example, we visualize the eigenvectors in an embedding, therefore we specify only_evals=False, but for the method cellrank.tl.estimators.GPCCA.plot_spectrum(), they are not necessary, because there we plot only the eigenvalues.

g.compute_eigendecomposition(k=20, only_evals=False)
g.eigendecomposition

Out:

{'D': array([1.        +0.j        , 0.99137053+0.j        ,
       0.9758474 +0.j        , 0.88897151+0.j        ,
       0.87896326+0.04901534j, 0.87896326-0.04901534j,
       0.84816328+0.j        , 0.82121909+0.j        ,
       0.80365766+0.j        , 0.77571644+0.10474434j,
       0.77571644-0.10474434j, 0.74923267+0.j        ,
       0.70915182+0.j        , 0.6812836 +0.j        ,
       0.66361949+0.j        , 0.64270154+0.13980109j,
       0.64270154-0.13980109j, 0.58896217-0.03061993j,
       0.58896217+0.03061993j, 0.58650816+0.0068823j ]), 'stationary_dist': array([7.40750026e-09, 1.94359730e-08, 1.01516602e-09, ...,
       2.21219747e-08, 3.63003411e-07, 1.45711587e-07]), 'V_l': array([[-4.28536196e-08+0.00000000e+00j,  2.13794301e-04+0.00000000e+00j,
        -6.23963743e-04+0.00000000e+00j, ...,
         3.28594078e-06+1.83528622e-05j,  3.28594078e-06-1.83528622e-05j,
        -1.16405233e-05-1.05792194e-05j],
       [-1.12440332e-07+0.00000000e+00j,  5.69645773e-04+0.00000000e+00j,
        -1.46368005e-03+0.00000000e+00j, ...,
         1.92676053e-05+1.32364672e-04j,  1.92676053e-05-1.32364672e-04j,
        -8.25691126e-05-7.35627582e-05j],
       [-5.87290407e-09+0.00000000e+00j,  3.33130265e-05+0.00000000e+00j,
        -9.58710260e-05+0.00000000e+00j, ...,
        -1.51334581e-05-3.30118326e-06j, -1.51334581e-05+3.30118326e-06j,
        -5.03302972e-08+4.03516736e-06j],
       ...,
       [-1.27979299e-07+0.00000000e+00j,  2.65002636e-04+0.00000000e+00j,
        -8.03040395e-04+0.00000000e+00j, ...,
        -6.87706505e-06-3.39744627e-06j, -6.87706505e-06+3.39744627e-06j,
         7.64981754e-07+2.63658991e-06j],
       [-2.10003503e-06+0.00000000e+00j,  1.07217683e-02+0.00000000e+00j,
         1.19457924e-02+0.00000000e+00j, ...,
         1.25770913e-04+2.65788454e-04j,  1.25770913e-04-2.65788454e-04j,
        -5.71172331e-05-1.25479548e-04j],
       [-8.42965738e-07+0.00000000e+00j,  6.15551551e-03+0.00000000e+00j,
        -1.38295927e-02+0.00000000e+00j, ...,
        -6.57367705e-05+8.23952515e-05j, -6.57367705e-05-8.23952515e-05j,
        -1.06806376e-04-5.28000495e-05j]]), 'V_r': array([[ 1.98771414e-02+0.00000000e+00j,  1.20876005e-02+0.00000000e+00j,
         5.77124398e-04+0.00000000e+00j, ...,
         2.04110000e-08+5.35808117e-09j,  2.04110000e-08-5.35808117e-09j,
         1.23945930e-09-1.46462684e-08j],
       [ 1.98771414e-02+0.00000000e+00j,  2.03996611e-02+0.00000000e+00j,
         6.31972192e-04+0.00000000e+00j, ...,
        -1.76238318e-08-1.93858897e-09j, -1.76238318e-08+1.93858897e-09j,
        -1.17032152e-10+7.39284792e-09j],
       [ 1.98771414e-02+0.00000000e+00j,  1.36962901e-02+0.00000000e+00j,
        -1.08637079e-03+0.00000000e+00j, ...,
         3.97098243e-06+2.03388287e-06j,  3.97098243e-06-2.03388287e-06j,
        -1.02966671e-06-1.12472803e-06j],
       ...,
       [ 1.98771414e-02+0.00000000e+00j,  1.01817921e-02+0.00000000e+00j,
        -9.79193393e-05+0.00000000e+00j, ...,
         7.54772915e-09+1.48398113e-08j,  7.54772915e-09-1.48398113e-08j,
        -3.17590875e-10-1.43531463e-08j],
       [ 1.98771414e-02+0.00000000e+00j,  5.07558046e-02+0.00000000e+00j,
         5.17799342e-02+0.00000000e+00j, ...,
         2.28134198e-09+3.17919252e-10j,  2.28134198e-09-3.17919252e-10j,
        -9.50512998e-11-1.03554018e-09j],
       [ 1.98771414e-02+0.00000000e+00j,  2.31866713e-02+0.00000000e+00j,
        -5.74426441e-02+0.00000000e+00j, ...,
        -2.67052800e-08-6.03125122e-09j, -2.67052800e-08+6.03125122e-09j,
         1.34924181e-09+1.03072449e-08j]]), 'eigengap': 2, 'params': {'which': 'LR', 'k': 20, 'alpha': 1}}

Let’s start first by plotting the real spectrum. The eigengap is shown by the vertical line. Below we plot only the first 10 real eigenvalues, sorted by their real part.

g.plot_spectrum(10, real_only=True)
real part of top 10 eigenvalues according to their real part

Next, we plot the eigenvalues in the complex plane.

g.plot_spectrum(real_only=False)
top 20 eigenvalues according to their real part

Finally, we can plot the left and right eigenvectors in an embedding. We can also specify which or how many eigenvectors to plot using the parameter use. If not specified, vectors up to the eigengap are shown.

g.plot_eigendecomposition(left=False)
g.plot_eigendecomposition(left=True, use=2)
  • $\lambda 0$=1.00+0.00j, $\lambda 1$=0.99+0.00j, $\lambda 2$=0.98+0.00j
  • $\lambda 0$=1.00+0.00j, $\lambda 1$=0.99+0.00j

Left and right eigenvectors are used in cellrank.tl.estimators.CFLARE estimator to compute the initial or the states of the process.

Total running time of the script: ( 0 minutes 17.821 seconds)

Estimated memory usage: 431 MB

Gallery generated by Sphinx-Gallery