Note
Click here to download the full example code
Compute eigendecomposition¶
This example shows how to compute and plot the spectrum of a transition matrix.
The spectrum of a rectangular matrix \(T\) is given by its eigenvalues. For a general, real valued matrix \(T\), eigenvalues will either be real or appear in pairs of complex conjugates. CellRank can plot the spectrum in the complex plane or restrict the visualization to the real part of the eigenvalues, which may be helpful to spot the eigengap.
import cellrank as cr
adata = cr.datasets.pancreas_preprocessed("../example.h5ad")
adata
Out:
0%| | 0.00/140M [00:00<?, ?B/s]
0%| | 56.0k/140M [00:00<07:13, 339kB/s]
0%| | 240k/140M [00:00<03:05, 790kB/s]
1%| | 1.00M/140M [00:00<00:56, 2.58MB/s]
3%|2 | 3.53M/140M [00:00<00:18, 7.70MB/s]
5%|4 | 6.42M/140M [00:00<00:12, 11.3MB/s]
7%|6 | 9.45M/140M [00:01<00:09, 13.8MB/s]
9%|8 | 12.5M/140M [00:01<00:08, 15.3MB/s]
11%|#1 | 15.5M/140M [00:01<00:08, 16.3MB/s]
13%|#3 | 18.5M/140M [00:01<00:07, 17.0MB/s]
15%|#5 | 21.5M/140M [00:01<00:07, 17.5MB/s]
18%|#7 | 24.5M/140M [00:01<00:06, 17.8MB/s]
20%|#9 | 27.5M/140M [00:02<00:06, 18.0MB/s]
22%|##1 | 30.6M/140M [00:02<00:06, 18.1MB/s]
24%|##3 | 33.6M/140M [00:02<00:06, 18.2MB/s]
26%|##6 | 36.6M/140M [00:02<00:05, 18.3MB/s]
28%|##8 | 39.6M/140M [00:02<00:05, 18.3MB/s]
30%|### | 42.6M/140M [00:02<00:05, 18.4MB/s]
33%|###2 | 45.7M/140M [00:03<00:05, 18.6MB/s]
35%|###4 | 48.7M/140M [00:03<00:05, 18.4MB/s]
37%|###6 | 51.7M/140M [00:03<00:05, 18.4MB/s]
39%|###9 | 54.7M/140M [00:03<00:04, 18.4MB/s]
41%|####1 | 57.7M/140M [00:03<00:04, 18.4MB/s]
43%|####3 | 60.8M/140M [00:03<00:04, 18.5MB/s]
45%|####5 | 63.8M/140M [00:04<00:04, 18.7MB/s]
48%|####7 | 66.8M/140M [00:04<00:04, 18.7MB/s]
50%|####9 | 69.8M/140M [00:04<00:04, 18.3MB/s]
52%|#####1 | 72.8M/140M [00:04<00:03, 18.4MB/s]
54%|#####4 | 75.8M/140M [00:04<00:03, 18.5MB/s]
56%|#####6 | 78.9M/140M [00:04<00:03, 18.5MB/s]
58%|#####8 | 81.9M/140M [00:05<00:03, 18.8MB/s]
60%|#####9 | 84.1M/140M [00:05<00:03, 19.6MB/s]
62%|######1 | 86.4M/140M [00:05<00:03, 18.0MB/s]
64%|######3 | 89.4M/140M [00:05<00:02, 18.2MB/s]
66%|######5 | 92.4M/140M [00:05<00:02, 18.3MB/s]
68%|######8 | 95.5M/140M [00:05<00:02, 18.3MB/s]
70%|####### | 98.5M/140M [00:06<00:02, 18.4MB/s]
72%|#######2 | 102M/140M [00:06<00:02, 18.4MB/s]
75%|#######4 | 105M/140M [00:06<00:02, 18.4MB/s]
77%|#######6 | 108M/140M [00:06<00:01, 18.4MB/s]
79%|#######8 | 111M/140M [00:06<00:01, 18.4MB/s]
81%|########1 | 114M/140M [00:06<00:01, 18.5MB/s]
83%|########3 | 117M/140M [00:07<00:01, 18.5MB/s]
85%|########5 | 120M/140M [00:07<00:01, 18.5MB/s]
87%|########7 | 123M/140M [00:07<00:00, 18.5MB/s]
90%|########9 | 126M/140M [00:07<00:00, 18.5MB/s]
92%|#########1| 129M/140M [00:07<00:00, 18.5MB/s]
94%|#########3| 132M/140M [00:07<00:00, 18.5MB/s]
96%|#########6| 135M/140M [00:08<00:00, 18.6MB/s]
98%|#########8| 138M/140M [00:08<00:00, 18.4MB/s]
100%|##########| 140M/140M [00:08<00:00, 17.5MB/s]
AnnData object with n_obs × n_vars = 2531 × 2000
obs: 'day', 'proliferation', 'G2M_score', 'S_score', 'phase', 'clusters_coarse', 'clusters', 'clusters_fine', 'louvain_Alpha', 'louvain_Beta', 'initial_size_unspliced', 'initial_size_spliced', 'initial_size', 'n_counts', 'velocity_self_transition', 'dpt_pseudotime'
var: 'highly_variable_genes', 'gene_count_corr', 'means', 'dispersions', 'dispersions_norm', 'fit_r2', 'fit_alpha', 'fit_beta', 'fit_gamma', 'fit_t_', 'fit_scaling', 'fit_std_u', 'fit_std_s', 'fit_likelihood', 'fit_u0', 'fit_s0', 'fit_pval_steady', 'fit_steady_u', 'fit_steady_s', 'fit_variance', 'fit_alignment_scaling', 'velocity_genes'
uns: 'clusters_colors', 'clusters_fine_colors', 'diffmap_evals', 'iroot', 'louvain_Alpha_colors', 'louvain_Beta_colors', 'neighbors', 'pca', 'recover_dynamics', 'velocity_graph', 'velocity_graph_neg', 'velocity_params'
obsm: 'X_diffmap', 'X_pca', 'X_umap', 'velocity_umap'
varm: 'PCs', 'loss'
layers: 'Ms', 'Mu', 'fit_t', 'fit_tau', 'fit_tau_', 'spliced', 'unspliced', 'velocity', 'velocity_u'
obsp: 'connectivities', 'distances'
First, we prepare the kernel using the high-level pipeline and the cellrank.tl.estimators.GPCCA
estimator.
k = cr.tl.transition_matrix(
adata, weight_connectivities=0.2, softmax_scale=4, show_progress_bar=False
)
g = cr.tl.estimators.GPCCA(k)
In this example, we visualize the eigenvectors in an embedding, therefore we specify only_evals=False
,
but for the method cellrank.tl.estimators.GPCCA.plot_spectrum()
, they are not necessary, because
there we plot only the eigenvalues.
g.compute_eigendecomposition(k=20, only_evals=False)
g.eigendecomposition
Out:
{'D': array([1. +0.j , 0.99137053+0.j ,
0.9758474 +0.j , 0.88897151+0.j ,
0.87896326+0.04901534j, 0.87896326-0.04901534j,
0.84816328+0.j , 0.82121909+0.j ,
0.80365766+0.j , 0.77571644+0.10474434j,
0.77571644-0.10474434j, 0.74923267+0.j ,
0.70915182+0.j , 0.6812836 +0.j ,
0.66361949+0.j , 0.64270154+0.13980109j,
0.64270154-0.13980109j, 0.58896217-0.03061993j,
0.58896217+0.03061993j, 0.58650816+0.0068823j ]), 'stationary_dist': array([7.40750026e-09, 1.94359730e-08, 1.01516602e-09, ...,
2.21219747e-08, 3.63003411e-07, 1.45711587e-07]), 'V_l': array([[-4.28536196e-08+0.00000000e+00j, 2.13794301e-04+0.00000000e+00j,
-6.23963743e-04+0.00000000e+00j, ...,
3.28594078e-06+1.83528622e-05j, 3.28594078e-06-1.83528622e-05j,
-1.16405233e-05-1.05792194e-05j],
[-1.12440332e-07+0.00000000e+00j, 5.69645773e-04+0.00000000e+00j,
-1.46368005e-03+0.00000000e+00j, ...,
1.92676053e-05+1.32364672e-04j, 1.92676053e-05-1.32364672e-04j,
-8.25691126e-05-7.35627582e-05j],
[-5.87290407e-09+0.00000000e+00j, 3.33130265e-05+0.00000000e+00j,
-9.58710260e-05+0.00000000e+00j, ...,
-1.51334581e-05-3.30118326e-06j, -1.51334581e-05+3.30118326e-06j,
-5.03302972e-08+4.03516736e-06j],
...,
[-1.27979299e-07+0.00000000e+00j, 2.65002636e-04+0.00000000e+00j,
-8.03040395e-04+0.00000000e+00j, ...,
-6.87706505e-06-3.39744627e-06j, -6.87706505e-06+3.39744627e-06j,
7.64981754e-07+2.63658991e-06j],
[-2.10003503e-06+0.00000000e+00j, 1.07217683e-02+0.00000000e+00j,
1.19457924e-02+0.00000000e+00j, ...,
1.25770913e-04+2.65788454e-04j, 1.25770913e-04-2.65788454e-04j,
-5.71172331e-05-1.25479548e-04j],
[-8.42965738e-07+0.00000000e+00j, 6.15551551e-03+0.00000000e+00j,
-1.38295927e-02+0.00000000e+00j, ...,
-6.57367705e-05+8.23952515e-05j, -6.57367705e-05-8.23952515e-05j,
-1.06806376e-04-5.28000495e-05j]]), 'V_r': array([[ 1.98771414e-02+0.00000000e+00j, 1.20876005e-02+0.00000000e+00j,
5.77124398e-04+0.00000000e+00j, ...,
2.04110000e-08+5.35808117e-09j, 2.04110000e-08-5.35808117e-09j,
1.23945930e-09-1.46462684e-08j],
[ 1.98771414e-02+0.00000000e+00j, 2.03996611e-02+0.00000000e+00j,
6.31972192e-04+0.00000000e+00j, ...,
-1.76238318e-08-1.93858897e-09j, -1.76238318e-08+1.93858897e-09j,
-1.17032152e-10+7.39284792e-09j],
[ 1.98771414e-02+0.00000000e+00j, 1.36962901e-02+0.00000000e+00j,
-1.08637079e-03+0.00000000e+00j, ...,
3.97098243e-06+2.03388287e-06j, 3.97098243e-06-2.03388287e-06j,
-1.02966671e-06-1.12472803e-06j],
...,
[ 1.98771414e-02+0.00000000e+00j, 1.01817921e-02+0.00000000e+00j,
-9.79193393e-05+0.00000000e+00j, ...,
7.54772915e-09+1.48398113e-08j, 7.54772915e-09-1.48398113e-08j,
-3.17590875e-10-1.43531463e-08j],
[ 1.98771414e-02+0.00000000e+00j, 5.07558046e-02+0.00000000e+00j,
5.17799342e-02+0.00000000e+00j, ...,
2.28134198e-09+3.17919252e-10j, 2.28134198e-09-3.17919252e-10j,
-9.50512998e-11-1.03554018e-09j],
[ 1.98771414e-02+0.00000000e+00j, 2.31866713e-02+0.00000000e+00j,
-5.74426441e-02+0.00000000e+00j, ...,
-2.67052800e-08-6.03125122e-09j, -2.67052800e-08+6.03125122e-09j,
1.34924181e-09+1.03072449e-08j]]), 'eigengap': 2, 'params': {'which': 'LR', 'k': 20, 'alpha': 1}}
Let’s start first by plotting the real spectrum. The eigengap is shown by the vertical line. Below we plot only the first 10 real eigenvalues, sorted by their real part.
g.plot_spectrum(10, real_only=True)

Next, we plot the eigenvalues in the complex plane.
g.plot_spectrum(real_only=False)

Finally, we can plot the left and right eigenvectors in an embedding. We can also specify which or how many
eigenvectors to plot using the parameter use
. If not specified, vectors up to the eigengap are shown.
g.plot_eigendecomposition(left=False)
g.plot_eigendecomposition(left=True, use=2)
Left and right eigenvectors are used in cellrank.tl.estimators.CFLARE
estimator to compute the initial or
the states of the process.
Total running time of the script: ( 0 minutes 21.852 seconds)
Estimated memory usage: 416 MB