Lineage tricks

This example shows some niche, but useful functionalities of cellrank.tl.Lineage.

cellrank.tl.Lineage is a lightweight wrapper around numpy.ndarray containing names and colors which stores the data in columns and allows for pandas-like indexing. It also provides various methods, such as a method for plotting some aggregate information for each column.

We use it primarily to store either the fate probabilities or the macrostates memberships, see Compute absorption probabilities or Compute macrostates, to learn how to compute them.

import cellrank as cr
import numpy as np

np.random.seed(42)

The lineage class behaves like a numpy array for the most part. The key differences are that only 2 dimensional arrays are allowed as input and that it always tries to preserve it’s shape, even if scalar is requested.

The constructor requires the underlying array and the lineage names, which must be unique. The colors are optional and by default they are automatically generated.

lin = cr.tl.Lineage(
    np.abs(np.random.normal(size=(10, 4))), names=["foo", "bar", "baz", "quux"]
)
lin /= lin.sum(1)

In some cases, this behavior is not desirable or can have unintended consequences. To access the underlying numpy array, use the cellrank.tl.Lineage.X attribute.

lin.X

Out:

array([[0.17703771, 0.04927984, 0.23084765, 0.54283479],
       [0.08318243, 0.0831766 , 0.5610116 , 0.27262937],
       [0.24184977, 0.27949985, 0.23872966, 0.23992072],
       [0.05446598, 0.43068153, 0.38828094, 0.12657155],
       [0.27768531, 0.08615638, 0.24895063, 0.38720768],
       [0.46035999, 0.07091629, 0.0212106 , 0.44751312],
       [0.24948831, 0.05083536, 0.52749551, 0.17218082],
       [0.17949245, 0.08716859, 0.17981159, 0.55352737],
       [0.00433354, 0.33959804, 0.26409355, 0.39197487],
       [0.05654772, 0.53056103, 0.35959305, 0.0532982 ]])

Lineages can also be transposed.

lin.T
foo0.1770380.0831820.2418500.0544660.2776850.4603600.2494880.1794920.0043340.056548
bar0.0492800.0831770.2795000.4306820.0861560.0709160.0508350.0871690.3395980.530561
baz0.2308480.5610120.2387300.3882810.2489510.0212110.5274960.1798120.2640940.359593
quux0.5428350.2726290.2399210.1265720.3872080.4475130.1721810.5535270.3919750.053298

4 lineages x 10 cells



Indexing into lineage can be done via the names as well.

lin[["foo", "bar"]]
foobar
0.1770380.049280
0.0831820.083177
0.2418500.279500
0.0544660.430682
0.2776850.086156
0.4603600.070916
0.2494880.050835
0.1794920.087169
0.0043340.339598
0.0565480.530561

10 cells x 2 lineages



Two or more lineage can be combined into by joining the names with “,”. This also automatically updates the color based on the combined lineages’ colors.

lin[["bar, baz, quux"]]
bar or baz or quux
0.822962
0.916818
0.758150
0.945534
0.722315
0.539640
0.750512
0.820508
0.995666
0.943452

10 cells x 1 lineage



Most of the numpy methods are supported by the cellrank.tl.Lineage. One can also calculate the entropy, which in [Setty19] is defined as the differentiation potential of cells.

lin.entropy(axis=1)
entropy of foo, bar, baz, quux
1.124933
1.092287
1.384020
1.150247
1.280556
0.986338
1.138118
1.156893
1.109079
1.022771

10 cells x 1 lineage



When subsetting the lineage and not selecting all of them, they will no longer sum to 1 and cannot be interpreted as a probability distribution. We offer a method cellrank.tl.Lineage.reduce which can be used to solve this issue. Below we show only one out of many normalization techniques.

lin.reduce("foo, quux", "baz", normalize_weights="softmax")
foo or quuxbaz
0.7549160.245084
0.4149610.585039
0.6805290.319471
0.4873050.512695
0.7261610.273839
0.9583030.041697
0.4578190.542181
0.7950070.204993
0.6378040.362196
0.4871400.512860

10 cells x 2 lineages



Lastly, we can plot aggregate information about lineages, such as numpy.mean() and others.

lin.plot_pie(np.mean, legend_loc="on data")
mean

Total running time of the script: ( 0 minutes 5.828 seconds)

Estimated memory usage: 10 MB

Gallery generated by Sphinx-Gallery