# Lineage tricks

This example shows some niche, but useful functionalities of `cellrank.tl.Lineage`.

`cellrank.tl.Lineage` is a lightweight wrapper around `numpy.ndarray` containing names and colors which stores the data in columns and allows for `pandas`-like indexing. It also provides various methods, such as a method for plotting some aggregate information for each column.

We use it primarily to store either the fate probabilities or the macrostates memberships, see Compute absorption probabilities or Compute macrostates to learn how to compute them.

```import cellrank as cr

import numpy as np

np.random.seed(42)
```

The lineage class behaves like a `numpy` array for the most part. The key differences are that only 2 dimensional arrays are allowed as input and that it always tries to preserve it’s shape, even if scalar is requested.

The constructor requires the underlying array and the lineage names, which must be unique. The colors are optional and by default they are automatically generated.

```lin = cr.tl.Lineage(
np.abs(np.random.normal(size=(10, 4))), names=["foo", "bar", "baz", "quux"]
)
lin /= lin.sum(1)
```

In some cases, this behavior is not desirable or can have unintended consequences. To access the underlying `numpy` array, use the `cellrank.tl.Lineage.X` attribute.

```lin.X
```

Out:

```array([[0.17703771, 0.04927984, 0.23084765, 0.54283479],
[0.08318243, 0.0831766 , 0.5610116 , 0.27262937],
[0.24184977, 0.27949985, 0.23872966, 0.23992072],
[0.05446598, 0.43068153, 0.38828094, 0.12657155],
[0.27768531, 0.08615638, 0.24895063, 0.38720768],
[0.46035999, 0.07091629, 0.0212106 , 0.44751312],
[0.24948831, 0.05083536, 0.52749551, 0.17218082],
[0.17949245, 0.08716859, 0.17981159, 0.55352737],
[0.00433354, 0.33959804, 0.26409355, 0.39197487],
[0.05654772, 0.53056103, 0.35959305, 0.0532982 ]])
```

Lineages can also be transposed.

```lin.T
```
 foo bar baz quux 0.177038 0.083182 0.24185 0.054466 0.277685 0.46036 0.249488 0.179492 0.004334 0.056548 0.04928 0.083177 0.2795 0.430682 0.086156 0.070916 0.050835 0.087169 0.339598 0.530561 0.230848 0.561012 0.23873 0.388281 0.248951 0.021211 0.527496 0.179812 0.264094 0.359593 0.542835 0.272629 0.239921 0.126572 0.387208 0.447513 0.172181 0.553527 0.391975 0.053298

4 lineages x 10 cells

Indexing into lineage can be done via the names as well.

```lin[["foo", "bar"]]
```
foobar
0.1770380.049280
0.0831820.083177
0.2418500.279500
0.0544660.430682
0.2776850.086156
0.4603600.070916
0.2494880.050835
0.1794920.087169
0.0043340.339598
0.0565480.530561

10 cells x 2 lineages

Two or more lineage can be combined into by joining the names with “,”. This also automatically updates the color based on the combined lineages’ colors.

```lin[["bar, baz, quux"]]
```
bar, baz, quux
0.822962
0.916818
0.758150
0.945534
0.722315
0.539640
0.750512
0.820508
0.995666
0.943452

10 cells x 1 lineage

Most of the `numpy` methods are supported by the `cellrank.tl.Lineage`. One can also calculate the entropy, which in is defined as the differentiation potential of cells.

```lin.entropy(axis=1)
```
entropy of foo, bar, baz, quux
1.124933
1.092287
1.384020
1.150247
1.280556
0.986338
1.138118
1.156893
1.109079
1.022771

10 cells x 1 lineage

When subsetting the lineage and not selecting all of them, they will no longer sum to 1 and cannot be interpreted as a probability distribution. We offer a method `cellrank.tl.Lineage.reduce` which can be used to solve this issue. Below we show only one out of many normalization techniques.

```lin.reduce("foo, quux", "baz", normalize_weights="softmax")
```
foo, quuxbaz
0.7549160.245084
0.4149610.585039
0.6805290.319471
0.4873050.512695
0.7261610.273839
0.9583030.041697
0.4578190.542181
0.7950070.204993
0.6378040.362196
0.4871400.512860

10 cells x 2 lineages

Lastly, we can plot aggregate information about lineages, such as `numpy.mean()` and others.

```lin.plot_pie(np.mean, legend_loc="on data")
```

Total running time of the script: ( 0 minutes 2.959 seconds)

Estimated memory usage: 9 MB

Gallery generated by Sphinx-Gallery