CytoTRACEKernel.compute_cytotrace(layer='Ms', aggregation=CytoTRACEAggregation.MEAN, use_raw=False)[source]

Re-implementation of the CytoTRACE algorithm [Gulati et al., 2020] to estimate cellular plasticity.

Computes the number of genes expressed per cell and ranks genes according to their correlation with this measure. Next, it selects to top-correlating genes and aggregates their (imputed) expression to obtain the CytoTRACE score. A high score stands for high differentiation potential (naive, plastic cells) and a low score stands for low differentiation potential (mature, differentiation cells).

  • layer (str) – Key in anndata.AnnData.layers or ‘X’ for anndata.AnnData.X from where to get the expression.

  • aggregation (Literal[‘mean’, ‘median’, ‘hmean’, ‘gmean’]) –

    How to aggregate expression of the top-correlating genes. Valid options are:

    • ’mean’ - arithmetic mean.

    • ’median’ - median.

    • ’hmean’ - harmonic mean.

    • ’gmean’ - geometric mean.

  • use_raw (bool) – Whether to use the anndata.AnnData.raw to compute the number of genes expressed per cell (#genes/cell) and the correlation of gene expression across cells with #genes/cell.

Return type



Nothing, just modifies anndata.AnnData.obs with the following keys:

  • ’ct_score’ - the normalized CytoTRACE score.

  • ’ct_pseudotime’ - associated pseudotime, essentially 1 - CytoTRACE score.

  • ’ct_num_exp_genes’ - the number of genes expressed per cell, basis of the CytoTRACE score.

It also modifies anndata.AnnData.var with the following keys:

  • ’ct_gene_corr’ - the correlation as specified above.

  • ’ct_correlates’ - indication of the genes used to compute the CytoTRACE score, i.e. the ones that correlated best with ‘num_exp_genes’.


This will not exactly reproduce the results of the original CytoTRACE algorithm [Gulati et al., 2020] because we allow for any normalization and imputation techniques whereas CytoTRACE has built-in specific methods for that.