cellrank.tl.lineage_drivers

cellrank.tl.lineage_drivers(adata, backward=False, lineages=None, method='fischer', cluster_key=None, clusters=None, layer='X', use_raw=False, confidence_level=0.95, n_perms=1000, seed=None, return_drivers=True, **kwargs)[source]

Compute driver genes per lineage.

Correlates gene expression with lineage probabilities, for a given lineage and set of clusters. Often, it makes sense to restrict this to a set of clusters which are relevant for the specified lineages.

Parameters
  • adata (anndata.AnnData) – Annotated data object.

  • backward (bool) – Direction of the process.

  • lineages (Union[str, Sequence, None]) – Either a set of lineage names from absorption_probabilities .names or None, in which case all lineages are considered.

  • method (str) –

    Mode to use when calculating p-values and confidence intervals. Can be one of:

    • ’fischer’ - use Fischer transformation [Fischer21].

    • ’perm_test’ - use permutation test.

Returns

Dataframe of shape (n_genes, n_lineages * 5) containing the following columns, 1 for each lineage:

  • {lineage} corr - correlation between the gene expression and absorption probabilities.

  • {lineage} pval - calculated p-values for double-sided test.

  • {lineage} qval - corrected p-values using Benjamini-Hochberg method at level 0.05.

  • {lineage} ci low - lower bound of the confidence_level correlation confidence interval.

  • {lineage} ci high - upper bound of the confidence_level correlation confidence interval.

Only if return_drivers=True.

Return type

pandas.DataFrame

References

Fischer21

Fisher, R. A. (1921), On the “probable error” of a coefficient of correlation deduced from a small sample., Metron 1 3–32.