cellrank.tl.lineage_drivers

cellrank.tl.lineage_drivers(adata, backward=False, lineages=None, method=TestMethod.FISCHER, cluster_key=None, clusters=None, layer='X', use_raw=False, confidence_level=0.95, n_perms=1000, seed=None, **kwargs)[source]

Compute driver genes per lineage.

Correlates gene expression with lineage probabilities, for a given lineage and set of clusters. Often, it makes sense to restrict this to a set of clusters which are relevant for the specified lineages.

Parameters
  • adata (anndata.AnnData) – Annotated data object.

  • backward (bool) – Direction of the process.

  • lineages (Union[str, Sequence[str], None]) – Lineage names from absorption_probabilities. If None, use all lineages.

  • method (Literal[‘fischer’, ‘perm_test’]) –

    Mode to use when calculating p-values and confidence intervals. Valid options are:

    • ’fischer’ - use Fischer transformation [Fisher, 1921].

    • ’perm_test’ - use permutation test.

  • cluster_key (Optional[str]) – Key from anndata.AnnData.obs to obtain cluster annotations. These are considered for clusters.

  • clusters (Union[str, Sequence[str], None]) – Restrict the correlations to these clusters.

  • layer (str) – Key from anndata.AnnData.layers from which to get the expression. If None or ‘X’, use anndata.AnnData.X.

  • use_raw (bool) – Whether or not to use anndata.AnnData.raw to correlate gene expression.

  • confidence_level (float) – Confidence level for the confidence interval calculation. Must be in interval [0, 1].

  • n_perms (int) – Number of permutations to use when method = 'perm_test'.

  • seed (Optional[int]) – Random seed when method = 'perm_test'.

  • show_progress_bar – Whether to show a progress bar. Disabling it may slightly improve performance.

  • n_jobs – Number of parallel jobs. If -1, use all available cores. If None or 1, the execution is sequential.

  • backend – Which backend to use for parallelization. See joblib.Parallel for valid options.

Return type

DataFrame

Returns

Dataframe of shape (n_genes, n_lineages * 5) containing the following columns, one for each lineage:

  • {lineage}_corr - correlation between the gene expression and absorption probabilities.

  • {lineage}_pval - calculated p-values for double-sided test.

  • {lineage}_qval - corrected p-values using Benjamini-Hochberg method at level 0.05.

  • {lineage}_ci_low - lower bound of the confidence_level correlation confidence interval.

  • {lineage}_ci_high - upper bound of the confidence_level correlation confidence interval.