cellrank.tl.estimators.GPCCA.compute_lineage_drivers
- GPCCA.compute_lineage_drivers(lineages=None, method=TestMethod.FISCHER, cluster_key=None, clusters=None, layer=None, use_raw=False, confidence_level=0.95, n_perms=1000, seed=None, **kwargs)
Compute driver genes per lineage.
Correlates gene expression with lineage probabilities, for a given lineage and set of clusters. Often, it makes sense to restrict this to a set of clusters which are relevant for the specified lineages.
- Parameters
lineages (
Union
[str
,Sequence
,None
]) – Lineage names fromabsorption_probabilities
. If None, use all lineages.method (
Literal
[‘fischer’, ‘perm_test’]) –Mode to use when calculating p-values and confidence intervals. Valid options are:
’fischer’ - use Fischer transformation [Fisher, 1921].
’perm_test’ - use permutation test.
cluster_key (
Optional
[str
]) – Key fromanndata.AnnData.obs
to obtain cluster annotations. These are considered forclusters
.clusters (
Union
[str
,Sequence
,None
]) – Restrict the correlations to these clusters.layer (
Optional
[str
]) – Key fromanndata.AnnData.layers
from which to get the expression. If None or ‘X’, useanndata.AnnData.X
.use_raw (
bool
) – Whether or not to useanndata.AnnData.raw
to correlate gene expression.confidence_level (
float
) – Confidence level for the confidence interval calculation. Must be in interval [0, 1].n_perms (
int
) – Number of permutations to use whenmethod = 'perm_test'
.seed (
Optional
[int
]) – Random seed whenmethod = 'perm_test'
.show_progress_bar – Whether to show a progress bar. Disabling it may slightly improve performance.
n_jobs – Number of parallel jobs. If -1, use all available cores. If None or 1, the execution is sequential.
backend – Which backend to use for parallelization. See
joblib.Parallel
for valid options.
- Return type
- Returns
Dataframe of shape
(n_genes, n_lineages * 5)
containing the following columns, one for each lineage:{lineage}_corr
- correlation between the gene expression and absorption probabilities.{lineage}_pval
- calculated p-values for double-sided test.{lineage}_qval
- corrected p-values using Benjamini-Hochberg method at level 0.05.{lineage}_ci_low
- lower bound of theconfidence_level
correlation confidence interval.{lineage}_ci_high
- upper bound of theconfidence_level
correlation confidence interval.
Also updates the following field:
lineage_drivers
- the samepandas.DataFrame
as described above.