cellrank.estimators.GPCCA#
 class cellrank.estimators.GPCCA(object, **kwargs)[source]#
Generalized Perron Cluster Cluster Analysis [Reuter et al., 2018] as implemented in pyGPCCA.
Coarsegrains a discrete Markov chain into a set of macrostates and computes coarsegrained transition probabilities among the macrostates. Each macrostate corresponds to an area of the state space, i.e. to a subset of cells. The assignment is soft, i.e. each cell is assigned to every macrostate with a certain weight, where weights sum to one per cell. Macrostates are computed by maximizing the ‘crispness’ which can be thought of as a measure for minimal overlap between macrostates in a certain innerproduct sense. Once the macrostates have been computed, we project the large transition matrix onto a coarsegrained transition matrix among the macrostates via a Galerkin projection. This projection is based on invariant subspaces of the original transition matrix which are obtained using the real Schur decomposition [Reuter et al., 2018].
 Parameters:
object (
Union
[str
,bool
,ndarray
,spmatrix
,AnnData
,KernelExpression
]) –Can be one of the following types:
anndata.AnnData
 annotated data object.scipy.sparse.spmatrix
,numpy.ndarray
 rownormalized transition matrix.cellrank.kernels.KernelExpression
 kernel expression.str
 key inanndata.AnnData.obsp
where the transition matrix is stored.adata
must be provided in this case.bool
 directionality of the transition matrix that will be used to infer its storage location. If None, the directionality will be determined automatically.adata
must be provided in this case.
kwargs (
Any
) – Keyword arguments forcellrank.kernels.PrecomputedKernel
.
Attributes table#
Absorption probabilities. 

Mean and variance of the time until absorption. 

Annotated data object. 

Direction of 

Coarsegrained transition matrix. 

Coarsegrained initial distribution. 

Coarsegrained stationary distribution. 

Eigendecomposition of 

Underlying kernel expression. 

Potential lineage drivers. 

Macrostates of the transition matrix. 

Macrostate membership matrix. 

Estimator parameters. 

Priming degree. 

Schur matrix. 

Real Schur vectors of the transition matrix. 

Shape of the kernel. 

Categorical annotation of terminal states. 

Terminal state membership matrix. 

Aggregated probability of cells to be in terminal states. 

Transition matrix of 
Methods table#

Compute absorption probabilities. 

Compute eigendecomposition of 

Compute driver genes per lineage. 

Compute the degree of lineage priming. 

Compute the macrostates. 

Compute Schur decomposition. 

Compute terminal states of the process. 

Return a copy of self. 

Prepare self for terminal states prediction. 

Deserialize self from 

Plot continuous or categorical observations in an embedding or along pseudotime. 

Plot the coarsegrained transition matrix between macrostates. 

Plot lineage drivers discovered by 

Show scatter plot of genecorrelations between two lineages. 

Plot stacked histogram of macrostates over categorical annotations. 

Plot continuous or categorical observations in an embedding or along pseudotime. 

Plot the Schur matrix. 

Plot the top eigenvalues in real or complex plane. 

Plot continuous or categorical observations in an embedding or along pseudotime. 

Automatically select terminal states from macrostates. 

Deserialize self from a file. 

Rename categories in 

Manually define terminal states. 
Manually select terminal states from macrostates. 


Serialize self to 

Serialize self to a file. 
Attributes#
absorption_probabilities#
 GPCCA.absorption_probabilities#
Absorption probabilities.
Informally, given a (finite, discrete) Markov chain with a set of transient states \(T\) and a set of absorbing states \(A\), the absorption probability for cell \(i\) from \(T\) to reach cell \(j\) from \(R\) is the probability that a random walk initialized in \(i\) will reach absorbing state \(j\).
In our context, states correspond to cells, in particular, absorbing states correspond to cells in terminal states.
 Return type:
Optional
[Lineage
]
absorption_times#
adata#
backward#
coarse_T#
coarse_initial_distribution#
coarse_stationary_distribution#
eigendecomposition#
 GPCCA.eigendecomposition#
Eigendecomposition of
transition_matrix
.For nonsymmetric real matrices, left and right eigenvectors will in general be different and complex. We compute both left and right eigenvectors.
 Return type:
 Returns:
A dictionary with the following keys:
’D’  the eigenvalues.
’eigengap’  the eigengap.
’params’  parameters used for the computation.
’V_l’  left eigenvectors (optional).
’V_r’  right eigenvectors (optional).
’stationary_dist’  stationary distribution of
transition_matrix
, if present.
kernel#
lineage_drivers#
 GPCCA.lineage_drivers#
Potential lineage drivers.
Computes Pearson correlation of each gene with fate probabilities for every terminal state. High Pearson correlation indicates potential lineage drivers. Also computes pvalues and confidence intervals.
 Return type:
 Returns:
Dataframe of shape
(n_genes, n_lineages * 5)
containing the following columns, one for each lineage:{lineage}_corr
 correlation between the gene expression and absorption probabilities.{lineage}_pval
 calculated pvalues for doublesided test.{lineage}_qval
 corrected pvalues using BenjaminiHochberg method at level 0.05.{lineage}_ci_low
 lower bound of theconfidence_level
correlation confidence interval.{lineage}_ci_high
 upper bound of theconfidence_level
correlation confidence interval.
macrostates#
macrostates_memberships#
params#
priming_degree#
 GPCCA.priming_degree#
Priming degree.
Given a cell \(i\) and a set of terminal states, this quantifies how committed vs. naive cell \(i\) is, i.e. its degree of pluripotency. Low values correspond to naive cells (high degree of pluripotency), high values correspond to committed cells (low degree of pluripotency).
schur_matrix#
 GPCCA.schur_matrix#
Schur matrix.
The real Schur decomposition is a generalization of the Eigendecomposition and can be computed for any realvalued, square matrix \(A\). It is given by \(A = Q R Q^T\), where \(Q\) contains the real Schur vectors and \(R\) is the Schur matrix. \(Q\) is orthogonal and \(R\) is quasiupper triangular with 1x1 and 2x2 blocks on the diagonal. If PETSc and SLEPc are installed, only the leading Schur vectors are computed.
schur_vectors#
 GPCCA.schur_vectors#
Real Schur vectors of the transition matrix.
The real Schur decomposition is a generalization of the Eigendecomposition and can be computed for any realvalued, square matrix \(A\). It is given by \(A = Q R Q^T\), where \(Q\) contains the real Schur vectors and \(R\) is the Schur matrix. \(Q\) is orthogonal and \(R\) is quasiupper triangular with 1x1 and 2x2 blocks on the diagonal. If PETSc and SLEPc are installed, only the leading Schur vectors are computed.
shape#
terminal_states#
terminal_states_memberships#
terminal_states_probabilities#
transition_matrix#
Methods#
compute_absorption_probabilities#
 GPCCA.compute_absorption_probabilities(keys=None, solver='gmres', use_petsc=True, time_to_absorption=None, n_jobs=None, backend='loky', show_progress_bar=True, tol=1e06, preconditioner=None)#
Compute absorption probabilities.
For each cell, this computes the probability of being absorbed in any of the
terminal_states
. In particular, this corresponds to the probability that a random walk initialized in transient cell \(i\) will reach any cell from a fixed transient state before reaching a cell from any other transient state. Parameters:
keys (
Optional
[Sequence
[str
]]) – Terminal states for which to compute the absorption probabilities. If None, use all states defined interminal_states
.solver (
Union
[str
,Literal
[‘direct’, ‘gmres’, ‘lgmres’, ‘bicgstab’, ‘gcrotmk’]]) –Solver to use for the linear problem. Options are ‘direct’, ‘gmres’, ‘lgmres’, ‘bicgstab’ or ‘gcrotmk’ when
use_petsc = False
or one ofpetsc4py.PETSc.KPS.Type
otherwise.Information on the
scipy
iterative solvers can be found inscipy.sparse.linalg()
or forpetsc4py
solver here.use_petsc (
bool
) – Whether to use solvers frompetsc4py
orscipy
. Recommended for large problems. If no installation is found, defaults toscipy.sparse.linalg.gmres()
.time_to_absorption (
Union
[Literal
[‘all’],Sequence
[Union
[str
,Sequence
[str
]]],Dict
[Union
[str
,Sequence
[str
]],Literal
[‘mean’, ‘var’]],None
]) –Whether to compute mean time to absorption and its variance to specific absorbing states.
If a
dict
, can be specified as{{'Alpha': 'var', ...}}
to also compute variance. In case when states are atuple
, time to absorption will be computed to the subset of these states, such as[('Alpha', 'Beta'), ...]
or{{('Alpha', 'Beta'): 'mean', ...}}
. Can be specified as'all'
to compute it to any absorbing state inkeys
, which is more efficient than listing all absorbing states explicitly.It might be beneficial to disable the progress bar as
show_progress_bar = False
because of many solves.n_jobs (
Optional
[int
]) – Number of parallel jobs to use when using an iterative solver.backend (
Literal
[‘loky’, ‘multiprocessing’, ‘threading’]) – Which backend to use for multiprocessing. Seejoblib.Parallel
for valid options.show_progress_bar (
bool
) – Whether to show progress bar. Only used whensolver != 'direct'
.tol (
float
) – Convergence tolerance for the iterative solver. The default is fine for most cases, only consider decreasing this for severely illconditioned matrices.preconditioner (
Optional
[str
]) – Preconditioner to use, only available whenuse_petsc = True
. For valid options, see here. We recommend the ‘ilu’ preconditioner for badly conditioned problems.
 Return type:
 Returns:
Nothing, just updates the following fields:
absorption_probabilities
 Absorption probabilities.absorption_times
 Mean and variance of the time until absorption. Only iftime_to_absorption
is specified.
compute_eigendecomposition#
 GPCCA.compute_eigendecomposition(k=20, which='LR', alpha=1.0, only_evals=False, ncv=None)#
Compute eigendecomposition of
transition_matrix
.Uses a sparse implementation, if possible, and only computes the top \(k\) eigenvectors to speed up the computation. Computes both left and right eigenvectors.
 Parameters:
k (
int
) – Number of eigenvectors or eigenvalues to compute.which (
Literal
[‘LR’, ‘LM’]) –How to sort the eigenvalues. Valid option are:
’LR’  the largest real part.
’LM’  the largest magnitude.
alpha (
float
) – Used to compute the eigengap.alpha
is the weight given to the deviation of an eigenvalue from one.only_evals (
bool
) – Whether to compute only eigenvalues.
 Return type:
 Returns:
Nothing, just updates the following field:
eigendecomposition
 Eigendecomposition oftransition_matrix
.
compute_lineage_drivers#
 GPCCA.compute_lineage_drivers(lineages=None, method=TestMethod.FISCHER, cluster_key=None, clusters=None, layer=None, use_raw=False, confidence_level=0.95, n_perms=1000, seed=None, **kwargs)#
Compute driver genes per lineage.
Correlates gene expression with lineage probabilities, for a given lineage and set of clusters. Often, it makes sense to restrict this to a set of clusters which are relevant for the specified lineages.
 Parameters:
lineages (
Union
[str
,Sequence
,None
]) – Lineage names fromabsorption_probabilities
. If None, use all lineages.method (
Literal
[‘fischer’, ‘perm_test’]) –Mode to use when calculating pvalues and confidence intervals. Valid options are:
’fischer’  use Fischer transformation [Fisher, 1921].
’perm_test’  use permutation test.
cluster_key (
Optional
[str
]) – Key fromanndata.AnnData.obs
to obtain cluster annotations. These are considered forclusters
.clusters (
Union
[str
,Sequence
,None
]) – Restrict the correlations to these clusters.layer (
Optional
[str
]) – Key fromanndata.AnnData.layers
from which to get the expression. If None or ‘X’, useanndata.AnnData.X
.use_raw (
bool
) – Whether or not to useanndata.AnnData.raw
to correlate gene expression.confidence_level (
float
) – Confidence level for the confidence interval calculation. Must be in interval [0, 1].n_perms (
int
) – Number of permutations to use whenmethod = 'perm_test'
.seed (
Optional
[int
]) – Random seed whenmethod = 'perm_test'
.show_progress_bar – Whether to show a progress bar. Disabling it may slightly improve performance.
n_jobs – Number of parallel jobs. If 1, use all available cores. If None or 1, the execution is sequential.
backend – Which backend to use for parallelization. See
joblib.Parallel
for valid options.
 Return type:
 Returns:
Dataframe of shape
(n_genes, n_lineages * 5)
containing the following columns, one for each lineage:{lineage}_corr
 correlation between the gene expression and absorption probabilities.{lineage}_pval
 calculated pvalues for doublesided test.{lineage}_qval
 corrected pvalues using BenjaminiHochberg method at level 0.05.{lineage}_ci_low
 lower bound of theconfidence_level
correlation confidence interval.{lineage}_ci_high
 upper bound of theconfidence_level
correlation confidence interval.
Also updates the following field:
lineage_drivers
 the samepandas.DataFrame
as described above.
compute_lineage_priming#
 GPCCA.compute_lineage_priming(method='kl_divergence', early_cells=None)#
Compute the degree of lineage priming.
It returns a score in [0, 1] where 0 stands for naive and 1 stands for committed.
 Parameters:
method (
Literal
[‘kl_divergence’, ‘entropy’]) –The method used to compute the degree of lineage priming. Valid options are:
’kl_divergence’  as in [Velten et al., 2017], computes KLdivergence between the fate probabilities of a cell and the average fate probabilities. Computation of average fate probabilities can be restricted to a set of userdefined
early_cells
.’entropy’  as in [Setty et al., 2019], computes entropy over a cell’s fate probabilities.
early_cells (
Union
[Mapping
[str
,Sequence
[str
]],Sequence
[str
],None
]) – Cell IDs or a mask marking early cells. If None, use all cells. Only used whenmethod = 'kl_divergence'
. If adict
, the key specifies a cluster key inanndata.AnnData.obs
and the values specify cluster labels containing early cells.
 Return type:
 Returns:
The priming degree.
Also updates the following field:
priming_degree
 Priming degree.
compute_macrostates#
 GPCCA.compute_macrostates(n_states=None, n_cells=30, cluster_key=None, **kwargs)[source]#
Compute the macrostates.
 Parameters:
n_states (
Union
[int
,Sequence
[int
],None
]) – Number of macrostates. If atyping.Sequence
, use the minChi criterion [Reuter et al., 2018]. If None, use the eigengap heuristic.n_cells (
Optional
[int
]) – Number of most likely cells from each macrostate to select.cluster_key (
Optional
[str
]) – If a key to cluster labels is given, names and colors of the states will be associated with the clusters.kwargs (
Any
) – Keyword arguments forcompute_schur()
.
 Return type:
 Returns:
Nothing, just updates the following fields:
macrostates
 Macrostates of the transition matrix.macrostates_memberships
 Macrostate membership matrix.coarse_T
 Coarsegrained transition matrix.coarse_initial_distribution
 Coarsegrained initial distribution.coarse_stationary_distribution
 Coarsegrained stationary distribution.schur_vectors
 Real Schur vectors of the transition matrix.schur_matrix
 Schur matrix.eigendecomposition
 Eigendecomposition oftransition_matrix
.
compute_schur#
 GPCCA.compute_schur(n_components=20, initial_distribution=None, method='krylov', which='LR', alpha=1.0)#
Compute Schur decomposition.
 Parameters:
n_components (
int
) – Number of Schur vectors to compute.initial_distribution (
Optional
[ndarray
]) – Input distribution over all cells. If None, uniform distribution is used.method (
Literal
[‘krylov’, ‘brandts’]) –Method for calculating the Schur vectors. Valid options are:
’krylov’  an iterative procedure that computes a partial, sorted Schur decomposition for large, sparse matrices.
’brandts’  full sorted Schur decomposition of a dense matrix.
For benefits of each method, see
pygpcca.GPCCA
.which (
Literal
[‘LR’, ‘LM’]) –How to sort the eigenvalues. Valid option are:
’LR’  the largest real part.
’LM’  the largest magnitude.
alpha (
float
) – Used to compute the eigengap.alpha
is the weight given to the deviation of an eigenvalue from one.
 Returns:
Nothing, just updates the following fields:
schur_vectors
 Real Schur vectors of the transition matrix.schur_matrix
 Schur matrix.eigendecomposition
 Eigendecomposition oftransition_matrix
.
compute_terminal_states#
 GPCCA.compute_terminal_states(*args, **kwargs)#
Compute terminal states of the process.
This is an alias for
predict()
. Parameters:
 Return type:
 Returns:
Nothing, just updates the following fields:
terminal_states
Categorical annotation of terminal states.
terminal_states_probabilities
 Aggregated probability of cells to be in terminal states.
copy#
fit#
 GPCCA.fit(n_states=None, n_cells=30, cluster_key=None, **kwargs)[source]#
Prepare self for terminal states prediction.
 Parameters:
n_states (
Union
[int
,Sequence
[int
],None
]) – Number of macrostates. If atyping.Sequence
, use the minChi criterion [Reuter et al., 2018]. If None, use the eigengap heuristic.n_cells (
Optional
[int
]) – Number of most likely cells from each macrostate to select.cluster_key (
Optional
[str
]) – If a key to cluster labels is given, names and colors of the states will be associated with the clusters.kwargs (
Any
) – Keyword arguments forcompute_schur()
.
 Return type:
 Returns:
Nothing, just updates the following fields:
macrostates
 Macrostates of the transition matrix.macrostates_memberships
 Macrostate membership matrix.coarse_T
 Coarsegrained transition matrix.coarse_initial_distribution
 Coarsegrained initial distribution.coarse_stationary_distribution
 Coarsegrained stationary distribution.schur_vectors
 Real Schur vectors of the transition matrix.schur_matrix
 Schur matrix.eigendecomposition
 Eigendecomposition oftransition_matrix
.
from_adata#
 classmethod GPCCA.from_adata(adata, obsp_key)#
Deserialize self from
anndata.AnnData
. Parameters:
adata (
anndata.AnnData
) – Annotated data object.obsp_key (
str
) – Key inanndata.AnnData.obsp
where the transition matrix is stored.
 Return type:
BaseEstimator
 Returns:
The deserialized object.
plot_absorption_probabilities#
 GPCCA.plot_absorption_probabilities(states=None, color=None, discrete=True, mode=PlotMode.EMBEDDING, time_key='latent_time', same_plot=True, title=None, cmap='viridis', **kwargs)#
Plot continuous or categorical observations in an embedding or along pseudotime.
 Parameters:
color (
Optional
[str
]) – Key inanndata.AnnData.obs
.discrete (
bool
) – Whether to plot the data as continuous or discrete observations. If the data cannot be plotted as continuous observations, it will be plotted as discrete.mode (
Literal
[‘embedding’, ‘time’]) –Valid options are:
’embedding’  plot the embedding while coloring in continuous or categorical observations.
’time’  plot the pseudotime on xaxis and the probabilities/memberships on yaxis.
time_key (
str
) – Key inanndata.AnnData.obs
where pseudotime is stored. Only used whenmode = 'time'
.title (
Union
[str
,Sequence
[str
],None
]) – Title of the plot(s).same_plot (
bool
) – Whether to plot the data on the same plot or not. Only use whenmode = 'embedding'
. If True anddiscrete = False
,color
is ignored.cmap (
str
) – Colormap for continuous data.kwargs (
Any
) – Keyword arguments forscvelo.pl.scatter()
.
 Return type:
 Returns:
Nothing, just plots the figure. Optionally saves it based on
save
.
plot_coarse_T#
 GPCCA.plot_coarse_T(show_stationary_dist=True, show_initial_dist=False, order='stability', cmap='viridis', xtick_rotation=45, annotate=True, show_cbar=True, title=None, figsize=(8, 8), dpi=80, save=None, text_kwargs=mappingproxy({}), **kwargs)[source]#
Plot the coarsegrained transition matrix between macrostates.
 Parameters:
show_stationary_dist (
bool
) – Whether to showcoarse_stationary_distribution
, if present.show_initial_dist (
bool
) – Whether to showcoarse_initial_distribution
.order (
Optional
[Literal
[‘stability’, ‘incoming’, ‘outgoing’, ‘stat_dist’]]) –How to order the coarsegrained transition matrix. Valid options are:
’stability’  order by the values on the diagonal.
’incoming’  order by the incoming mass, excluding the diagonal.
’outgoing’  order by the outgoing mass, excluding the diagonal.
’stat_dist’  order by coarse stationary distribution. If not present, use ‘stability’.
cmap (
Union
[str
,ListedColormap
]) – Colormap to use.xtick_rotation (
float
) – Rotation of ticks on the xaxis.annotate (
bool
) – Whether to display the text on each cell.show_cbar (
bool
) – Whether to show colorbar.dpi (
int
) – Dots per inch.save (
Union
[str
,Path
,None
]) – Filename where to save the plot.text_kwargs (
Mapping
[str
,Any
]) – Keyword arguments formatplotlib.pyplot.text()
.kwargs (
Any
) – Keyword arguments formatplotlib.pyplot.imshow()
.
 Return type:
 Returns:
Nothing, just plots the figure. Optionally saves it based on
save
.
plot_lineage_drivers#
 GPCCA.plot_lineage_drivers(lineage, n_genes=8, use_raw=False, ascending=False, ncols=None, title_fmt='{gene} qval={qval:.4e}', figsize=None, dpi=None, save=None, **kwargs)#
Plot lineage drivers discovered by
compute_lineage_drivers()
. Parameters:
lineage (
str
) – Lineage for which to plot the driver genes.n_genes (
int
) – Top most correlated genes to plot.use_raw (
bool
) – Whether to access inanndata.AnnData.raw
or not.ascending (
bool
) – Whether to sort the genes in ascending order.title_fmt (
str
) – Title format. Can include {gene}, {pval}, {qval} or {corr}, which will be substituted with the actual values.figsize (
Optional
[Tuple
[float
,float
]]) – Size of the figure.save (
Union
[str
,Path
,None
]) – Filename where to save the plot.kwargs (
Any
) – Keyword arguments forscvelo.pl.scatter()
.
 Return type:
 Returns:
Nothing, just plots the figure. Optionally saves it based on
save
.
plot_lineage_drivers_correlation#
 GPCCA.plot_lineage_drivers_correlation(lineage_x, lineage_y, color=None, gene_sets=None, gene_sets_colors=None, use_raw=False, cmap='RdYlBu_r', fontsize=12, adjust_text=False, legend_loc='best', figsize=(4, 4), dpi=None, save=None, show=True, **kwargs)#
Show scatter plot of genecorrelations between two lineages.
Optionally, you can pass a
dict
of gene names that will be annotated in the plot. Parameters:
lineage_x (
str
) – Name of the lineage on the xaxis.lineage_y (
str
) – Name of the lineage on the yaxis.color (
Optional
[str
]) – Key inanndata.AnnData.var
oranndata.AnnData.varm
, preferring for the former.gene_sets (
Optional
[Dict
[str
,Sequence
[str
]]]) – Gene sets annotations of the form {‘gene_set_name’: [‘gene_1’, ‘gene_2’], …}.gene_sets_colors (
Optional
[Sequence
[str
]]) – List of colors where each entry corresponds to a gene set fromgenes_sets
. If None and keys ingene_sets
correspond to lineage names, use the lineage colors. Otherwise, use default colors.use_raw (
bool
) – Whether to accessanndata.AnnData.raw
or not.cmap (
str
) – Colormap to use.fontsize (
int
) – Size of the text when plottinggene_sets
.adjust_text (
bool
) – Whether to automatically adjust text in order to reduce overlap.legend_loc (
Optional
[str
]) – Position of the legend. If None, don’t show the legend. Only used whengene_sets != None
.figsize (
Optional
[Tuple
[float
,float
]]) – Size of the figure.save (
Union
[str
,Path
,None
]) – Filename where to save the plot.show (
bool
) – If False, returnmatplotlib.pyplot.Axes
.kwargs (
Any
) – Keyword arguments forscanpy.pl.scatter()
.
 Return type:
 Returns:
The axes object, if
show = False
. Nothing, just plots the figure. Optionally saves it based onsave
.
Notes
This plot is based on the following notebook by Maren Büttner.
plot_macrostate_composition#
 GPCCA.plot_macrostate_composition(key, width=0.8, title=None, labelrot=45, legend_loc='upper right out', figsize=None, dpi=None, save=None, show=True)[source]#
Plot stacked histogram of macrostates over categorical annotations.
 Parameters:
adata (
anndata.AnnData
) – Annotated data object.key (
str
) – Key fromanndata.AnnData.obs
containing categorical annotations.width (
float
) – Bar width in [0, 1].title (
Optional
[str
]) – Title of the figure. If None, create one automatically.labelrot (
float
) – Rotation of labels on xaxis.legend_loc (
Optional
[str
]) – Position of the legend. If None, don’t show legend.figsize (
Optional
[Tuple
[float
,float
]]) – Size of the figure.save (
Union
[str
,Path
,None
]) – Filename where to save the plot.show (
bool
) – If False, returnmatplotlib.pyplot.Axes
.
 Return type:
 Returns:
The axes object, if
show = False
. Nothing, just plots the figure. Optionally saves it based onsave
.
plot_macrostates#
 GPCCA.plot_macrostates(states=None, color=None, discrete=True, mode=PlotMode.EMBEDDING, time_key='latent_time', same_plot=True, title=None, cmap='viridis', **kwargs)#
Plot continuous or categorical observations in an embedding or along pseudotime.
 Parameters:
color (
Optional
[str
]) – Key inanndata.AnnData.obs
.discrete (
bool
) – Whether to plot the data as continuous or discrete observations. If the data cannot be plotted as continuous observations, it will be plotted as discrete.mode (
Literal
[‘embedding’, ‘time’]) –Valid options are:
’embedding’  plot the embedding while coloring in continuous or categorical observations.
’time’  plot the pseudotime on xaxis and the probabilities/memberships on yaxis.
time_key (
str
) – Key inanndata.AnnData.obs
where pseudotime is stored. Only used whenmode = 'time'
.title (
Union
[str
,Sequence
[str
],None
]) – Title of the plot(s).same_plot (
bool
) – Whether to plot the data on the same plot or not. Only use whenmode = 'embedding'
. If True anddiscrete = False
,color
is ignored.cmap (
str
) – Colormap for continuous data.kwargs (
Any
) – Keyword arguments forscvelo.pl.scatter()
.
 Return type:
 Returns:
Nothing, just plots the figure. Optionally saves it based on
save
.
plot_schur_matrix#
 GPCCA.plot_schur_matrix(title='schur matrix', cmap='viridis', figsize=None, dpi=80, save=None, **kwargs)#
Plot the Schur matrix.
 Parameters:
 Returns:
Nothing, just plots the figure. Optionally saves it based on
save
.
plot_spectrum#
 GPCCA.plot_spectrum(n=None, real_only=None, show_eigengap=True, show_all_xticks=True, legend_loc=None, title=None, marker='.', figsize=(5, 5), dpi=100, save=None, **kwargs)#
Plot the top eigenvalues in real or complex plane.
 Parameters:
n (
Optional
[int
]) – Number of eigenvalues to show. If None, show all that have been computed.real_only (
Optional
[bool
]) – Whether to plot only the real part of the spectrum. If None, plot real spectrum if no complex eigenvalues are present.show_eigengap (
bool
) – Whenreal_only = True
, this determines whether to show the inferred eigengap as a dotted line.show_all_xticks (
bool
) – Whenreal_only = True
, this determines whether to show the indices of all eigenvalues on the xaxis.legend_loc (
Optional
[str
]) – Location parameter for the legend.marker (
str
) – Marker symbol used, valid options can be found inmatplotlib.markers
.figsize (
Optional
[Tuple
[float
,float
]]) – Size of the figure.dpi (
int
) – Dots per inch.save (
Union
[str
,Path
,None
]) – Filename where to save the plot.kwargs (
Any
) – Keyword arguments formatplotlib.pyplot.scatter()
.
 Return type:
 Returns:
Nothing, just plots the figure. Optionally saves it based on
save
.
plot_terminal_states#
 GPCCA.plot_terminal_states(states=None, color=None, discrete=True, mode=PlotMode.EMBEDDING, time_key='latent_time', same_plot=True, title=None, cmap='viridis', **kwargs)#
Plot continuous or categorical observations in an embedding or along pseudotime.
 Parameters:
color (
Optional
[str
]) – Key inanndata.AnnData.obs
.discrete (
bool
) – Whether to plot the data as continuous or discrete observations. If the data cannot be plotted as continuous observations, it will be plotted as discrete.mode (
Literal
[‘embedding’, ‘time’]) –Valid options are:
’embedding’  plot the embedding while coloring in continuous or categorical observations.
’time’  plot the pseudotime on xaxis and the probabilities/memberships on yaxis.
time_key (
str
) – Key inanndata.AnnData.obs
where pseudotime is stored. Only used whenmode = 'time'
.title (
Union
[str
,Sequence
[str
],None
]) – Title of the plot(s).same_plot (
bool
) – Whether to plot the data on the same plot or not. Only use whenmode = 'embedding'
. If True anddiscrete = False
,color
is ignored.cmap (
str
) – Colormap for continuous data.kwargs (
Any
) – Keyword arguments forscvelo.pl.scatter()
.
 Return type:
 Returns:
Nothing, just plots the figure. Optionally saves it based on
save
.
predict#
 GPCCA.predict(method=TermStatesMethod.STABILITY, n_cells=30, alpha=1, stability_threshold=0.96, n_states=None)[source]#
Automatically select terminal states from macrostates.
 Parameters:
method (
Literal
[‘stability’, ‘top_n’, ‘eigengap’, ‘eigengap_coarse’]) –How to select the terminal states. Valid option are:
’eigengap’  select the number of states based on the eigengap of
transition_matrix
.’eigengap_coarse’  select the number of states based on the eigengap of the diagonal of
coarse_T
.’top_n’  select top
n_states
based on the probability of the diagonal ofcoarse_T
.’stability’  select states which have a stability >=
stability_threshold
. The stability is given by the diagonal elements ofcoarse_T
.
n_cells (
int
) – Number of most likely cells from each macrostate to select.alpha (
Optional
[float
]) – Weight given to the deviation of an eigenvalue from one. Only used whenmethod = 'eigengap'
ormethod = 'eigengap_coarse'
.stability_threshold (
float
) – Threshold used whenmethod = 'stability'
.n_states (
Optional
[int
]) – Number of states used whenmethod = 'top_n'
.
 Return type:
 Returns:
Nothing, just updates the following fields:
terminal_states
Categorical annotation of terminal states.
terminal_states_memberships
 Terminal state membership matrix.terminal_states_probabilities
 Aggregated probability of cells to be in terminal states.
read#
 static GPCCA.read(fname, adata=None, copy=False)#
Deserialize self from a file.
 Parameters:
fname (
Union
[str
,Path
]) – Filename from which to read the object.adata (
Optional
[AnnData
]) –anndata.AnnData
object to assign to the saved object. Only used when the saved object hasadata
and it was saved without it.copy (
bool
) – Whether to copyadata
before assigning it or not. Ifadata
is a view, it is always copied.
 Return type:
IOMixin
 Returns:
The deserialized object.
rename_terminal_states#
 GPCCA.rename_terminal_states(new_names)[source]#
Rename categories in
terminal_states
. Parameters:
new_names (
Mapping
[str
,str
]) – Mapping where keys corresponds to the old names and the values to the new names. The new names must be unique. Return type:
 Returns:
Nothing, just updates the names of:
terminal_states
Categorical annotation of terminal states.
terminal_states_memberships
 Terminal state membership matrix.
set_terminal_states#
 GPCCA.set_terminal_states(labels, cluster_key=None, add_to_existing=False, **kwargs)#
Manually define terminal states.
 Parameters:
labels (
Union
[Series
,Dict
[str
,Sequence
[Any
]]]) –Defines the terminal states. Valid options are:
categorical
pandas.Series
where each category corresponds to a terminal state. NaN entries denote cells that do not belong to any terminal state, i.e. these are either initial or transient cells.dict
where keys are terminal states and values are lists of cell barcodes corresponding to annotations inadata.AnnData.obs_names
. If only 1 key is provided, values should correspond to terminal state clusters if a categoricalpandas.Series
can be found inanndata.AnnData.obs
.
cluster_key (
Optional
[str
]) – Key inanndata.AnnData.obs
in order to associate names and colors withterminal_states
. Each terminal state will be given the name and color corresponding to the cluster it mostly overlaps with.add_to_existing (
bool
) – Whether the new terminal states should be added to the existing ones. Cells already assigned to a terminal state will be reassigned to the new terminal state if there’s a conflict between old and new annotations. This throws an error if no previous annotations corresponding to terminal states have been found.
 Return type:
 Returns:
Nothing, just updates the following fields:
terminal_states
Categorical annotation of terminal states.
terminal_states_probabilities
 Aggregated probability of cells to be in terminal states.
set_terminal_states_from_macrostates#
 GPCCA.set_terminal_states_from_macrostates(names=None, n_cells=30, **kwargs)[source]#
Manually select terminal states from macrostates.
 Parameters:
names (
Union
[str
,Sequence
[str
],Mapping
[str
,str
],None
]) – Names of the macrostates to be marked as terminal. Multiple states can be combined using ‘,’, such as["Alpha, Beta", "Epsilon"]
. If adict
, keys correspond to the names of the macrostates and the values to the new names. If None, select all macrostates.n_cells (
int
) – Number of most likely cells from each macrostate to select.
 Return type:
 Returns:
Nothing, just updates the following fields:
terminal_states
Categorical annotation of terminal states.
terminal_states_probabilities
 Aggregated probability of cells to be in terminal states.terminal_states_probabilities_memberships
 Terminal state membership matrix.
to_adata#
 GPCCA.to_adata(keep=('X', 'raw'), *, copy=True)#
Serialize self to
anndata.Anndata
. Parameters:
keep (
Union
[Literal
[‘all’],Sequence
[Literal
[‘X’, ‘raw’, ‘layers’, ‘obs’, ‘var’, ‘obsm’, ‘varm’, ‘obsp’, ‘varp’, ‘uns’]]]) –Which attributes to keep from the underlying
adata
. Valid options are:’all’  keep all attributes specified in the signature.
typing.Sequence
 keep only subset of these attributes.dict
 the keys correspond the attribute names and values to a subset of keys which to keep from this attribute. If the values are specified either as True or ‘all’, everything from this attribute will be kept.
copy (
Union
[bool
,Sequence
[Literal
[‘X’, ‘raw’, ‘layers’, ‘obs’, ‘var’, ‘obsm’, ‘varm’, ‘obsp’, ‘varp’, ‘uns’]]]) – Whether to copy the data. Can be specified on perattribute basis. Useful for attributes that store arrays. Attributes not specified here will not be copied.
 Return type:
 Returns:
adata :
anndata.AnnData
Annotated data object.