Developer API#
Kernels#
 class cellrank.kernels.Kernel(adata, parent=None, **kwargs)[source]#
Base kernel class.
 Parameters:
 cbc(source, target, cluster_key, rep, graph_key='distances')#
Compute crossboundary correctness score between source and target cluster.
 Parameters:
 Return type:
 Returns:
: Crossboundary correctness score for each observation.
 abstract compute_transition_matrix(*args, **kwargs)#
Compute transition matrix.
 Parameters:
 Return type:
KernelExpression
 Returns:
: Modifies
transition_matrix
and returns self.
 copy(*, deep=False)[source]#
Return a copy of self.
 Parameters:
deep (
bool
) – Whether to usedeepcopy()
. Return type:
 Returns:
: Copy of self.
 classmethod from_adata(adata, key, copy=False)[source]#
Read the kernel saved using
write_to_adata()
. Parameters:
adata (
AnnData
) – Annotated data object.key (
str
) – Key inobsp
where the transition matrix is stored. The parameters should be stored inadata.uns['{key}_params']
.copy (
bool
) – Whether to copy the transition matrix.
 Return type:
 Returns:
: The kernel with explicitly initialized properties:
transition_matrix
 the transition matrix.params
 parameters used for computation.
 plot_projection(basis='umap', key_added=None, recompute=False, stream=True, connectivities=None, **kwargs)#
Plot
transition_matrix
as a stream or a grid plot. Parameters:
key_added (
Optional
[str
]) – If notNone
, save the result toadata.obsm['{key_added}']
. Otherwise, save the result to'T_fwd_{basis}'
or'T_bwd_{basis}'
, depending on the direction.recompute (
bool
) – Whether to recompute the projection if it already exists.stream (
bool
) – IfTrue
, usevelocity_embedding_stream()
. Otherwise, usevelocity_embedding_grid()
.connectivities (
Optional
[spmatrix
]) – Connectivity matrix to use for projection. IfNone
, use ones from the underlying kernel, is possible.kwargs (
Any
) – Keyword argument for the abovementioned plotting function.
 Return type:
 Returns:
: Nothing, just plots and modifies
obsm
with a key based on thekey_added
.
 plot_random_walks(n_sims=100, max_iter=0.25, seed=None, successive_hits=0, start_ixs=None, stop_ixs=None, basis='umap', cmap='gnuplot', linewidth=1.0, linealpha=0.3, ixs_legend_loc=None, n_jobs=None, backend='loky', show_progress_bar=True, figsize=None, dpi=None, save=None, **kwargs)#
Plot random walks in an embedding.
This method simulates random walks on the Markov chain defined though the corresponding transition matrix. The method is intended to give qualitative rather than quantitative insights into the transition matrix. Random walks are simulated by iteratively choosing the next cell based on the current cell’s transition probabilities.
 Parameters:
n_sims (
int
) – Number of random walks to simulate.max_iter (
Union
[int
,float
]) – Maximum number of steps of a random walk. If afloat
, it can be specified as a fraction of the number of cells.successive_hits (
int
) – Number of successive hits in thestop_ixs
required to stop prematurely.start_ixs (
Union
[Sequence
[str
],Mapping
[str
,Union
[str
,Sequence
[str
],Tuple
[float
,float
]]],None
]) –Cells from which to sample the starting points. If
None
, use all cells. Can be specified as:dict
 dictionary with 1 key inobs
with values corresponding to either 1 or more clusters (if the column is categorical) or atuple
specifying \([min, max]\) interval from which to select the indices.
For example
{'dpt_pseudotime': [0, 0.1]}
means that starting points for random walks will be sampled uniformly from cells whose pseudotime is in \([0, 0.1]\).stop_ixs (
Union
[Sequence
[str
],Mapping
[str
,Union
[str
,Sequence
[str
],Tuple
[float
,float
]]],None
]) –Cells which when hit, the random walk is terminated. If
None
, terminate aftermax_iters
. Can be specified as:dict
 dictionary with 1 key inobs
with values corresponding to either 1 or more clusters (if the column is categorical) or atuple
specifying \([min, max]\) interval from which to select the indices.
For example
{'clusters': ['Alpha', 'Beta']}
andsuccessive_hits = 3
means that the random walk will stop prematurely after cells in the above specified clusters have been visited successively 3 times in a row.cmap (
Union
[str
,LinearSegmentedColormap
]) – Colormap for the random walk lines.linewidth (
float
) – Width of the random walk lines.linealpha (
float
) – Alpha value of the random walk lines.ixs_legend_loc (
Optional
[str
]) – Legend location for the start/top indices.show_progress_bar (
bool
) – Whether to show a progress bar. Disabling it may slightly improve performance.n_jobs (
Optional
[int
]) – Number of parallel jobs. If 1, use all available cores. IfNone
or 1, the execution is sequential.backend (
str
) – Which backend to use for parallelization. SeeParallel
for valid options.figsize (
Optional
[Tuple
[float
,float
]]) – Size of the figure.save (
Union
[Path
,str
,None
]) – Filename where to save the plot.
 Return type:
 Returns:
: Nothing, just plots the figure. Optionally saves it based on
save
. For each random walk, the first/last cell is marked by the start/end colors ofcmap
.
 plot_single_flow(cluster, cluster_key, time_key, clusters=None, time_points=None, min_flow=0, remove_empty_clusters=True, ascending=False, legend_loc='upper right out', alpha=0.8, xticks_step_size=1, figsize=None, dpi=None, save=None, show=True)#
Visualize outgoing flow from a cluster of cells [Mittnenzweig et al., 2021].
 Parameters:
cluster (
str
) – Cluster for which to visualize outgoing flow.time_key (
str
) – Key inobs
where experimental time is stored.clusters (
Optional
[Sequence
[Any
]]) – Visualize flow only for these clusters. IfNone
, use all clusters.time_points (
Optional
[Sequence
[Union
[float
,int
]]]) – Visualize flow only for these time points. IfNone
, use all time points.min_flow (
float
) – Only show flow edges with flow greater than this value. Flow values are always in \([0, 1]\).remove_empty_clusters (
bool
) – Whether to remove clusters with no incoming flow edges.ascending (
Optional
[bool
]) – Whether to sort the cluster by ascending or descending incoming flow. If None, use the order as in defined byclusters
.xticks_step_size (
Optional
[int
]) – Show only every other nth tick on the xaxis. IfNone
, don’t show any ticks.legend_loc (
Optional
[str
]) – Position of the legend. IfNone
, do not show the legend.figsize (
Optional
[Tuple
[float
,float
]]) – Size of the figure.figsize – Size of the figure.
dpi – Dots per inch.
save (
Union
[Path
,str
,None
]) – Filename where to save the plot.
 Return type:
 Returns:
: The axes object, if
show = False
. Nothing, just plots the figure. Optionally saves it based onsave
.
Notes
This function is a Python reimplementation of the following original R function with some minor stylistic differences. This function will not recreate the results from [Mittnenzweig et al., 2021], because there, the Metacell model [Baran et al., 2019] was used to compute the flow, whereas here the transition matrix is used.
 static read(fname, adata=None, copy=False)#
Deserialize self from a file.
 Parameters:
fname (
Union
[str
,Path
]) – Path from which to read the object.adata (
Optional
[AnnData
]) –AnnData
object to assign to the saved object. Only used when the saved object hasadata
and it was saved without it.copy (
bool
) – Whether to copyadata
before assigning it. Ifadata
is a view, it is always copied.
 Return type:
IOMixin
 Returns:
: The deserialized object.
 property transition_matrix: ndarray  csr_matrix#
Rownormalized transition matrix.
 write_to_adata(key=None, copy=False)#
Write the transition matrix and parameters used for computation to the underlying
adata
object. Parameters:
 Return type:
 Returns:
: Updates the
adata
with the following fields:obsp['{key}']
 the transition matrix.uns['{key}_params']
 parameters used for the calculation.
 class cellrank.kernels.UnidirectionalKernel(adata, parent=None, **kwargs)[source]#

 cbc(source, target, cluster_key, rep, graph_key='distances')#
Compute crossboundary correctness score between source and target cluster.
 Parameters:
 Return type:
 Returns:
: Crossboundary correctness score for each observation.
 abstract compute_transition_matrix(*args, **kwargs)#
Compute transition matrix.
 Parameters:
 Return type:
KernelExpression
 Returns:
: Modifies
transition_matrix
and returns self.
 copy(*, deep=False)#
Return a copy of self.
 Parameters:
deep (
bool
) – Whether to usedeepcopy()
. Return type:
 Returns:
: Copy of self.
 classmethod from_adata(adata, key, copy=False)#
Read the kernel saved using
write_to_adata()
. Parameters:
adata (
AnnData
) – Annotated data object.key (
str
) – Key inobsp
where the transition matrix is stored. The parameters should be stored inadata.uns['{key}_params']
.copy (
bool
) – Whether to copy the transition matrix.
 Return type:
 Returns:
: The kernel with explicitly initialized properties:
transition_matrix
 the transition matrix.params
 parameters used for computation.
 plot_projection(basis='umap', key_added=None, recompute=False, stream=True, connectivities=None, **kwargs)#
Plot
transition_matrix
as a stream or a grid plot. Parameters:
key_added (
Optional
[str
]) – If notNone
, save the result toadata.obsm['{key_added}']
. Otherwise, save the result to'T_fwd_{basis}'
or'T_bwd_{basis}'
, depending on the direction.recompute (
bool
) – Whether to recompute the projection if it already exists.stream (
bool
) – IfTrue
, usevelocity_embedding_stream()
. Otherwise, usevelocity_embedding_grid()
.connectivities (
Optional
[spmatrix
]) – Connectivity matrix to use for projection. IfNone
, use ones from the underlying kernel, is possible.kwargs (
Any
) – Keyword argument for the abovementioned plotting function.
 Return type:
 Returns:
: Nothing, just plots and modifies
obsm
with a key based on thekey_added
.
 plot_random_walks(n_sims=100, max_iter=0.25, seed=None, successive_hits=0, start_ixs=None, stop_ixs=None, basis='umap', cmap='gnuplot', linewidth=1.0, linealpha=0.3, ixs_legend_loc=None, n_jobs=None, backend='loky', show_progress_bar=True, figsize=None, dpi=None, save=None, **kwargs)#
Plot random walks in an embedding.
This method simulates random walks on the Markov chain defined though the corresponding transition matrix. The method is intended to give qualitative rather than quantitative insights into the transition matrix. Random walks are simulated by iteratively choosing the next cell based on the current cell’s transition probabilities.
 Parameters:
n_sims (
int
) – Number of random walks to simulate.max_iter (
Union
[int
,float
]) – Maximum number of steps of a random walk. If afloat
, it can be specified as a fraction of the number of cells.successive_hits (
int
) – Number of successive hits in thestop_ixs
required to stop prematurely.start_ixs (
Union
[Sequence
[str
],Mapping
[str
,Union
[str
,Sequence
[str
],Tuple
[float
,float
]]],None
]) –Cells from which to sample the starting points. If
None
, use all cells. Can be specified as:dict
 dictionary with 1 key inobs
with values corresponding to either 1 or more clusters (if the column is categorical) or atuple
specifying \([min, max]\) interval from which to select the indices.
For example
{'dpt_pseudotime': [0, 0.1]}
means that starting points for random walks will be sampled uniformly from cells whose pseudotime is in \([0, 0.1]\).stop_ixs (
Union
[Sequence
[str
],Mapping
[str
,Union
[str
,Sequence
[str
],Tuple
[float
,float
]]],None
]) –Cells which when hit, the random walk is terminated. If
None
, terminate aftermax_iters
. Can be specified as:dict
 dictionary with 1 key inobs
with values corresponding to either 1 or more clusters (if the column is categorical) or atuple
specifying \([min, max]\) interval from which to select the indices.
For example
{'clusters': ['Alpha', 'Beta']}
andsuccessive_hits = 3
means that the random walk will stop prematurely after cells in the above specified clusters have been visited successively 3 times in a row.cmap (
Union
[str
,LinearSegmentedColormap
]) – Colormap for the random walk lines.linewidth (
float
) – Width of the random walk lines.linealpha (
float
) – Alpha value of the random walk lines.ixs_legend_loc (
Optional
[str
]) – Legend location for the start/top indices.show_progress_bar (
bool
) – Whether to show a progress bar. Disabling it may slightly improve performance.n_jobs (
Optional
[int
]) – Number of parallel jobs. If 1, use all available cores. IfNone
or 1, the execution is sequential.backend (
str
) – Which backend to use for parallelization. SeeParallel
for valid options.figsize (
Optional
[Tuple
[float
,float
]]) – Size of the figure.save (
Union
[Path
,str
,None
]) – Filename where to save the plot.
 Return type:
 Returns:
: Nothing, just plots the figure. Optionally saves it based on
save
. For each random walk, the first/last cell is marked by the start/end colors ofcmap
.
 plot_single_flow(cluster, cluster_key, time_key, clusters=None, time_points=None, min_flow=0, remove_empty_clusters=True, ascending=False, legend_loc='upper right out', alpha=0.8, xticks_step_size=1, figsize=None, dpi=None, save=None, show=True)#
Visualize outgoing flow from a cluster of cells [Mittnenzweig et al., 2021].
 Parameters:
cluster (
str
) – Cluster for which to visualize outgoing flow.time_key (
str
) – Key inobs
where experimental time is stored.clusters (
Optional
[Sequence
[Any
]]) – Visualize flow only for these clusters. IfNone
, use all clusters.time_points (
Optional
[Sequence
[Union
[float
,int
]]]) – Visualize flow only for these time points. IfNone
, use all time points.min_flow (
float
) – Only show flow edges with flow greater than this value. Flow values are always in \([0, 1]\).remove_empty_clusters (
bool
) – Whether to remove clusters with no incoming flow edges.ascending (
Optional
[bool
]) – Whether to sort the cluster by ascending or descending incoming flow. If None, use the order as in defined byclusters
.xticks_step_size (
Optional
[int
]) – Show only every other nth tick on the xaxis. IfNone
, don’t show any ticks.legend_loc (
Optional
[str
]) – Position of the legend. IfNone
, do not show the legend.figsize (
Optional
[Tuple
[float
,float
]]) – Size of the figure.figsize – Size of the figure.
dpi – Dots per inch.
save (
Union
[Path
,str
,None
]) – Filename where to save the plot.
 Return type:
 Returns:
: The axes object, if
show = False
. Nothing, just plots the figure. Optionally saves it based onsave
.
Notes
This function is a Python reimplementation of the following original R function with some minor stylistic differences. This function will not recreate the results from [Mittnenzweig et al., 2021], because there, the Metacell model [Baran et al., 2019] was used to compute the flow, whereas here the transition matrix is used.
 static read(fname, adata=None, copy=False)#
Deserialize self from a file.
 Parameters:
fname (
Union
[str
,Path
]) – Path from which to read the object.adata (
Optional
[AnnData
]) –AnnData
object to assign to the saved object. Only used when the saved object hasadata
and it was saved without it.copy (
bool
) – Whether to copyadata
before assigning it. Ifadata
is a view, it is always copied.
 Return type:
IOMixin
 Returns:
: The deserialized object.
 property transition_matrix: ndarray  csr_matrix#
Rownormalized transition matrix.
 write_to_adata(key=None, copy=False)#
Write the transition matrix and parameters used for computation to the underlying
adata
object. Parameters:
 Return type:
 Returns:
: Updates the
adata
with the following fields:obsp['{key}']
 the transition matrix.uns['{key}_params']
 parameters used for the calculation.
 class cellrank.kernels.BidirectionalKernel(*args, backward=False, **kwargs)[source]#

 cbc(source, target, cluster_key, rep, graph_key='distances')#
Compute crossboundary correctness score between source and target cluster.
 Parameters:
 Return type:
 Returns:
: Crossboundary correctness score for each observation.
 abstract compute_transition_matrix(*args, **kwargs)#
Compute transition matrix.
 Parameters:
 Return type:
KernelExpression
 Returns:
: Modifies
transition_matrix
and returns self.
 copy(*, deep=False)#
Return a copy of self.
 Parameters:
deep (
bool
) – Whether to usedeepcopy()
. Return type:
 Returns:
: Copy of self.
 classmethod from_adata(adata, key, copy=False)#
Read the kernel saved using
write_to_adata()
. Parameters:
adata (
AnnData
) – Annotated data object.key (
str
) – Key inobsp
where the transition matrix is stored. The parameters should be stored inadata.uns['{key}_params']
.copy (
bool
) – Whether to copy the transition matrix.
 Return type:
 Returns:
: The kernel with explicitly initialized properties:
transition_matrix
 the transition matrix.params
 parameters used for computation.
 plot_projection(basis='umap', key_added=None, recompute=False, stream=True, connectivities=None, **kwargs)#
Plot
transition_matrix
as a stream or a grid plot. Parameters:
key_added (
Optional
[str
]) – If notNone
, save the result toadata.obsm['{key_added}']
. Otherwise, save the result to'T_fwd_{basis}'
or'T_bwd_{basis}'
, depending on the direction.recompute (
bool
) – Whether to recompute the projection if it already exists.stream (
bool
) – IfTrue
, usevelocity_embedding_stream()
. Otherwise, usevelocity_embedding_grid()
.connectivities (
Optional
[spmatrix
]) – Connectivity matrix to use for projection. IfNone
, use ones from the underlying kernel, is possible.kwargs (
Any
) – Keyword argument for the abovementioned plotting function.
 Return type:
 Returns:
: Nothing, just plots and modifies
obsm
with a key based on thekey_added
.
 plot_random_walks(n_sims=100, max_iter=0.25, seed=None, successive_hits=0, start_ixs=None, stop_ixs=None, basis='umap', cmap='gnuplot', linewidth=1.0, linealpha=0.3, ixs_legend_loc=None, n_jobs=None, backend='loky', show_progress_bar=True, figsize=None, dpi=None, save=None, **kwargs)#
Plot random walks in an embedding.
This method simulates random walks on the Markov chain defined though the corresponding transition matrix. The method is intended to give qualitative rather than quantitative insights into the transition matrix. Random walks are simulated by iteratively choosing the next cell based on the current cell’s transition probabilities.
 Parameters:
n_sims (
int
) – Number of random walks to simulate.max_iter (
Union
[int
,float
]) – Maximum number of steps of a random walk. If afloat
, it can be specified as a fraction of the number of cells.successive_hits (
int
) – Number of successive hits in thestop_ixs
required to stop prematurely.start_ixs (
Union
[Sequence
[str
],Mapping
[str
,Union
[str
,Sequence
[str
],Tuple
[float
,float
]]],None
]) –Cells from which to sample the starting points. If
None
, use all cells. Can be specified as:dict
 dictionary with 1 key inobs
with values corresponding to either 1 or more clusters (if the column is categorical) or atuple
specifying \([min, max]\) interval from which to select the indices.
For example
{'dpt_pseudotime': [0, 0.1]}
means that starting points for random walks will be sampled uniformly from cells whose pseudotime is in \([0, 0.1]\).stop_ixs (
Union
[Sequence
[str
],Mapping
[str
,Union
[str
,Sequence
[str
],Tuple
[float
,float
]]],None
]) –Cells which when hit, the random walk is terminated. If
None
, terminate aftermax_iters
. Can be specified as:dict
 dictionary with 1 key inobs
with values corresponding to either 1 or more clusters (if the column is categorical) or atuple
specifying \([min, max]\) interval from which to select the indices.
For example
{'clusters': ['Alpha', 'Beta']}
andsuccessive_hits = 3
means that the random walk will stop prematurely after cells in the above specified clusters have been visited successively 3 times in a row.cmap (
Union
[str
,LinearSegmentedColormap
]) – Colormap for the random walk lines.linewidth (
float
) – Width of the random walk lines.linealpha (
float
) – Alpha value of the random walk lines.ixs_legend_loc (
Optional
[str
]) – Legend location for the start/top indices.show_progress_bar (
bool
) – Whether to show a progress bar. Disabling it may slightly improve performance.n_jobs (
Optional
[int
]) – Number of parallel jobs. If 1, use all available cores. IfNone
or 1, the execution is sequential.backend (
str
) – Which backend to use for parallelization. SeeParallel
for valid options.figsize (
Optional
[Tuple
[float
,float
]]) – Size of the figure.save (
Union
[Path
,str
,None
]) – Filename where to save the plot.
 Return type:
 Returns:
: Nothing, just plots the figure. Optionally saves it based on
save
. For each random walk, the first/last cell is marked by the start/end colors ofcmap
.
 plot_single_flow(cluster, cluster_key, time_key, clusters=None, time_points=None, min_flow=0, remove_empty_clusters=True, ascending=False, legend_loc='upper right out', alpha=0.8, xticks_step_size=1, figsize=None, dpi=None, save=None, show=True)#
Visualize outgoing flow from a cluster of cells [Mittnenzweig et al., 2021].
 Parameters:
cluster (
str
) – Cluster for which to visualize outgoing flow.time_key (
str
) – Key inobs
where experimental time is stored.clusters (
Optional
[Sequence
[Any
]]) – Visualize flow only for these clusters. IfNone
, use all clusters.time_points (
Optional
[Sequence
[Union
[float
,int
]]]) – Visualize flow only for these time points. IfNone
, use all time points.min_flow (
float
) – Only show flow edges with flow greater than this value. Flow values are always in \([0, 1]\).remove_empty_clusters (
bool
) – Whether to remove clusters with no incoming flow edges.ascending (
Optional
[bool
]) – Whether to sort the cluster by ascending or descending incoming flow. If None, use the order as in defined byclusters
.xticks_step_size (
Optional
[int
]) – Show only every other nth tick on the xaxis. IfNone
, don’t show any ticks.legend_loc (
Optional
[str
]) – Position of the legend. IfNone
, do not show the legend.figsize (
Optional
[Tuple
[float
,float
]]) – Size of the figure.figsize – Size of the figure.
dpi – Dots per inch.
save (
Union
[Path
,str
,None
]) – Filename where to save the plot.
 Return type:
 Returns:
: The axes object, if
show = False
. Nothing, just plots the figure. Optionally saves it based onsave
.
Notes
This function is a Python reimplementation of the following original R function with some minor stylistic differences. This function will not recreate the results from [Mittnenzweig et al., 2021], because there, the Metacell model [Baran et al., 2019] was used to compute the flow, whereas here the transition matrix is used.
 static read(fname, adata=None, copy=False)#
Deserialize self from a file.
 Parameters:
fname (
Union
[str
,Path
]) – Path from which to read the object.adata (
Optional
[AnnData
]) –AnnData
object to assign to the saved object. Only used when the saved object hasadata
and it was saved without it.copy (
bool
) – Whether to copyadata
before assigning it. Ifadata
is a view, it is always copied.
 Return type:
IOMixin
 Returns:
: The deserialized object.
 property transition_matrix: ndarray  csr_matrix#
Rownormalized transition matrix.
 write_to_adata(key=None, copy=False)#
Write the transition matrix and parameters used for computation to the underlying
adata
object. Parameters:
 Return type:
 Returns:
: Updates the
adata
with the following fields:obsp['{key}']
 the transition matrix.uns['{key}_params']
 parameters used for the calculation.
Similarity#
 class cellrank.kernels.utils.SimilarityABC[source]#
Base class for all similarity schemes.
 abstract __call__(v, D, softmax_scale=1.0)[source]#
Compute transition probability of a cell to its nearest neighbors using RNA velocity.
 Parameters:
v (
ndarray
) – Array of shape(n_genes,)
or(n_neighbors, n_genes)
containing the velocity vector(s). The second case is used for the backward process.D (
ndarray
) – Array of shape(n_neighbors, n_genes)
corresponding to the transcriptomic displacement of the current cell with respect to ist nearest neighbors.softmax_scale (
float
) – Scaling factor for the softmax function.
 Return type:
 Returns:
: The probability and unscaled logits arrays of shape
(n_neighbors,)
.
 class cellrank.kernels.utils.Cosine[source]#
Cosine similarity scheme as defined in eq. (4.7) [Li et al., 2021].
\[v(s_i, s_j) := g(cos(\delta_{i, j}, v_i))\]where \(v_i\) is the velocity vector of cell \(i\), \(\delta_{i, j}\) corresponds to the transcriptional displacement between cells \(i\) and \(j\) and \(g\) is a softmax function with some scaling parameter.
 __call__(v, D, softmax_scale=1.0)#
Compute transition probability of a cell to its nearest neighbors using RNA velocity.
 Parameters:
v (
ndarray
) – Array of shape(n_genes,)
or(n_neighbors, n_genes)
containing the velocity vector(s). The second case is used for the backward process.D (
ndarray
) – Array of shape(n_neighbors, n_genes)
corresponding to the transcriptomic displacement of the current cell with respect to ist nearest neighbors.softmax_scale (
float
) – Scaling factor for the softmax function.
 Return type:
 Returns:
: The probability and unscaled logits arrays of shape
(n_neighbors,)
.
 hessian(v, D, softmax_scale=1.0)#
Compute the Hessian.
 Parameters:
 Return type:
 Returns:
: The full Hessian of shape
(n_neighbors, n_genes, n_genes)
or only its diagonal of shape(n_neighbors, n_genes)
.
 class cellrank.kernels.utils.Correlation[source]#
Pearson correlation scheme as defined in eq. (4.8) [Li et al., 2021].
\[v(s_i, s_j) := g(corr(\delta_{i, j}, v_i))\]where \(v_i\) is the velocity vector of cell \(i\), \(\delta_{i, j}\) corresponds to the transcriptional displacement between cells \(i\) and \(j\) and \(g\) is a softmax function with some scaling parameter.
 __call__(v, D, softmax_scale=1.0)#
Compute transition probability of a cell to its nearest neighbors using RNA velocity.
 Parameters:
v (
ndarray
) – Array of shape(n_genes,)
or(n_neighbors, n_genes)
containing the velocity vector(s). The second case is used for the backward process.D (
ndarray
) – Array of shape(n_neighbors, n_genes)
corresponding to the transcriptomic displacement of the current cell with respect to ist nearest neighbors.softmax_scale (
float
) – Scaling factor for the softmax function.
 Return type:
 Returns:
: The probability and unscaled logits arrays of shape
(n_neighbors,)
.
 hessian(v, D, softmax_scale=1.0)#
Compute the Hessian.
 Parameters:
 Return type:
 Returns:
: The full Hessian of shape
(n_neighbors, n_genes, n_genes)
or only its diagonal of shape(n_neighbors, n_genes)
.
 class cellrank.kernels.utils.DotProduct[source]#
Dot product scheme as defined in eq. (4.9) [Li et al., 2021].
\[v(s_i, s_j) := g(\delta_{i, j}^T v_i)\]where \(v_i\) is the velocity vector of cell \(i\), \(\delta_{i, j}\) corresponds to the transcriptional displacement between cells \(i\) and \(j\) and \(g\) is a softmax function with some scaling parameter.
 __call__(v, D, softmax_scale=1.0)#
Compute transition probability of a cell to its nearest neighbors using RNA velocity.
 Parameters:
v (
ndarray
) – Array of shape(n_genes,)
or(n_neighbors, n_genes)
containing the velocity vector(s). The second case is used for the backward process.D (
ndarray
) – Array of shape(n_neighbors, n_genes)
corresponding to the transcriptomic displacement of the current cell with respect to ist nearest neighbors.softmax_scale (
float
) – Scaling factor for the softmax function.
 Return type:
 Returns:
: The probability and unscaled logits arrays of shape
(n_neighbors,)
.
 hessian(v, D, softmax_scale=1.0)#
Compute the Hessian.
 Parameters:
 Return type:
 Returns:
: The full Hessian of shape
(n_neighbors, n_genes, n_genes)
or only its diagonal of shape(n_neighbors, n_genes)
.
Threshold Scheme#
 class cellrank.kernels.utils.ThresholdSchemeABC[source]#
Base class for all connectivity biasing schemes.
 abstract __call__(cell_pseudotime, neigh_pseudotime, neigh_conn, **kwargs)[source]#
Calculate biased connections for a given cell.
 Parameters:
 Return type:
 Returns:
: Array of shape
(n_neighbors,)
containing the biased connectivities.
 bias_knn(conn, pseudotime, n_jobs=None, backend='loky', show_progress_bar=True, **kwargs)[source]#
Bias cellcell connectivities of a KNN graph.
 Parameters:
conn (
csr_matrix
) – Sparse matrix of shape(n_cells, n_cells)
containing the nearest neighbor connectivities.pseudotime (
ndarray
) – Pseudotemporal ordering of cells.show_progress_bar (
bool
) – Whether to show a progress bar. Disabling it may slightly improve performance.n_jobs (
Optional
[int
]) – Number of parallel jobs. If 1, use all available cores. IfNone
or 1, the execution is sequential.backend (
str
) – Which backend to use for parallelization. SeeParallel
for valid options.kwargs (Any) –
 Return type:
 Returns:
: The biased connectivities.
 class cellrank.kernels.utils.HardThresholdScheme[source]#
Thresholding scheme inspired by Palantir [Setty et al., 2019].
Note that this won’t exactly reproduce the original Palantir results:
Palantir computes the kNN graph in a scaled space of diffusion components.
Palantir uses its own pseudotime to bias the kNN graph which is not implemented here.
Palantir uses a slightly different mechanism to ensure the graph remains connected when removing edges that point into the “pseudotime past”.
 __call__(cell_pseudotime, neigh_pseudotime, neigh_conn, frac_to_keep=0.3)[source]#
Convert the undirected graph of cellcell similarities into a directed one by removing “past” edges.
This uses a pseudotemporal measure to remove graphedges that point into the pseudotimepast. For each cell, it keeps the closest neighbors, even if they are in the pseudotime past, to make sure the graph remains connected.
 Parameters:
cell_pseudotime (
float
) – Pseudotime of the current cell.neigh_pseudotime (
ndarray
) – Array of shape(n_neighbors,)
containing pseudotime of neighbors.neigh_conn (
ndarray
) – Array of shape(n_neighbors,)
containing connectivities of the current cell and its neighbors.frac_to_keep (
float
) – Thefrac_to_keep
* n_neighbors closest neighbors (according to graph connectivities) are kept, no matter whether they lie in the pseudotemporal past or future. Must be in \([0, 1]\).
 Return type:
 Returns:
: Array of shape
(n_neighbors,)
containing the biased connectivities.
 class cellrank.kernels.utils.SoftThresholdScheme[source]#
Thresholding scheme inspired by [Stassen et al., 2021].
The idea is to downweight edges that points against the direction of increasing pseudotime. Essentially, the further “behind” a query cell is in pseudotime with respect to the current reference cell, the more penalized will be its graphconnectivity.
 __call__(cell_pseudotime, neigh_pseudotime, neigh_conn, b=10.0, nu=0.5)[source]#
Bias the connectivities by downweighting ones to past cells.
This function uses generalized logistic regression to weight the past connectivities.
 Parameters:
cell_pseudotime (
float
) – Pseudotime of the current cell.neigh_pseudotime (
ndarray
) – Array of shape(n_neighbors,)
containing pseudotime of neighbors.neigh_conn (
ndarray
) – Array of shape(n_neighbors,)
containing connectivities of the current cell and its neighbors.b (
float
) – The growth rate of generalized logistic function.nu (
float
) – Affects near which asymptote maximum growth occurs.
 Return type:
 Returns:
: Array of shape
(n_neighbors,)
containing the biased connectivities.
 class cellrank.kernels.utils.CustomThresholdScheme(callback)[source]#
Class that wraps a user supplied scheme.
 Parameters:
callback (
Callable
[[float
,ndarray
,ndarray
,ndarray
,Any
],ndarray
]) – Function which returns the biased connectivities.
 __call__(cell_pseudotime, neigh_pseudotime, neigh_conn, **kwargs)[source]#
Calculate biased connections for a given cell.
 Parameters:
cell_pseudotime (
float
) – Pseudotime of the current cell.neigh_pseudotime (
ndarray
) – Array of shape(n_neighbors,)
containing pseudotime of neighbors.neigh_conn (
ndarray
) – Array of shape(n_neighbors,)
containing connectivities of the current cell and its neighbors.kwargs (
Any
) – Additional keyword arguments.
 Return type:
 Returns:
: Array of shape
(n_neighbors,)
containing the biased connectivities.
Estimators#
 class cellrank.estimators.BaseEstimator(object, **kwargs)[source]#
Base class for all estimators.
 Parameters:
object (
Union
[str
,bool
,ndarray
,spmatrix
,AnnData
,KernelExpression
]) –Can be one of the following types:
AnnData
 annotated data object.KernelExpression
 kernel expression.str
 key inobsp
where the transition matrix is stored andadata
must be provided in this case.bool
 directionality of the transition matrix that will be used to infer its storage location. IfNone
, the directionality will be determined automatically andadata
must be provided in this case.
kwargs (
Any
) – Keyword arguments for thePrecomputedKernel
.
 copy(*, deep=False)[source]#
Return a copy of self.
 classmethod from_adata(adata, obsp_key)[source]#
Deserialize self from
AnnData
. Parameters:
 Return type:
 Returns:
: The deserialized object.
 abstract predict(*args, **kwargs)[source]#
Run the prediction.
 Parameters:
 Return type:
 Returns:
: Self.
 to_adata(keep=('X', 'raw'), *, copy=True)[source]#
Serialize self to
Anndata
. Parameters:
keep (
Union
[Literal
['all'
],Sequence
[Literal
['X'
,'raw'
,'layers'
,'obs'
,'var'
,'obsm'
,'varm'
,'obsp'
,'varp'
,'uns'
]]]) –Which attributes to keep from the underlying
adata
. Valid options are:'all'
 keep all attributes specified in the signature.Sequence
 keep only subset of these attributes.dict
 the keys correspond the attribute names and values to a subset of keys which to keep from this attribute. If the values are specified either asTrue
or'all'
, everything from this attribute will be kept.
copy (
Union
[bool
,Sequence
[Literal
['X'
,'raw'
,'layers'
,'obs'
,'var'
,'obsm'
,'varm'
,'obsp'
,'varp'
,'uns'
]]]) – Whether to copy the data. Can be specified on perattribute basis. Useful for attributes that are arraylike.
 Return type:
 Returns:
: Annotated data object.
 class cellrank.estimators.TermStatesEstimator(object, **kwargs)[source]#
Base class for all estimators predicting the initial and terminal states.
 Parameters:
object (
Union
[AnnData
,ndarray
,spmatrix
,KernelExpression
]) –Can be one of the following types:
AnnData
 annotated data object.KernelExpression
 kernel expression.str
 key inobsp
where the transition matrix is stored andadata
must be provided in this case.bool
 directionality of the transition matrix that will be used to infer its storage location. IfNone
, the directionality will be determined automatically andadata
must be provided in this case.
kwargs (
Any
) – Keyword arguments for thePrecomputedKernel
.
 property initial_states: Series  None#
Categorical annotation of initial states.
By default, all transient cells will be labeled as NaN.
 plot_macrostates(which, states=None, color=None, discrete=True, mode=PlotMode.EMBEDDING, time_key='latent_time', same_plot=True, title=None, cmap='viridis', **kwargs)[source]#
Plot macrostates on an embedding or along pseudotime.
 Parameters:
which (
Literal
['all'
,'initial'
,'terminal'
]) –Which macrostates to plot. Valid options are:
'all'
 plot all macrostates.'initial'
 plot macrostates marked asinitial_states
.'terminal'
 plot macrostates marked asterminal_states
.
states (
Union
[str
,Sequence
[str
],None
]) – Subset of the macrostates to show. IfNone
, plot all macrostates.color (
Optional
[str
]) – Key inobs
orvar
used to color the observations.discrete (
bool
) – Whether to plot the data as continuous or discrete observations. If the data cannot be plotted as continuous observations, it will be plotted as discrete.mode (
Literal
['embedding'
,'time'
]) – Whether to plot the probabilities in an embedding or along the pseudotime.time_key (
str
) – Key inobs
where pseudotime is stored. Only used whenmode = 'time'
.title (
Union
[str
,Sequence
[str
],None
]) – Title of the plot.same_plot (
bool
) – Whether to plot the data on the same plot or not. Only use whenmode = 'embedding'
. If True anddiscrete = False
,color
is ignored.cmap (
str
) – Colormap for continuous annotations.
 Return type:
 Returns:
: Nothing, just plots the figure. Optionally saves it based on
save
.
 rename_initial_states(old_new)[source]#
Rename the
initial_states
. Parameters:
old_new (
Dict
[str
,str
]) – Dictionary that maps old names to unique new names. Return type:
 Returns:
: Returns self and updates the following fields:
initial_states
 Categorical annotation of initial states.
 rename_terminal_states(old_new)[source]#
Rename the
terminal_states
. Parameters:
old_new (
Dict
[str
,str
]) – Dictionary that maps old names to unique new names. Return type:
 Returns:
: Returns self and updates the following fields:
terminal_states
 Categorical annotation of terminal states.
 set_initial_states(states, cluster_key=None, allow_overlap=False, **kwargs)[source]#
Set the
initial_states
. Parameters:
states (
Union
[Series
,Dict
[str
,Sequence
[Any
]]]) –Which states to select. Valid options are:
categorical
Series
where each category corresponds to an individual state. NaN entries denote cells that do not belong to any state, i.e., transient cells.dict
where keys are states and values are lists of cell barcodes corresponding to annotations inobs_names
. If only 1 key is provided, values should correspond to clusters if a categoricalSeries
can be found inobs
.
cluster_key (
Optional
[str
]) – Key inobs
to associate names and colorsinitial_states
. Each state will be given the name and color corresponding to the cluster it mostly overlaps with.allow_overlap (
bool
) – Whether to allow overlapping names between initial and terminal states.kwargs (
Any
) – Additional keyword arguments.
 Return type:
 Returns:
: Returns self and updates the following fields:
initial_states
 Categorical annotation of initial states.initial_states_probabilities
 Probability to be an initial state.
 set_terminal_states(states, cluster_key=None, allow_overlap=False, **kwargs)[source]#
Set the
terminal_states
. Parameters:
states (
Union
[Series
,Dict
[str
,Sequence
[Any
]]]) –States to select. Valid options are:
categorical
Series
where each category corresponds to an individual state. NaN entries denote cells that do not belong to any state, i.e., transient cells.dict
where keys are states and values are lists of cell barcodes corresponding to annotations inobs_names
. If only 1 key is provided, values should correspond to clusters if a categoricalSeries
can be found inobs
.
cluster_key (
Optional
[str
]) – Key inobs
to associate names and colors withterminal_states
. Each state will be given the name and color corresponding to the cluster it mostly overlaps with.allow_overlap (
bool
) – Whether to allow overlapping names between initial and terminal states.kwargs (
Any
) – Additional keyword arguments.
 Return type:
 Returns:
: Returns self and updates the following fields:
terminal_states
 Categorical annotation of terminal states.terminal_states_probabilities
 Probability to be a terminal state.
Models#
 class cellrank.models.BaseModel(adata, model)[source]#
Base class for all model classes.
 Parameters:
 property conf_int: ndarray#
Array of shape
(n_samples, 2)
containing the lower and upper bound of the confidence interval.
 abstract confidence_interval(x_test=None, **kwargs)[source]#
Calculate the confidence interval.
Use
default_confidence_interval()
function if underlyingmodel
has no method for confidence interval calculation. Parameters:
 Return type:
 Returns:
: Returns self and updates the following fields:
conf_int
 Array of shape(n_samples, 2)
containing the lower and upper bound of the confidence interval.
 default_confidence_interval(x_test=None, **kwargs)[source]#
Calculate the confidence interval, if the underlying
model
has no method for it.This formula is taken from [DeSalvo, 1970], eq. 5.
 Parameters:
 Return type:
 Returns:
: Returns self and updates the following fields:
Also updates the following fields:
 abstract fit(x=None, y=None, w=None, **kwargs)[source]#
Fit the model.
 Parameters:
x (
Optional
[ndarray
]) – Independent variables, array of shape(n_samples, 1)
. IfNone
, usex
.y (
Optional
[ndarray
]) – Dependent variables, array of shape(n_samples, 1)
. IfNone
, usey
.w (
Optional
[ndarray
]) – Optional weights ofx
, array of shape(n_samples,)
. IfNone
, usew
.kwargs (
Any
) – Keyword arguments for underlyingmodel
’s fitting function.
 Return type:
 Returns:
: Fits the
model
and returns self.
 plot(figsize=(8, 5), same_plot=False, hide_cells=False, perc=None, fate_prob_cmap=<matplotlib.colors.ListedColormap object>, cell_color=None, lineage_color='black', alpha=0.8, lineage_alpha=0.2, title=None, size=15, lw=2, cbar=True, margins=0.015, xlabel='pseudotime', ylabel='expression', conf_int=True, lineage_probability=False, lineage_probability_conf_int=False, lineage_probability_color=None, obs_legend_loc='best', dpi=None, fig=None, ax=None, return_fig=False, save=None, **kwargs)[source]#
Plot the smoothed gene expression.
 Parameters:
same_plot (
bool
) – Whether to plot all trends in the same plot.hide_cells (
bool
) – Whether to hide the cells.perc (
Optional
[Tuple
[float
,float
]]) – Percentile by which to clip the fate probabilities.fate_prob_cmap (
ListedColormap
) – Colormap to use when coloring in the fate probabilities.cell_color (
Optional
[str
]) – Key inobs
orvar_names
used for coloring the cells.lineage_color (
str
) – Color for the lineage.alpha (
float
) – Alpha value in \([0, 1]\) for the transparency of cells.lineage_alpha (
float
) – Alpha value in \([0, 1]\) for the transparency lineage confidence intervals.size (
int
) – Size of the points.lw (
float
) – Line width for the smoothed values.cbar (
bool
) – Whether to show the colorbar.margins (
float
) – Margins around the plot.xlabel (
str
) – Label on the xaxis.ylabel (
str
) – Label on the yaxis.conf_int (
bool
) – Whether to show the confidence interval.lineage_probability (
bool
) – Whether to show smoothed lineage probability as a dashed line. Note that this will require 1 additional model fit.lineage_probability_conf_int (
Union
[bool
,float
]) – Whether to compute and show smoothed lineage probability confidence interval.lineage_probability_color (
Optional
[str
]) – Color to use when plotting the smoothedlineage_probability
. IfNone
, it’s the same aslineage_color
. Only used whenshow_lineage_probability = True
.obs_legend_loc (
Optional
[str
]) – Location of the legend whencell_color
corresponds to a categorical variable.fig (
Optional
[Figure
]) – Figure to use. IfNone
, create a new one.save (
Optional
[str
]) – Filename where to save the plot. IfNone
, just shows the plots.
 Return type:
 Returns:
: Nothing, just plots the figure. Optionally saves it based on
save
.
 abstract predict(x_test=None, key_added='_x_test', **kwargs)[source]#
Run the prediction.
 Parameters:
 Return type:
 Returns:
: Returns and updates the following fields:
 prepare(gene, lineage, time_key, backward=False, time_range=None, data_key='X', use_raw=False, threshold=None, weight_threshold=(0.01, 0.01), filter_cells=None, n_test_points=200)[source]#
Prepare the model to be ready for fitting.
 Parameters:
lineage (
Optional
[str
]) – Name of the lineage. IfNone
, all weights will be set to \(1\).backward (
bool
) – Direction of the process.time_range (
Union
[float
,Tuple
[float
,float
],None
]) –Specify start and end times:
data_key (
Optional
[str
]) – Key inlayers
or'X'
forX
. Ifuse_raw = True
, it’s always set to'X'
.threshold (
Optional
[float
]) – Consider only cells with weights >threshold
when estimating the test endpoint. IfNone
, use the median of the weights.weight_threshold (
Union
[float
,Tuple
[float
,float
]]) – Set all weights belowweight_threshold
toweight_threshold
if afloat
, or to the second value, if atuple
.filter_cells (
Optional
[float
]) – Filter out all cells with expression values lower than this threshold.n_test_points (
int
) – Number of test points. IfNone
, use the original points based onthreshold
.
 Return type:
 Returns:
: Nothing, just updates the following fields:
x
 Filtered independent variables of shape(n_filtered_cells, 1)
used for fitting.y
 Filtered dependent variables of shape(n_filtered_cells, 1)
used for fitting.w
 Filtered weights of shape(n_filtered_cells,)
used for fitting.x_all
 Unfiltered independent variables of shape(n_cells, 1)
.y_all
 Unfiltered dependent variables of shape(n_cells, 1)
.w_all
 Unfiltered weights of shape(n_cells,)
.x_test
 Independent variables of shape(n_samples, 1)
used for prediction.prepared
 Whether the model is prepared for fitting.
 property prepared#
Whether the model is prepared for fitting.
 property x: ndarray#
Filtered independent variables of shape
(n_filtered_cells, 1)
used for fitting.
 property x_hat: ndarray#
Filtered independent variables used when calculating default confidence interval, usually same as
x
.
Lineage#
 class cellrank.Lineage(input_array, *, names, colors=None)[source]#
Lightweight
ndarray
wrapper that adds names and colors. Parameters:
 Return type:
 property T#
Transpose of self.
 classmethod from_adata(adata, backward=False, estimator_backward=None, kind=LinKind.FATE_PROBS, copy=False)[source]#
Reconstruct the
Lineage
object fromAnnData
object. Parameters:
adata (
AnnData
) – Annotated data object.backward (
bool
) – Direction of the process.estimator_backward (
Optional
[bool
]) – Key which helps to determine whether these states are initial or terminal.kind (
Literal
['macrostates'
,'term_states'
,'fate_probs'
]) –Which kind of object to reconstruct. Valid options are:
'macrostates'
 macrostates memberships fromcellrank.estimators.GPCCA
.'term_states'
 terminal states memberships fromcellrank.estimators.GPCCA
.'fate_probs'
 fate probabilities.
copy (
bool
) – Whether to return a copy of the underlying array.
 Return type:
 Returns:
: The reconstructed lineage object.
 plot_pie(reduction, title=None, legend_loc='on data', legend_kwargs=mappingproxy({}), figsize=None, dpi=None, save=None, **kwargs)[source]#
Plot a pie chart visualizing aggregated lineage probabilities.
 Parameters:
 Return type:
 Returns:
: Nothing, just plots the figure. Optionally saves it based on
save
.
 priming_degree(method='kl_divergence', early_cells=None)[source]#
Compute the degree of lineage priming.
It returns a score in \([0, 1]\) where \(0\) stands for naive and \(1\) stands for committed.
 Parameters:
method (
Literal
['kl_divergence'
,'entropy'
]) –The method used to compute the degree of lineage priming. Valid options are:
'kl_divergence'
 as in [Velten et al., 2017], computes KLdivergence between the fate probabilities of a cell and the average fate probabilities. Computation of average fate probabilities can be restricted to a set of userdefinedearly_cells
.'entropy'
 as in [Setty et al., 2019], computes entropy over a cell’s fate probabilities.
early_cells (
Optional
[ndarray
]) – Cell IDs or a mask marking early cells. IfNone
, use all cells. Only used whenmethod = 'kl_divergence'
.
 Return type:
 Returns:
: The priming degree.
 reduce(*keys, mode=Reduction.DIST, dist_measure=DistanceMeasure.MUTUAL_INFO, normalize_weights=NormWeights.SOFTMAX, softmax_scale=1.0, return_weights=False)[source]#
Subset states and normalize them so that they again sum to \(1\).
 Parameters:
keys (
str
) – List of keys that define the states, to which this object will be reduced by projecting the values of the other states.mode (
Literal
['dist'
,'scale'
]) –Reduction mode to use. Valid options are:
'dist'
 use a distance measuredist_measure
to compute weights.'scale'
 just rescale the values.
dist_measure (
Literal
['cosine_sim'
,'wasserstein_dist'
,'kl_div'
,'js_div'
,'mutual_info'
,'equal'
]) –Used to quantify similarity between query and reference states. Valid options are:
'cosine_sim'
 cosine similarity.'wasserstein_dist'
 Wasserstein distance.'kl_div'
 Kullback–Leibler divergence.'js_div'
 Jensen–Shannon divergence.'mutual_info'
 mutual information.'equal'
 equally redistribute the mass among the rest.
Only use when
mode = 'dist'
.normalize_weights (
Literal
['scale'
,'softmax'
]) –How to rownormalize the weights. Valid options are:
'scale'
 divide by the sum.'softmax'
 use a softmax.
Only used when
mode = 'dist'
.softmax_scale (
float
) – Scaling factor in the softmax, used for normalizing the weights to sum to \(1\).return_weights (
bool
) – If True, aDataFrame
of the weights used for the projection is also returned. Ifmode = 'scale'
, the weights will be None.
 Return type:
 Returns:
: The lineage object, reduced to the initial or terminal states. The weights used for the projection of shape
(n_query, n_reference)
, ifreturn_weights = True
.