Developer API#
Kernels#
- class cellrank.kernels.Kernel(adata, parent=None, **kwargs)[source]#
Base kernel class.
- Parameters
- abstract compute_transition_matrix(*args, **kwargs)#
Compute transition matrix.
- Parameters
- Return type
KernelExpression
- Returns
: Modifies
transition_matrix
and returns self.
- copy(*, deep=False)[source]#
Return a copy of self.
- Parameters
deep (
bool
) – Whether to usedeepcopy()
.- Return type
- Returns
: Copy of self.
- classmethod from_adata(adata, key, copy=False)[source]#
Read the kernel saved using
write_to_adata()
.- Parameters
adata (
AnnData
) – Annotated data object.key (
str
) – Key inobsp
where the transition matrix is stored. The parameters should be stored inadata.uns['{key}_params']
.copy (
bool
) – Whether to copy the transition matrix.
- Return type
- Returns
: The kernel with explicitly initialized properties:
transition_matrix
- the transition matrix.params
- parameters used for computation.
- plot_projection(basis='umap', key_added=None, recompute=False, stream=True, connectivities=None, **kwargs)#
Plot
transition_matrix
as a stream or a grid plot.- Parameters
key_added (
Optional
[str
]) – If notNone
, save the result toadata.obsm['{key_added}']
. Otherwise, save the result to'T_fwd_{basis}'
or'T_bwd_{basis}'
, depending on the direction.recompute (
bool
) – Whether to recompute the projection if it already exists.stream (
bool
) – IfTrue
, usevelocity_embedding_stream()
. Otherwise, usevelocity_embedding_grid()
.connectivities (
Optional
[spmatrix
]) – Connectivity matrix to use for projection. IfNone
, use ones from the underlying kernel, is possible.kwargs (
Any
) – Keyword argument for the above-mentioned plotting function.
- Return type
- Returns
: Nothing, just plots and modifies
obsm
with a key based on thekey_added
.
- plot_random_walks(n_sims=100, max_iter=0.25, seed=None, successive_hits=0, start_ixs=None, stop_ixs=None, basis='umap', cmap='gnuplot', linewidth=1.0, linealpha=0.3, ixs_legend_loc=None, n_jobs=None, backend='loky', show_progress_bar=True, figsize=None, dpi=None, save=None, **kwargs)#
Plot random walks in an embedding.
This method simulates random walks on the Markov chain defined though the corresponding transition matrix. The method is intended to give qualitative rather than quantitative insights into the transition matrix. Random walks are simulated by iteratively choosing the next cell based on the current cell’s transition probabilities.
- Parameters
n_sims (
int
) – Number of random walks to simulate.max_iter (
Union
[int
,float
]) – Maximum number of steps of a random walk. If afloat
, it can be specified as a fraction of the number of cells.successive_hits (
int
) – Number of successive hits in thestop_ixs
required to stop prematurely.start_ixs (
Union
[Sequence
[str
],Mapping
[str
,Union
[str
,Sequence
[str
],Tuple
[float
,float
]]],None
]) –Cells from which to sample the starting points. If
None
, use all cells. Can be specified as:dict
- dictionary with 1 key inobs
with values corresponding to either 1 or more clusters (if the column is categorical) or atuple
specifying \([min, max]\) interval from which to select the indices.
For example
{'dpt_pseudotime': [0, 0.1]}
means that starting points for random walks will be sampled uniformly from cells whose pseudotime is in \([0, 0.1]\).stop_ixs (
Union
[Sequence
[str
],Mapping
[str
,Union
[str
,Sequence
[str
],Tuple
[float
,float
]]],None
]) –Cells which when hit, the random walk is terminated. If
None
, terminate aftermax_iters
. Can be specified as:dict
- dictionary with 1 key inobs
with values corresponding to either 1 or more clusters (if the column is categorical) or atuple
specifying \([min, max]\) interval from which to select the indices.
For example
{'clusters': ['Alpha', 'Beta']}
andsuccessive_hits = 3
means that the random walk will stop prematurely after cells in the above specified clusters have been visited successively 3 times in a row.cmap (
Union
[str
,LinearSegmentedColormap
]) – Colormap for the random walk lines.linewidth (
float
) – Width of the random walk lines.linealpha (
float
) – Alpha value of the random walk lines.ixs_legend_loc (
Optional
[str
]) – Legend location for the start/top indices.show_progress_bar (
bool
) – Whether to show a progress bar. Disabling it may slightly improve performance.n_jobs (
Optional
[int
]) – Number of parallel jobs. If -1, use all available cores. IfNone
or 1, the execution is sequential.backend (
str
) – Which backend to use for parallelization. SeeParallel
for valid options.figsize (
Optional
[Tuple
[float
,float
]]) – Size of the figure.save (
Union
[Path
,str
,None
]) – Filename where to save the plot.
- Return type
- Returns
: Nothing, just plots the figure. Optionally saves it based on
save
. For each random walk, the first/last cell is marked by the start/end colors ofcmap
.
- plot_single_flow(cluster, cluster_key, time_key, clusters=None, time_points=None, min_flow=0, remove_empty_clusters=True, ascending=False, legend_loc='upper right out', alpha=0.8, xticks_step_size=1, figsize=None, dpi=None, save=None, show=True)#
Visualize outgoing flow from a cluster of cells [Mittnenzweig et al., 2021].
- Parameters
cluster (
str
) – Cluster for which to visualize outgoing flow.time_key (
str
) – Key inobs
where experimental time is stored.clusters (
Optional
[Sequence
[Any
]]) – Visualize flow only for these clusters. IfNone
, use all clusters.time_points (
Optional
[Sequence
[Union
[float
,int
]]]) – Visualize flow only for these time points. IfNone
, use all time points.min_flow (
float
) – Only show flow edges with flow greater than this value. Flow values are always in \([0, 1]\).remove_empty_clusters (
bool
) – Whether to remove clusters with no incoming flow edges.ascending (
Optional
[bool
]) – Whether to sort the cluster by ascending or descending incoming flow. If None, use the order as in defined byclusters
.xticks_step_size (
Optional
[int
]) – Show only every other n-th tick on the x-axis. IfNone
, don’t show any ticks.legend_loc (
Optional
[str
]) – Position of the legend. IfNone
, do not show the legend.figsize (
Optional
[Tuple
[float
,float
]]) – Size of the figure.figsize – Size of the figure.
dpi – Dots per inch.
save (
Union
[Path
,str
,None
]) – Filename where to save the plot.
- Return type
- Returns
: The axes object, if
show = False
. Nothing, just plots the figure. Optionally saves it based onsave
.
Notes
This function is a Python re-implementation of the following original R function with some minor stylistic differences. This function will not recreate the results from [Mittnenzweig et al., 2021], because there, the Metacell model [Baran et al., 2019] was used to compute the flow, whereas here the transition matrix is used.
- static read(fname, adata=None, copy=False)#
De-serialize self from a file.
- Parameters
fname (
Union
[str
,Path
]) – Path from which to read the object.adata (
Optional
[AnnData
]) –AnnData
object to assign to the saved object. Only used when the saved object hasadata
and it was saved without it.copy (
bool
) – Whether to copyadata
before assigning it. Ifadata
is a view, it is always copied.
- Return type
IOMixin
- Returns
: The de-serialized object.
- property transition_matrix: Union[ndarray, csr_matrix]#
Row-normalized transition matrix.
- write_to_adata(key=None, copy=False)#
Write the transition matrix and parameters used for computation to the underlying
adata
object.- Parameters
- Return type
- Returns
: Updates the
adata
with the following fields:obsp['{key}']
- the transition matrix.uns['{key}_params']
- parameters used for the calculation.
- class cellrank.kernels.UnidirectionalKernel(adata, parent=None, **kwargs)[source]#
-
- abstract compute_transition_matrix(*args, **kwargs)#
Compute transition matrix.
- Parameters
- Return type
KernelExpression
- Returns
: Modifies
transition_matrix
and returns self.
- copy(*, deep=False)#
Return a copy of self.
- Parameters
deep (
bool
) – Whether to usedeepcopy()
.- Return type
- Returns
: Copy of self.
- classmethod from_adata(adata, key, copy=False)#
Read the kernel saved using
write_to_adata()
.- Parameters
adata (
AnnData
) – Annotated data object.key (
str
) – Key inobsp
where the transition matrix is stored. The parameters should be stored inadata.uns['{key}_params']
.copy (
bool
) – Whether to copy the transition matrix.
- Return type
- Returns
: The kernel with explicitly initialized properties:
transition_matrix
- the transition matrix.params
- parameters used for computation.
- plot_projection(basis='umap', key_added=None, recompute=False, stream=True, connectivities=None, **kwargs)#
Plot
transition_matrix
as a stream or a grid plot.- Parameters
key_added (
Optional
[str
]) – If notNone
, save the result toadata.obsm['{key_added}']
. Otherwise, save the result to'T_fwd_{basis}'
or'T_bwd_{basis}'
, depending on the direction.recompute (
bool
) – Whether to recompute the projection if it already exists.stream (
bool
) – IfTrue
, usevelocity_embedding_stream()
. Otherwise, usevelocity_embedding_grid()
.connectivities (
Optional
[spmatrix
]) – Connectivity matrix to use for projection. IfNone
, use ones from the underlying kernel, is possible.kwargs (
Any
) – Keyword argument for the above-mentioned plotting function.
- Return type
- Returns
: Nothing, just plots and modifies
obsm
with a key based on thekey_added
.
- plot_random_walks(n_sims=100, max_iter=0.25, seed=None, successive_hits=0, start_ixs=None, stop_ixs=None, basis='umap', cmap='gnuplot', linewidth=1.0, linealpha=0.3, ixs_legend_loc=None, n_jobs=None, backend='loky', show_progress_bar=True, figsize=None, dpi=None, save=None, **kwargs)#
Plot random walks in an embedding.
This method simulates random walks on the Markov chain defined though the corresponding transition matrix. The method is intended to give qualitative rather than quantitative insights into the transition matrix. Random walks are simulated by iteratively choosing the next cell based on the current cell’s transition probabilities.
- Parameters
n_sims (
int
) – Number of random walks to simulate.max_iter (
Union
[int
,float
]) – Maximum number of steps of a random walk. If afloat
, it can be specified as a fraction of the number of cells.successive_hits (
int
) – Number of successive hits in thestop_ixs
required to stop prematurely.start_ixs (
Union
[Sequence
[str
],Mapping
[str
,Union
[str
,Sequence
[str
],Tuple
[float
,float
]]],None
]) –Cells from which to sample the starting points. If
None
, use all cells. Can be specified as:dict
- dictionary with 1 key inobs
with values corresponding to either 1 or more clusters (if the column is categorical) or atuple
specifying \([min, max]\) interval from which to select the indices.
For example
{'dpt_pseudotime': [0, 0.1]}
means that starting points for random walks will be sampled uniformly from cells whose pseudotime is in \([0, 0.1]\).stop_ixs (
Union
[Sequence
[str
],Mapping
[str
,Union
[str
,Sequence
[str
],Tuple
[float
,float
]]],None
]) –Cells which when hit, the random walk is terminated. If
None
, terminate aftermax_iters
. Can be specified as:dict
- dictionary with 1 key inobs
with values corresponding to either 1 or more clusters (if the column is categorical) or atuple
specifying \([min, max]\) interval from which to select the indices.
For example
{'clusters': ['Alpha', 'Beta']}
andsuccessive_hits = 3
means that the random walk will stop prematurely after cells in the above specified clusters have been visited successively 3 times in a row.cmap (
Union
[str
,LinearSegmentedColormap
]) – Colormap for the random walk lines.linewidth (
float
) – Width of the random walk lines.linealpha (
float
) – Alpha value of the random walk lines.ixs_legend_loc (
Optional
[str
]) – Legend location for the start/top indices.show_progress_bar (
bool
) – Whether to show a progress bar. Disabling it may slightly improve performance.n_jobs (
Optional
[int
]) – Number of parallel jobs. If -1, use all available cores. IfNone
or 1, the execution is sequential.backend (
str
) – Which backend to use for parallelization. SeeParallel
for valid options.figsize (
Optional
[Tuple
[float
,float
]]) – Size of the figure.save (
Union
[Path
,str
,None
]) – Filename where to save the plot.
- Return type
- Returns
: Nothing, just plots the figure. Optionally saves it based on
save
. For each random walk, the first/last cell is marked by the start/end colors ofcmap
.
- plot_single_flow(cluster, cluster_key, time_key, clusters=None, time_points=None, min_flow=0, remove_empty_clusters=True, ascending=False, legend_loc='upper right out', alpha=0.8, xticks_step_size=1, figsize=None, dpi=None, save=None, show=True)#
Visualize outgoing flow from a cluster of cells [Mittnenzweig et al., 2021].
- Parameters
cluster (
str
) – Cluster for which to visualize outgoing flow.time_key (
str
) – Key inobs
where experimental time is stored.clusters (
Optional
[Sequence
[Any
]]) – Visualize flow only for these clusters. IfNone
, use all clusters.time_points (
Optional
[Sequence
[Union
[float
,int
]]]) – Visualize flow only for these time points. IfNone
, use all time points.min_flow (
float
) – Only show flow edges with flow greater than this value. Flow values are always in \([0, 1]\).remove_empty_clusters (
bool
) – Whether to remove clusters with no incoming flow edges.ascending (
Optional
[bool
]) – Whether to sort the cluster by ascending or descending incoming flow. If None, use the order as in defined byclusters
.xticks_step_size (
Optional
[int
]) – Show only every other n-th tick on the x-axis. IfNone
, don’t show any ticks.legend_loc (
Optional
[str
]) – Position of the legend. IfNone
, do not show the legend.figsize (
Optional
[Tuple
[float
,float
]]) – Size of the figure.figsize – Size of the figure.
dpi – Dots per inch.
save (
Union
[Path
,str
,None
]) – Filename where to save the plot.
- Return type
- Returns
: The axes object, if
show = False
. Nothing, just plots the figure. Optionally saves it based onsave
.
Notes
This function is a Python re-implementation of the following original R function with some minor stylistic differences. This function will not recreate the results from [Mittnenzweig et al., 2021], because there, the Metacell model [Baran et al., 2019] was used to compute the flow, whereas here the transition matrix is used.
- static read(fname, adata=None, copy=False)#
De-serialize self from a file.
- Parameters
fname (
Union
[str
,Path
]) – Path from which to read the object.adata (
Optional
[AnnData
]) –AnnData
object to assign to the saved object. Only used when the saved object hasadata
and it was saved without it.copy (
bool
) – Whether to copyadata
before assigning it. Ifadata
is a view, it is always copied.
- Return type
IOMixin
- Returns
: The de-serialized object.
- property transition_matrix: Union[ndarray, csr_matrix]#
Row-normalized transition matrix.
- write_to_adata(key=None, copy=False)#
Write the transition matrix and parameters used for computation to the underlying
adata
object.- Parameters
- Return type
- Returns
: Updates the
adata
with the following fields:obsp['{key}']
- the transition matrix.uns['{key}_params']
- parameters used for the calculation.
- class cellrank.kernels.BidirectionalKernel(*args, backward=False, **kwargs)[source]#
-
- abstract compute_transition_matrix(*args, **kwargs)#
Compute transition matrix.
- Parameters
- Return type
KernelExpression
- Returns
: Modifies
transition_matrix
and returns self.
- copy(*, deep=False)#
Return a copy of self.
- Parameters
deep (
bool
) – Whether to usedeepcopy()
.- Return type
- Returns
: Copy of self.
- classmethod from_adata(adata, key, copy=False)#
Read the kernel saved using
write_to_adata()
.- Parameters
adata (
AnnData
) – Annotated data object.key (
str
) – Key inobsp
where the transition matrix is stored. The parameters should be stored inadata.uns['{key}_params']
.copy (
bool
) – Whether to copy the transition matrix.
- Return type
- Returns
: The kernel with explicitly initialized properties:
transition_matrix
- the transition matrix.params
- parameters used for computation.
- plot_projection(basis='umap', key_added=None, recompute=False, stream=True, connectivities=None, **kwargs)#
Plot
transition_matrix
as a stream or a grid plot.- Parameters
key_added (
Optional
[str
]) – If notNone
, save the result toadata.obsm['{key_added}']
. Otherwise, save the result to'T_fwd_{basis}'
or'T_bwd_{basis}'
, depending on the direction.recompute (
bool
) – Whether to recompute the projection if it already exists.stream (
bool
) – IfTrue
, usevelocity_embedding_stream()
. Otherwise, usevelocity_embedding_grid()
.connectivities (
Optional
[spmatrix
]) – Connectivity matrix to use for projection. IfNone
, use ones from the underlying kernel, is possible.kwargs (
Any
) – Keyword argument for the above-mentioned plotting function.
- Return type
- Returns
: Nothing, just plots and modifies
obsm
with a key based on thekey_added
.
- plot_random_walks(n_sims=100, max_iter=0.25, seed=None, successive_hits=0, start_ixs=None, stop_ixs=None, basis='umap', cmap='gnuplot', linewidth=1.0, linealpha=0.3, ixs_legend_loc=None, n_jobs=None, backend='loky', show_progress_bar=True, figsize=None, dpi=None, save=None, **kwargs)#
Plot random walks in an embedding.
This method simulates random walks on the Markov chain defined though the corresponding transition matrix. The method is intended to give qualitative rather than quantitative insights into the transition matrix. Random walks are simulated by iteratively choosing the next cell based on the current cell’s transition probabilities.
- Parameters
n_sims (
int
) – Number of random walks to simulate.max_iter (
Union
[int
,float
]) – Maximum number of steps of a random walk. If afloat
, it can be specified as a fraction of the number of cells.successive_hits (
int
) – Number of successive hits in thestop_ixs
required to stop prematurely.start_ixs (
Union
[Sequence
[str
],Mapping
[str
,Union
[str
,Sequence
[str
],Tuple
[float
,float
]]],None
]) –Cells from which to sample the starting points. If
None
, use all cells. Can be specified as:dict
- dictionary with 1 key inobs
with values corresponding to either 1 or more clusters (if the column is categorical) or atuple
specifying \([min, max]\) interval from which to select the indices.
For example
{'dpt_pseudotime': [0, 0.1]}
means that starting points for random walks will be sampled uniformly from cells whose pseudotime is in \([0, 0.1]\).stop_ixs (
Union
[Sequence
[str
],Mapping
[str
,Union
[str
,Sequence
[str
],Tuple
[float
,float
]]],None
]) –Cells which when hit, the random walk is terminated. If
None
, terminate aftermax_iters
. Can be specified as:dict
- dictionary with 1 key inobs
with values corresponding to either 1 or more clusters (if the column is categorical) or atuple
specifying \([min, max]\) interval from which to select the indices.
For example
{'clusters': ['Alpha', 'Beta']}
andsuccessive_hits = 3
means that the random walk will stop prematurely after cells in the above specified clusters have been visited successively 3 times in a row.cmap (
Union
[str
,LinearSegmentedColormap
]) – Colormap for the random walk lines.linewidth (
float
) – Width of the random walk lines.linealpha (
float
) – Alpha value of the random walk lines.ixs_legend_loc (
Optional
[str
]) – Legend location for the start/top indices.show_progress_bar (
bool
) – Whether to show a progress bar. Disabling it may slightly improve performance.n_jobs (
Optional
[int
]) – Number of parallel jobs. If -1, use all available cores. IfNone
or 1, the execution is sequential.backend (
str
) – Which backend to use for parallelization. SeeParallel
for valid options.figsize (
Optional
[Tuple
[float
,float
]]) – Size of the figure.save (
Union
[Path
,str
,None
]) – Filename where to save the plot.
- Return type
- Returns
: Nothing, just plots the figure. Optionally saves it based on
save
. For each random walk, the first/last cell is marked by the start/end colors ofcmap
.
- plot_single_flow(cluster, cluster_key, time_key, clusters=None, time_points=None, min_flow=0, remove_empty_clusters=True, ascending=False, legend_loc='upper right out', alpha=0.8, xticks_step_size=1, figsize=None, dpi=None, save=None, show=True)#
Visualize outgoing flow from a cluster of cells [Mittnenzweig et al., 2021].
- Parameters
cluster (
str
) – Cluster for which to visualize outgoing flow.time_key (
str
) – Key inobs
where experimental time is stored.clusters (
Optional
[Sequence
[Any
]]) – Visualize flow only for these clusters. IfNone
, use all clusters.time_points (
Optional
[Sequence
[Union
[float
,int
]]]) – Visualize flow only for these time points. IfNone
, use all time points.min_flow (
float
) – Only show flow edges with flow greater than this value. Flow values are always in \([0, 1]\).remove_empty_clusters (
bool
) – Whether to remove clusters with no incoming flow edges.ascending (
Optional
[bool
]) – Whether to sort the cluster by ascending or descending incoming flow. If None, use the order as in defined byclusters
.xticks_step_size (
Optional
[int
]) – Show only every other n-th tick on the x-axis. IfNone
, don’t show any ticks.legend_loc (
Optional
[str
]) – Position of the legend. IfNone
, do not show the legend.figsize (
Optional
[Tuple
[float
,float
]]) – Size of the figure.figsize – Size of the figure.
dpi – Dots per inch.
save (
Union
[Path
,str
,None
]) – Filename where to save the plot.
- Return type
- Returns
: The axes object, if
show = False
. Nothing, just plots the figure. Optionally saves it based onsave
.
Notes
This function is a Python re-implementation of the following original R function with some minor stylistic differences. This function will not recreate the results from [Mittnenzweig et al., 2021], because there, the Metacell model [Baran et al., 2019] was used to compute the flow, whereas here the transition matrix is used.
- static read(fname, adata=None, copy=False)#
De-serialize self from a file.
- Parameters
fname (
Union
[str
,Path
]) – Path from which to read the object.adata (
Optional
[AnnData
]) –AnnData
object to assign to the saved object. Only used when the saved object hasadata
and it was saved without it.copy (
bool
) – Whether to copyadata
before assigning it. Ifadata
is a view, it is always copied.
- Return type
IOMixin
- Returns
: The de-serialized object.
- property transition_matrix: Union[ndarray, csr_matrix]#
Row-normalized transition matrix.
- write_to_adata(key=None, copy=False)#
Write the transition matrix and parameters used for computation to the underlying
adata
object.- Parameters
- Return type
- Returns
: Updates the
adata
with the following fields:obsp['{key}']
- the transition matrix.uns['{key}_params']
- parameters used for the calculation.
Similarity#
- class cellrank.kernels.utils.SimilarityABC[source]#
Base class for all similarity schemes.
- abstract __call__(v, D, softmax_scale=1.0)[source]#
Compute transition probability of a cell to its nearest neighbors using RNA velocity.
- Parameters
v (
ndarray
) – Array of shape(n_genes,)
or(n_neighbors, n_genes)
containing the velocity vector(s). The second case is used for the backward process.D (
ndarray
) – Array of shape(n_neighbors, n_genes)
corresponding to the transcriptomic displacement of the current cell with respect to ist nearest neighbors.softmax_scale (
float
) – Scaling factor for the softmax function.
- Return type
- Returns
: The probability and unscaled logits arrays of shape
(n_neighbors,)
.
- class cellrank.kernels.utils.Cosine[source]#
Cosine similarity scheme as defined in eq. (4.7) [Li et al., 2021].
\[v(s_i, s_j) := g(cos(\delta_{i, j}, v_i))\]where \(v_i\) is the velocity vector of cell \(i\), \(\delta_{i, j}\) corresponds to the transcriptional displacement between cells \(i\) and \(j\) and \(g\) is a softmax function with some scaling parameter.
- __call__(v, D, softmax_scale=1.0)#
Compute transition probability of a cell to its nearest neighbors using RNA velocity.
- Parameters
v (
ndarray
) – Array of shape(n_genes,)
or(n_neighbors, n_genes)
containing the velocity vector(s). The second case is used for the backward process.D (
ndarray
) – Array of shape(n_neighbors, n_genes)
corresponding to the transcriptomic displacement of the current cell with respect to ist nearest neighbors.softmax_scale (
float
) – Scaling factor for the softmax function.
- Return type
- Returns
: The probability and unscaled logits arrays of shape
(n_neighbors,)
.
- hessian(v, D, softmax_scale=1.0)#
Compute the Hessian.
- Parameters
- Return type
- Returns
: The full Hessian of shape
(n_neighbors, n_genes, n_genes)
or only its diagonal of shape(n_neighbors, n_genes)
.
- class cellrank.kernels.utils.Correlation[source]#
Pearson correlation scheme as defined in eq. (4.8) [Li et al., 2021].
\[v(s_i, s_j) := g(corr(\delta_{i, j}, v_i))\]where \(v_i\) is the velocity vector of cell \(i\), \(\delta_{i, j}\) corresponds to the transcriptional displacement between cells \(i\) and \(j\) and \(g\) is a softmax function with some scaling parameter.
- __call__(v, D, softmax_scale=1.0)#
Compute transition probability of a cell to its nearest neighbors using RNA velocity.
- Parameters
v (
ndarray
) – Array of shape(n_genes,)
or(n_neighbors, n_genes)
containing the velocity vector(s). The second case is used for the backward process.D (
ndarray
) – Array of shape(n_neighbors, n_genes)
corresponding to the transcriptomic displacement of the current cell with respect to ist nearest neighbors.softmax_scale (
float
) – Scaling factor for the softmax function.
- Return type
- Returns
: The probability and unscaled logits arrays of shape
(n_neighbors,)
.
- hessian(v, D, softmax_scale=1.0)#
Compute the Hessian.
- Parameters
- Return type
- Returns
: The full Hessian of shape
(n_neighbors, n_genes, n_genes)
or only its diagonal of shape(n_neighbors, n_genes)
.
- class cellrank.kernels.utils.DotProduct[source]#
Dot product scheme as defined in eq. (4.9) [Li et al., 2021].
\[v(s_i, s_j) := g(\delta_{i, j}^T v_i)\]where \(v_i\) is the velocity vector of cell \(i\), \(\delta_{i, j}\) corresponds to the transcriptional displacement between cells \(i\) and \(j\) and \(g\) is a softmax function with some scaling parameter.
- __call__(v, D, softmax_scale=1.0)#
Compute transition probability of a cell to its nearest neighbors using RNA velocity.
- Parameters
v (
ndarray
) – Array of shape(n_genes,)
or(n_neighbors, n_genes)
containing the velocity vector(s). The second case is used for the backward process.D (
ndarray
) – Array of shape(n_neighbors, n_genes)
corresponding to the transcriptomic displacement of the current cell with respect to ist nearest neighbors.softmax_scale (
float
) – Scaling factor for the softmax function.
- Return type
- Returns
: The probability and unscaled logits arrays of shape
(n_neighbors,)
.
- hessian(v, D, softmax_scale=1.0)#
Compute the Hessian.
- Parameters
- Return type
- Returns
: The full Hessian of shape
(n_neighbors, n_genes, n_genes)
or only its diagonal of shape(n_neighbors, n_genes)
.
Threshold Scheme#
- class cellrank.kernels.utils.ThresholdSchemeABC[source]#
Base class for all connectivity biasing schemes.
- abstract __call__(cell_pseudotime, neigh_pseudotime, neigh_conn, **kwargs)[source]#
Calculate biased connections for a given cell.
- Parameters
- Return type
- Returns
: Array of shape
(n_neighbors,)
containing the biased connectivities.
- bias_knn(conn, pseudotime, n_jobs=None, backend='loky', show_progress_bar=True, **kwargs)[source]#
Bias cell-cell connectivities of a KNN graph.
- Parameters
conn (
csr_matrix
) – Sparse matrix of shape(n_cells, n_cells)
containing the nearest neighbor connectivities.pseudotime (
ndarray
) – Pseudotemporal ordering of cells.show_progress_bar (
bool
) – Whether to show a progress bar. Disabling it may slightly improve performance.n_jobs (
Optional
[int
]) – Number of parallel jobs. If -1, use all available cores. IfNone
or 1, the execution is sequential.backend (
str
) – Which backend to use for parallelization. SeeParallel
for valid options.kwargs (Any) –
- Return type
- Returns
: The biased connectivities.
- class cellrank.kernels.utils.HardThresholdScheme[source]#
Thresholding scheme inspired by Palantir [Setty et al., 2019].
Note that this won’t exactly reproduce the original Palantir results:
Palantir computes the kNN graph in a scaled space of diffusion components.
Palantir uses its own pseudotime to bias the kNN graph which is not implemented here.
Palantir uses a slightly different mechanism to ensure the graph remains connected when removing edges that point into the “pseudotime past”.
- __call__(cell_pseudotime, neigh_pseudotime, neigh_conn, frac_to_keep=0.3)[source]#
Convert the undirected graph of cell-cell similarities into a directed one by removing “past” edges.
This uses a pseudotemporal measure to remove graph-edges that point into the pseudotime-past. For each cell, it keeps the closest neighbors, even if they are in the pseudotime past, to make sure the graph remains connected.
- Parameters
cell_pseudotime (
float
) – Pseudotime of the current cell.neigh_pseudotime (
ndarray
) – Array of shape(n_neighbors,)
containing pseudotime of neighbors.neigh_conn (
ndarray
) – Array of shape(n_neighbors,)
containing connectivities of the current cell and its neighbors.frac_to_keep (
float
) – Thefrac_to_keep
* n_neighbors closest neighbors (according to graph connectivities) are kept, no matter whether they lie in the pseudotemporal past or future. Must be in \([0, 1]\).
- Return type
- Returns
: Array of shape
(n_neighbors,)
containing the biased connectivities.
- class cellrank.kernels.utils.SoftThresholdScheme[source]#
Thresholding scheme inspired by [Stassen et al., 2021].
The idea is to downweight edges that points against the direction of increasing pseudotime. Essentially, the further “behind” a query cell is in pseudotime with respect to the current reference cell, the more penalized will be its graph-connectivity.
- __call__(cell_pseudotime, neigh_pseudotime, neigh_conn, b=10.0, nu=0.5)[source]#
Bias the connectivities by downweighting ones to past cells.
This function uses generalized logistic regression to weight the past connectivities.
- Parameters
cell_pseudotime (
float
) – Pseudotime of the current cell.neigh_pseudotime (
ndarray
) – Array of shape(n_neighbors,)
containing pseudotime of neighbors.neigh_conn (
ndarray
) – Array of shape(n_neighbors,)
containing connectivities of the current cell and its neighbors.b (
float
) – The growth rate of generalized logistic function.nu (
float
) – Affects near which asymptote maximum growth occurs.
- Return type
- Returns
: Array of shape
(n_neighbors,)
containing the biased connectivities.
- class cellrank.kernels.utils.CustomThresholdScheme(callback)[source]#
Class that wraps a user supplied scheme.
- Parameters
callback (
Callable
[[float
,ndarray
,ndarray
,ndarray
,Any
],ndarray
]) – Function which returns the biased connectivities.
- __call__(cell_pseudotime, neigh_pseudotime, neigh_conn, **kwargs)[source]#
Calculate biased connections for a given cell.
- Parameters
cell_pseudotime (
float
) – Pseudotime of the current cell.neigh_pseudotime (
ndarray
) – Array of shape(n_neighbors,)
containing pseudotime of neighbors.neigh_conn (
ndarray
) – Array of shape(n_neighbors,)
containing connectivities of the current cell and its neighbors.kwargs (
Any
) – Additional keyword arguments.
- Return type
- Returns
: Array of shape
(n_neighbors,)
containing the biased connectivities.
Estimators#
- class cellrank.estimators.BaseEstimator(object, **kwargs)[source]#
Base class for all estimators.
- Parameters
object (
Union
[str
,bool
,ndarray
,spmatrix
,AnnData
,KernelExpression
]) –Can be one of the following types:
AnnData
- annotated data object.KernelExpression
- kernel expression.str
- key inobsp
where the transition matrix is stored andadata
must be provided in this case.bool
- directionality of the transition matrix that will be used to infer its storage location. IfNone
, the directionality will be determined automatically andadata
must be provided in this case.
kwargs (
Any
) – Keyword arguments for thePrecomputedKernel
.
- copy(*, deep=False)[source]#
Return a copy of self.
- classmethod from_adata(adata, obsp_key)[source]#
De-serialize self from
AnnData
.- Parameters
- Return type
- Returns
: The de-serialized object.
- abstract predict(*args, **kwargs)[source]#
Run the prediction.
- Parameters
- Return type
- Returns
: Self.
- to_adata(keep=('X', 'raw'), *, copy=True)[source]#
Serialize self to
Anndata
.- Parameters
keep (
Union
[Literal
['all'
],Sequence
[Literal
['X'
,'raw'
,'layers'
,'obs'
,'var'
,'obsm'
,'varm'
,'obsp'
,'varp'
,'uns'
]]]) –Which attributes to keep from the underlying
adata
. Valid options are:'all'
- keep all attributes specified in the signature.Sequence
- keep only subset of these attributes.dict
- the keys correspond the attribute names and values to a subset of keys which to keep from this attribute. If the values are specified either asTrue
or'all'
, everything from this attribute will be kept.
copy (
Union
[bool
,Sequence
[Literal
['X'
,'raw'
,'layers'
,'obs'
,'var'
,'obsm'
,'varm'
,'obsp'
,'varp'
,'uns'
]]]) – Whether to copy the data. Can be specified on per-attribute basis. Useful for attributes that are array-like.
- Return type
- Returns
: Annotated data object.
- class cellrank.estimators.TermStatesEstimator(object, **kwargs)[source]#
Base class for all estimators predicting the initial and terminal states.
- Parameters
object (
Union
[AnnData
,ndarray
,spmatrix
,KernelExpression
]) –Can be one of the following types:
AnnData
- annotated data object.KernelExpression
- kernel expression.str
- key inobsp
where the transition matrix is stored andadata
must be provided in this case.bool
- directionality of the transition matrix that will be used to infer its storage location. IfNone
, the directionality will be determined automatically andadata
must be provided in this case.
kwargs (
Any
) – Keyword arguments for thePrecomputedKernel
.
- property initial_states: Optional[Series]#
Categorical annotation of initial states.
By default, all transient cells will be labeled as NaN.
- plot_macrostates(which, states=None, color=None, discrete=True, mode=PlotMode.EMBEDDING, time_key='latent_time', same_plot=True, title=None, cmap='viridis', **kwargs)[source]#
Plot macrostates on an embedding or along pseudotime.
- Parameters
which (
Literal
['all'
,'initial'
,'terminal'
]) –Which macrostates to plot. Valid options are:
'all'
- plot all macrostates.'initial'
- plot macrostates marked asinitial_states
.'terminal'
- plot macrostates marked asterminal_states
.
states (
Union
[str
,Sequence
[str
],None
]) – Subset of the macrostates to show. IfNone
, plot all macrostates.color (
Optional
[str
]) – Key inobs
orvar
used to color the observations.discrete (
bool
) – Whether to plot the data as continuous or discrete observations. If the data cannot be plotted as continuous observations, it will be plotted as discrete.mode (
Literal
['embedding'
,'time'
]) – Whether to plot the probabilities in an embedding or along the pseudotime.time_key (
str
) – Key inobs
where pseudotime is stored. Only used whenmode = 'time'
.title (
Union
[str
,Sequence
[str
],None
]) – Title of the plot.same_plot (
bool
) – Whether to plot the data on the same plot or not. Only use whenmode = 'embedding'
. If True anddiscrete = False
,color
is ignored.cmap (
str
) – Colormap for continuous annotations.
- Return type
- Returns
: Nothing, just plots the figure. Optionally saves it based on
save
.
- rename_initial_states(old_new)[source]#
Rename the
initial_states
.- Parameters
old_new (
Dict
[str
,str
]) – Dictionary that maps old names to unique new names.- Return type
- Returns
: Returns self and updates the following fields:
initial_states
- Categorical annotation of initial states.
- rename_terminal_states(old_new)[source]#
Rename the
terminal_states
.- Parameters
old_new (
Dict
[str
,str
]) – Dictionary that maps old names to unique new names.- Return type
- Returns
: Returns self and updates the following fields:
terminal_states
- Categorical annotation of terminal states.
- set_initial_states(states, cluster_key=None, allow_overlap=False, **kwargs)[source]#
Set the
initial_states
.- Parameters
states (
Union
[Series
,Dict
[str
,Sequence
[Any
]]]) –Which states to select. Valid options are:
categorical
Series
where each category corresponds to an individual state. NaN entries denote cells that do not belong to any state, i.e., transient cells.dict
where keys are states and values are lists of cell barcodes corresponding to annotations inobs_names
. If only 1 key is provided, values should correspond to clusters if a categoricalSeries
can be found inobs
.
cluster_key (
Optional
[str
]) – Key inobs
to associate names and colorsinitial_states
. Each state will be given the name and color corresponding to the cluster it mostly overlaps with.allow_overlap (
bool
) – Whether to allow overlapping names between initial and terminal states.kwargs (
Any
) – Additional keyword arguments.
- Return type
- Returns
: Returns self and updates the following fields:
initial_states
- Categorical annotation of initial states.initial_states_probabilities
- Probability to be an initial state.
- set_terminal_states(states, cluster_key=None, allow_overlap=False, **kwargs)[source]#
Set the
terminal_states
.- Parameters
states (
Union
[Series
,Dict
[str
,Sequence
[Any
]]]) –States to select. Valid options are:
categorical
Series
where each category corresponds to an individual state. NaN entries denote cells that do not belong to any state, i.e., transient cells.dict
where keys are states and values are lists of cell barcodes corresponding to annotations inobs_names
. If only 1 key is provided, values should correspond to clusters if a categoricalSeries
can be found inobs
.
cluster_key (
Optional
[str
]) – Key inobs
to associate names and colors withterminal_states
. Each state will be given the name and color corresponding to the cluster it mostly overlaps with.allow_overlap (
bool
) – Whether to allow overlapping names between initial and terminal states.kwargs (
Any
) – Additional keyword arguments.
- Return type
- Returns
: Returns self and updates the following fields:
terminal_states
- Categorical annotation of terminal states.terminal_states_probabilities
- Probability to be a terminal state.
Models#
- class cellrank.models.BaseModel(adata, model)[source]#
Base class for all model classes.
- Parameters
- property conf_int: ndarray#
Array of shape
(n_samples, 2)
containing the lower and upper bound of the confidence interval.
- abstract confidence_interval(x_test=None, **kwargs)[source]#
Calculate the confidence interval.
Use
default_confidence_interval()
function if underlyingmodel
has no method for confidence interval calculation.- Parameters
- Return type
- Returns
: Returns self and updates the following fields:
conf_int
- Array of shape(n_samples, 2)
containing the lower and upper bound of the confidence interval.
- default_confidence_interval(x_test=None, **kwargs)[source]#
Calculate the confidence interval, if the underlying
model
has no method for it.This formula is taken from [DeSalvo, 1970], eq. 5.
- Parameters
- Return type
- Returns
: Returns self and updates the following fields:
Also updates the following fields:
- abstract fit(x=None, y=None, w=None, **kwargs)[source]#
Fit the model.
- Parameters
x (
Optional
[ndarray
]) – Independent variables, array of shape(n_samples, 1)
. IfNone
, usex
.y (
Optional
[ndarray
]) – Dependent variables, array of shape(n_samples, 1)
. IfNone
, usey
.w (
Optional
[ndarray
]) – Optional weights ofx
, array of shape(n_samples,)
. IfNone
, usew
.kwargs (
Any
) – Keyword arguments for underlyingmodel
’s fitting function.
- Return type
- Returns
: Fits the
model
and returns self.
- plot(figsize=(8, 5), same_plot=False, hide_cells=False, perc=None, fate_prob_cmap=<matplotlib.colors.ListedColormap object>, cell_color=None, lineage_color='black', alpha=0.8, lineage_alpha=0.2, title=None, size=15, lw=2, cbar=True, margins=0.015, xlabel='pseudotime', ylabel='expression', conf_int=True, lineage_probability=False, lineage_probability_conf_int=False, lineage_probability_color=None, obs_legend_loc='best', dpi=None, fig=None, ax=None, return_fig=False, save=None, **kwargs)[source]#
Plot the smoothed gene expression.
- Parameters
same_plot (
bool
) – Whether to plot all trends in the same plot.hide_cells (
bool
) – Whether to hide the cells.perc (
Optional
[Tuple
[float
,float
]]) – Percentile by which to clip the fate probabilities.fate_prob_cmap (
ListedColormap
) – Colormap to use when coloring in the fate probabilities.cell_color (
Optional
[str
]) – Key inobs
orvar_names
used for coloring the cells.lineage_color (
str
) – Color for the lineage.alpha (
float
) – Alpha value in \([0, 1]\) for the transparency of cells.lineage_alpha (
float
) – Alpha value in \([0, 1]\) for the transparency lineage confidence intervals.size (
int
) – Size of the points.lw (
float
) – Line width for the smoothed values.cbar (
bool
) – Whether to show the colorbar.margins (
float
) – Margins around the plot.xlabel (
str
) – Label on the x-axis.ylabel (
str
) – Label on the y-axis.conf_int (
bool
) – Whether to show the confidence interval.lineage_probability (
bool
) – Whether to show smoothed lineage probability as a dashed line. Note that this will require 1 additional model fit.lineage_probability_conf_int (
Union
[bool
,float
]) – Whether to compute and show smoothed lineage probability confidence interval.lineage_probability_color (
Optional
[str
]) – Color to use when plotting the smoothedlineage_probability
. IfNone
, it’s the same aslineage_color
. Only used whenshow_lineage_probability = True
.obs_legend_loc (
Optional
[str
]) – Location of the legend whencell_color
corresponds to a categorical variable.fig (
Optional
[Figure
]) – Figure to use. IfNone
, create a new one.save (
Optional
[str
]) – Filename where to save the plot. IfNone
, just shows the plots.
- Return type
- Returns
: Nothing, just plots the figure. Optionally saves it based on
save
.
- abstract predict(x_test=None, key_added='_x_test', **kwargs)[source]#
Run the prediction.
- Parameters
- Return type
- Returns
: Returns and updates the following fields:
- prepare(gene, lineage, time_key, backward=False, time_range=None, data_key='X', use_raw=False, threshold=None, weight_threshold=(0.01, 0.01), filter_cells=None, n_test_points=200)[source]#
Prepare the model to be ready for fitting.
- Parameters
lineage (
Optional
[str
]) – Name of the lineage. IfNone
, all weights will be set to \(1\).backward (
bool
) – Direction of the process.time_range (
Union
[float
,Tuple
[float
,float
],None
]) –Specify start and end times:
data_key (
Optional
[str
]) – Key inlayers
or'X'
forX
. Ifuse_raw = True
, it’s always set to'X'
.threshold (
Optional
[float
]) – Consider only cells with weights >threshold
when estimating the test endpoint. IfNone
, use the median of the weights.weight_threshold (
Union
[float
,Tuple
[float
,float
]]) – Set all weights belowweight_threshold
toweight_threshold
if afloat
, or to the second value, if atuple
.filter_cells (
Optional
[float
]) – Filter out all cells with expression values lower than this threshold.n_test_points (
int
) – Number of test points. IfNone
, use the original points based onthreshold
.
- Return type
- Returns
: Nothing, just updates the following fields:
x
- Filtered independent variables of shape(n_filtered_cells, 1)
used for fitting.y
- Filtered dependent variables of shape(n_filtered_cells, 1)
used for fitting.w
- Filtered weights of shape(n_filtered_cells,)
used for fitting.x_all
- Unfiltered independent variables of shape(n_cells, 1)
.y_all
- Unfiltered dependent variables of shape(n_cells, 1)
.w_all
- Unfiltered weights of shape(n_cells,)
.x_test
- Independent variables of shape(n_samples, 1)
used for prediction.prepared
- Whether the model is prepared for fitting.
- property prepared#
Whether the model is prepared for fitting.
- property x: ndarray#
Filtered independent variables of shape
(n_filtered_cells, 1)
used for fitting.
- property x_hat: ndarray#
Filtered independent variables used when calculating default confidence interval, usually same as
x
.
Lineage#
- class cellrank.Lineage(input_array, *, names, colors=None)[source]#
Lightweight
ndarray
wrapper that adds names and colors.- Parameters
- Return type
- property T#
Transpose of self.
- classmethod from_adata(adata, backward=False, estimator_backward=None, kind=LinKind.FATE_PROBS, copy=False)[source]#
Reconstruct the
Lineage
object fromAnnData
object.- Parameters
adata (
AnnData
) – Annotated data object.backward (
bool
) – Direction of the process.estimator_backward (
Optional
[bool
]) – Key which helps to determine whether these states are initial or terminal.kind (
Literal
['macrostates'
,'term_states'
,'fate_probs'
]) –Which kind of object to reconstruct. Valid options are:
'macrostates'
- macrostates memberships fromcellrank.estimators.GPCCA
.'term_states'
- terminal states memberships fromcellrank.estimators.GPCCA
.'fate_probs'
- fate probabilities.
copy (
bool
) – Whether to return a copy of the underlying array.
- Return type
- Returns
: The reconstructed lineage object.
- plot_pie(reduction, title=None, legend_loc='on data', legend_kwargs=mappingproxy({}), figsize=None, dpi=None, save=None, **kwargs)[source]#
Plot a pie chart visualizing aggregated lineage probabilities.
- Parameters
- Return type
- Returns
: Nothing, just plots the figure. Optionally saves it based on
save
.
- priming_degree(method='kl_divergence', early_cells=None)[source]#
Compute the degree of lineage priming.
It returns a score in \([0, 1]\) where \(0\) stands for naive and \(1\) stands for committed.
- Parameters
method (
Literal
['kl_divergence'
,'entropy'
]) –The method used to compute the degree of lineage priming. Valid options are:
'kl_divergence'
- as in [Velten et al., 2017], computes KL-divergence between the fate probabilities of a cell and the average fate probabilities. Computation of average fate probabilities can be restricted to a set of user-definedearly_cells
.'entropy'
- as in [Setty et al., 2019], computes entropy over a cell’s fate probabilities.
early_cells (
Optional
[ndarray
]) – Cell IDs or a mask marking early cells. IfNone
, use all cells. Only used whenmethod = 'kl_divergence'
.
- Return type
- Returns
: The priming degree.
- reduce(*keys, mode=Reduction.DIST, dist_measure=DistanceMeasure.MUTUAL_INFO, normalize_weights=NormWeights.SOFTMAX, softmax_scale=1, return_weights=False)[source]#
Subset states and normalize them so that they again sum to \(1\).
- Parameters
keys (
str
) – List of keys that define the states, to which this object will be reduced by projecting the values of the other states.mode (
Literal
['dist'
,'scale'
]) –Reduction mode to use. Valid options are:
'dist'
- use a distance measuredist_measure
to compute weights.'scale'
- just rescale the values.
dist_measure (
Literal
['cosine_sim'
,'wasserstein_dist'
,'kl_div'
,'js_div'
,'mutual_info'
,'equal'
]) –Used to quantify similarity between query and reference states. Valid options are:
'cosine_sim'
- cosine similarity.'wasserstein_dist'
- Wasserstein distance.'kl_div'
- Kullback–Leibler divergence.'js_div'
- Jensen–Shannon divergence.'mutual_info'
- mutual information.'equal'
- equally redistribute the mass among the rest.
Only use when
mode = 'dist'
.normalize_weights (
Literal
['scale'
,'softmax'
]) –How to row-normalize the weights. Valid options are:
'scale'
- divide by the sum.'softmax'
- use a softmax.
Only used when
mode = 'dist'
.softmax_scale (
float
) – Scaling factor in the softmax, used for normalizing the weights to sum to \(1\).return_weights (
bool
) – If True, aDataFrame
of the weights used for the projection is also returned. Ifmode = 'scale'
, the weights will be None.
- Return type
- Returns
: The lineage object, reduced to the initial or terminal states. The weights used for the projection of shape
(n_query, n_reference)
, ifreturn_weights = True
.