Developer API#

Kernels#

class cellrank.kernels.Kernel(adata, parent=None, **kwargs)[source]#

Base kernel class.

Parameters:
  • adata (AnnData) – Annotated data object.

  • parent (Optional[KernelExpression]) – Parent kernel expression.

  • kwargs (Any) – Keyword arguments for the parent.

property adata: AnnData#

Annotated data object.

abstract property backward: bool | None#

Direction of the process.

cbc(source, target, cluster_key, rep, graph_key='distances')#

Compute cross-boundary correctness score between source and target cluster.

Parameters:
  • source (str) – Name of the source cluster.

  • target (str) – Name of the target cluster.

  • cluster_key (str) – Key in obs to obtain cluster annotations.

  • rep (str) – Key in obsm to use as data representation.

  • graph_key (str) – Name of graph representation to use from obsp.

Return type:

ndarray

Returns:

: Cross-boundary correctness score for each observation.

abstract compute_transition_matrix(*args, **kwargs)#

Compute transition matrix.

Parameters:
  • args (Any) – Positional arguments.

  • kwargs (Any) – Keyword arguments.

Return type:

KernelExpression

Returns:

: Modifies transition_matrix and returns self.

copy(*, deep=False)[source]#

Return a copy of self.

Parameters:

deep (bool) – Whether to use deepcopy().

Return type:

Kernel

Returns:

: Copy of self.

classmethod from_adata(adata, key, copy=False)[source]#

Read the kernel saved using write_to_adata().

Parameters:
  • adata (AnnData) – Annotated data object.

  • key (str) – Key in obsp where the transition matrix is stored. The parameters should be stored in adata.uns['{key}_params'].

  • copy (bool) – Whether to copy the transition matrix.

Return type:

Kernel

Returns:

: The kernel with explicitly initialized properties:

property kernels: Tuple[KernelExpression, ...]#

Underlying base kernels.

property params: Dict[str, Any]#

Parameters which are used to compute the transition matrix.

plot_projection(basis='umap', key_added=None, recompute=False, stream=True, connectivities=None, **kwargs)#

Plot transition_matrix as a stream or a grid plot.

Parameters:
Return type:

None

Returns:

: Nothing, just plots and modifies obsm with a key based on the key_added.

plot_random_walks(n_sims=100, max_iter=0.25, seed=None, successive_hits=0, start_ixs=None, stop_ixs=None, basis='umap', cmap='gnuplot', linewidth=1.0, linealpha=0.3, ixs_legend_loc=None, n_jobs=None, backend='loky', show_progress_bar=True, figsize=None, dpi=None, save=None, **kwargs)#

Plot random walks in an embedding.

This method simulates random walks on the Markov chain defined though the corresponding transition matrix. The method is intended to give qualitative rather than quantitative insights into the transition matrix. Random walks are simulated by iteratively choosing the next cell based on the current cell’s transition probabilities.

Parameters:
  • n_sims (int) – Number of random walks to simulate.

  • max_iter (Union[int, float]) – Maximum number of steps of a random walk. If a float, it can be specified as a fraction of the number of cells.

  • seed (Optional[int]) – Random seed.

  • successive_hits (int) – Number of successive hits in the stop_ixs required to stop prematurely.

  • start_ixs (Union[Sequence[str], Mapping[str, Union[str, Sequence[str], Tuple[float, float]]], None]) –

    Cells from which to sample the starting points. If None, use all cells. Can be specified as:

    • dict - dictionary with 1 key in obs with values corresponding to either 1 or more clusters (if the column is categorical) or a tuple specifying \([min, max]\) interval from which to select the indices.

    • Sequence - sequence of cell ids in obs_names.

    For example {'dpt_pseudotime': [0, 0.1]} means that starting points for random walks will be sampled uniformly from cells whose pseudotime is in \([0, 0.1]\).

  • stop_ixs (Union[Sequence[str], Mapping[str, Union[str, Sequence[str], Tuple[float, float]]], None]) –

    Cells which when hit, the random walk is terminated. If None, terminate after max_iters. Can be specified as:

    • dict - dictionary with 1 key in obs with values corresponding to either 1 or more clusters (if the column is categorical) or a tuple specifying \([min, max]\) interval from which to select the indices.

    • Sequence - sequence of cell ids in obs_names.

    For example {'clusters': ['Alpha', 'Beta']} and successive_hits = 3 means that the random walk will stop prematurely after cells in the above specified clusters have been visited successively 3 times in a row.

  • basis (str) – Basis in obsm to use as an embedding.

  • cmap (Union[str, LinearSegmentedColormap]) – Colormap for the random walk lines.

  • linewidth (float) – Width of the random walk lines.

  • linealpha (float) – Alpha value of the random walk lines.

  • ixs_legend_loc (Optional[str]) – Legend location for the start/top indices.

  • show_progress_bar (bool) – Whether to show a progress bar. Disabling it may slightly improve performance.

  • n_jobs (Optional[int]) – Number of parallel jobs. If -1, use all available cores. If None or 1, the execution is sequential.

  • backend (str) – Which backend to use for parallelization. See Parallel for valid options.

  • figsize (Optional[Tuple[float, float]]) – Size of the figure.

  • dpi (Optional[int]) – Dots per inch.

  • save (Union[Path, str, None]) – Filename where to save the plot.

  • kwargs (Any) – Keyword arguments for scatter().

Return type:

None

Returns:

: Nothing, just plots the figure. Optionally saves it based on save. For each random walk, the first/last cell is marked by the start/end colors of cmap.

plot_single_flow(cluster, cluster_key, time_key, clusters=None, time_points=None, min_flow=0, remove_empty_clusters=True, ascending=False, legend_loc='upper right out', alpha=0.8, xticks_step_size=1, figsize=None, dpi=None, save=None, show=True)#

Visualize outgoing flow from a cluster of cells [Mittnenzweig et al., 2021].

Parameters:
  • cluster (str) – Cluster for which to visualize outgoing flow.

  • cluster_key (str) – Key in obs where clustering is stored.

  • time_key (str) – Key in obs where experimental time is stored.

  • clusters (Optional[Sequence[Any]]) – Visualize flow only for these clusters. If None, use all clusters.

  • time_points (Optional[Sequence[Union[float, int]]]) – Visualize flow only for these time points. If None, use all time points.

  • min_flow (float) – Only show flow edges with flow greater than this value. Flow values are always in \([0, 1]\).

  • remove_empty_clusters (bool) – Whether to remove clusters with no incoming flow edges.

  • ascending (Optional[bool]) – Whether to sort the cluster by ascending or descending incoming flow. If None, use the order as in defined by clusters.

  • alpha (Optional[float]) – Alpha value for cell proportions.

  • xticks_step_size (Optional[int]) – Show only every other n-th tick on the x-axis. If None, don’t show any ticks.

  • legend_loc (Optional[str]) – Position of the legend. If None, do not show the legend.

  • figsize (Optional[Tuple[float, float]]) – Size of the figure.

  • dpi (Optional[int]) – Dots per inch.

  • figsize – Size of the figure.

  • dpi – Dots per inch.

  • save (Union[Path, str, None]) – Filename where to save the plot.

  • show (bool) – If False, return Axes.

Return type:

Optional[Axes]

Returns:

: The axes object, if show = False. Nothing, just plots the figure. Optionally saves it based on save.

Notes

This function is a Python re-implementation of the following original R function with some minor stylistic differences. This function will not recreate the results from [Mittnenzweig et al., 2021], because there, the Metacell model [Baran et al., 2019] was used to compute the flow, whereas here the transition matrix is used.

static read(fname, adata=None, copy=False)#

De-serialize self from a file.

Parameters:
  • fname (Union[str, Path]) – Path from which to read the object.

  • adata (Optional[AnnData]) – AnnData object to assign to the saved object. Only used when the saved object has adata and it was saved without it.

  • copy (bool) – Whether to copy adata before assigning it. If adata is a view, it is always copied.

Return type:

IOMixin

Returns:

: The de-serialized object.

property shape: Tuple[int, int]#

(n_cells, n_cells).

property transition_matrix: ndarray | csr_matrix#

Row-normalized transition matrix.

write(fname, write_adata=True)#

Serialize self to a file using pickle.

Parameters:
  • fname (Union[str, Path]) – Path where to save the object.

  • write_adata (bool) – Whether to save adata object.

Return type:

None

Returns:

: Nothing, just writes itself to a file.

write_to_adata(key=None, copy=False)#

Write the transition matrix and parameters used for computation to the underlying adata object.

Parameters:
Return type:

None

Returns:

: Updates the adata with the following fields:

class cellrank.kernels.UnidirectionalKernel(adata, parent=None, **kwargs)[source]#
Parameters:
  • adata (AnnData) –

  • parent (KernelExpression | None) –

  • kwargs (Any) –

property adata: AnnData#

Annotated data object.

property backward: None#

None.

cbc(source, target, cluster_key, rep, graph_key='distances')#

Compute cross-boundary correctness score between source and target cluster.

Parameters:
  • source (str) – Name of the source cluster.

  • target (str) – Name of the target cluster.

  • cluster_key (str) – Key in obs to obtain cluster annotations.

  • rep (str) – Key in obsm to use as data representation.

  • graph_key (str) – Name of graph representation to use from obsp.

Return type:

ndarray

Returns:

: Cross-boundary correctness score for each observation.

abstract compute_transition_matrix(*args, **kwargs)#

Compute transition matrix.

Parameters:
  • args (Any) – Positional arguments.

  • kwargs (Any) – Keyword arguments.

Return type:

KernelExpression

Returns:

: Modifies transition_matrix and returns self.

copy(*, deep=False)#

Return a copy of self.

Parameters:

deep (bool) – Whether to use deepcopy().

Return type:

Kernel

Returns:

: Copy of self.

classmethod from_adata(adata, key, copy=False)#

Read the kernel saved using write_to_adata().

Parameters:
  • adata (AnnData) – Annotated data object.

  • key (str) – Key in obsp where the transition matrix is stored. The parameters should be stored in adata.uns['{key}_params'].

  • copy (bool) – Whether to copy the transition matrix.

Return type:

Kernel

Returns:

: The kernel with explicitly initialized properties:

property kernels: Tuple[KernelExpression, ...]#

Underlying base kernels.

property params: Dict[str, Any]#

Parameters which are used to compute the transition matrix.

plot_projection(basis='umap', key_added=None, recompute=False, stream=True, connectivities=None, **kwargs)#

Plot transition_matrix as a stream or a grid plot.

Parameters:
Return type:

None

Returns:

: Nothing, just plots and modifies obsm with a key based on the key_added.

plot_random_walks(n_sims=100, max_iter=0.25, seed=None, successive_hits=0, start_ixs=None, stop_ixs=None, basis='umap', cmap='gnuplot', linewidth=1.0, linealpha=0.3, ixs_legend_loc=None, n_jobs=None, backend='loky', show_progress_bar=True, figsize=None, dpi=None, save=None, **kwargs)#

Plot random walks in an embedding.

This method simulates random walks on the Markov chain defined though the corresponding transition matrix. The method is intended to give qualitative rather than quantitative insights into the transition matrix. Random walks are simulated by iteratively choosing the next cell based on the current cell’s transition probabilities.

Parameters:
  • n_sims (int) – Number of random walks to simulate.

  • max_iter (Union[int, float]) – Maximum number of steps of a random walk. If a float, it can be specified as a fraction of the number of cells.

  • seed (Optional[int]) – Random seed.

  • successive_hits (int) – Number of successive hits in the stop_ixs required to stop prematurely.

  • start_ixs (Union[Sequence[str], Mapping[str, Union[str, Sequence[str], Tuple[float, float]]], None]) –

    Cells from which to sample the starting points. If None, use all cells. Can be specified as:

    • dict - dictionary with 1 key in obs with values corresponding to either 1 or more clusters (if the column is categorical) or a tuple specifying \([min, max]\) interval from which to select the indices.

    • Sequence - sequence of cell ids in obs_names.

    For example {'dpt_pseudotime': [0, 0.1]} means that starting points for random walks will be sampled uniformly from cells whose pseudotime is in \([0, 0.1]\).

  • stop_ixs (Union[Sequence[str], Mapping[str, Union[str, Sequence[str], Tuple[float, float]]], None]) –

    Cells which when hit, the random walk is terminated. If None, terminate after max_iters. Can be specified as:

    • dict - dictionary with 1 key in obs with values corresponding to either 1 or more clusters (if the column is categorical) or a tuple specifying \([min, max]\) interval from which to select the indices.

    • Sequence - sequence of cell ids in obs_names.

    For example {'clusters': ['Alpha', 'Beta']} and successive_hits = 3 means that the random walk will stop prematurely after cells in the above specified clusters have been visited successively 3 times in a row.

  • basis (str) – Basis in obsm to use as an embedding.

  • cmap (Union[str, LinearSegmentedColormap]) – Colormap for the random walk lines.

  • linewidth (float) – Width of the random walk lines.

  • linealpha (float) – Alpha value of the random walk lines.

  • ixs_legend_loc (Optional[str]) – Legend location for the start/top indices.

  • show_progress_bar (bool) – Whether to show a progress bar. Disabling it may slightly improve performance.

  • n_jobs (Optional[int]) – Number of parallel jobs. If -1, use all available cores. If None or 1, the execution is sequential.

  • backend (str) – Which backend to use for parallelization. See Parallel for valid options.

  • figsize (Optional[Tuple[float, float]]) – Size of the figure.

  • dpi (Optional[int]) – Dots per inch.

  • save (Union[Path, str, None]) – Filename where to save the plot.

  • kwargs (Any) – Keyword arguments for scatter().

Return type:

None

Returns:

: Nothing, just plots the figure. Optionally saves it based on save. For each random walk, the first/last cell is marked by the start/end colors of cmap.

plot_single_flow(cluster, cluster_key, time_key, clusters=None, time_points=None, min_flow=0, remove_empty_clusters=True, ascending=False, legend_loc='upper right out', alpha=0.8, xticks_step_size=1, figsize=None, dpi=None, save=None, show=True)#

Visualize outgoing flow from a cluster of cells [Mittnenzweig et al., 2021].

Parameters:
  • cluster (str) – Cluster for which to visualize outgoing flow.

  • cluster_key (str) – Key in obs where clustering is stored.

  • time_key (str) – Key in obs where experimental time is stored.

  • clusters (Optional[Sequence[Any]]) – Visualize flow only for these clusters. If None, use all clusters.

  • time_points (Optional[Sequence[Union[float, int]]]) – Visualize flow only for these time points. If None, use all time points.

  • min_flow (float) – Only show flow edges with flow greater than this value. Flow values are always in \([0, 1]\).

  • remove_empty_clusters (bool) – Whether to remove clusters with no incoming flow edges.

  • ascending (Optional[bool]) – Whether to sort the cluster by ascending or descending incoming flow. If None, use the order as in defined by clusters.

  • alpha (Optional[float]) – Alpha value for cell proportions.

  • xticks_step_size (Optional[int]) – Show only every other n-th tick on the x-axis. If None, don’t show any ticks.

  • legend_loc (Optional[str]) – Position of the legend. If None, do not show the legend.

  • figsize (Optional[Tuple[float, float]]) – Size of the figure.

  • dpi (Optional[int]) – Dots per inch.

  • figsize – Size of the figure.

  • dpi – Dots per inch.

  • save (Union[Path, str, None]) – Filename where to save the plot.

  • show (bool) – If False, return Axes.

Return type:

Optional[Axes]

Returns:

: The axes object, if show = False. Nothing, just plots the figure. Optionally saves it based on save.

Notes

This function is a Python re-implementation of the following original R function with some minor stylistic differences. This function will not recreate the results from [Mittnenzweig et al., 2021], because there, the Metacell model [Baran et al., 2019] was used to compute the flow, whereas here the transition matrix is used.

static read(fname, adata=None, copy=False)#

De-serialize self from a file.

Parameters:
  • fname (Union[str, Path]) – Path from which to read the object.

  • adata (Optional[AnnData]) – AnnData object to assign to the saved object. Only used when the saved object has adata and it was saved without it.

  • copy (bool) – Whether to copy adata before assigning it. If adata is a view, it is always copied.

Return type:

IOMixin

Returns:

: The de-serialized object.

property shape: Tuple[int, int]#

(n_cells, n_cells).

property transition_matrix: ndarray | csr_matrix#

Row-normalized transition matrix.

write(fname, write_adata=True)#

Serialize self to a file using pickle.

Parameters:
  • fname (Union[str, Path]) – Path where to save the object.

  • write_adata (bool) – Whether to save adata object.

Return type:

None

Returns:

: Nothing, just writes itself to a file.

write_to_adata(key=None, copy=False)#

Write the transition matrix and parameters used for computation to the underlying adata object.

Parameters:
Return type:

None

Returns:

: Updates the adata with the following fields:

class cellrank.kernels.BidirectionalKernel(*args, backward=False, **kwargs)[source]#
Parameters:
  • args (Any) –

  • backward (bool) –

  • kwargs (Any) –

property adata: AnnData#

Annotated data object.

property backward: bool#

Direction of the process.

cbc(source, target, cluster_key, rep, graph_key='distances')#

Compute cross-boundary correctness score between source and target cluster.

Parameters:
  • source (str) – Name of the source cluster.

  • target (str) – Name of the target cluster.

  • cluster_key (str) – Key in obs to obtain cluster annotations.

  • rep (str) – Key in obsm to use as data representation.

  • graph_key (str) – Name of graph representation to use from obsp.

Return type:

ndarray

Returns:

: Cross-boundary correctness score for each observation.

abstract compute_transition_matrix(*args, **kwargs)#

Compute transition matrix.

Parameters:
  • args (Any) – Positional arguments.

  • kwargs (Any) – Keyword arguments.

Return type:

KernelExpression

Returns:

: Modifies transition_matrix and returns self.

copy(*, deep=False)#

Return a copy of self.

Parameters:

deep (bool) – Whether to use deepcopy().

Return type:

Kernel

Returns:

: Copy of self.

classmethod from_adata(adata, key, copy=False)#

Read the kernel saved using write_to_adata().

Parameters:
  • adata (AnnData) – Annotated data object.

  • key (str) – Key in obsp where the transition matrix is stored. The parameters should be stored in adata.uns['{key}_params'].

  • copy (bool) – Whether to copy the transition matrix.

Return type:

Kernel

Returns:

: The kernel with explicitly initialized properties:

property kernels: Tuple[KernelExpression, ...]#

Underlying base kernels.

property params: Dict[str, Any]#

Parameters which are used to compute the transition matrix.

plot_projection(basis='umap', key_added=None, recompute=False, stream=True, connectivities=None, **kwargs)#

Plot transition_matrix as a stream or a grid plot.

Parameters:
Return type:

None

Returns:

: Nothing, just plots and modifies obsm with a key based on the key_added.

plot_random_walks(n_sims=100, max_iter=0.25, seed=None, successive_hits=0, start_ixs=None, stop_ixs=None, basis='umap', cmap='gnuplot', linewidth=1.0, linealpha=0.3, ixs_legend_loc=None, n_jobs=None, backend='loky', show_progress_bar=True, figsize=None, dpi=None, save=None, **kwargs)#

Plot random walks in an embedding.

This method simulates random walks on the Markov chain defined though the corresponding transition matrix. The method is intended to give qualitative rather than quantitative insights into the transition matrix. Random walks are simulated by iteratively choosing the next cell based on the current cell’s transition probabilities.

Parameters:
  • n_sims (int) – Number of random walks to simulate.

  • max_iter (Union[int, float]) – Maximum number of steps of a random walk. If a float, it can be specified as a fraction of the number of cells.

  • seed (Optional[int]) – Random seed.

  • successive_hits (int) – Number of successive hits in the stop_ixs required to stop prematurely.

  • start_ixs (Union[Sequence[str], Mapping[str, Union[str, Sequence[str], Tuple[float, float]]], None]) –

    Cells from which to sample the starting points. If None, use all cells. Can be specified as:

    • dict - dictionary with 1 key in obs with values corresponding to either 1 or more clusters (if the column is categorical) or a tuple specifying \([min, max]\) interval from which to select the indices.

    • Sequence - sequence of cell ids in obs_names.

    For example {'dpt_pseudotime': [0, 0.1]} means that starting points for random walks will be sampled uniformly from cells whose pseudotime is in \([0, 0.1]\).

  • stop_ixs (Union[Sequence[str], Mapping[str, Union[str, Sequence[str], Tuple[float, float]]], None]) –

    Cells which when hit, the random walk is terminated. If None, terminate after max_iters. Can be specified as:

    • dict - dictionary with 1 key in obs with values corresponding to either 1 or more clusters (if the column is categorical) or a tuple specifying \([min, max]\) interval from which to select the indices.

    • Sequence - sequence of cell ids in obs_names.

    For example {'clusters': ['Alpha', 'Beta']} and successive_hits = 3 means that the random walk will stop prematurely after cells in the above specified clusters have been visited successively 3 times in a row.

  • basis (str) – Basis in obsm to use as an embedding.

  • cmap (Union[str, LinearSegmentedColormap]) – Colormap for the random walk lines.

  • linewidth (float) – Width of the random walk lines.

  • linealpha (float) – Alpha value of the random walk lines.

  • ixs_legend_loc (Optional[str]) – Legend location for the start/top indices.

  • show_progress_bar (bool) – Whether to show a progress bar. Disabling it may slightly improve performance.

  • n_jobs (Optional[int]) – Number of parallel jobs. If -1, use all available cores. If None or 1, the execution is sequential.

  • backend (str) – Which backend to use for parallelization. See Parallel for valid options.

  • figsize (Optional[Tuple[float, float]]) – Size of the figure.

  • dpi (Optional[int]) – Dots per inch.

  • save (Union[Path, str, None]) – Filename where to save the plot.

  • kwargs (Any) – Keyword arguments for scatter().

Return type:

None

Returns:

: Nothing, just plots the figure. Optionally saves it based on save. For each random walk, the first/last cell is marked by the start/end colors of cmap.

plot_single_flow(cluster, cluster_key, time_key, clusters=None, time_points=None, min_flow=0, remove_empty_clusters=True, ascending=False, legend_loc='upper right out', alpha=0.8, xticks_step_size=1, figsize=None, dpi=None, save=None, show=True)#

Visualize outgoing flow from a cluster of cells [Mittnenzweig et al., 2021].

Parameters:
  • cluster (str) – Cluster for which to visualize outgoing flow.

  • cluster_key (str) – Key in obs where clustering is stored.

  • time_key (str) – Key in obs where experimental time is stored.

  • clusters (Optional[Sequence[Any]]) – Visualize flow only for these clusters. If None, use all clusters.

  • time_points (Optional[Sequence[Union[float, int]]]) – Visualize flow only for these time points. If None, use all time points.

  • min_flow (float) – Only show flow edges with flow greater than this value. Flow values are always in \([0, 1]\).

  • remove_empty_clusters (bool) – Whether to remove clusters with no incoming flow edges.

  • ascending (Optional[bool]) – Whether to sort the cluster by ascending or descending incoming flow. If None, use the order as in defined by clusters.

  • alpha (Optional[float]) – Alpha value for cell proportions.

  • xticks_step_size (Optional[int]) – Show only every other n-th tick on the x-axis. If None, don’t show any ticks.

  • legend_loc (Optional[str]) – Position of the legend. If None, do not show the legend.

  • figsize (Optional[Tuple[float, float]]) – Size of the figure.

  • dpi (Optional[int]) – Dots per inch.

  • figsize – Size of the figure.

  • dpi – Dots per inch.

  • save (Union[Path, str, None]) – Filename where to save the plot.

  • show (bool) – If False, return Axes.

Return type:

Optional[Axes]

Returns:

: The axes object, if show = False. Nothing, just plots the figure. Optionally saves it based on save.

Notes

This function is a Python re-implementation of the following original R function with some minor stylistic differences. This function will not recreate the results from [Mittnenzweig et al., 2021], because there, the Metacell model [Baran et al., 2019] was used to compute the flow, whereas here the transition matrix is used.

static read(fname, adata=None, copy=False)#

De-serialize self from a file.

Parameters:
  • fname (Union[str, Path]) – Path from which to read the object.

  • adata (Optional[AnnData]) – AnnData object to assign to the saved object. Only used when the saved object has adata and it was saved without it.

  • copy (bool) – Whether to copy adata before assigning it. If adata is a view, it is always copied.

Return type:

IOMixin

Returns:

: The de-serialized object.

property shape: Tuple[int, int]#

(n_cells, n_cells).

property transition_matrix: ndarray | csr_matrix#

Row-normalized transition matrix.

write(fname, write_adata=True)#

Serialize self to a file using pickle.

Parameters:
  • fname (Union[str, Path]) – Path where to save the object.

  • write_adata (bool) – Whether to save adata object.

Return type:

None

Returns:

: Nothing, just writes itself to a file.

write_to_adata(key=None, copy=False)#

Write the transition matrix and parameters used for computation to the underlying adata object.

Parameters:
Return type:

None

Returns:

: Updates the adata with the following fields:

Similarity#

class cellrank.kernels.utils.SimilarityABC[source]#

Base class for all similarity schemes.

abstract __call__(v, D, softmax_scale=1.0)[source]#

Compute transition probability of a cell to its nearest neighbors using RNA velocity.

Parameters:
  • v (ndarray) – Array of shape (n_genes,) or (n_neighbors, n_genes) containing the velocity vector(s). The second case is used for the backward process.

  • D (ndarray) – Array of shape (n_neighbors, n_genes) corresponding to the transcriptomic displacement of the current cell with respect to ist nearest neighbors.

  • softmax_scale (float) – Scaling factor for the softmax function.

Return type:

Tuple[ndarray, ndarray]

Returns:

: The probability and unscaled logits arrays of shape (n_neighbors,).

class cellrank.kernels.utils.Cosine[source]#

Cosine similarity scheme as defined in eq. (4.7) [Li et al., 2021].

\[v(s_i, s_j) := g(cos(\delta_{i, j}, v_i))\]

where \(v_i\) is the velocity vector of cell \(i\), \(\delta_{i, j}\) corresponds to the transcriptional displacement between cells \(i\) and \(j\) and \(g\) is a softmax function with some scaling parameter.

__call__(v, D, softmax_scale=1.0)#

Compute transition probability of a cell to its nearest neighbors using RNA velocity.

Parameters:
  • v (ndarray) – Array of shape (n_genes,) or (n_neighbors, n_genes) containing the velocity vector(s). The second case is used for the backward process.

  • D (ndarray) – Array of shape (n_neighbors, n_genes) corresponding to the transcriptomic displacement of the current cell with respect to ist nearest neighbors.

  • softmax_scale (float) – Scaling factor for the softmax function.

Return type:

Tuple[ndarray, ndarray]

Returns:

: The probability and unscaled logits arrays of shape (n_neighbors,).

hessian(v, D, softmax_scale=1.0)#

Compute the Hessian.

Parameters:
  • v (ndarray) – Array of shape (n_genes,) containing the velocity vector.

  • D (ndarray) – Array of shape (n_neighbors, n_genes) corresponding to the transcriptomic displacement of the current cell with respect to ist nearest neighbors.

  • softmax_scale (float) – Scaling factor for the softmax function.

Return type:

ndarray

Returns:

: The full Hessian of shape (n_neighbors, n_genes, n_genes) or only its diagonal of shape (n_neighbors, n_genes).

class cellrank.kernels.utils.Correlation[source]#

Pearson correlation scheme as defined in eq. (4.8) [Li et al., 2021].

\[v(s_i, s_j) := g(corr(\delta_{i, j}, v_i))\]

where \(v_i\) is the velocity vector of cell \(i\), \(\delta_{i, j}\) corresponds to the transcriptional displacement between cells \(i\) and \(j\) and \(g\) is a softmax function with some scaling parameter.

__call__(v, D, softmax_scale=1.0)#

Compute transition probability of a cell to its nearest neighbors using RNA velocity.

Parameters:
  • v (ndarray) – Array of shape (n_genes,) or (n_neighbors, n_genes) containing the velocity vector(s). The second case is used for the backward process.

  • D (ndarray) – Array of shape (n_neighbors, n_genes) corresponding to the transcriptomic displacement of the current cell with respect to ist nearest neighbors.

  • softmax_scale (float) – Scaling factor for the softmax function.

Return type:

Tuple[ndarray, ndarray]

Returns:

: The probability and unscaled logits arrays of shape (n_neighbors,).

hessian(v, D, softmax_scale=1.0)#

Compute the Hessian.

Parameters:
  • v (ndarray) – Array of shape (n_genes,) containing the velocity vector.

  • D (ndarray) – Array of shape (n_neighbors, n_genes) corresponding to the transcriptomic displacement of the current cell with respect to ist nearest neighbors.

  • softmax_scale (float) – Scaling factor for the softmax function.

Return type:

ndarray

Returns:

: The full Hessian of shape (n_neighbors, n_genes, n_genes) or only its diagonal of shape (n_neighbors, n_genes).

class cellrank.kernels.utils.DotProduct[source]#

Dot product scheme as defined in eq. (4.9) [Li et al., 2021].

\[v(s_i, s_j) := g(\delta_{i, j}^T v_i)\]

where \(v_i\) is the velocity vector of cell \(i\), \(\delta_{i, j}\) corresponds to the transcriptional displacement between cells \(i\) and \(j\) and \(g\) is a softmax function with some scaling parameter.

__call__(v, D, softmax_scale=1.0)#

Compute transition probability of a cell to its nearest neighbors using RNA velocity.

Parameters:
  • v (ndarray) – Array of shape (n_genes,) or (n_neighbors, n_genes) containing the velocity vector(s). The second case is used for the backward process.

  • D (ndarray) – Array of shape (n_neighbors, n_genes) corresponding to the transcriptomic displacement of the current cell with respect to ist nearest neighbors.

  • softmax_scale (float) – Scaling factor for the softmax function.

Return type:

Tuple[ndarray, ndarray]

Returns:

: The probability and unscaled logits arrays of shape (n_neighbors,).

hessian(v, D, softmax_scale=1.0)#

Compute the Hessian.

Parameters:
  • v (ndarray) – Array of shape (n_genes,) containing the velocity vector.

  • D (ndarray) – Array of shape (n_neighbors, n_genes) corresponding to the transcriptomic displacement of the current cell with respect to ist nearest neighbors.

  • softmax_scale (float) – Scaling factor for the softmax function.

Return type:

ndarray

Returns:

: The full Hessian of shape (n_neighbors, n_genes, n_genes) or only its diagonal of shape (n_neighbors, n_genes).

Threshold Scheme#

class cellrank.kernels.utils.ThresholdSchemeABC[source]#

Base class for all connectivity biasing schemes.

abstract __call__(cell_pseudotime, neigh_pseudotime, neigh_conn, **kwargs)[source]#

Calculate biased connections for a given cell.

Parameters:
  • cell_pseudotime (float) – Pseudotime of the current cell.

  • neigh_pseudotime (ndarray) – Array of shape (n_neighbors,) containing pseudotime of neighbors.

  • neigh_conn (ndarray) – Array of shape (n_neighbors,) containing connectivities of the current cell and its neighbors.

  • kwargs (Any) –

Return type:

ndarray

Returns:

: Array of shape (n_neighbors,) containing the biased connectivities.

bias_knn(conn, pseudotime, n_jobs=None, backend='loky', show_progress_bar=True, **kwargs)[source]#

Bias cell-cell connectivities of a KNN graph.

Parameters:
  • conn (csr_matrix) – Sparse matrix of shape (n_cells, n_cells) containing the nearest neighbor connectivities.

  • pseudotime (ndarray) – Pseudotemporal ordering of cells.

  • show_progress_bar (bool) – Whether to show a progress bar. Disabling it may slightly improve performance.

  • n_jobs (Optional[int]) – Number of parallel jobs. If -1, use all available cores. If None or 1, the execution is sequential.

  • backend (str) – Which backend to use for parallelization. See Parallel for valid options.

  • kwargs (Any) –

Return type:

csr_matrix

Returns:

: The biased connectivities.

class cellrank.kernels.utils.HardThresholdScheme[source]#

Thresholding scheme inspired by Palantir [Setty et al., 2019].

Note that this won’t exactly reproduce the original Palantir results:

  • Palantir computes the kNN graph in a scaled space of diffusion components.

  • Palantir uses its own pseudotime to bias the kNN graph which is not implemented here.

  • Palantir uses a slightly different mechanism to ensure the graph remains connected when removing edges that point into the “pseudotime past”.

__call__(cell_pseudotime, neigh_pseudotime, neigh_conn, frac_to_keep=0.3)[source]#

Convert the undirected graph of cell-cell similarities into a directed one by removing “past” edges.

This uses a pseudotemporal measure to remove graph-edges that point into the pseudotime-past. For each cell, it keeps the closest neighbors, even if they are in the pseudotime past, to make sure the graph remains connected.

Parameters:
  • cell_pseudotime (float) – Pseudotime of the current cell.

  • neigh_pseudotime (ndarray) – Array of shape (n_neighbors,) containing pseudotime of neighbors.

  • neigh_conn (ndarray) – Array of shape (n_neighbors,) containing connectivities of the current cell and its neighbors.

  • frac_to_keep (float) – The frac_to_keep * n_neighbors closest neighbors (according to graph connectivities) are kept, no matter whether they lie in the pseudotemporal past or future. Must be in \([0, 1]\).

Return type:

ndarray

Returns:

: Array of shape (n_neighbors,) containing the biased connectivities.

class cellrank.kernels.utils.SoftThresholdScheme[source]#

Thresholding scheme inspired by [Stassen et al., 2021].

The idea is to downweight edges that points against the direction of increasing pseudotime. Essentially, the further “behind” a query cell is in pseudotime with respect to the current reference cell, the more penalized will be its graph-connectivity.

__call__(cell_pseudotime, neigh_pseudotime, neigh_conn, b=10.0, nu=0.5)[source]#

Bias the connectivities by downweighting ones to past cells.

This function uses generalized logistic regression to weight the past connectivities.

Parameters:
  • cell_pseudotime (float) – Pseudotime of the current cell.

  • neigh_pseudotime (ndarray) – Array of shape (n_neighbors,) containing pseudotime of neighbors.

  • neigh_conn (ndarray) – Array of shape (n_neighbors,) containing connectivities of the current cell and its neighbors.

  • b (float) – The growth rate of generalized logistic function.

  • nu (float) – Affects near which asymptote maximum growth occurs.

Return type:

ndarray

Returns:

: Array of shape (n_neighbors,) containing the biased connectivities.

class cellrank.kernels.utils.CustomThresholdScheme(callback)[source]#

Class that wraps a user supplied scheme.

Parameters:

callback (Callable[[float, ndarray, ndarray, ndarray, Any], ndarray]) – Function which returns the biased connectivities.

__call__(cell_pseudotime, neigh_pseudotime, neigh_conn, **kwargs)[source]#

Calculate biased connections for a given cell.

Parameters:
  • cell_pseudotime (float) – Pseudotime of the current cell.

  • neigh_pseudotime (ndarray) – Array of shape (n_neighbors,) containing pseudotime of neighbors.

  • neigh_conn (ndarray) – Array of shape (n_neighbors,) containing connectivities of the current cell and its neighbors.

  • kwargs (Any) – Additional keyword arguments.

Return type:

ndarray

Returns:

: Array of shape (n_neighbors,) containing the biased connectivities.

Estimators#

class cellrank.estimators.BaseEstimator(object, **kwargs)[source]#

Base class for all estimators.

Parameters:
  • object (Union[str, bool, ndarray, spmatrix, AnnData, KernelExpression]) –

    Can be one of the following types:

    • AnnData - annotated data object.

    • spmatrix, ndarray - row-normalized transition matrix.

    • KernelExpression - kernel expression.

    • str - key in obsp where the transition matrix is stored and adata must be provided in this case.

    • bool - directionality of the transition matrix that will be used to infer its storage location. If None, the directionality will be determined automatically and adata must be provided in this case.

  • kwargs (Any) – Keyword arguments for the PrecomputedKernel.

copy(*, deep=False)[source]#

Return a copy of self.

Parameters:

deep (bool) – Whether to return a deep copy or not. If True, this also copies the adata.

Return type:

BaseEstimator

Returns:

: A copy of self.

abstract fit(*args, **kwargs)[source]#

Fit the estimator.

Parameters:
  • args (Any) – Positional arguments.

  • kwargs (Any) – Keyword arguments.

Return type:

BaseEstimator

Returns:

: Self.

classmethod from_adata(adata, obsp_key)[source]#

De-serialize self from AnnData.

Parameters:
  • adata (AnnData) – Annotated data object.

  • obsp_key (str) – Key in obsp where the transition matrix is stored.

Return type:

BaseEstimator

Returns:

: The de-serialized object.

property params: Dict[str, Any]#

Estimator parameters.

abstract predict(*args, **kwargs)[source]#

Run the prediction.

Parameters:
  • args (Any) – Positional arguments.

  • kwargs (Any) – Keyword arguments.

Return type:

BaseEstimator

Returns:

: Self.

to_adata(keep=('X', 'raw'), *, copy=True)[source]#

Serialize self to Anndata.

Parameters:
  • keep (Union[Literal['all'], Sequence[Literal['X', 'raw', 'layers', 'obs', 'var', 'obsm', 'varm', 'obsp', 'varp', 'uns']]]) –

    Which attributes to keep from the underlying adata. Valid options are:

    • 'all' - keep all attributes specified in the signature.

    • Sequence - keep only subset of these attributes.

    • dict - the keys correspond the attribute names and values to a subset of keys which to keep from this attribute. If the values are specified either as True or 'all', everything from this attribute will be kept.

  • copy (Union[bool, Sequence[Literal['X', 'raw', 'layers', 'obs', 'var', 'obsm', 'varm', 'obsp', 'varp', 'uns']]]) – Whether to copy the data. Can be specified on per-attribute basis. Useful for attributes that are array-like.

Return type:

AnnData

Returns:

: Annotated data object.

class cellrank.estimators.TermStatesEstimator(object, **kwargs)[source]#

Base class for all estimators predicting the initial and terminal states.

Parameters:
  • object (Union[AnnData, ndarray, spmatrix, KernelExpression]) –

    Can be one of the following types:

    • AnnData - annotated data object.

    • spmatrix, ndarray - row-normalized transition matrix.

    • KernelExpression - kernel expression.

    • str - key in obsp where the transition matrix is stored and adata must be provided in this case.

    • bool - directionality of the transition matrix that will be used to infer its storage location. If None, the directionality will be determined automatically and adata must be provided in this case.

  • kwargs (Any) – Keyword arguments for the PrecomputedKernel.

property initial_states: Series | None#

Categorical annotation of initial states.

By default, all transient cells will be labeled as NaN.

property initial_states_probabilities: Series | None#

Probability to be an initial state.

plot_macrostates(which, states=None, color=None, discrete=True, mode=PlotMode.EMBEDDING, time_key='latent_time', same_plot=True, title=None, cmap='viridis', **kwargs)[source]#

Plot macrostates on an embedding or along pseudotime.

Parameters:
  • which (Literal['all', 'initial', 'terminal']) –

    Which macrostates to plot. Valid options are:

  • states (Union[str, Sequence[str], None]) – Subset of the macrostates to show. If None, plot all macrostates.

  • color (Optional[str]) – Key in obs or var used to color the observations.

  • discrete (bool) – Whether to plot the data as continuous or discrete observations. If the data cannot be plotted as continuous observations, it will be plotted as discrete.

  • mode (Literal['embedding', 'time']) – Whether to plot the probabilities in an embedding or along the pseudotime.

  • time_key (str) – Key in obs where pseudotime is stored. Only used when mode = 'time'.

  • title (Union[str, Sequence[str], None]) – Title of the plot.

  • same_plot (bool) – Whether to plot the data on the same plot or not. Only use when mode = 'embedding'. If True and discrete = False, color is ignored.

  • cmap (str) – Colormap for continuous annotations.

  • kwargs (Any) – Keyword arguments for scatter().

Return type:

None

Returns:

: Nothing, just plots the figure. Optionally saves it based on save.

rename_initial_states(old_new)[source]#

Rename the initial_states.

Parameters:

old_new (Dict[str, str]) – Dictionary that maps old names to unique new names.

Return type:

TermStatesEstimator

Returns:

: Returns self and updates the following fields:

rename_terminal_states(old_new)[source]#

Rename the terminal_states.

Parameters:

old_new (Dict[str, str]) – Dictionary that maps old names to unique new names.

Return type:

TermStatesEstimator

Returns:

: Returns self and updates the following fields:

set_initial_states(states, cluster_key=None, allow_overlap=False, **kwargs)[source]#

Set the initial_states.

Parameters:
  • states (Union[Series, Dict[str, Sequence[Any]]]) –

    Which states to select. Valid options are:

    • categorical Series where each category corresponds to an individual state. NaN entries denote cells that do not belong to any state, i.e., transient cells.

    • dict where keys are states and values are lists of cell barcodes corresponding to annotations in obs_names. If only 1 key is provided, values should correspond to clusters if a categorical Series can be found in obs.

  • cluster_key (Optional[str]) – Key in obs to associate names and colors initial_states. Each state will be given the name and color corresponding to the cluster it mostly overlaps with.

  • allow_overlap (bool) – Whether to allow overlapping names between initial and terminal states.

  • kwargs (Any) – Additional keyword arguments.

Return type:

TermStatesEstimator

Returns:

: Returns self and updates the following fields:

set_terminal_states(states, cluster_key=None, allow_overlap=False, **kwargs)[source]#

Set the terminal_states.

Parameters:
  • states (Union[Series, Dict[str, Sequence[Any]]]) –

    States to select. Valid options are:

    • categorical Series where each category corresponds to an individual state. NaN entries denote cells that do not belong to any state, i.e., transient cells.

    • dict where keys are states and values are lists of cell barcodes corresponding to annotations in obs_names. If only 1 key is provided, values should correspond to clusters if a categorical Series can be found in obs.

  • cluster_key (Optional[str]) – Key in obs to associate names and colors with terminal_states. Each state will be given the name and color corresponding to the cluster it mostly overlaps with.

  • allow_overlap (bool) – Whether to allow overlapping names between initial and terminal states.

  • kwargs (Any) – Additional keyword arguments.

Return type:

TermStatesEstimator

Returns:

: Returns self and updates the following fields:

property terminal_states: Series | None#

Categorical annotation of terminal states.

By default, all transient cells will be labeled as NaN.

property terminal_states_probabilities: Series | None#

Probability to be a terminal state.

Models#

class cellrank.models.BaseModel(adata, model)[source]#

Base class for all model classes.

Parameters:
  • adata (Optional[AnnData]) – Annotated data object.

  • model (Any) – The underlying model that is used for fitting and prediction.

property adata: AnnData#

Annotated data object.

property conf_int: ndarray#

Array of shape (n_samples, 2) containing the lower and upper bound of the confidence interval.

abstract confidence_interval(x_test=None, **kwargs)[source]#

Calculate the confidence interval.

Use default_confidence_interval() function if underlying model has no method for confidence interval calculation.

Parameters:
Return type:

ndarray

Returns:

: Returns self and updates the following fields:

  • conf_int - Array of shape (n_samples, 2) containing the lower and upper bound of the confidence interval.

abstract copy()[source]#

Return a copy of self.

Return type:

BaseModel

default_confidence_interval(x_test=None, **kwargs)[source]#

Calculate the confidence interval, if the underlying model has no method for it.

This formula is taken from [DeSalvo, 1970], eq. 5.

Parameters:
Return type:

ndarray

Returns:

: Returns self and updates the following fields:

Also updates the following fields:

  • x_hat - Filtered independent variables used when calculating default confidence interval, usually same as x.

  • y_hat - Filtered dependent variables used when calculating default confidence interval, usually same as y.

abstract fit(x=None, y=None, w=None, **kwargs)[source]#

Fit the model.

Parameters:
  • x (Optional[ndarray]) – Independent variables, array of shape (n_samples, 1). If None, use x.

  • y (Optional[ndarray]) – Dependent variables, array of shape (n_samples, 1). If None, use y.

  • w (Optional[ndarray]) – Optional weights of x, array of shape (n_samples,). If None, use w.

  • kwargs (Any) – Keyword arguments for underlying model’s fitting function.

Return type:

BaseModel

Returns:

: Fits the model and returns self.

property model: Any#

Underlying model.

plot(figsize=(8, 5), same_plot=False, hide_cells=False, perc=None, fate_prob_cmap=<matplotlib.colors.ListedColormap object>, cell_color=None, lineage_color='black', alpha=0.8, lineage_alpha=0.2, title=None, size=15, lw=2, cbar=True, margins=0.015, xlabel='pseudotime', ylabel='expression', conf_int=True, lineage_probability=False, lineage_probability_conf_int=False, lineage_probability_color=None, obs_legend_loc='best', dpi=None, fig=None, ax=None, return_fig=False, save=None, **kwargs)[source]#

Plot the smoothed gene expression.

Parameters:
  • figsize (Tuple[float, float]) – Size of the figure.

  • same_plot (bool) – Whether to plot all trends in the same plot.

  • hide_cells (bool) – Whether to hide the cells.

  • perc (Optional[Tuple[float, float]]) – Percentile by which to clip the fate probabilities.

  • fate_prob_cmap (ListedColormap) – Colormap to use when coloring in the fate probabilities.

  • cell_color (Optional[str]) – Key in obs or var_names used for coloring the cells.

  • lineage_color (str) – Color for the lineage.

  • alpha (float) – Alpha value in \([0, 1]\) for the transparency of cells.

  • lineage_alpha (float) – Alpha value in \([0, 1]\) for the transparency lineage confidence intervals.

  • title (Optional[str]) – Title of the plot.

  • size (int) – Size of the points.

  • lw (float) – Line width for the smoothed values.

  • cbar (bool) – Whether to show the colorbar.

  • margins (float) – Margins around the plot.

  • xlabel (str) – Label on the x-axis.

  • ylabel (str) – Label on the y-axis.

  • conf_int (bool) – Whether to show the confidence interval.

  • lineage_probability (bool) – Whether to show smoothed lineage probability as a dashed line. Note that this will require 1 additional model fit.

  • lineage_probability_conf_int (Union[bool, float]) – Whether to compute and show smoothed lineage probability confidence interval.

  • lineage_probability_color (Optional[str]) – Color to use when plotting the smoothed lineage_probability. If None, it’s the same as lineage_color. Only used when show_lineage_probability = True.

  • obs_legend_loc (Optional[str]) – Location of the legend when cell_color corresponds to a categorical variable.

  • dpi (Optional[int]) – Dots per inch.

  • fig (Optional[Figure]) – Figure to use. If None, create a new one.

  • ax (Optional[Axes]) – Ax to use. If None, create a new one.

  • return_fig (bool) – If True, return the figure object.

  • save (Optional[str]) – Filename where to save the plot. If None, just shows the plots.

  • kwargs (Any) – Keyword arguments for legend().

Return type:

Optional[Figure]

Returns:

: Nothing, just plots the figure. Optionally saves it based on save.

abstract predict(x_test=None, key_added='_x_test', **kwargs)[source]#

Run the prediction.

Parameters:
  • x_test (Optional[ndarray]) – Array of shape (n_samples,) used for prediction. If None, use x_test.

  • key_added (Optional[str]) – Attribute name where to save the x_test for later use. If None, don’t save it.

  • kwargs (Any) – Keyword arguments for underlying model’s prediction method.

Return type:

ndarray

Returns:

: Returns and updates the following fields:

  • y_test - Prediction values of shape (n_samples,) for x_test.

prepare(gene, lineage, time_key, backward=False, time_range=None, data_key='X', use_raw=False, threshold=None, weight_threshold=(0.01, 0.01), filter_cells=None, n_test_points=200)[source]#

Prepare the model to be ready for fitting.

Parameters:
  • gene (str) – Gene in var_names.

  • lineage (Optional[str]) – Name of the lineage. If None, all weights will be set to \(1\).

  • time_key (str) – Key in obs where the pseudotime is stored.

  • backward (bool) – Direction of the process.

  • time_range (Union[float, Tuple[float, float], None]) –

    Specify start and end times:

    • tuple - it specifies the minimum and maximum pseudotime. Both values can be None, in which case the minimum is the earliest pseudotime and the maximum is automatically determined.

    • float - it specifies the maximum pseudotime.

  • data_key (Optional[str]) – Key in layers or 'X' for X. If use_raw = True, it’s always set to 'X'.

  • use_raw (bool) – Whether to access raw.

  • threshold (Optional[float]) – Consider only cells with weights > threshold when estimating the test endpoint. If None, use the median of the weights.

  • weight_threshold (Union[float, Tuple[float, float]]) – Set all weights below weight_threshold to weight_threshold if a float, or to the second value, if a tuple.

  • filter_cells (Optional[float]) – Filter out all cells with expression values lower than this threshold.

  • n_test_points (int) – Number of test points. If None, use the original points based on threshold.

Return type:

BaseModel

Returns:

: Nothing, just updates the following fields:

  • x - Filtered independent variables of shape (n_filtered_cells, 1) used for fitting.

  • y - Filtered dependent variables of shape (n_filtered_cells, 1) used for fitting.

  • w - Filtered weights of shape (n_filtered_cells,) used for fitting.

  • x_all - Unfiltered independent variables of shape (n_cells, 1).

  • y_all - Unfiltered dependent variables of shape (n_cells, 1).

  • w_all - Unfiltered weights of shape (n_cells,).

  • x_test - Independent variables of shape (n_samples, 1) used for prediction.

  • prepared - Whether the model is prepared for fitting.

property prepared#

Whether the model is prepared for fitting.

property shape: Tuple[int]#

Number of cells in adata.

property w: ndarray#

Filtered weights of shape (n_filtered_cells,) used for fitting.

property w_all: ndarray#

Unfiltered weights of shape (n_cells,).

property x: ndarray#

Filtered independent variables of shape (n_filtered_cells, 1) used for fitting.

property x_all: ndarray#

Unfiltered independent variables of shape (n_cells, 1).

property x_hat: ndarray#

Filtered independent variables used when calculating default confidence interval, usually same as x.

property x_test: ndarray#

Independent variables of shape (n_samples, 1) used for prediction.

property y: ndarray#

Filtered dependent variables of shape (n_filtered_cells, 1) used for fitting.

property y_all: ndarray#

Unfiltered dependent variables of shape (n_cells, 1).

property y_hat: ndarray#

Filtered dependent variables used when calculating default confidence interval, usually same as y.

property y_test: ndarray#

Prediction values of shape (n_samples,) for x_test.

Lineage#

class cellrank.Lineage(input_array, *, names, colors=None)[source]#

Lightweight ndarray wrapper that adds names and colors.

Parameters:
  • input_array (ndarray) – Input array containing lineage probabilities stored in columns.

  • names (Iterable[str]) – Lineage names.

  • colors (Iterable[ColorLike] | None) – Lineage colors.

Return type:

Lineage

property T#

Transpose of self.

property X: ndarray#

Convert self to an array.

property colors: ndarray#

Lineage colors.

classmethod from_adata(adata, backward=False, estimator_backward=None, kind=LinKind.FATE_PROBS, copy=False)[source]#

Reconstruct the Lineage object from AnnData object.

Parameters:
  • adata (AnnData) – Annotated data object.

  • backward (bool) – Direction of the process.

  • estimator_backward (Optional[bool]) – Key which helps to determine whether these states are initial or terminal.

  • kind (Literal['macrostates', 'term_states', 'fate_probs']) –

    Which kind of object to reconstruct. Valid options are:

  • copy (bool) – Whether to return a copy of the underlying array.

Return type:

Lineage

Returns:

: The reconstructed lineage object.

property names: ndarray#

Lineage names.

plot_pie(reduction, title=None, legend_loc='on data', legend_kwargs=mappingproxy({}), figsize=None, dpi=None, save=None, **kwargs)[source]#

Plot a pie chart visualizing aggregated lineage probabilities.

Parameters:
Return type:

None

Returns:

: Nothing, just plots the figure. Optionally saves it based on save.

priming_degree(method='kl_divergence', early_cells=None)[source]#

Compute the degree of lineage priming.

It returns a score in \([0, 1]\) where \(0\) stands for naive and \(1\) stands for committed.

Parameters:
  • method (Literal['kl_divergence', 'entropy']) –

    The method used to compute the degree of lineage priming. Valid options are:

    • 'kl_divergence' - as in [Velten et al., 2017], computes KL-divergence between the fate probabilities of a cell and the average fate probabilities. Computation of average fate probabilities can be restricted to a set of user-defined early_cells.

    • 'entropy' - as in [Setty et al., 2019], computes entropy over a cell’s fate probabilities.

  • early_cells (Optional[ndarray]) – Cell IDs or a mask marking early cells. If None, use all cells. Only used when method = 'kl_divergence'.

Return type:

ndarray

Returns:

: The priming degree.

reduce(*keys, mode=Reduction.DIST, dist_measure=DistanceMeasure.MUTUAL_INFO, normalize_weights=NormWeights.SOFTMAX, softmax_scale=1.0, return_weights=False)[source]#

Subset states and normalize them so that they again sum to \(1\).

Parameters:
  • keys (str) – List of keys that define the states, to which this object will be reduced by projecting the values of the other states.

  • mode (Literal['dist', 'scale']) –

    Reduction mode to use. Valid options are:

    • 'dist' - use a distance measure dist_measure to compute weights.

    • 'scale' - just rescale the values.

  • dist_measure (Literal['cosine_sim', 'wasserstein_dist', 'kl_div', 'js_div', 'mutual_info', 'equal']) –

    Used to quantify similarity between query and reference states. Valid options are:

    • 'cosine_sim' - cosine similarity.

    • 'wasserstein_dist' - Wasserstein distance.

    • 'kl_div' - Kullback–Leibler divergence.

    • 'js_div' - Jensen–Shannon divergence.

    • 'mutual_info' - mutual information.

    • 'equal' - equally redistribute the mass among the rest.

    Only use when mode = 'dist'.

  • normalize_weights (Literal['scale', 'softmax']) –

    How to row-normalize the weights. Valid options are:

    • 'scale' - divide by the sum.

    • 'softmax'- use a softmax.

    Only used when mode = 'dist'.

  • softmax_scale (float) – Scaling factor in the softmax, used for normalizing the weights to sum to \(1\).

  • return_weights (bool) – If True, a DataFrame of the weights used for the projection is also returned. If mode = 'scale', the weights will be None.

Return type:

Union[Lineage, Tuple[Lineage, Optional[DataFrame]]]

Returns:

: The lineage object, reduced to the initial or terminal states. The weights used for the projection of shape (n_query, n_reference), if return_weights = True.

view(dtype=None, type=None, *_, **__)[source]#

Return a view of self.

Return type:

LineageView