cellrank.models.GAMR

class cellrank.models.GAMR(adata, n_knots=5, distribution='gaussian', basis='cr', knotlocs='auto', offset='default', smoothing_penalty=1.0, **kwargs)[source]

Wrapper around R’s mgcv package for fitting GAMs.

Parameters:
  • adata (AnnData) – Annotated data object.

  • n_knots (int) – Number of knots.

  • distribution (str) – Distribution family in rpy2.robjects.r, such as 'gaussian' or 'nb' for negative binomial. If 'nb', raw count data in raw is always used.

  • basis (str) – Basis for the smoothing term. See here for valid options.

  • knotlocs (Literal['auto', 'density']) –

    Position of the knots. Can be one of the following:

    • 'auto' - let mgcv handle the knot positions.

    • 'density' - position the knots based on the density of the pseudotime.

  • offset (Union[ndarray, Literal['default'], None]) – Offset term for the GAM. Only available when distribution='nb'. If ‘default’, it is calculated according to [Robinson and Oshlack, 2010]. The values are saved in adata.obs['cellrank_offset']. If None, no offset is used.

  • smoothing_penalty (float) – Penalty for the smoothing term. The larger the value, the smoother the fitted curve.

  • kwargs (Any) – Keyword arguments for gam control.

Attributes table

adata

Annotated data object.

conf_int

Array of shape (n_samples, 2) containing the lower and upper bound of the confidence interval.

model

Underlying model.

prepared

Whether the model is prepared for fitting.

shape

Number of cells in adata.

signal

The Signal this model was prepared with, or None.

w

Filtered weights of shape (n_filtered_cells,) used for fitting.

w_all

Unfiltered weights of shape (n_cells,).

x

Filtered independent variables of shape (n_filtered_cells, 1) used for fitting.

x_all

Unfiltered independent variables of shape (n_cells, 1).

x_hat

Filtered independent variables used when calculating default confidence interval, usually same as x.

x_test

Independent variables of shape (n_samples, 1) used for prediction.

y

Filtered dependent variables of shape (n_filtered_cells, 1) used for fitting.

y_all

Unfiltered dependent variables of shape (n_cells, 1).

y_hat

Filtered dependent variables used when calculating default confidence interval, usually same as y.

y_test

Prediction values of shape (n_samples,) for x_test.

Methods table

confidence_interval([x_test, level])

Calculate the confidence interval.

copy()

Return a copy of self.

default_confidence_interval([x_test])

Calculate the confidence interval, if the underlying model has no method for it.

fit([x, y, w])

Fit the model.

plot([figsize, same_plot, hide_cells, perc, ...])

Plot the smoothed gene expression.

predict([x_test, key_added, level])

Run the prediction.

prepare(*args, **kwargs)

Prepare the model to be ready for fitting.

read(fname[, adata, copy])

De-serialize self from a file.

write(fname[, write_adata])

Serialize self to a file using pickle.

Attributes

adata

GAMR.adata

Annotated data object.

conf_int

GAMR.conf_int

Array of shape (n_samples, 2) containing the lower and upper bound of the confidence interval.

model

GAMR.model

Underlying model.

prepared

GAMR.prepared

Whether the model is prepared for fitting.

shape

GAMR.shape

Number of cells in adata.

signal

GAMR.signal

The Signal this model was prepared with, or None.

w

GAMR.w

Filtered weights of shape (n_filtered_cells,) used for fitting.

w_all

GAMR.w_all

Unfiltered weights of shape (n_cells,).

x

GAMR.x

Filtered independent variables of shape (n_filtered_cells, 1) used for fitting.

x_all

GAMR.x_all

Unfiltered independent variables of shape (n_cells, 1).

x_hat

GAMR.x_hat

Filtered independent variables used when calculating default confidence interval, usually same as x.

x_test

GAMR.x_test

Independent variables of shape (n_samples, 1) used for prediction.

y

GAMR.y

Filtered dependent variables of shape (n_filtered_cells, 1) used for fitting.

y_all

GAMR.y_all

Unfiltered dependent variables of shape (n_cells, 1).

y_hat

GAMR.y_hat

Filtered dependent variables used when calculating default confidence interval, usually same as y.

y_test

GAMR.y_test

Prediction values of shape (n_samples,) for x_test.

Methods

confidence_interval

GAMR.confidence_interval(x_test=None, level=0.95, **kwargs)[source]

Calculate the confidence interval.

Internally, this method calls predict() to extract the confidence interval, if needed.

Parameters:
Return type:

ndarray

Returns:

: Returns self and updates the following fields:

copy

GAMR.copy()[source]

Return a copy of self.

Return type:

GAMR

default_confidence_interval

GAMR.default_confidence_interval(x_test=None, **kwargs)

Calculate the confidence interval, if the underlying model has no method for it.

This formula is taken from [DeSalvo, 1970], eq. 5.

Parameters:
Return type:

ndarray

Returns:

: Returns self and updates the following fields:

Also updates the following fields:

  • x_hat - Filtered independent variables used when calculating default confidence interval, usually same as x.

  • y_hat - Filtered dependent variables used when calculating default confidence interval, usually same as y.

fit

GAMR.fit(x=None, y=None, w=None, **kwargs)[source]

Fit the model.

Parameters:
  • x (ndarray | None) – Independent variables, array of shape (n_samples, 1). If None, use x.

  • y (ndarray | None) – Dependent variables, array of shape (n_samples, 1). If None, use y.

  • w (ndarray | None) – Optional weights of x, array of shape (n_samples,). If None, use w.

  • kwargs (Any) – Keyword arguments for underlying model’s fitting function.

Return type:

GAMR

Returns:

: Fits the model and returns self. Updates the following fields by filtering out \(0\) weights w:

  • x - Filtered independent variables of shape (n_filtered_cells, 1) used for fitting.

  • y - Filtered dependent variables of shape (n_filtered_cells, 1) used for fitting.

  • w - Filtered weights of shape (n_filtered_cells,) used for fitting.

plot

GAMR.plot(figsize=(8, 5), same_plot=False, hide_cells=False, perc=None, fate_prob_cmap=<matplotlib.colors.ListedColormap object>, cell_color=None, lineage_color='black', alpha=0.8, lineage_alpha=0.2, title=None, size=15, lw=2, cbar=True, margins=0.015, xlabel='pseudotime', ylabel='expression', conf_int=True, lineage_probability=False, lineage_probability_conf_int=False, lineage_probability_color=None, obs_legend_loc='best', dpi=None, fig=None, ax=None, return_fig=False, save=None, **kwargs)

Plot the smoothed gene expression.

Parameters:
  • figsize (tuple[float, float]) – Size of the figure.

  • same_plot (bool) – Whether to plot all trends in the same plot.

  • hide_cells (bool) – Whether to hide the cells.

  • perc (tuple[float, float]) – Percentile by which to clip the fate probabilities.

  • fate_prob_cmap (ListedColormap) – Colormap to use when coloring in the fate probabilities.

  • cell_color (str | None) – Key in obs or var_names used for coloring the cells.

  • lineage_color (str) – Color for the lineage.

  • alpha (float) – Alpha value in \([0, 1]\) for the transparency of cells.

  • lineage_alpha (float) – Alpha value in \([0, 1]\) for the transparency lineage confidence intervals.

  • title (str | None) – Title of the plot.

  • size (int) – Size of the points.

  • lw (float) – Line width for the smoothed values.

  • cbar (bool) – Whether to show the colorbar.

  • margins (float) – Margins around the plot.

  • xlabel (str) – Label on the x-axis.

  • ylabel (str) – Label on the y-axis.

  • conf_int (bool) – Whether to show the confidence interval.

  • lineage_probability (bool) – Whether to show smoothed lineage probability as a dashed line. Note that this will require 1 additional model fit.

  • lineage_probability_conf_int (bool | float) – Whether to compute and show smoothed lineage probability confidence interval.

  • lineage_probability_color (str | None) – Color to use when plotting the smoothed lineage_probability. If None, it’s the same as lineage_color. Only used when show_lineage_probability = True.

  • obs_legend_loc (str | None) – Location of the legend when cell_color corresponds to a categorical variable.

  • dpi (int) – Dots per inch.

  • fig (Figure) – Figure to use. If None, create a new one.

  • ax (Axes) – Ax to use. If None, create a new one.

  • return_fig (bool) – If True, return the figure object.

  • save (str | None) – Filename where to save the plot. If None, just shows the plots.

  • kwargs (Any) – Keyword arguments for legend().

Return type:

Figure | None

Returns:

: Nothing, just plots the figure. Optionally saves it based on save.

predict

GAMR.predict(x_test=None, key_added='_x_test', level=None, **kwargs)[source]

Run the prediction.

This method can also compute the confidence interval.

Parameters:
  • x_test (ndarray | None) – Array of shape (n_samples,) used for prediction. If None, use x_test.

  • key_added (str) – Attribute name where to save the x_test for later use. If None, don’t save it.

  • kwargs – Keyword arguments for underlying model’s prediction method.

  • level (float | None) – Confidence level for confidence interval calculation. If None, don’t compute the confidence interval. Must be in \([0, 1]\).

Return type:

ndarray

Returns:

: Returns and updates the following fields:

prepare

GAMR.prepare(*args, **kwargs)[source]

Prepare the model to be ready for fitting.

This also removes the zero and negative weights and prepares the design matrix.

Parameters:
  • signal

    The observation-aligned quantity to fit along the trajectory. Either a Signal (Gene, Obs or Obsm) or, as a shorthand, a gene name in var_names (equivalent to Gene).

    Added in version 2.3.

  • lineage – Name of the lineage. If None, all weights will be set to \(1\).

  • time_key – Key in obs where the pseudotime is stored.

  • backward – Direction of the process.

  • time_range

    Specify start and end times:

    • tuple - it specifies the minimum and maximum pseudotime. Both values can be None, in which case the minimum is the earliest pseudotime and the maximum is automatically determined.

    • float - it specifies the maximum pseudotime.

  • data_key

    Deprecated since version 2.4: Pass a Signal via signal instead, e.g. Gene(name, layer=...) or Obs(name).

  • use_raw

    Deprecated since version 2.4: Pass Gene(name, use_raw=True) via signal instead.

  • gene

    Deprecated since version 2.4: Renamed to signal, which also accepts Signal objects.

  • threshold – Consider only cells with weights > threshold when estimating the test endpoint. If None, use the median of the weights.

  • weight_threshold – Set all weights below weight_threshold to weight_threshold if a float, or to the second value, if a tuple.

  • filter_cells – Filter out all cells with expression values lower than this threshold.

  • n_test_points – Number of test points. If None, use the original points based on threshold.

  • args (Any)

  • kwargs (Any)

Return type:

GAMR

Returns:

: Nothing, just updates the following fields:

read

static GAMR.read(fname, adata=None, copy=False)

De-serialize self from a file.

Parameters:
  • fname (str | Path) – Path from which to read the object.

  • adata (AnnData | None) – AnnData object to assign to the saved object. Only used when the saved object has adata and it was saved without it.

  • copy (bool) – Whether to copy adata before assigning it. If adata is a view, it is always copied.

Return type:

IOMixin

Returns:

: The de-serialized object.

write

GAMR.write(fname, write_adata=True)

Serialize self to a file using pickle.

Parameters:
  • fname (str | Path) – Path where to save the object.

  • write_adata (bool) – Whether to save adata object.

Return type:

None

Returns:

: Nothing, just writes itself to a file.