cellrank.models.SKLearnModel#
- class cellrank.models.SKLearnModel(adata, model, weight_name=None, ignore_raise=False)[source]#
Wrapper around
sklearn.base.BaseEstimator
.- Parameters:
adata (
anndata.AnnData
) – Annotated data object.model (
BaseEstimator
) – Instance of the underlyingsklearn
estimator, such assklearn.svm.SVR
.weight_name (
Optional
[str
]) – Name of the weight argument formodel
.fit
. If None, to determine it automatically. If and empty string, no weights will be used.ignore_raise (
bool
) – Do not raise an exception if weight argument is not found in the fitting function ofmodel
. This is useful in case when weight is passed in**kwargs
and cannot be determined from signature.
Attributes table#
Annotated data object. |
|
Array of shape (n_samples, 2) containing the lower and upper bounds of the confidence interval. |
|
The underlying |
|
Whether the model is prepared for fitting. |
|
Number of cells in |
|
Filtered weights of shape (n_filtered_cells,) used for fitting. |
|
Unfiltered weights of shape (n_cells,). |
|
Filtered independent variables of shape (n_filtered_cells, 1) used for fitting. |
|
Unfiltered independent variables of shape (n_cells, 1). |
|
Filtered independent variables used when calculating default confidence interval, usually same as |
|
Independent variables of shape (n_samples, 1) used for prediction. |
|
Filtered dependent variables of shape (n_filtered_cells, 1) used for fitting. |
|
Unfiltered dependent variables of shape (n_cells, 1). |
|
Filtered dependent variables used when calculating default confidence interval, usually same as |
|
Prediction values of shape (n_samples,) for |
Methods table#
|
Calculate the confidence interval. |
|
Return a copy of self. |
|
Calculate the confidence interval, if the underlying |
|
Fit the model. |
|
Plot the smoothed gene expression. |
|
Run the prediction. |
|
Prepare the model to be ready for fitting. |
|
De-serialize self from a file. |
|
Serialize self to a file. |
Attributes#
adata#
- SKLearnModel.adata#
Annotated data object.
- Returns:
adata :
anndata.AnnData
Annotated data object.
conf_int#
- SKLearnModel.conf_int#
Array of shape (n_samples, 2) containing the lower and upper bounds of the confidence interval.
model#
- SKLearnModel.model#
The underlying
sklearn.base.BaseEstimator
.
prepared#
- SKLearnModel.prepared#
Whether the model is prepared for fitting.
shape#
w#
- SKLearnModel.w#
Filtered weights of shape (n_filtered_cells,) used for fitting.
w_all#
- SKLearnModel.w_all#
Unfiltered weights of shape (n_cells,).
x#
- SKLearnModel.x#
Filtered independent variables of shape (n_filtered_cells, 1) used for fitting.
x_all#
- SKLearnModel.x_all#
Unfiltered independent variables of shape (n_cells, 1).
x_hat#
x_test#
- SKLearnModel.x_test#
Independent variables of shape (n_samples, 1) used for prediction.
y#
- SKLearnModel.y#
Filtered dependent variables of shape (n_filtered_cells, 1) used for fitting.
y_all#
- SKLearnModel.y_all#
Unfiltered dependent variables of shape (n_cells, 1).
y_hat#
y_test#
Methods#
confidence_interval#
- SKLearnModel.confidence_interval(x_test=None, **kwargs)[source]#
Calculate the confidence interval.
Use
default_confidence_interval()
function if underlyingmodel
has not method for confidence interval calculation.- Parameters:
x_test (
Optional
[ndarray
]) – Array of shape (n_samples,) used for confidence interval calculation. If None, usex_test
.kwargs – Keyword arguments for underlying
model
’s confidence method or fordefault_confidence_interval()
.
- Return type:
- Returns:
: Updates and returns the following field:
conf_int
- Array of shape (n_samples, 2) containing the lower and upper bounds of the confidence interval.
copy#
default_confidence_interval#
- SKLearnModel.default_confidence_interval(x_test=None, **kwargs)#
Calculate the confidence interval, if the underlying
model
has no method for it.This formula is taken from [DeSalvo, 1970], eq. 5.
- Parameters:
x_test (
Optional
[ndarray
]) – Array of shape (n_samples,) used for confidence interval calculation. If None, usex_test
.kwargs – Keyword arguments for underlying
model
’s confidence method or fordefault_confidence_interval()
.
- Return type:
- Returns:
: Updates and returns the following field:
conf_int
- Array of shape (n_samples, 2) containing the lower and upper bounds of the confidence interval.
Also update the following fields:
fit#
- SKLearnModel.fit(x=None, y=None, w=None, **kwargs)[source]#
Fit the model.
- Parameters:
x (
Optional
[ndarray
]) – Independent variables, array of shape (n_samples, 1). If None, usex
.y (
Optional
[ndarray
]) – Dependent variables, array of shape (n_samples, 1). If None, usey
.w (
Optional
[ndarray
]) – Optional weights ofx
, array of shape (n_samples,). If None, usew
.kwargs – Keyword arguments for underlying
model
’s fitting function.
- Return type:
- Returns:
: Fits the model and returns self.
plot#
- SKLearnModel.plot(figsize=(8, 5), same_plot=False, hide_cells=False, perc=None, abs_prob_cmap=<matplotlib.colors.ListedColormap object>, cell_color=None, lineage_color='black', alpha=0.8, lineage_alpha=0.2, title=None, size=15, lw=2, cbar=True, margins=0.015, xlabel='pseudotime', ylabel='expression', conf_int=True, lineage_probability=False, lineage_probability_conf_int=False, lineage_probability_color=None, obs_legend_loc='best', dpi=None, fig=None, ax=None, return_fig=False, save=None, **kwargs)#
Plot the smoothed gene expression.
- Parameters:
same_plot (
bool
) – Whether to plot all trends in the same plot.hide_cells (
bool
) – Whether to hide the cells.perc (
Optional
[Tuple
[float
,float
]]) – Percentile by which to clip the absorption probabilities.abs_prob_cmap (
ListedColormap
) – Colormap to use when coloring in the absorption probabilities.cell_color (
Optional
[str
]) – Key inanndata.AnnData.obs
oranndata.AnnData.var_names
used for coloring the cells.lineage_color (
str
) – Color for the lineage.alpha (
float
) – Alpha channel for cells.lineage_alpha (
float
) – Alpha channel for lineage confidence intervals.size (
int
) – Size of the points.lw (
float
) – Line width for the smoothed values.cbar (
bool
) – Whether to show colorbar.margins (
float
) – Margins around the plot.xlabel (
str
) – Label on the x-axis.ylabel (
str
) – Label on the y-axis.conf_int (
bool
) – Whether to show the confidence interval.lineage_probability (
bool
) – Whether to show smoothed lineage probability as a dashed line. Note that this will require 1 additional model fit.lineage_probability_conf_int (
Union
[bool
,float
]) – Whether to compute and show smoothed lineage probability confidence interval. Ifself
iscellrank.models.GAMR
, it can also specify the confidence level, the default is 0.95. Only used whenshow_lineage_probability=True
.lineage_probability_color (
Optional
[str
]) – Color to use when plotting the smoothedlineage_probability
. If None, it’s the same aslineage_color
. Only used whenshow_lineage_probability=True
.obs_legend_loc (
Optional
[str
]) – Location of the legend whencell_color
corresponds to a categorical variable.fig (
Optional
[Figure
]) – Figure to use, if None, create a new one.ax (
matplotlib.axes.Axes
) – Ax to use, if None, create a new one.return_fig (
bool
) – If True, return the figure object.save (
Optional
[str
]) – Filename where to save the plot. If None, just shows the plots.kwargs – Keyword arguments for
matplotlib.axes.Axes.legend()
, e.g. to disable the legend, specifyloc=None
. Only available whenshow_lineage_probability=True
.
- Return type:
- Returns:
: Nothing, just plots the figure. Optionally saves it based on
save
.
predict#
prepare#
- SKLearnModel.prepare(gene, lineage, backward=False, time_range=None, data_key='X', time_key='latent_time', use_raw=False, threshold=None, weight_threshold=(0.01, 0.01), filter_cells=None, n_test_points=200)#
Prepare the model to be ready for fitting.
- Parameters:
gene (
str
) – Gene inanndata.AnnData.var_names
.lineage (
Optional
[str
]) – Name of the lineage. If None, all weights will be set to 1.backward (
bool
) – Direction of the process.time_range (
Union
[float
,Tuple
[float
,float
],None
]) –Specify start and end times:
data_key (
Optional
[str
]) – Key inanndata.AnnData.layers
or ‘X’ foranndata.AnnData.X
. Ifuse_raw = True
, it’s always set to ‘X’.time_key (
str
) – Key inanndata.AnnData.obs
where the pseudotime is stored.use_raw (
bool
) – Whether to accessanndata.AnnData.raw
.threshold (
Optional
[float
]) – Consider only cells with weights >threshold
when estimating the test endpoint. If None, use the median of the weights.weight_threshold (
Union
[float
,Tuple
[float
,float
]]) – Set all weights belowweight_threshold
toweight_threshold
if afloat
, or to the second value, if atuple
.filter_cells (
Optional
[float
]) – Filter out all cells with expression values lower than this threshold.n_test_points (
int
) – Number of test points. If None, use the original points based onthreshold
.
- Return type:
BaseModel
- Returns:
: Nothing, just updates the following fields:
x
- Filtered independent variables of shape (n_filtered_cells, 1) used for fitting.y
- Filtered dependent variables of shape (n_filtered_cells, 1) used for fitting.w
- Filtered weights of shape (n_filtered_cells,) used for fitting.x_all
- Unfiltered independent variables of shape (n_cells, 1).y_all
- Unfiltered dependent variables of shape (n_cells, 1).w_all
- Unfiltered weights of shape (n_cells,).x_test
- Independent variables of shape (n_samples, 1) used for prediction.prepared
- Whether the model is prepared for fitting.
read#
- static SKLearnModel.read(fname, adata=None, copy=False)#
De-serialize self from a file.
- Parameters:
fname (
Union
[str
,Path
]) – Filename from which to read the object.adata (
Optional
[AnnData
]) –anndata.AnnData
object to assign to the saved object. Only used when the saved object hasadata
and it was saved without it.copy (
bool
) – Whether to copyadata
before assigning it or not. Ifadata
is a view, it is always copied.
- Return type:
IOMixin
- Returns:
: The de-serialized object.