pmdarima.model_selection.cross_val_predict

pmdarima.model_selection.cross_val_predict(estimator, y, X=None, cv=None, verbose=0, averaging='mean', return_raw_predictions=False, **kwargs)[source][source]

Generate cross-validated estimates for each input data point

Parameters:

estimator : estimator

An estimator object that implements the fit method

y : array-like or iterable, shape=(n_samples,)

The time-series array.

X : array-like, shape=[n_obs, n_vars], optional (default=None)

An optional 2-d array of exogenous variables.

cv : BaseTSCrossValidator or None, optional (default=None)

An instance of cross-validation. If None, will use a RollingForecastCV. Note that for cross-validation predictions, the CV step cannot exceed the CV horizon, or there will be a gap between fold predictions.

verbose : integer, optional

The verbosity level.

averaging : str or callable, one of [“median”, “mean”] (default=”mean”)

Unlike normal CV, time series CV might have different folds (windows) forecasting the same time step. After all forecast windows are made, we build a matrix of y x n_folds, populating each fold’s forecasts like so:

nan nan nan  # training samples
nan nan nan
nan nan nan
nan nan nan
  1 nan nan  # test samples
  4   3 nan
  3 2.5 3.5
nan   6   5
nan nan   4

We then average each time step’s forecasts to end up with our final prediction results.

return_raw_predictions : bool (default=False)

If True, raw predictions are returned instead of averaged ones. This results in a y x h matrix. For example, if h=3, and step=1 then:

nan nan nan # training samples nan nan nan nan nan nan nan nan nan 1 4 2 # test samples 2 5 7 8 9 1 nan nan nan nan nan nan

First column contains all one-step-ahead-predictions, second column all two-step-ahead-predictions etc. Further metrics can then be calculated as desired.

Examples

>>> import pmdarima as pm
>>> from pmdarima.model_selection import cross_val_predict,    ...     RollingForecastCV
>>> y = pm.datasets.load_wineind()
>>> cv = RollingForecastCV(h=14, step=12)
>>> preds = cross_val_predict(
...     pm.ARIMA((1, 1, 2), seasonal_order=(0, 1, 1, 12)), y, cv=cv)
>>> preds[:5]
array([30710.45743168, 34902.94929722, 17994.16587163, 22127.71167249,
       25473.60876435])

Examples using pmdarima.model_selection.cross_val_predict