pmdarima.pipeline.Pipeline

class pmdarima.pipeline.Pipeline(steps)[source][source]

A pipeline of transformers with an optional final estimator stage

The pipeline object chains together an arbitrary number of named, ordered transformations, passing the output from one as the input to the next. As the last stage, an ARIMA or AutoARIMA object will be fit. This pipeline takes after the scikit-learn sklearn.Pipeline object, which behaves similarly but does not share the same time-series interface that pmdarima follows.

The purpose of the pipeline is to assemble several steps that can be cross-validated together while setting different parameters. For this, it enables setting parameters of the various steps using their names and the parameter name separated by a ‘__’, as in the example below.

Parameters:

steps : list

List of (name, transform) tuples (implementing fit/transform) that are chained, in the order in which they are chained, with the last object an ARIMA or AutoARIMA estimator.

Attributes

named_steps Map the steps to a dictionary

Examples

>>> from pmdarima.datasets import load_wineind
>>> from pmdarima.arima import AutoARIMA
>>> from pmdarima.pipeline import Pipeline
>>> from pmdarima.preprocessing import FourierFeaturizer
>>>
>>> wineind = load_wineind()
>>> pipeline = Pipeline([
...     ("fourier", FourierFeaturizer(m=12, k=3)),
...     ("arima", AutoARIMA(seasonal=False, stepwise=True,
...                         suppress_warnings=True,
...                         error_action='ignore'))
... ])
>>> pipeline.fit(wineind)
Pipeline(steps=[('fourier', FourierFeaturizer(k=3, m=12)),
                ('arima', AutoARIMA(D=None, alpha=0.05, callback=None,
                                    d=None, disp=0, error_action='ignore',
                                    information_criterion='aic', m=1,
                                    max_D=1, max_P=2, max_Q=2, max_d=2,
                                    max_order=10, max_p=5, max_q=5,
                                    maxiter=None, method=None,
                                    n_fits=10, n...s_warnings=True,
                                    test='kpss', trace=False,
                                    transparams=True, trend=None,
                                    with_intercept=True))])

Methods

fit(y[, X]) Fit the pipeline of transformers and the ARIMA model
get_params([deep]) Get parameters for this estimator.
predict([n_periods, X, return_conf_int, …]) Forecast future (transformed) values
predict_in_sample([X, start, end, dynamic, …]) Generate in-sample predictions from the fit pipeline.
set_params(**params) Set the parameters of this estimator.
summary() Get a summary of the ARIMA model
transform([n_periods, X]) Get the transformed X array
update(y[, X, maxiter]) Update an ARIMA or auto-ARIMA as well as any necessary transformers
__init__(steps)[source][source]

Initialize self. See help(type(self)) for accurate signature.

fit(y, X=None, **fit_kwargs)[source][source]

Fit the pipeline of transformers and the ARIMA model

Chain the time-series and X array through a series of transformations, fitting each stage along the way, finally fitting an ARIMA or AutoARIMA model.

Parameters:

y : array-like or iterable, shape=(n_samples,)

The time-series to which to fit the ARIMA estimator. This may either be a Pandas Series object (statsmodels can internally use the dates in the index), or a numpy array. This should be a one-dimensional array of floats, and should not contain any np.nan or np.inf values.

X : array-like, shape=[n_obs, n_vars], optional (default=None)

An optional 2-d array of exogenous variables. If provided, these variables are used as additional features in the regression operation. This should not include a constant or trend. Note that if an ARIMA is fit on exogenous features, it must be provided exogenous features for making predictions.

**fit_kwargs : keyword args

Extra keyword arguments used for each stage’s fit stage. Similar to scikit-learn pipeline keyword args, the keys are compound, comprised of the stage name and the argument name separated by a “__”. For instance, if fitting an ARIMA in stage “arima”, your kwargs may resemble:

{"arima__maxiter": 10}
get_params(deep=True)[source]

Get parameters for this estimator.

Parameters:

deep : bool, default=True

If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns:

params : mapping of string to any

Parameter names mapped to their values.

named_steps

Map the steps to a dictionary

predict(n_periods=10, X=None, return_conf_int=False, alpha=0.05, inverse_transform=True, **kwargs)[source][source]

Forecast future (transformed) values

Generate predictions (forecasts) n_periods in the future. Note that if an X array was used in the model fit, it will be expected for the predict procedure and will fail otherwise. Forecasts may be transformed by the endogenous steps along the way and might be on a different scale than raw training/test data.

Parameters:

n_periods : int, optional (default=10)

The number of periods in the future to forecast.

X : array-like, shape=[n_obs, n_vars], optional (default=None)

An optional 2-d array of exogenous variables. If provided, these variables are used as additional features in the regression operation. This should not include a constant or trend. Note that if an ARIMA is fit on exogenous features, it must be provided exogenous features for making predictions.

return_conf_int : bool, optional (default=False)

Whether to get the confidence intervals of the forecasts.

alpha : float, optional (default=0.05)

The confidence intervals for the forecasts are (1 - alpha) %

inverse_transform : bool, optional (default=True)

Whether to inverse transform predictions, if they are in log or BoxCox scale. Any endog transformer will be inverse-transformed.

**kwargs : keyword args

Extra keyword arguments used for each stage’s transform stage and the estimator’s predict stage. Similar to scikit-learn pipeline keyword args, the keys are compound, comprised of the stage name and the argument name separated by a “__”. For instance, if you have a FourierFeaturizer whose stage is named “fourier”, your transform kwargs could resemble:

{"fourier__n_periods": 50}
Returns:

forecasts : array-like, shape=(n_periods,)

The array of transformed, forecasted values.

conf_int : array-like, shape=(n_periods, 2), optional

The confidence intervals for the forecasts. Only returned if return_conf_int is True.

predict_in_sample(X=None, start=None, end=None, dynamic=False, return_conf_int=False, alpha=0.05, inverse_transform=True, **kwargs)[source][source]

Generate in-sample predictions from the fit pipeline.

Predicts the original training (in-sample) time series values. This can be useful when wanting to visualize the fit, and qualitatively inspect the efficacy of the model, or when wanting to compute the residuals of the model.

Parameters:

X : array-like, shape=[n_obs, n_vars], optional (default=None)

An optional 2-d array of exogenous variables. If provided, these variables are used as additional features in the regression operation. This should not include a constant or trend. Note that if an ARIMA is fit on exogenous features, it must be provided exogenous features for making predictions.

start : int, optional (default=None)

Zero-indexed observation number at which to start forecasting, ie., the first forecast is start.

end : int, optional (default=None)

Zero-indexed observation number at which to end forecasting, ie., the first forecast is start.

dynamic : bool, optional (default=False)

The dynamic keyword affects in-sample prediction. If dynamic is False, then the in-sample lagged values are used for prediction. If dynamic is True, then in-sample forecasts are used in place of lagged dependent variables. The first forecasted value is start.

return_conf_int : bool, optional (default=False)

Whether to get the confidence intervals of the forecasts.

alpha : float, optional (default=0.05)

The confidence intervals for the forecasts are (1 - alpha) %

inverse_transform : bool, optional (default=True)

Whether to inverse transform predictions, if they are in log or BoxCox scale. Any endog transformer will be inverse-transformed.

**kwargs : keyword args

Extra keyword arguments used for each stage’s transform stage. Similar to scikit-learn pipeline keyword args, the keys are compound, comprised of the stage name and the argument name separated by a “__”. For instance, if you have a FourierFeaturizer whose stage is named “fourier”, your transform kwargs could resemble:

{"fourier__n_periods": 50}
Returns:

preds : array

The predicted values.

conf_int : array-like, shape=(n_periods, 2), optional

The confidence intervals for the predictions. Only returned if return_conf_int is True.

set_params(**params)[source]

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters:

**params : dict

Estimator parameters.

Returns:

self : object

Estimator instance.

summary()[source][source]

Get a summary of the ARIMA model

transform(n_periods=10, X=None, **kwargs)[source][source]

Get the transformed X array

Generate the X array n_periods in the future. This passes the provided exog variables through all transformation steps, returning it before it’s passed into the model.

Parameters:

n_periods : int, optional (default=10)

The number of periods in the future to forecast.

X : array-like, shape=[n_obs, n_vars], optional (default=None)

An optional 2-d array of exogenous variables. If provided, these variables are used as additional features in the regression operation. This should not include a constant or trend. Note that if an ARIMA is fit on exogenous features, it must be provided exogenous features for making predictions.

**kwargs : keyword args

Extra keyword arguments used for each stage’s transform stage and the estimator’s predict stage. Similar to scikit-learn pipeline keyword args, the keys are compound, comprised of the stage name and the argument name separated by a “__”. For instance, if you have a FourierFeaturizer whose stage is named “fourier”, your transform kwargs could resemble:

{"fourier__n_periods": 50}
Returns:

X_prime : pd.DataFrame

The transformed exog array.

update(y, X=None, maxiter=None, **kwargs)[source][source]

Update an ARIMA or auto-ARIMA as well as any necessary transformers

Passes the newly observed values through the appropriate endog transformations, and the X array through the exog transformers (updating where necessary) before finally updating the ARIMA model.

Parameters:

y : array-like or iterable, shape=(n_samples,)

The time-series data to add to the endogenous samples on which the ARIMA estimator was previously fit. This may either be a Pandas Series object or a numpy array. This should be a one- dimensional array of finite floats.

X : array-like, shape=[n_obs, n_vars], optional (default=None)

An optional 2-d array of exogenous variables. If the model was fit with an exogenous array of covariates, it will be required for updating the observed values.

maxiter : int, optional (default=None)

The number of iterations to perform when updating the model. If None, will perform max(5, n_samples // 10) iterations.

**kwargs : keyword args

Extra keyword arguments used for each stage’s update stage. Similar to scikit-learn pipeline keyword args, the keys are compound, comprised of the stage name and the argument name separated by a “__”.