What’s new in pmdarima¶

As new releases of pmdarima are pushed out, the following list (introduced in v0.8.1) will document the latest features.

v1.7.0 ¶

Address issue #341 where a combination of a large m value and D value could difference an array into being too small to test stationarity in the ADF test
Fix issue #351 where a large value of m could prevent the seasonality test from completing.
Fix issue #354 where models with near non-invertible roots could still be considered as candidate best-fits.
Remove legacy pickling behavior that separates the statsmodels object from the pmdarima object. This breaks backwards compatibility with versions pre-1.2.0.
Change default with_intercept in pmdarima.arima.auto_arima() to 'auto' rather than True. This will behave much like the current behavior, where a truthiness check will still return True, but allows the stepwise search to selectively change it to False in the presence of certain differencing conditions.
Inverse endog transformation is now supported when return_conf_int=True on pipeline predictions (thanks to skyetim)
Fix a bug where the pmdarima.model_selection.SlidingWindowForecastCV could produce too few splits for the given input data.
Permit custom scoring metrics to be passed for out-of-sample scoring, as requested in #368.

v1.6.1 ¶

Pin Cython to be >=0.29,<0.29.18
Pin statsmodels to be >=0.11

v1.6.0 ¶

Support newest versions of matplotlib
Add new level of auto_arima error actions: “trace” which will warn for errors while dumping the original stacktrace.
New featurizer: pmdarima.preprocessing.DateFeaturizer. This can be used to create dummy and ordinal exogenous features and is useful when modeling pseudo-seasonal trends or time series with holes in them.
Removes first-party conda distributions (see #326)
Raise a ValueError in arima.predict_in_sample when start < d

v1.5.3 ¶

Adds first-party conda distributions as requested in #173
- Due to dependency limitations, we only support 64-bit architectures and Python 3.6 or 3.7
Adds Python 3.8 support as requested in #199
Added pmdarima.datasets.load_gasoline() dataset
Added integer levels of verbosity in the trace argument
Added support for statsmodels 0.11+
Added pmdarima.model_selection.cross_val_predict(), as requested in #291

v1.5.2 ¶

Added pmdarima.show_versions as a utility for issue filing
Fixed deprecation for check_is_fitted in newer versions of scikit-learn
Adds the pmdarima.datasets.load_sunspots() method with R’s sunspots dataset
Adds the pmdarima.model_selection.train_test_split() method
Fix bug where 1.5.1 documentation was labeled version “0.0.0”.
Fix bug reported in #271, where the use of threading.local to store stepwise context information may have broken job schedulers.
Fix bug reported in #272, where the new default value of max_order can cause a ValueError even in default cases when stepwise=False.

v1.5.1 ¶

No longer use statsmodels’ ARIMA or ARMA class under the hood; only use the SARIMAX model, which cuts back on a lot of errors/warnings we saw in the past. (#211)
Defaults in the ARIMA class that have changed as a result of #211:
- maxiter is now 50 (was None)
- method is now ‘lbfgs’ (was None)
- seasonal_order is now (0, 0, 0, 0) (was None)
- max_order is now 5 (was 10) and is no longer used as a constraint when stepwise=True
Correct bug where aicc always added 1 (for constant) to degrees of freedom, even when df_model accounted for the constant term.
New pmdarima.arima.auto.StepwiseContext feature for more control over fit duration (introduced by @kpsunkara in #221).
Adds the pmdarima.preprocessing.LogEndogTransformer class as discussed in #205
Exogenous arrays are no longer cast to numpy array by default, and will pass pandas frames through to the model. This keeps variable names intact in the summary (#222)
Added the prefix param to exogenous featurizers to allow the addition of meaningful names to engineered features.
Added polyroot test of near non-invertibility when stepwise=True. For models that are near non-invertible will be deprioritized in model selection as requested in #208.
Removes pmdarima.arima.ARIMA.add_new_samples, which was previously deprecated. Use pmdarima.arima.ARIMA.update() instead.
The following args have been deprecated from the pmdarima.arima.ARIMA class as well as pmdarima.arima.auto_arima() and any other calling methods/classes:
- disp^[1]
- callback^[1]
- transparams
- solver
- typ
[1] These can still be passed to the fit method via **fit_kwargs, but should no longer be passed to the model constructor.
Added diff_inv function that is in parity with R’s implementation, diffinv, as requested in #180.
Added decompose function that is in parity with R’s implementation, decompose, as requested in #190

v1.4.0 ¶

Fixes #191, an issue where the OCSB test could raise ValueError: negative dimensions are not allowed" in OCSB test
Add option to automatically inverse-transform endogenous transformations when predicting from pipelines (#197)
Add predict_in_sample to pipeline (#196)
Parameterize dtype option in datasets module
Adds the model_selection submodule, which defines several different cross-validation classes as well as CV functions:
Adds the pmdarima.datasets.load_taylor() dataset

v1.3.0 ¶

Adds a new dataset for stock prediction, along with an associated example (load_msft)
Fixes a bug in predict_in_sample, as addressed in #140.
Numpy 1.16+ is now required
Statsmodels 0.10.0+ is now required
Added sarimax_kwargs to ARIMA constructor and auto_arima function. This fixes #146

v1.2.1 ¶

Pins scipy at 1.2.0 to avoid a statsmodels bug.

v1.2.0 ¶

Adds the OCSBTest of seasonality, as discussed in #88
Default value of seasonal_test changes from “ch” to “ocsb” in auto_arima
Default value of test changes from “ch” to “ocsb” in nsdiffs
Adds benchmarking notebook and capabilities in pytest plugins
Removes the following environment variables, which are now deprecated:
- PMDARIMA_CACHE and PYRAMID_ARIMA_CACHE
- PMDARIMA_CACHE_WARN_SIZE and PYRAMID_ARIMA_CACHE_WARN_SIZE
- PYRAMID_MPL_DEBUG
- PYRAMID_MPL_BACKEND
Deprecates the is_stationary method in tests of stationarity. This will be removed in v1.4.0. Use should_diff instead.
Adds two new datasets: airpassengers & austres
When using out_of_sample, the out-of-sample predictions are now stored under the oob_preds_ attribute.
Adds a number of transformer classes including:
- BoxCoxEndogTransformer
- FourierFeaturizer
Adds a Pipeline class resembling that of scikit-learn’s, which allows the stacking of transformers together.
Adds a class wrapper for auto_arima: AutoARIMA. This is allows auto-ARIMA to be used with pipelines.

v1.1.1 ¶

v1.1.1 is a patch release in response to #104

Deprecates the ARIMA.add_new_observations method. This method originally was designed to support updating the endogenous/exogenous arrays with new observations without changing the model parameters, but achieving this behavior for each of statsmodels’ ARMA, ARIMA and SARIMAX classes proved nearly impossible, given the extremely complex internals of statmodels estimators.
Replaces ARIMA.add_new_observations with ARIMA.update. This allows the user to update the model with new observations by taking maxiter new steps from the existing model coefficients and allowing the MLE to converge to an updated set of model parameters.
Changes default maxiter to None, using 50 for seasonal models and 500 for non-seasonal models (as statsmodels does). The default value used to be 50 for all models.
New behavior in ARIMA.fit allows start_params and maxiter to be passed as **fit_args, overriding the use of their corresponding instance attributes.

v1.1.0 ¶

Adds ARIMA.plot_diagnostics method, as requested in #49
Adds new arg to ARIMA constructor and auto_arima: with_intercept (default is True).
New default for trend is no longer 'c', it is None.
Adds to_dict method to ARIMA class to address Issue #54
ARIMA serialization no longer stores statsmodels results wrappers in the cache, but bundles them into the pickle file. This solves Issue #48 and only works on statsmodels 0.9.0+ since they’ve fixed a bug on their end.
The 'PMDARIMA_CACHE' and 'PMDARIMA_CACHE_WARN_SIZE' environment variables are now deprecated, since they no longer need to be used.
Added versioned documentation. All releases’ doc (from 0.9.0 onward) is now available at alkaline-ml.com/pmdarima/<version>
Fixes bug in ADFTest where OLS was computed with method="pinv" rather than "method=qr". This fix means better parity with R’s results. See #71 for more context.
CHTest now solves linear regression with normalize=True. This solves #74
Python 3.7 is now supported(!!)

v1.0.0 ¶

Wheels are no longer built for Python versions < 3.5. You may still be able to build from source, but support for 2.x python versions will diminish in future versions.
Migrates namespace from ‘pyramid-arima’ to ‘pmdarima’. This is due to the fact that a growing web-framework (also named Pyramid) is causing namespace collisions when both packages are installed on a machine. See Issue #34 for more detail.
Removes redundant Travis tests
Automates documentation build on Circle CI
Moves lots of the build/test functionality into the Makefile for ease.
Warns for impending deprecation of various environment variable name changes. The following will be completely switched over in version 1.2.0:
- 'PYRAMID_MPL_DEBUG' will become 'PMDARIMA_MPL_DEBUG'
- 'PYRAMID_MPL_BACKEND' will become 'PMDARIMA_MPL_BACKEND'
- 'PYRAMID_ARIMA_CACHE_WARN_SIZE' will become 'PMDARIMA_CACHE_WARN_SIZE'

v0.9.0 ¶

Explicitly catches case in auto_arima where a value of m that is too large may over-estimate D, causing the time series to be differenced down to an empty array. This is now handled by raising a separate error for this case that better explains what happened.
Re-pickling an ARIMA will no longer remove the location on disk of the cached statsmodels ARIMA models. Older versions encountered an issue where an older version of the model would be reinstated and immediately fail due to an OSError since the cached state no longer existed. This means that a user must be very intentional about clearing out the pyramid cache over time.
Adds pyramid cache check on initial import to warn user if the cache size has grown too large.
If d or D are explicitly defined for auto_arima (rather than None), do not raise an error if they exceed max_d or max_D, respectively.
Adds Circle CI for validating PyPy builds (rather than CPython)
Deploys python wheel for version 3.6 on Linux and Windows
Includes warning for upcoming package name change (pmdarima).

v0.8.1¶

New ARIMA instance attributes
- The pkg_version_ attribute (assigned on model fit) is new as of version 0.8.0. On unpickling, if the current Pyramid version does not match the version under which it was serialized, a UserWarning will be raised.
Addition of the _config.py file at the top-level of the package
- Specifies the location of the ARIMA result pickles (see Serializing your ARIMA models)
- Specifies the ARIMA result pickle name pattern
Fixes bug (Issue #30) in ARIMA where using CV with differencing and no seasonality caused a dim mismatch in the model’s exog array and its endog array
New dataset: Woolyrnq (from R’s forecast package).
Visualization utilities available at the top level of the package:
- plot_acf
- plot_pacf
- autocorr_plot
Updates documentation with significantly more examples and API references.

v0.7.0¶

out_of_sample_size behavior in pmdarima.arima.ARIMA
- In prior versions, the out_of_sample_size (OOSS) parameter misbehaved in the sense that it ended up fitting the model on the entire sample, and scoring the number specified. This behavior changed in v0.7.0. Going forward, when OOSS is not None, ARIMA models will be fit on \(n - OOSS\) samples, scored on the last OOSS samples, and the held-out samples are then added to the model.
Adds add_new_samples method to pmdarima.arima.ARIMA
- This method adds new samples to the model, effectively refreshing the point from which it creates new forecasts without impacting the model parameters.
Adds confidence intervals on predict in pmdarima.arima.ARIMA
- When return_conf_int is true, the confidence intervals will now be returned with the forecasts.

v0.6.5¶

pmdarima.arima.CHTest of seasonality
- No longer computes the \(U\) or \(V\) matrix in the SVD computation in the Canova-Hansen test. This makes the test much faster.