pmdarima.utils.acf

pmdarima.utils.acf(x, nlags=None, qstat=False, fft=None, alpha=None, missing='none', adjusted=False)[source][source]

Calculate the autocorrelation function.

Parameters:

x : array_like

The time series data.

adjusted : bool, default False

If True, then denominators for autocovariance are n-k, otherwise n.

nlags : int, optional

Number of lags to return autocorrelation for. If not provided, uses min(10 * np.log10(nobs), nobs - 1). The returned value includes lag 0 (ie., 1) so size of the acf vector is (nlags + 1,).

qstat : bool, default False

If True, returns the Ljung-Box q statistic for each autocorrelation coefficient. See q_stat for more information.

fft : bool, default True

If True, computes the ACF via FFT.

alpha : scalar, default None

If a number is given, the confidence intervals for the given level are returned. For instance if alpha=.05, 95 % confidence intervals are returned where the standard deviation is computed according to Bartlett”s formula.

bartlett_confint : bool, default True

Confidence intervals for ACF values are generally placed at 2 standard errors around r_k. The formula used for standard error depends upon the situation. If the autocorrelations are being used to test for randomness of residuals as part of the ARIMA routine, the standard errors are determined assuming the residuals are white noise. The approximate formula for any lag is that standard error of each r_k = 1/sqrt(N). See section 9.4 of [2] for more details on the 1/sqrt(N) result. For more elementary discussion, see section 5.3.2 in [3]. For the ACF of raw data, the standard error at a lag k is found as if the right model was an MA(k-1). This allows the possible interpretation that if all autocorrelations past a certain lag are within the limits, the model might be an MA of order defined by the last significant autocorrelation. In this case, a moving average model is assumed for the data and the standard errors for the confidence intervals should be generated using Bartlett’s formula. For more details on Bartlett formula result, see section 7.2 in [2].

missing : str, default “none”

A string in [“none”, “raise”, “conservative”, “drop”] specifying how the NaNs are to be treated. “none” performs no checks. “raise” raises an exception if NaN values are found. “drop” removes the missing observations and then estimates the autocovariances treating the non-missing as contiguous. “conservative” computes the autocovariance using nan-ops so that nans are removed when computing the mean and cross-products that are used to estimate the autocovariance. When using “conservative”, n is set to the number of non-missing observations.

Returns:

acf : ndarray

The autocorrelation function for lags 0, 1, …, nlags. Shape (nlags+1,).

confint : ndarray, optional

Confidence intervals for the ACF at lags 0, 1, …, nlags. Shape (nlags + 1, 2). Returned if alpha is not None.

qstat : ndarray, optional

The Ljung-Box Q-Statistic for lags 1, 2, …, nlags (excludes lag zero). Returned if q_stat is True.

pvalues : ndarray, optional

The p-values associated with the Q-statistics for lags 1, 2, …, nlags (excludes lag zero). Returned if q_stat is True.

Notes

The acf at lag 0 (ie., 1) is returned.

For very long time series it is recommended to use fft convolution instead. When fft is False uses a simple, direct estimator of the autocovariances that only computes the first nlag + 1 values. This can be much faster when the time series is long and only a small number of autocovariances are needed.

If adjusted is true, the denominator for the autocovariance is adjusted for the loss of data.

References

[R94]Parzen, E., 1963. On spectral analysis with missing observations and amplitude modulation. Sankhya: The Indian Journal of Statistics, Series A, pp.383-392.
[R95]Brockwell and Davis, 1987. Time Series Theory and Methods
[R96]Brockwell and Davis, 2010. Introduction to Time Series and Forecasting, 2nd edition.