7. Toy time-series datasets

The datasets submodule provides an interface for loading various built-in toy time-series datasets, some of which are datasets commonly used for benchmarking time-series models or are pre-built in R.

All datasets share a common interface:

load_<some_dataset>(as_series=False)

Where as_series=True will return a Pandas Series object with the appropriate index.

7.1. Air Passengers

The classic Box & Jenkins airline data. Monthly totals of international airline passengers, 1949 to 1960.

>>> load_airpassengers(True).head()
0    112.0
1    118.0
2    132.0
3    129.0
4    121.0
dtype: float64

7.2. Austres

Numbers (in thousands) of Australian residents measured quarterly from March 1971 to March 1994. The sample consists of 89 records on a quarterly basis.

>>> load_austres(True).head()
0    13067.3
1    13130.5
2    13198.4
3    13254.2
4    13303.7
dtype: float64

7.3. Heartrate

The heart rate data records sample of heartrate data borrowed from an MIT database. The sample consists of 150 evenly spaced (0.5 seconds) heartrate measurements.

>>> load_heartrate(True).head()
0    84.2697
1    84.2697
2    84.0619
3    85.6542
4    87.2093
dtype: float64

7.4. Lynx

The Lynx dataset records the number of skins of predators (lynx) that were collected over many years by the Hudson’s Bay Company (1821 - 1934). It’s commonly used for time-series benchmarking (Brockwell and Davis - 1991) and is built into R. The dataset exhibits a clear 10-year cycle.

>>> load_lynx(True).head()
1821     269
1822     321
1823     585
1824     871
1825    1475
dtype: int64

7.5. Wineind

This time-series records total wine sales by Australian wine makers in bottles <= 1 litre between Jan 1980 – Aug 1994. This dataset is found in the R forecast package.

>>> load_wineind(True).head()
Jan 1980    15136
Feb 1980    16733
Mar 1980    20016
Apr 1980    17708
May 1980    18019
dtype: int64

7.6. Woolyrnq

A time-series that records the quarterly production (in tonnes) of woollen yarn in Australia between Mar 1965 and Sep 1994. This dataset is found in the R forecast package.

>>> load_woolyrnq(True).head()
Q1 1965    6172
Q2 1965    6709
Q3 1965    6633
Q4 1965    6660
Q1 1966    6786
dtype: int64