practical times series forecasting

Practical Time Series Forecasting – Some Basics

“The long run is a misleading guide to current affairs. In the long run we are all dead.”
John Maynard Keynes, A Tract on Monetary Reform

Forecasting the future is an exercise in uncertainty. And the further out one looks, the more uncertain the forecast becomes.

Most businesses are keenly focused on the next quarter, 6 months, year or at most next few years. Hence, our focus in this series is on time series methods for “short-run” forecasting.

The nature of time series

We are all familiar with charts like this:

Low variation time series

showing a sequence of numbers ordered by time, across equally spaced periods of time. That is, a “time series” (e.g. closing stock price per day, sales per month, GDP per quarter, average global temperature per year).

Some time series exhibit little variability (up/down) from time period to time period (except for an overall trend) like the one above.

Others exhibit considerable variability across time with a much less apparent trend, like this:

High variation time series

An oftentimes unique characteristic of time series data, relative to non-time series data, is that successive values are not independent of each other. Although it may not be apparent from looking at a chart, today’s value is usually related in some way to yesterday’s value. And possibly to that of the day and/or several days before. This makes time series model estimation more complicated than in other areas.

A time series chart holds a unique fascination for us. Because we are constantly aware of the progression of time, our natural reaction when we see such charts is, “I wonder what’s going to happen next?”

Components of a time series

A successful forecasting model will account for each of 3 components that may exist in a time series: trend, seasonality and cycles.

Trend

Trend, when present, can be (but not always) visually apparent. For example, US real GDP (below) exhibits a persistent upward trend since the Great Depression.

Trend is a long-run phenomenon and reflects, in business, “slowly evolving preferences, technologies, institutions and demographics.” (Diebold, Elements of Forecasting)

US Real GDP

Trend comes in two flavors.

If GDP, for example, was knocked off its long-run growth path by a recession but returned to the same path afterwards, then trend is said to be “deterministic.” Adding a TIME dimension to a model can go a long way to capturing such “deterministic” trend.

On the other hand, if GDP started a new growth path after the recession, then trend is said to be “stochastic.”

This distinction (between deterministic and stochastic trend) has important modeling and forecasting consequences which we will address in a later article.

Seasonality

A seasonal pattern repeats with calendar regularity.

The annual uptick in sales that occur during the November and December holiday season is an example. Higher airline passenger counts during the summer months is another example (see below). Adding seasonal indicators (dummy variables“) to a model can capture such seasonality.

US Enplanements

Cycles

A cyclic component can also be present. Cycles are much less rigid than seasonal patterns. One example is the business cycle, from a recession low to an expansion high.

A time series can contain one cycle (e.g. the daily cycle of body temperature) or multiple cycles (e.g. bicycle traffic patterns can exhibit daily, weekly and annual cycles). More broadly, a cyclic component is any dynamic not accounted for by trend or seasonality.

Modeling cycles takes us into the world of ARMA and ARIMA models which we’ll cover later.

Methods for forecasting

There are numerous methods for forecasting a time series, ranging from simple to complex.

Simple

The simplest is some type of smoothing routine, like moving averages or exponential smoothing. Moving averages , especially a 200-day moving average, are commonly used in technical analysis of stock price movements:

Complex

More complex econometric methods seek to model the relationship between, say, sales over time, and several dimensions that could affect sales, such as advertising spending.

Econometric models can consist of multiple interrelated equations (one for sales, one for ad spending) which would be estimated jointly, typically using a multiple regression methodology. Such models are used to model the US economy and to generate long-run forecasts of macroeconomic variables such as GDP and employment.

Also on the sophisticated end of the spectrum are techniques like spectral analysis, deep learning and neural networks. These methods require an elevated level of expertise on the part of a data scientist to implement and fine tune the models.

Middle of the road

In between the simpler and more complex forecasting methods is what we refer to as “time series methods.” These methods primarily rely on (but not always) the series’ historical behavior to inform the future. “Univariate modeling” is sometimes used to describe these methods.

A distinguishing feature of time series methods is that they explicitly account for the key characteristics of a time series: trend, seasonality and cycles.

The workhorses of time series methods are single equation, least squares regression and ARIMA models.

Least squares regression models can use a TIME trend, seasonal indicators and either lagged values of the series being modeled or an ARMA representation of the cyclic component to model a time series. They can also include other related lagged variables (e.g., advertising expenditures in a SALES forecasting model) but usually only if the lags are long.

If the trend of the series is “stochastic” (i.e. when the series is bumped off its trend path, it starts a new trend path), then ARIMA models may provide the best forecast.

Back to the short-run

The time series methods we will cover in this series of articles use the estimated dynamics and trend of the series to forecast a future path over the “forecast horizon.”

But since the forecasts will most likely ultimately revert to the underlying trend in the series, the best use of these time series methods is for “short-run” forecasts.

Although there is a more “technical” definition based on the type of model used, we generally define the “short run” as the period of time that matches most business’ forecast needs.  So, we are talking about anywhere from the next day to the next few years.

“The long run is a misleading guide to current affairs

Part 1 – Practical Time Series Forecasting – Introduction