Practical Time Series Forecasting – To Difference or Not to Difference

By KDD | January 22, 2017

“It is sometimes very difficult to decide whether trend is best modeled as deterministic or stochastic, and the decision is an important part of the science – and art – of building forecasting models.”
― Diebold, Elements of Forecasting, 1998

A times series can have a very strong trend.

Visually, we often can see it. Gross domestic product (GDP) per person increasing year after year.

When a “shock” occurs to the process generating GDP, due to a recession for example, GDP gets knocked off its long-run growth path.

But can we expect GDP to bounce back and return to its original long-run growth path? Or will it start growing again but along a different path?

If the former, then the trend in GDP is said to be “deterministic.” And adding TIME to a time series forecasting model is one way to capture this trend.

On the other hand, if GDP starts a new trend after a recession, its trend is said to be “stochastic,” driven by random shocks. The standard approach to time series forecast modeling in this case is to “difference” the data before modeling.

The challenge as a forecaster is that it is not always easy to tell if the trend in a times series is deterministic or stochastic.

And your answer and the subsequent modeling choice will have important implications for the resulting forecast.

Deterministic vs. stochastic trends

Consider the times series shown below.

Suppose you were tasked with generating a 2-year forecast starting December 2003 (at the end of the shown time series history).

Is there a deterministic trend in this series? That is, do you suspect that the series will bounce back to the trend exhibited before January 2001?

Or has there been a fundamental change to the process generating this series and a new trend will start (i.e. the trend is stochastic)?

Deterministic trend

If you opt for a deterministic trend, then your forecasting model will be in “levels.” If we are talking about SALES, then it is the value of SALES at any given point in time. So, when we have a deterministic trend, we can model SALES as:

SALES_t = b₀ + b₁*TIME + u_t

Of course, we could also account for seasonality by adding seasonal dummy variables as well as any hidden dynamics (cycles) by modeling the error term u_t as an ARMA process. But the key characteristic is the inclusion of a TIME variable (May 1993 = 1, June 1993 =2, etc.) and possibly TIME² and/or TIME³ depending on the series.

An ARMA process models SALES as being based on past SALES as well as on unobservable shocks. Such models can include two types of components: An autoregressive (AR) component captures the effect of past SALES on current SALES while a moving average (MA) component captures random shocks to the SALES series.

Stochastic trend

If you opt for a stochastic trend, then the standard methodology is to difference your data (to remove the trend) and model the differences. This is known as ARIMA modeling. An ARIMA process is like an ARMA process except that the dynamics of the differenced series are modeled (see here).

Forecast differences

The forecast implications of this choice are shown in the following chart. We estimated a deterministic and a stochastic model and generated a forecast from each starting in December 2003. Specifically,

Deterministic Trend Model: Y_t = b₀ + b₁*TIME + b₂*AR(1) + b₃*AR(2) + b₄*MA(3) + u_t

Stochastic Trend Model: Y_t – Y_t-1 = b₀ + b₁*AR(1) + b₂*AR(3) + u_t

The forecast based on a deterministic model is shown by the orange line while the one based on the stochastic model is shown by the gray line. Also shown is what actually happened to the times series.

Hind sight is 20/20. In this case, the stochastic model would have been the better choice.

It does appear that some fundamental change occurred in the time series generation process. That is, the time series did not revert to its pre-2001 historical trend (at least during the forecast horizon).

The stochastic model yields a better forecast error (MAPE = 2.0%) than the deterministic model (MAPE = 5.6%) over the forecast horizon.

But at the time we had to make the forecast, all we had available were data through December 2003.

So, how do we pick between a deterministic and a stochastic forecasting model?

Holdout sample

From a practical perspective, unless we have very strong evidence of a stochastic process, the best course of action is to use a holdout sample.

Yes, there are techniques for testing whether a times series is “stationary” (i.e. has no trend) when visually it is not obvious.

But pragmatically, we are concerned about short-run forecast accuracy. And one way to compare competing models is by their performance in a holdout sample.

As we discussed in an earlier article, hold out a period of time equal to your forecast horizon from the data used to estimate a model. In this case, 2 years (January 2001 – December 2003).

Then build your models on data prior to January 2001 and compare the models’ forecast performance over the holdout sample.

In this case, such a holdout sample does not include any data from the strong trend period (pre-May 2001). So, likely a stochastic model would have performed better in the holdout sample as well.

But suppose we do this and have two (or more) models that perform equally well in the holdout sample?

We’ll cover this possibility in a subsequent article.

Deterministic/stochastic trend? holdout sample!

Part 1 – Practical Time Series Forecasting – Introduction

Part 2 – Practical Time Series Forecasting – Some Basics

Part 3 – Practical Time Series Forecasting – Potentially Useful Models

Part 4 – Practical Time Series Forecasting – Data Science Taxonomy

Part 5 – Practical Time Series Forecasting – Know When to Hold ’em

Part 6 – Practical Time Series Forecasting – What Makes a Model Useful?