“All models are wrong, but some are useful.”
― attributed to statistician George Box
This quote pretty well sums up time series forecasting models.
Any given model is unlikely to be spot on. And some can be wildly off.
But through a careful methodical process, we can whittle the pool of candidate models down to a set of useful models, if not a single preferred model.
When all is said and done, though, our guiding principle when building forecasting models is…how well the model predicts!
In practice, what this means for the types of models we consider is that we don’t rule anything out.
Yes, we have specific things we look for in an acceptable model (which we will cover later). But we don’t rule out a simple TIME trend model simply because it is too “simple.”
Our focus is on finding a forecasting model that can yield defensible short-run forecasts in a cost-effective manner.
Potentially useful models
So what kind of models do we typically examine?
As discussed in a previous article, a time series such as monthly sales (SALES) can have 3 components: trend, seasonal and cyclical. So, the type of model we consider depends on the extent to which 1, 2 or all 3 of these dynamics are present.
There are 3 classes of models that we typically consider. We will use a bit of math here to describe these models…think back to the formula of a line you learned in algebra: Y = a + bX.
First are least squares regression models. Using SALES as our example, we could have a TIME trend model with, say, quarterly seasonality if we were examining SALES by quarter:
SALESt = b0 + b1*TIME + b2*Q1 + b3*Q2 + b4*Q3 + εt
Or a lagged least squares model with quarterly seasonality:
SALESt = b0 + b1*SALESt-1 + b2*SALESt-2 + b3*Q1 + b4*Q2 +b5*Q3 +εt
In these model formulae, b0 is the “intercept.” b1, b2,…etc. indicate the incremental effect (i.e. slope) on sales of a change in the value of a “right hand side” variable. εt is “residual” SALES, what is left “unexplained” by the model. And t is the time period, whether it is months, quarters, years, etc.
The second class of models are ARMA models.
An ARMA process models SALES as being based on past SALES as well as on unobservable shocks to SALES over time. Such models can include two types of components:
An autoregressive (AR) component captures the effect of past SALES on current SALES while a moving average (MA) component captures random shocks to the SALES series. These are typically estimated using a maximum likelihood technique.
We could have a model that is a pure ARMA model, for example:
SALESt = b0 + b1*AR(1) + b2*AR(2) + b3*MA(1) +εt
Or a mixed regression-ARMA model, sometimes called “regression with ARMA errors,” like this:
SALESt = b0 + b1*TIME + b2*Q1 + b3*Q2 + b4*Q3 + b4*AR(1) + b5*MA(1) +εt
A third class of models is related to the ARMA models above: ARIMA. According to standard Box-Jenkins methodology, if you know the underlying trend in SALES is “stochastic” (i.e. random), remove it by differencing SALES. Then model the differenced series as an ARMA process. For example:
SALESt – SALESt-1 = b0 + b1*AR(1) + b2*MA(1) + b3*MA(2) +εt
However, “it is sometimes very difficult to decide whether trend is best modeled as deterministic or stochastic, and the decision is an important part of the science – and art – of building forecasting models.” (Diebold, Elements of Forecasting, 1998)
We will revisit this issue in a later article.
In addition to these 3 general classes of models we typically also try these variations:
- ARCH/GARCH models.
These models address heteroscedasticity in the residuals (εt). ARCH/GARCH models are used in the financial arena to help model return and risk where market volatility can fluctuate in a predictable manner.
- Inclusion of additional “right hand side variables.”
In the case of least squares and mixed regression-ARMA models, if the data are available, we often consider whether additional variables will improve predictive accuracy. In the case of SALES, for example, we could consider adding lagged values of advertising spending (AD SPEND). But if we are tasked with forecasting out 6 months, for example, then we cannot use lags of AD SPEND (in this example) shorter than 5 months. Else we would also have to forecast AD SPEND.
For example, using the natural log of SALES can help model non-linear trends and/or dampen variation in SALES over time which may help to improve predictive accuracy.
There are many “specifications,” many potentially useful models that we estimate.
But not all end up in a final “pool” of candidates for the forecasting model. Each estimated model must pass certain tests to stay in the candidate pool.
In a later article we will cover the tests we use to help whittle down the pool of candidates to a set of truly useful models.