time series Archives - KDD Analytics

Practical Time Series Forecasting – To Difference or Not to Difference

KDD — Mon, 22 Jan 2018 01:22:47 +0000

“It is sometimes very difficult to decide whether trend is best modeled as deterministic or stochastic, and the decision is an important part of the science – and art – of building forecasting models.”
― Diebold, Elements of Forecasting, 1998

A time series can have a very strong trend.

Visually, we often can see it. Gross domestic product (GDP) per person increasing year after year.

When a “shock” occurs to the process generating GDP, due to a recession for example, GDP gets knocked off its long-run growth path.

But can we expect GDP to bounce back and return to its original long-run growth path? Or will it start growing again but along a different path?

If the former, then the trend in GDP is said to be “deterministic.” And adding TIME to a time series forecasting model is one way to capture this trend.

On the other hand, if GDP starts a new trend after a recession, its trend is said to be “stochastic,” driven by random shocks. The standard approach to time series forecast modeling in this case is to “difference” the data before modeling.

The challenge as a forecaster is that it is not always easy to tell if the trend in a time series is deterministic or stochastic.

And your answer and the subsequent modeling choice will have important implications for the resulting forecast.

Deterministic vs. stochastic trends

Consider the time series shown below.

Suppose you were tasked with generating a 2-year forecast starting December 2003 (at the end of the shown time series history).

Is there a deterministic trend in this series? That is, do you suspect that the series will bounce back to the trend exhibited before January 2001?

Or has there been a fundamental change to the process generating this series and a new trend will start (i.e. the trend is stochastic)?

Deterministic trend

If you opt for a deterministic trend, then your forecasting model will be in “levels.” If we are talking about SALES, then it is the value of SALES at any given point in time. So, when we have a deterministic trend, we can model SALES as:

SALES_t = b₀ + b₁*TIME + ε_t

Of course, we could also account for seasonality by adding seasonal dummy variables as well as any hidden dynamics (cycles) by modeling the error term u_t as an ARMA process. But the key characteristic is the inclusion of a TIME variable (May 1993 = 1, June 1993 =2, etc.) and possibly TIME² and/or TIME³ depending on the series.

An ARMA process models SALES as being based on past SALES as well as on unobservable shocks. Such models can include two types of components: An autoregressive (AR) component captures the effect of past SALES on current SALES while a moving average (MA) component captures random shocks to the SALES series.

Stochastic trend

If you opt for a stochastic trend, then the standard methodology is to difference your data (to remove the trend) and model the differences. This is known as ARIMA modeling. An ARIMA process is like an ARMA process except that the dynamics of the differenced series are modeled (see here).

Forecast differences

The forecast implications of this choice are shown in the following chart. We estimated a deterministic and a stochastic model and generated a forecast from each starting in December 2003. Specifically,

Deterministic Trend Model: Y_t = b₀ + b₁*TIME + b₂*AR(1) + b₃*AR(2) + b₄*MA(3) + ε_t

Stochastic Trend Model: Y_t – Y_t-1 = b₀ + b₁*AR(1) + b₂*AR(3) + ε_t

The forecast based on a deterministic model is shown by the orange line while the one based on the stochastic model is shown by the gray line. Also shown is what actually happened to the time series.

Hindsight is 20/20. In this case, the stochastic model would have been the better choice.

It does appear that some fundamental change occurred in the time series generation process. That is, the time series did not revert to its pre-2001 historical trend (at least during the forecast horizon).

The stochastic model yields a better forecast error (MAPE = 2.0%) than the deterministic model (MAPE = 5.6%) over the forecast horizon.

But at the time we had to make the forecast, all we had available were data through December 2003.

So, how do we pick between a deterministic and a stochastic forecasting model?

Holdout sample

From a practical perspective, unless we have very strong evidence of a stochastic process, the best course of action is to use a holdout sample.

Yes, there are techniques for testing whether a time series is “stationary” (i.e. has no trend) when visually it is not obvious.

But pragmatically, we are concerned about short-run forecast accuracy. And one way to compare competing models is by their performance in a holdout sample.

As we discussed in an earlier article, hold out a period of time at least equal to your forecast horizon from the data used to estimate a model. In this case, 2 years (January 2001 – December 2003).

Then build your models on data prior to January 2001 and compare the models’ forecast performance over the holdout sample.

In this case, such a holdout sample does not include any data from the strong trend period (pre-May 2001). So, likely a stochastic model would have performed better in the holdout sample as well.

But suppose we do this and have two (or more) models that perform equally well in the holdout sample?

We’ll cover this possibility in a subsequent article.

deterministic/stochastic trend? holdout sample!

Part 1 – Practical Time Series Forecasting – Introduction

Part 2 – Practical Time Series Forecasting – Some Basics

Part 3 – Practical Time Series Forecasting – Potentially Useful Models

Part 4 – Practical Time Series Forecasting – Data Science Taxonomy

Part 5 – Practical Time Series Forecasting – Know When to Hold ’em

Part 6 – Practical Time Series Forecasting – What Makes a Model Useful?

The post Practical Time Series Forecasting – To Difference or Not to Difference appeared first on KDD Analytics.

Practical Time Series Forecasting – What Makes a Model Useful?

KDD — Mon, 15 Jan 2018 07:56:19 +0000

“In God we trust. All others must bring data.”
― W. Edwards Deming, statistician

So, you have estimated a bunch of forecasting models and realize (kudos to you!) that they are “all wrong” (ala George Box).

But your forecasting deadline is looming, and you need to find some useful models on which to base a forecast.

How do you decide which models make it to the next round?

Model building process

First, let’s review the forecast model build process:

Step 1: Determine what is the business need;

Step 2: Collect and examine your data; clean and adjust (e.g. frequency change) as necessary;

Step 3: Determine your forecast horizon (i.e. align with the business need);

Step 4: Determine and set aside your holdout sample;

Step 5: Estimate models using the non-holdout portion of your time series (i.e. the “modeling sample”);

Step 6: Gauge each model’s performance in the holdout sample;

Step 7: Recalibrate each model using the full historical sample;

Step 8: Make your forecast for the forecast horizon.

At the end of this process, you should have a few models that “pass muster,” that are potentially useful models.

But how do you whittle down all the models you tried to this select few?

Guidelines for selecting useful models

Here are some guidelines we follow:

Statistically Significant Parameters – Although one can argue that it is the prediction that matters, we still like to see model coefficients that are statistically significant with signs that can be explained. You may be asked to defend your model.

White Noise Residuals – When you estimate your model using the modeling sample, the residuals (difference between the actual and predicted values in the modeling sample) should have no apparent pattern to them. That is, there is no additional variation in the time series that can be explained by your model. What is left over is random or “white” noise.

Strong Holdout Sample Performance – Your model should produce low forecast error and exhibit low systematic bias in the holdout sample.

Robustness – When you recalibrate your model using the entire historical sample (modeling + holdout sample), your model should retain its statistical properties. That is, parameters are still significant with plausible signs and the residuals are still white noise.

Parsimony – If two models are equal in all performance respects except one is more complex than the other, we generally opt for the simpler model. Experience suggests that simpler models perform better when forecasting over the forecast horizon. And they are easier to interpret and explain to business decision makers.

Forecast Plausibility – The forecast produced by your model over the forecast horizon should be consistent with the available knowledge concerning the relevant business environment. In other words, the forecast needs to make sense. It is possible, following the steps above, to arrive at a high performing model which produces a counter intuitive forecast (e.g. declining SALES when the trend in SALES has been nothing but up).

At the end of this model building and testing process, you may have more than 1 model that can be used to generate your forecast. In a later article we will address what you can do in this situation.

The art of forecasting

Our experience is consistent with the opinion of others that there is still quite a bit of “art” to time series forecasting. Especially if you want it to meet a specific business need. Automated forecast routines exist. But we recommend that the process be closely supervised by a human to ensure a reasonable forecast.

“In God we trust. All others must bring data.” W. Edwards Deming, statistician

Part 1 – Practical Time Series Forecasting – Introduction

Part 2 – Practical Time Series Forecasting – Some Basics

Part 3 – Practical Time Series Forecasting – Potentially Useful Models

Part 4 – Practical Time Series Forecasting – Data Science Taxonomy

Part 5 – Practical Time Series Forecasting – Know When to Hold ’em

The post Practical Time Series Forecasting – What Makes a Model Useful? appeared first on KDD Analytics.

Practical Time Series Forecasting – Data Science Taxonomy

KDD — Tue, 02 Jan 2018 12:26:19 +0000

“Big data is not about the data.*”
― Gary King, Harvard University

(*It’s about the analytics.)

Machine Learning. Deep Learning. Data Science. Artificial Intelligence. Big Data.

Not a day goes by that one or all of these buzzwords stream past in our business news feeds.

Data analytics has become mainstream. And you better jump on board or risk being left at the station!

Just within the last year or so, searches of these topics have taken off. In fact, according to Google, in early 2017, search interest in one of these topics, machine learning, has eclipsed that of big data:

So, how do time series methods for forecasting fit into the taxonomy that currently defines the data science field?

Data science taxonomy

Key data science terms that are related to time series methods for forecasting are data mining, predictive analytics, machine learning (supervised and unsupervised), regression, structured and unstructured data.

These are not necessarily mutually exclusive. At the risk of incurring the wrath of the data science gods, here is our simplification:

Structured vs. unstructured data

Structured data are organized into “rows and columns” (spreadsheet); unstructured data are not (text in a book).

Time series methods use structured data.

Data mining

Data mining seeks to find patterns in data, whether structured or unstructured.

Time series methods seek to find patterns that repeat over time.

Predictive analytics

Predictive analytics seeks to find a relationship between a variable of interest (e.g. customer churn) and multiple dimensions (e.g. age, length of contract, zip code). These dimensions can be used to predict the likelihood of a customer churning (in our example).

Typically, predictive analytics is not based on time series data but “cross-sectional” data like a customer set. Additionally, time series methods use only a very limited set of dimensions, the primary one being past behavior of the variable being forecasted (e.g. sales).

Time series methods typically use the past behavior of the variable being forecasted as the primary dimension.

Machine learning

Machine learning means that a computer is using a program (algorithm) to “connect the dots” in the data. If you run a regression model in Excel you are engaging in machine learning.

However, supervised machine learning does not mean you are keeping watch over Excel as it does its stuff!

This is NOT what “supervised” machine learning means!

Supervised machine learning means that the computer is seeking to find a relationship between a single variable (e.g. churn) and many dimensional variables (e.g. age, length of contract, zip code).

Unsupervised machine learning means that the computer is seeking to find a relationship between many dimensions (e.g. age, length of contract, zip code) so that customers can, for example, be clustered into a small number of groups or tribes with similar characteristics.

Time series methods are a type of supervised machine learning since they attempt to find a relationship between present and past behavior.

Regression

Regression is one way a machine finds relationships between a single variable and a few (or many) dimensional variables or past values of the variable itself. There are several flavors of regression.

Time series models typically use least squares regression or maximum likelihood.

Bottom line

So, when you use time series methods for forecasting you are probably mining structured data using supervised, regression- or maximum likelihood-based, machine learning.

“Big data is not about the data.”

Part 1 – Practical Time Series Forecasting – Introduction

Part 2 – Practical Time Series Forecasting – Some Basics

Part 3 – Practical Time Series Forecasting – Potentially Useful Models

The post Practical Time Series Forecasting – Data Science Taxonomy appeared first on KDD Analytics.

Practical Time Series Forecasting – Potentially Useful Models

KDD — Mon, 18 Dec 2017 08:00:05 +0000

“All models are wrong, but some are useful.”
― attributed to statistician George Box

This quote pretty well sums up time series forecasting models.

Any given model is unlikely to be spot on. And some can be wildly off.

But through a careful methodical process, we can whittle the pool of candidate models down to a set of useful models, if not a single preferred model.

When all is said and done, though, our guiding principle when building forecasting models is…how well the model predicts!

In practice, what this means for the types of models we consider is that we don’t rule anything out.

Yes, we have specific things we look for in an acceptable model (which we will cover later). But we don’t rule out a simple TIME trend model simply because it is too “simple.”

Our focus is on finding a forecasting model that can yield defensible short-run forecasts in a cost-effective manner.

Potentially useful models

So what kind of models do we typically examine?

As discussed in a previous article, a time series such as monthly sales (SALES) can have 3 components: trend, seasonal and cyclical. So, the type of model we consider depends on the extent to which 1, 2 or all 3 of these dynamics are present.

There are 3 classes of models that we typically consider. We will use a bit of math here to describe these models…think back to the formula of a line you learned in algebra: Y = a + bX.

Regression models

First are least squares regression models. Using SALES as our example, we could have a TIME trend model with, say, quarterly seasonality if we were examining SALES by quarter:

SALES_t = b₀ + b₁*TIME + b₂*Q1 + b₃*Q2 + b₄*Q3 + ε_t

Or a lagged least squares model with quarterly seasonality:

SALES_t = b₀ + b₁*SALES_t-1 + b₂*SALES_t-2 + b₃*Q1 + b₄*Q2 +b₅*Q3 +ε_t

In these model formulae, b₀ is the “intercept.” b₁, b₂,…etc. indicate the incremental effect (i.e. slope) on sales of a change in the value of a “right hand side” variable. ε_t is “residual” SALES, what is left “unexplained” by the model. And t is the time period, whether it is months, quarters, years, etc.

ARMA models

The second class of models are ARMA models.

An ARMA process models SALES as being based on past SALES as well as on unobservable shocks to SALES over time. Such models can include two types of components:

An autoregressive (AR) component captures the effect of past SALES on current SALES while a moving average (MA) component captures random shocks to the SALES series. These are typically estimated using a maximum likelihood technique.

We could have a model that is a pure ARMA model, for example:

SALES_t = b₀ + b₁*AR(1) + b₂*AR(2) + b₃*MA(1) +ε_t

Or a mixed regression-ARMA model, sometimes called “regression with ARMA errors,” like this:

SALES_t = b₀ + b₁*TIME + b₂*Q1 + b₃*Q2 + b₄*Q3 + b₄*AR(1) + b₅*MA(1) +ε_t

ARIMA models

A third class of models is related to the ARMA models above: ARIMA. According to standard Box-Jenkins methodology, if you know the underlying trend in SALES is “stochastic” (i.e. random), remove it by differencing SALES. Then model the differenced series as an ARMA process. For example:

SALES_t – SALES_t-1 = b₀ + b₁*AR(1) + b₂*MA(1) + b₃*MA(2) +ε_t

However, “it is sometimes very difficult to decide whether trend is best modeled as deterministic or stochastic, and the decision is an important part of the science – and art – of building forecasting models.” (Diebold, Elements of Forecasting, 1998)

We will revisit this issue in a later article.

Other considerations

In addition to these 3 general classes of models we typically also try these variations:

ARCH/GARCH models.

These models address heteroscedasticity in the residuals (ε_t). ARCH/GARCH models are used in the financial arena to help model return and risk where market volatility can fluctuate in a predictable manner.

Inclusion of additional “right hand side variables.”

In the case of least squares and mixed regression-ARMA models, if the data are available, we often consider whether additional variables will improve predictive accuracy. In the case of SALES, for example, we could consider adding lagged values of advertising spending (AD SPEND). But if we are tasked with forecasting out 6 months, for example, then we cannot use lags of AD SPEND (in this example) shorter than 5 months. Else we would also have to forecast AD SPEND.

Transformations.

For example, using the natural log of SALES can help model non-linear trends and/or dampen variation in SALES over time which may help to improve predictive accuracy.

Bottom line

There are many “specifications,” many potentially useful models that we estimate.

But not all end up in a final “pool” of candidates for the forecasting model. Each estimated model must pass certain tests to stay in the candidate pool.

In a later article we will cover the tests we use to help whittle down the pool of candidates to a set of truly useful models.

“All models are wrong, but some are useful.”

Part I – Practical Time Series Forecasting – Introduction

Part II – Practical Time Series Forecasting – Some basics

The post Practical Time Series Forecasting – Potentially Useful Models appeared first on KDD Analytics.

Practical Time Series Forecasting – Some Basics

KDD — Mon, 11 Dec 2017 02:50:12 +0000

“The long run is a misleading guide to current affairs. In the long run we are all dead.”
― John Maynard Keynes, A Tract on Monetary Reform

Forecasting the future is an exercise in uncertainty. And the further out one looks, the more uncertain the forecast becomes.

Most businesses are keenly focused on the next quarter, 6 months, year or at most next few years. Hence, our focus in this series is on time series methods for “short-run” forecasting.

The nature of time series

We are all familiar with charts like this:

showing a sequence of numbers ordered by time, across equally spaced periods of time. That is, a “time series” (e.g. closing stock price per day, sales per month, GDP per quarter, average global temperature per year).

Some time series exhibit little variability (up/down) from time period to time period (except for an overall trend) like the one above.

Others exhibit considerable variability across time with a much less apparent trend, like this:

An oftentimes unique characteristic of time series data, relative to non-time series data, is that successive values are not independent of each other. Although it may not be apparent from looking at a chart, today’s value is usually related in some way to yesterday’s value. And possibly to that of the day and/or several days before. This makes time series model estimation more complicated than in other areas.

A time series chart holds a unique fascination for us. Because we are constantly aware of the progression of time, our natural reaction when we see such charts is, “I wonder what’s going to happen next?”

Components of a time series

A successful forecasting model will account for each of 3 components that may exist in a time series: trend, seasonality and cycles.

Trend

Trend, when present, can be (but not always) visually apparent. For example, US real GDP (below) exhibits a persistent upward trend since the Great Depression.

Trend is a long-run phenomenon and reflects, in business, “slowly evolving preferences, technologies, institutions and demographics.” (Diebold, Elements of Forecasting)

Trend comes in two flavors.

If GDP, for example, was knocked off its long-run growth path by a recession but returned to the same path afterwards, then trend is said to be “deterministic.” Adding a TIME dimension to a model can go a long way to capturing such “deterministic” trend.

On the other hand, if GDP started a new growth path after the recession, then trend is said to be “stochastic.”

This distinction (between deterministic and stochastic trend) has important modeling and forecasting consequences which we will address in a later article.

Seasonality

A seasonal pattern repeats with calendar regularity.

The annual uptick in sales that occur during the November and December holiday season is an example. Higher airline passenger counts during the summer months is another example (see below). Adding seasonal indicators (“dummy variables“) to a model can capture such seasonality.

Cycles

A cyclic component can also be present. Cycles are much less rigid than seasonal patterns. One example is the business cycle, from a recession low to an expansion high.

A time series can contain one cycle (e.g. the daily cycle of body temperature) or multiple cycles (e.g. bicycle traffic patterns can exhibit daily, weekly and annual cycles). More broadly, a cyclic component is any dynamic not accounted for by trend or seasonality.

Modeling cycles takes us into the world of ARMA and ARIMA models which we’ll cover later.

Methods for forecasting

There are numerous methods for forecasting a time series, ranging from simple to complex.

Simple

The simplest is some type of smoothing routine, like moving averages or exponential smoothing. Moving averages , especially a 200-day moving average, are commonly used in technical analysis of stock price movements:

Complex

More complex econometric methods seek to model the relationship between, say, sales over time, and several dimensions that could affect sales, such as advertising spending.

Econometric models can consist of multiple interrelated equations (one for sales, one for ad spending) which would be estimated jointly, typically using a multiple regression methodology. Such models are used to model the US economy and to generate long-run forecasts of macroeconomic variables such as GDP and employment.

Also on the sophisticated end of the spectrum are techniques like spectral analysis, deep learning and neural networks. These methods require an elevated level of expertise on the part of a data scientist to implement and fine tune the models.

Middle of the road

In between the simpler and more complex forecasting methods is what we refer to as “time series methods.” These methods primarily rely on (but not always) the series’ historical behavior to inform the future. “Univariate modeling” is sometimes used to describe these methods.

A distinguishing feature of time series methods is that they explicitly account for the key characteristics of a time series: trend, seasonality and cycles.

The workhorses of time series methods are single equation, least squares regression and ARIMA models.

Least squares regression models can use a TIME trend, seasonal indicators and either lagged values of the series being modeled or an ARMA representation of the cyclic component to model a time series. They can also include other related lagged variables (e.g., advertising expenditures in a SALES forecasting model) but usually only if the lags are long.

If the trend of the series is “stochastic” (i.e. when the series is bumped off its trend path, it starts a new trend path), then ARIMA models may provide the best forecast.

Back to the short-run

The time series methods we will cover in this series of articles use the estimated dynamics and trend of the series to forecast a future path over the “forecast horizon.”

But since the forecasts will most likely ultimately revert to the underlying trend in the series, the best use of these time series methods is for “short-run” forecasts.

Although there is a more “technical” definition based on the type of model used, we generally define the “short run” as the period of time that matches most business’ forecast needs. So, we are talking about anywhere from the next day to the next few years.

“The long run is a misleading guide to current affairs

Part 1 – Practical Time Series Forecasting – Introduction

The post Practical Time Series Forecasting – Some Basics appeared first on KDD Analytics.

Practical Time Series Forecasting – Introduction

KDD — Mon, 04 Dec 2017 18:21:01 +0000

“The only thing I cannot predict is the future.”
― Amit Trivedi, Riding The Roller Coaster: Lessons from financial market cycles we repeatedly forget

It goes without saying that every business is keenly interested in knowing what the future will bring.

Will sales grow next year? By how much? Will suppliers increase their prices? How fast will be the adoption of a new IoT product? How much warehouse capacity is needed for the next holiday period? Will some international event plunge the global economy into a recession?

Predicting the future is an exercise in probability rather than certainty. Businesses engage in various levels of sophistication in trying to bound the likelihood of future states to support their business plans.

Some have teams of economists and data scientists tasked with building complex forecasting models.

Many businesses, however, likely rely on less sophisticated means centered on spreadsheet models, trends and moving averages (or even educated guesses).

Time series methodology is a moderately sophisticated yet cost effective way to generate forecasts. It is a statistical approach which bases forecasts on the past behavior of the data series in question (e.g. monthly sales).

And it accounts for other characteristics of a time series which can yield a more accurate forecast than, say, a simple straight-line trend model.

More time, more data

We have all heard the forecasts about data growth, the proverbial “hockey stick.”

By one account, human and machine-generated data is growing at 10x the rate of traditional business data. And machine-generated data is growing at 50X the rate.

A good portion of this machine-generated data has a time dimension. Internet of Things (IoT) devices are proliferating, each of which has a potential to collect data over time.

A washing machine can monitor, collect and post performance data to the cloud. Using these data to forecast product failure can lead to a pro-active maintenance visit by your friendly but lonely Maytag repairman.

Similarly, a household’s electricity usage can be monitored, modeled and forecasted leading to cost-savings suggestions by an energy service provider under a time-of-day pricing scheme.

The electric utility company itself can use the data from all the IoT appliances in households to generate better residential load forecasts and help better manage the electricity grid.

Even if a more traditional businesses source like sales, inventories, deliveries, workforce utilization, IT usage and the like, advances in data collection, storage and proliferation are making time series data more readily accessible.

Thus, there will be an increased demand for product managers, economists, statisticians and data scientists to make use of these data and tell us what will happen next.

Time series methods

The premise of time series methods (and of most quantitatively-based forecasting methods) is that the future will be much like the past.

If sales have been growing at a consistently healthy rate with strong seasonal variation (e.g. holiday periods) for the last year, then it is likely the next year will be similar, all else constant. If done correctly, the methodology can yield a defensible forecast of likely sales each month during the “forecast horizon.”

But, as with all forecasting methodologies, there are pitfalls of which one should be aware.

Practical time series methods

This is the first of a series of articles on practical time series methods for short-run business forecasting.

There are abundant, excellent resources covering the basics of business forecasting including time series methods, ranging from blog posts to online courses to open-source textbooks.

And time series methods are a mainstay of advanced courses in econometrics and business forecasting (resources we recommend are Elements of Forecasting by Diebold and Econometric Models and Economic Forecasts by Pindyck and Rubinfeld).

Rather than being a treatise on forecasting, this series of articles will present a practical methodology and some of the lessons we have learned performing time series forecasting for clients.

A practical methodology for business time series forecasting.

The post Practical Time Series Forecasting – Introduction appeared first on KDD Analytics.

Practical Time Series Forecasting – To Difference or Not to Difference

KDD — Sun, 22 Jan 2017 10:14:00 +0000

A times series can have a very strong trend.

Visually, we often can see it. Gross domestic product (GDP) per person increasing year after year.

When a “shock” occurs to the process generating GDP, due to a recession for example, GDP gets knocked off its long-run growth path.

But can we expect GDP to bounce back and return to its original long-run growth path? Or will it start growing again but along a different path?

If the former, then the trend in GDP is said to be “deterministic.” And adding TIME to a time series forecasting model is one way to capture this trend.

The challenge as a forecaster is that it is not always easy to tell if the trend in a times series is deterministic or stochastic.

And your answer and the subsequent modeling choice will have important implications for the resulting forecast.

Deterministic vs. stochastic trends

Consider the times series shown below.

Suppose you were tasked with generating a 2-year forecast starting December 2003 (at the end of the shown time series history).

Is there a deterministic trend in this series? That is, do you suspect that the series will bounce back to the trend exhibited before January 2001?

Or has there been a fundamental change to the process generating this series and a new trend will start (i.e. the trend is stochastic)?

Deterministic trend

SALES_t = b₀ + b₁*TIME + u_t

Stochastic trend

Forecast differences

Deterministic Trend Model: Y_t = b₀ + b₁*TIME + b₂*AR(1) + b₃*AR(2) + b₄*MA(3) + u_t

Stochastic Trend Model: Y_t – Y_t-1 = b₀ + b₁*AR(1) + b₂*AR(3) + u_t

Hind sight is 20/20. In this case, the stochastic model would have been the better choice.

The stochastic model yields a better forecast error (MAPE = 2.0%) than the deterministic model (MAPE = 5.6%) over the forecast horizon.

But at the time we had to make the forecast, all we had available were data through December 2003.

So, how do we pick between a deterministic and a stochastic forecasting model?

Holdout sample

From a practical perspective, unless we have very strong evidence of a stochastic process, the best course of action is to use a holdout sample.

Yes, there are techniques for testing whether a times series is “stationary” (i.e. has no trend) when visually it is not obvious.

But pragmatically, we are concerned about short-run forecast accuracy. And one way to compare competing models is by their performance in a holdout sample.

As we discussed in an earlier article, hold out a period of time equal to your forecast horizon from the data used to estimate a model. In this case, 2 years (January 2001 – December 2003).

Then build your models on data prior to January 2001 and compare the models’ forecast performance over the holdout sample.

In this case, such a holdout sample does not include any data from the strong trend period (pre-May 2001). So, likely a stochastic model would have performed better in the holdout sample as well.

But suppose we do this and have two (or more) models that perform equally well in the holdout sample?

We’ll cover this possibility in a subsequent article.

Deterministic/stochastic trend? holdout sample!

Part 1 – Practical Time Series Forecasting – Introduction

Part 2 – Practical Time Series Forecasting – Some Basics

Part 3 – Practical Time Series Forecasting – Potentially Useful Models

Part 4 – Practical Time Series Forecasting – Data Science Taxonomy

Part 5 – Practical Time Series Forecasting – Know When to Hold ’em

Part 6 – Practical Time Series Forecasting – What Makes a Model Useful?

The post Practical Time Series Forecasting – To Difference or Not to Difference appeared first on KDD Analytics.