# Practical Time Series Forecasting – Data Science Taxonomy

“Big data is not about the data.*”

― **Gary King, Harvard University**

**Machine Learning**. **Deep Learning**. **Data Science**. **Artificial Intelligence**. **Big Data**.

Not a day goes by that one or all of these buzzwords stream past in our business news feeds.

**Data analytics has become mainstream**. And you better jump on board or risk being left at the station!

Just within the last year or so, **searches** of these topics have taken off. In fact, according to Google, in early 2017, search interest in one of these topics, **machine learning, has eclipsed that of big data**:

So, how do **time series methods for forecasting** fit into the taxonomy that currently defines the data science field?

### Data science taxonomy

Key data science terms that are related to time series methods for forecasting are **data mining**, **predictive analytics**, **machine learning** (supervised and unsupervised), **regression**, **structured** and **unstructured** data.

These are not necessarily mutually exclusive. At the risk of incurring the wrath of the data science gods, **here is our simplification**:

#### Structured vs. unstructured data

Structured data are organized into “rows and columns” (spreadsheet); unstructured data are not (text in a book).

**Time series methods use structured data**.

#### Data mining

Data mining seeks to find patterns in data, whether structured or unstructured.

**Time series methods seek to find patterns that repeat over time**.

#### Predictive analytics

Predictive analytics seeks to find a relationship between a variable of interest (e.g. customer churn) and multiple dimensions (e.g. age, length of contract, zip code). These dimensions can be used to predict the likelihood of a customer churning (in our example).

Typically, predictive analytics is not based on time series data but “cross-sectional” data like a customer set. Additionally, time series methods use only a very limited set of dimensions, the primary one being past behavior of the variable being forecasted (e.g. sales).

**Time series methods typically use the past behavior of the variable being forecasted as the primary dimension.**

#### Machine learning

Machine learning means that a computer is using a program (algorithm) to “connect the dots” in the data. **If you run a regression model in Excel you are engaging in machine learning.**

However, supervised machine learning does not mean you are keeping watch over Excel as it does its stuff!

**Supervised machine learning means** that the computer is seeking to find a relationship between a single variable (e.g. churn) and many dimensional variables (e.g. age, length of contract, zip code).

**Unsupervised machine learning** **means** that the computer is seeking to find a relationship between many dimensions (e.g. age, length of contract, zip code) so that customers can, for example, be clustered into a small number of groups or tribes with similar characteristics.

**Time series methods are a type of supervised machine learning since they attempt to find a relationship between present and past behavior**.

#### Regression

Regression is one way a machine finds relationships between a single variable and a few (or many) dimensional variables or past values of the variable itself. There are several flavors of regression.

** Time series models typically use least squares regression or maximum likelihood**.

### Bottom line

So, when you use time series methods for forecasting you are probably **mining structured data using supervised, regression- or maximum likelihood-based, machine learning**.

**Part 1 – Practical Time Series Forecasting – Introduction**

**Part 2 – Practical Time Series Forecasting – Some Basics**

**Part 3 – Practical Time Series Forecasting – Potentially Useful Models**