<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	
	xmlns:georss="http://www.georss.org/georss"
	xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#"
	>

<channel>
	<title>holdout sample Archives - KDD Analytics</title>
	<atom:link href="https://www.kddanalytics.com/tag/holdout-sample/feed/" rel="self" type="application/rss+xml" />
	<link>https://www.kddanalytics.com/tag/holdout-sample/</link>
	<description>Data to Decisions</description>
	<lastBuildDate>Sat, 24 Mar 2018 02:30:37 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>
	hourly	</sy:updatePeriod>
	<sy:updateFrequency>
	1	</sy:updateFrequency>
	<generator>https://wordpress.org/?v=6.8.3</generator>

<image>
	<url>https://i0.wp.com/www.kddanalytics.com/wp-content/uploads/2016/08/cropped-imageedit_1_7939659602.png?fit=32%2C32&#038;ssl=1</url>
	<title>holdout sample Archives - KDD Analytics</title>
	<link>https://www.kddanalytics.com/tag/holdout-sample/</link>
	<width>32</width>
	<height>32</height>
</image> 
<site xmlns="com-wordpress:feed-additions:1">114932494</site>	<item>
		<title>Practical Time Series Forecasting – Know When to Roll ‘em</title>
		<link>https://www.kddanalytics.com/practical-times-series-forecasting-rolling-holdout-sample/</link>
		
		<dc:creator><![CDATA[KDD]]></dc:creator>
		<pubDate>Mon, 29 Jan 2018 01:33:32 +0000</pubDate>
				<category><![CDATA[Data Analytics Methods]]></category>
		<category><![CDATA[Econometrics]]></category>
		<category><![CDATA[Forecasting]]></category>
		<category><![CDATA[Time Series]]></category>
		<category><![CDATA[forecast error]]></category>
		<category><![CDATA[holdout sample]]></category>
		<category><![CDATA[rolling analysis]]></category>
		<category><![CDATA[times series]]></category>
		<guid isPermaLink="false">http://www.kddanalytics.com/?p=1322</guid>

					<description><![CDATA[<p>“Prediction is very difficult, especially if it&#8217;s about the future.” ― Niels Bohr, physicist Holdout samples are a key component to estimating a “useful” forecasting model. Set aside data at least equal in length to your forecast horizon (“holdout sample”). Build your models on the remaining data (“modeling sample”). And compare the candidate models’ forecast&#8230;</p>
<p>The post <a href="https://www.kddanalytics.com/practical-times-series-forecasting-rolling-holdout-sample/">Practical Time Series Forecasting – Know When to Roll ‘em</a> appeared first on <a href="https://www.kddanalytics.com">KDD Analytics</a>.</p>
]]></description>
										<content:encoded><![CDATA[<p><strong>“</strong><em>Prediction is very difficult, especially if it&#8217;s about the future.</em><strong>”<br />
― <a href="https://en.wikipedia.org/wiki/Niels_Bohr" target="_blank" rel="noopener">Niels Bohr</a></strong>, physicist</p>
<p><a href="https://www.kddanalytics.com/practical-time-series-forecasting-holdout-sample/" target="_blank" rel="noopener"><strong>Holdout samples</strong></a> are a key component to estimating a “useful” forecasting model. <strong>Set aside data at least equal in length to your forecast horizon</strong> (“holdout sample”). Build your models on the remaining data (“modeling sample”). And <strong>compare the candidate models’ forecast performance over the holdout sample.</strong></p>
<p>At a minimum, a single holdout sample should be used.</p>
<p>But to get a <strong>better sense of a model’s future performance, consider using multiple holdout samples</strong>.</p>
<p>This <strong>guards against</strong> basing your model on a <strong>holdout sample</strong> that is <strong>unrepresentative</strong> of the overall characteristics of the time series.</p>
<p>One way to achieve this is to use<strong> “rolling” holdout samples</strong>.</p>
<h3>Rolling analysis</h3>
<p>A <a href="https://link.springer.com/chapter/10.1007%2F978-0-387-32348-0_9" target="_blank" rel="noopener"><strong>rolling analysis</strong></a> of a time series is generally used to test a model’s stability. That is, <strong>are a model’s parameters stable across time</strong> or do they change, especially in a systematic way?</p>
<p>This is important for a forecasting model. We <strong>don’t want</strong> a forecasting model whose <strong>parameters</strong> are <strong>changing during the forecast horizon in an unexpected (i.e. unmodeled) manner.</strong></p>
<p>Suppose our forecast horizon is 6 months.</p>
<p><strong> Under a single holdout sample</strong>, we would <strong>set aside the last 6 months of data as the holdout sample</strong>. Then using the remaining data as the modeling sample, estimate models, forecast over the single holdout sample and compare the models’ performance.</p>
<p>This will help narrow down the pool of candidate models.</p>
<h4>Rolling holdout samples</h4>
<p>But under a rolling holdout approach, also called &#8220;<a href="http://otexts.org/fpp2/accuracy.html" target="_blank" rel="noopener"><strong>time series cross-validation</strong></a>,&#8221;  <strong>we would set aside a longer sample of data</strong>, say, the last 12 months. Then:</p>
<p><strong>Step 1:</strong>  Estimate a model and forecast over the <strong>first</strong> 6-months of this 12-month period (&#8220;roll 1&#8221;);</p>
<p><strong>Step 2:</strong>  Then add one 1 month to the tail-end of the estimation sample, recalibrate the model, and forecast over the subsequent 6-months (“roll 2”);</p>
<p><strong>Step 3:</strong>  Then add another month to the estimation sample, recalibrate and forecast over the subsequent 6-months (“roll 3”);</p>
<p><strong>Step 4:</strong>  Repeat until there are no more 6-month periods (&#8220;rolls&#8221;) remaining in the 12-month period.</p>
<p>So, <strong>in this example</strong>, we would have <strong>recalibrated our model 7 times</strong> (each with a modeling sample that is one additional month longer than the previous). And we would have <strong>made 7 forecasts over the rolling holdout periods</strong>.</p>
<p>The <strong>last &#8220;roll</strong>,&#8221; it turns out, <strong>is the same 6-month period</strong> we would have used <strong>under a single 6-month holdout sample case</strong>. So, we generate the stats for a standard single holdout sample during the course of this rolling holdout approach.</p>
<p>If we are examining multiple candidate models, this process can generate a lot of data. Below is an example of the rolling forecasts for one model.</p>
<p><img data-recalc-dims="1" fetchpriority="high" decoding="async" class="size-full wp-image-1325 aligncenter" src="https://i0.wp.com/www.kddanalytics.com/wp-content/uploads/2017/12/Rolling-Holdout-Samples.png?resize=561%2C547&#038;ssl=1" alt="Rolling Holdout Samples" width="561" height="547" srcset="https://i0.wp.com/www.kddanalytics.com/wp-content/uploads/2017/12/Rolling-Holdout-Samples.png?w=561&amp;ssl=1 561w, https://i0.wp.com/www.kddanalytics.com/wp-content/uploads/2017/12/Rolling-Holdout-Samples.png?resize=300%2C293&amp;ssl=1 300w" sizes="(max-width: 561px) 100vw, 561px" /></p>
<h3>Summary roll statistics</h3>
<p>We could generate a similar chart for every model we are testing. But it is <strong>easier to work with measures of forecast accuracy and bias</strong>, such as <a href="https://www.kddanalytics.com/practical-time-series-forecasting-holdout-sample/" target="_blank" rel="noopener"><strong>MAPE</strong></a> and <a href="https://www.kddanalytics.com/practical-time-series-forecasting-holdout-sample/" target="_blank" rel="noopener"><strong>MPE</strong></a>.</p>
<p>For each roll forecast, we can calculate the MAPE and MPE and observe how they change across the rolling forecasts.</p>
<p>Are the MAPE and MPE constant? Fluctuate with no apparent trend? Or exhibit some systematic trend?</p>
<p>Doing this for every candidate model we are testing generates charts like this which can quickly show any areas of concern:</p>
<p><img data-recalc-dims="1" decoding="async" class="size-full wp-image-1326 aligncenter" src="https://i0.wp.com/www.kddanalytics.com/wp-content/uploads/2017/12/Rolling-Holdout-Samples-MAPE.png?resize=604%2C370&#038;ssl=1" alt="" width="604" height="370" srcset="https://i0.wp.com/www.kddanalytics.com/wp-content/uploads/2017/12/Rolling-Holdout-Samples-MAPE.png?w=604&amp;ssl=1 604w, https://i0.wp.com/www.kddanalytics.com/wp-content/uploads/2017/12/Rolling-Holdout-Samples-MAPE.png?resize=300%2C184&amp;ssl=1 300w" sizes="(max-width: 604px) 100vw, 604px" /></p>
<p>In this example, candidate models 18 and 15 may be worth further inspection since their MAPEs are much higher than the rest in a recent roll period (roll 6).</p>
<h3>What else makes a model useful?</h3>
<p>So, with respect to the <strong>guidelines</strong> for whittling down a pool of candidate models we listed in an <strong><a href="https://www.kddanalytics.com/practical-time-series-forecasting-what-makes-a-useful-model/" target="_blank" rel="noopener">earlier article</a></strong>, we can add the following from a rolling holdout analysis:</p>
<p><strong>Stability</strong> – The model’s parameters should retain their statistical significance and not vary too much across the rolling periods; and the model&#8217;s residuals should remain &#8220;<strong>white noise</strong>&#8221; across the rolls;</p>
<p><strong>Consistency of Performance</strong> – The model’s forecast accuracy and bias should not exhibit any strong trends, especially trends in the “wrong” direction (i.e. getting progressively worse) as the more recent time period is approached.</p>
<p><strong>Strong Rolling Holdout Sample Performance</strong> – The model’s forecast accuracy and bias, <strong>averaged across all the rolls</strong>, should be high and low respectively. That is <strong>both the average MAPE </strong>and<strong> MPE should be low</strong>.</p>
<h3>Benefits of Rolling</h3>
<p>The primary benefit of a rolling analysis is that we get to see <strong>how a model performs</strong> forecast-wise <strong>over multiple time spans</strong> equal in length to our forecast horizon; <strong>instead of relying on performance in just one holdout sample</strong>.</p>
<p>A rolling analysis also <strong>addresses the issue of a short holdout sample</strong> (e.g. short forecast horizon) <strong>possibly not being representative of the general character of the time series</strong>.</p>
<p>In addition, a rolling analysis can be used as a check for the “best” model chosen using a single holdout sample. That is, would you pick the same model using the rolling holdout approach? If not, why?</p>
<p>In sum, <strong>a model that is persistently better at holdout sample forecasting over a longer time frame is likely to be more robust.</strong></p>
<p>So, let ‘em roll!</p>
<p><a href="https://www.kddanalytics.com/practical-time-series-forecasting-introduction/" target="_blank" rel="noopener"><strong>Part 1 &#8211; Practical Time Series Forecasting &#8211; Introduction</strong></a></p>
<p><a href="https://www.kddanalytics.com/practical-time-series-forecasting-basics/" target="_blank" rel="noopener"><strong>Part 2 &#8211; Practical Time Series Forecasting &#8211; Some Basics</strong></a></p>
<p><a href="https://www.kddanalytics.com/practical-time-series-forecasting-useful-models/" target="_blank" rel="noopener"><strong>Part 3 &#8211; Practical Time Series Forecasting &#8211; Potentially Useful Models</strong></a></p>
<p><a href="https://www.kddanalytics.com/practical-time-series-forecasting-data-science-taxonomy/" target="_blank" rel="noopener"><strong>Part 4 &#8211; Practical Time Series Forecasting &#8211; Data Science Taxonomy</strong></a></p>
<p><a href="https://www.kddanalytics.com/practical-time-series-forecasting-holdout-sample/" target="_blank" rel="noopener"><strong>Part 5 &#8211; Practical Time Series Forecasting &#8211; Know When to Hold &#8217;em</strong></a></p>
<p><a href="https://www.kddanalytics.com/practical-time-series-forecasting-what-makes-a-useful-model/" target="_blank" rel="noopener"><strong>Part 6 &#8211; Practical Time Series Forecasting &#8211; What Makes a Model Useful?</strong></a></p>
<p><a href="https://www.kddanalytics.com/practical-time-series-forecasting-deterministic-stochastic-trend/" target="_blank" rel="noopener"><strong>Part 7 &#8211; Practical Time Series Forecasting &#8211; To Difference or Not to Difference</strong></a></p>
<p>&nbsp;</p>
<p>The post <a href="https://www.kddanalytics.com/practical-times-series-forecasting-rolling-holdout-sample/">Practical Time Series Forecasting – Know When to Roll ‘em</a> appeared first on <a href="https://www.kddanalytics.com">KDD Analytics</a>.</p>
]]></content:encoded>
					
		
		
		<post-id xmlns="com-wordpress:feed-additions:1">1322</post-id>	</item>
		<item>
		<title>Practical Time Series Forecasting – Know When to Hold ‘em</title>
		<link>https://www.kddanalytics.com/practical-time-series-forecasting-holdout-sample/</link>
		
		<dc:creator><![CDATA[KDD]]></dc:creator>
		<pubDate>Mon, 08 Jan 2018 01:37:33 +0000</pubDate>
				<category><![CDATA[Data Analytics Methods]]></category>
		<category><![CDATA[Econometrics]]></category>
		<category><![CDATA[Forecasting]]></category>
		<category><![CDATA[Time Series]]></category>
		<category><![CDATA[forecast bias]]></category>
		<category><![CDATA[forecast error]]></category>
		<category><![CDATA[forecasting]]></category>
		<category><![CDATA[holdout sample]]></category>
		<category><![CDATA[methodology]]></category>
		<guid isPermaLink="false">http://www.kddanalytics.com/?p=1263</guid>

					<description><![CDATA[<p>“The only relevant test of the validity of a hypothesis is comparison of prediction with experience.” ― Milton Friedman, economist Holdout samples are a mainstay of predictive analytics. Set aside a portion of your data (say, 30%). Build your candidate models. Then “internally validate” your models using the holdout sample. More sophisticated methods like cross&#8230;</p>
<p>The post <a href="https://www.kddanalytics.com/practical-time-series-forecasting-holdout-sample/">Practical Time Series Forecasting – Know When to Hold ‘em</a> appeared first on <a href="https://www.kddanalytics.com">KDD Analytics</a>.</p>
]]></description>
										<content:encoded><![CDATA[<p>“<em>The only relevant test of the validity of a hypothesis is comparison of prediction with experience.</em>”<br />
― <strong>Milton Friedman, economist</strong></p>
<p><strong>Holdout samples</strong> are a mainstay of predictive analytics.</p>
<p>Set aside a portion of your data (say, 30%). Build your <a href="https://www.kddanalytics.com/practical-time-series-forecasting-useful-models/" target="_blank" rel="noopener"><strong>candidate models</strong></a>. Then “<strong>internally validate</strong>” your models using the holdout sample.</p>
<p>More sophisticated methods like <a href="https://en.wikipedia.org/wiki/Cross-validation_(statistics)"><strong>cross validation</strong></a> use multiple holdout samples. But the idea is to <strong>see how well your models predict using data the model has not “seen” before</strong>. Then go back and fine tune to improve the models&#8217; predictive accuracy.</p>
<h3>Time series holdout samples</h3>
<p>The <strong>truest test of your models</strong> is when they are applied to “new” data. Data from a fresh marketing campaign, a new set of customers, a more recent time period (“<strong>external validation</strong>”).</p>
<p>But you may not have access to such data when building your models. You certainly will not have access to future data.</p>
<p>So, a <strong>holdout sample needs to be crafted from the historical data at your disposal</strong>.</p>
<p>When building predictive models for, say, a marketing campaign or for loan risk scoring, there is usually a large amount of data to work with. So, holding out a sample for testing still leaves lots of data for model building.</p>
<p>However, the situation can be much different when working with time series data.</p>
<p>Depending on the frequency of the series, the <strong>amount of data points available to work with can be limited</strong>. 50 years of annual data is just 50 data points. 5 years of monthly data is just 60 data points.</p>
<p>Obviously the greater the frequency of data, the greater the number of data points available to work with…5 years of daily data is 1,825 data points. But these time series sample sizes usually pale against the large customer sets used to fuel marketing campaigns, which can run into the hundreds of thousands.</p>
<p>So, does this mean that holdout samples shouldn’t be used to test time series forecasting models?</p>
<p><strong>Absolutely not!</strong></p>
<p>You still <strong>need a way to</strong> <strong>whittle down your candidate models</strong>. You just need to be careful in how you select and use your holdout sample.</p>
<h3>Holdout sample length</h3>
<p>How much data should you set aside for a holdout sample? The <strong>rule of thumb</strong> we go by is to choose a holdout sample length that is <strong>at least</strong> (a) <strong>equal to the length of your forecast horizon</strong> or (b) <strong>equal to the length of time needed for your business to make a change</strong>.</p>
<p>Suppose you need a 12-month forecast to support a business plan. And you wish to forecast monthly sales for the 12 months starting November 1, 2017.</p>
<p>Then, your holdout sample should be at least the 12 months pertaining to November 2016 through October 2017. And your estimation sample should be all months prior to November 2016.</p>
<p><img data-recalc-dims="1" decoding="async" class="size-full wp-image-1267 aligncenter" src="https://i0.wp.com/www.kddanalytics.com/wp-content/uploads/2017/12/Example-of-Holdout-Sample-1.png?resize=618%2C385&#038;ssl=1" alt="Using a holdout sample for time series forecasting" width="618" height="385" srcset="https://i0.wp.com/www.kddanalytics.com/wp-content/uploads/2017/12/Example-of-Holdout-Sample-1.png?w=618&amp;ssl=1 618w, https://i0.wp.com/www.kddanalytics.com/wp-content/uploads/2017/12/Example-of-Holdout-Sample-1.png?resize=300%2C187&amp;ssl=1 300w" sizes="(max-width: 618px) 100vw, 618px" /></p>
<p>Remember, the <strong>time series methods we are addressing are best used for short-run forecasting</strong>. Most business forecasting needs are for short-run forecasts. The next few months or few years. Not the next 5 to 10 years.</p>
<p>Alternatively, suppose your business only needs 8 months to make a change (maybe it is getting more salespeople on line). Then your holdout sample should be at least 8 months.</p>
<h3>Holdout sample performance</h3>
<p>Once you estimate a model, you apply it to the holdout sample to see how well it predicts. There are several <strong>measures</strong> you can use to gauge <strong>how well your model performs</strong>. We focus on measures of <strong>accuracy</strong> and <strong>bias</strong>.</p>
<h4>To measure forecast accuracy:</h4>
<p><strong>If the business cost of a forecast error is high</strong>, then the <a href="https://en.wikipedia.org/wiki/Mean_squared_error"><strong>Mean Square Error</strong></a> (MSE) or <a href="https://en.wikipedia.org/wiki/Root-mean-square_deviation"><strong>Root Mean Square Error</strong></a> (RMSE) will magnify it since forecast errors are squared. MSE is the average of (predicted – actual)<sup>2</sup>.</p>
<p><strong>If the business cost of a forecast error is average</strong>, then the <a href="https://en.wikipedia.org/wiki/Mean_absolute_percentage_error"><strong>Mean Absolute Percent Error</strong></a> (MAPE) can be used. MAPE is simply the average of the absolute value of [(predicted – actual)/actual]. However, care should be taken if “0” values are possible as MAPE would be undefined.</p>
<p>See <a href="http://otexts.org/fpp2/accuracy.html" target="_blank" rel="noopener"><strong>here</strong></a> for a discussion of forecast accuracy measures.</p>
<h4>To measure forecast bias:</h4>
<p>The <a href="https://en.wikipedia.org/wiki/Mean_percentage_error"><strong>Mean Percent Error</strong></a> (MPE) will indicate if there is a <strong>systematic bias to the forecast</strong>. If positive, then the model is over predicting; if negative it is underpredicting. And the further from 0, the greater the bias. MPE is the average of [(predicted – actual)/actual].</p>
<p>An alternative measure is <strong>Theil’s measure of systematic error</strong>, the “bias-proportion” of Theil’s <a href="http://www.eviews.com/help/helpintro.html#page/content%2FForecast-Forecast_Basics.html%23" target="_blank" rel="noopener"><strong>inequality coefficient</strong></a>. This measures the extent to which average values of the forecasted and actual values deviate from each other, the larger the value, the greater the systematic bias.</p>
<p><strong>In general, in the holdout sample, a good performing model will exhibit low overall error (high accuracy) and low systematic bias</strong>.</p>
<p>The chart below shows an example of such a model using a 5-month holdout sample. On average, the model’s error is between 0.28% and 1.85% while exhibiting a very small positive bias of 0.10%.</p>
<p><img data-recalc-dims="1" loading="lazy" decoding="async" class="size-full wp-image-1268 aligncenter" src="https://i0.wp.com/www.kddanalytics.com/wp-content/uploads/2017/12/Example-of-Holdout-Sample-2.png?resize=618%2C385&#038;ssl=1" alt="Example of holdout sample performance" width="618" height="385" srcset="https://i0.wp.com/www.kddanalytics.com/wp-content/uploads/2017/12/Example-of-Holdout-Sample-2.png?w=618&amp;ssl=1 618w, https://i0.wp.com/www.kddanalytics.com/wp-content/uploads/2017/12/Example-of-Holdout-Sample-2.png?resize=300%2C187&amp;ssl=1 300w" sizes="auto, (max-width: 618px) 100vw, 618px" /></p>
<p>Note that <strong>there is no absolute criterion for what constitutes a “low” error,</strong> for example, MSE.</p>
<p><strong>Measures of forecast error</strong> are to be <strong>judged relative to the context of the forecast</strong> you are making. In some cases, your models may be averaging an error in the 30%’s; in others it could be in the single digits.</p>
<h3>Length of estimation sample</h3>
<p>A related issue is <strong>how much data do you use for model estimation</strong>?</p>
<p><strong>Often, there is not a choice</strong>. After setting aside a holdout sample, there may be just a bare minimum amount of data left for modeling (i.e. need more data points than model parameters to be estimated).</p>
<p>In general, the <strong>fewer</strong> the <strong>number of model parameters</strong> and the <strong>less &#8220;noisy&#8221;</strong> the data (i.e. less random), the <strong>fewer the number of data points <a href="http://otexts.org/fpp2/short-ts.html" target="_blank" rel="noopener">needed</a></strong>. Typically, though, <strong>we look for at least 40 data points.</strong></p>
<p>If you have a <strong>high frequency time series</strong> (monthly, daily, hourly) you may have room to consider whether the <strong>choice of the estimation sample length can affect model performance</strong>.</p>
<p><strong>One can argue that the modeling sample should be reflective of the characteristics of the forecast horizon</strong>. That is the next year, say, is more likely to be like the past several years, not like 20 years ago. So, <strong>limit the estimating sample to more recent years</strong>.</p>
<p>Consider the time series shown below. Clearly the time path of this series has not been consistent. Rather than estimating a model using the entire historical sample, maybe limit it to the more recent period.</p>
<p><img data-recalc-dims="1" loading="lazy" decoding="async" class="size-full wp-image-1206 aligncenter" src="https://i0.wp.com/www.kddanalytics.com/wp-content/uploads/2017/12/Low-variation-time-series.png?resize=615%2C386&#038;ssl=1" alt="Low variation time series" width="615" height="386" srcset="https://i0.wp.com/www.kddanalytics.com/wp-content/uploads/2017/12/Low-variation-time-series.png?w=615&amp;ssl=1 615w, https://i0.wp.com/www.kddanalytics.com/wp-content/uploads/2017/12/Low-variation-time-series.png?resize=300%2C188&amp;ssl=1 300w" sizes="auto, (max-width: 615px) 100vw, 615px" /></p>
<p>The <strong>trade-off</strong> is that there is <strong>less experiential history upon which to base a model</strong>. Maybe the dynamics associated with that turning point in early 2000 and subsequent recovery could prove to be fertile ground for training your model.</p>
<p><strong>But this is testable proposition!</strong></p>
<p>Because you have already set aside a holdout sample, <strong>you can test whether a model estimated on the full (non-holdout) sample performs better in the holdout sample than one based on a more recent sample.</strong></p>
<h3>Data frequency compression</h3>
<p>Another use for a holdout sample is to test for whether changes to the frequency of the time series will improve predictive accuracy.</p>
<p><strong>The frequency of the time series could be reduced to help match a desired forecast horizon</strong>. For example, suppose management wants a 3-year forecast. And you are working with monthly SALES. Yes, you could produce a 36 period (month) forecast. But that might be pushing the limits of your methodology, especially if there is not a strong trend.</p>
<p>Alternatively, by converting to a quarterly series, you would lessen the variability in your data and forecast only 12 periods. <strong>This might yield a more accurate forecast</strong>.</p>
<p><strong>But again, this is testable using a holdout sample!</strong></p>
<h3>Bottom line</h3>
<p><strong>Holdout samples are a critical component</strong> of a time series forecasting methodology.</p>
<p>In a later article we will address using <strong>multiple</strong> holdout samples…to help guard against basing a model on a single, unrepresentative holdout sample (i.e. we found a great model just because we got lucky!).</p>
<a class="dpsp-click-to-tweet dpsp-style-1" href="https://twitter.com/intent/tweet?text=Holdout+sample+a+critical+component+of+a+time+series+forecasting+methodology.&url=https%3A%2F%2Fwww.kddanalytics.com%2Fpractical-time-series-forecasting-holdout-sample%2F"><div class="dpsp-click-to-tweet-content">Holdout sample a critical component of a time series forecasting methodology.</div><div class="dpsp-click-to-tweet-footer"><span class="dpsp-click-to-tweet-cta"><span>Click to Tweet</span><i class="dpsp-network-btn dpsp-twitter"><span class="dpsp-network-icon"></span></i></span></div></a>
<p><a href="https://www.kddanalytics.com/practical-time-series-forecasting-introduction/" target="_blank" rel="noopener"><strong>Part 1 &#8211; Practical Time Series Forecasting &#8211; Introduction</strong></a></p>
<p><a href="https://www.kddanalytics.com/practical-time-series-forecasting-basics/" target="_blank" rel="noopener"><strong>Part 2 &#8211; Practical Time Series Forecasting &#8211; Some Basics</strong></a></p>
<p><a href="https://www.kddanalytics.com/practical-time-series-forecasting-useful-models/" target="_blank" rel="noopener"><strong>Part 3 &#8211; Practical Time Series Forecasting &#8211; Potentially Useful Models</strong></a></p>
<p><a href="https://www.kddanalytics.com/practical-time-series-forecasting-data-science-taxonomy/" target="_blank" rel="noopener"><strong>Part 4 &#8211; Practical Time Series Forecasting &#8211; Data Science Taxonomy</strong></a></p>
<p>&nbsp;</p>
<p>&nbsp;</p>
<p>The post <a href="https://www.kddanalytics.com/practical-time-series-forecasting-holdout-sample/">Practical Time Series Forecasting – Know When to Hold ‘em</a> appeared first on <a href="https://www.kddanalytics.com">KDD Analytics</a>.</p>
]]></content:encoded>
					
		
		
		<post-id xmlns="com-wordpress:feed-additions:1">1263</post-id>	</item>
	</channel>
</rss>
