In Weighted Moving Average, you can give more weight to recent events, but you are limited to the last m observations. Exponential Smoothing improves on Weighted Moving Average by taking all previous observations into account, while still favoring the most recent observations.
Basic Concepts
In Simple (aka Single) Exponential Smoothing, the forecasted value at time i+1 is based on the value at time i, and the forecasted value at time i (and so indirectly on all the previous time values). In particular, for some α where 0 ≤ α ≤ 1, for all i > 1, we define
Note that we don’t include time i = 1 in the calculations of MAE and MSE.
By simple algebra, this iteration can also be expressed as
For example
and similarly for all values of ŷi.
Example using Excel formulas
Example 1: Redo Example 1 of Simple Moving Average using exponential smoothing with α = .4.
The result is shown in Figure 1. The formula in cell C4 is =B4 and the formula in cell C5 is =C4+B$21*(B4-C4).
Figure 1 – Single Exponential Smoothing
The forecast for the next value in the time series is 74.0 (cell C19), by using the formula
=C18+B$21*(B18-C18)
Example using Excel’s data analysis tool
Excel Data Analysis Tool: Excel provides the Exponential Smoothing data analysis tool to simplify the calculations described above.
To use this tool for Example 1, select Data > Analysis|Data Analysis and choose Exponential Smoothing from the menu that appears. A dialog box now appears which is similar to that shown in Figure 2 of Simple Moving Average, except that a Damping Factor field is used in place of the Interval field. If this field is left blank it defaults to .7.
The damping factor is just 1 – α. Thus for Example 1, you need to use .6 as the damping factor.
The output is shown in columns D and E of Figure 2 along with the chart.
Figure 2 – Excel’s Exponential Smoothing data analysis tool
The output takes a form similar to that shown in Figure 3 of Simple Moving Average. The standard error is calculated just as for moving averages with an interval of 3.
Example using Real Statistics’ data analysis tool
Example 2: Find the best fit exponential smoothing approximation to the data Example 1, using the MAE measure of accuracy.
Once again we use Solver. This time we minimize the value of MAE (cell J21 in Figure 3) by changing the value in cell H21 subject to the constraint that H21 <= 1.0.
The result shown in Figure 3 is that α = .656205 with MAE = 10.08.
Figure 3 – Optimizing Exponential Smoothing
Click here for an example of how to obtain the standard errors and confidence intervals for the forecast obtained via the Exponential Smoothing option of the Basic Forecasting data analysis tool.
Examples Workbook
Click here to download the Excel workbook with the examples described on this webpage
References
Hyndman, R. J., and Athanasopoulos, G. (2018) Simple exponential smoothing. Forecasting: principals and practice, 2nd Ed.
https://otexts.com/fpp2/ses.html
Excel Easy (2016) Exponential smoothing
https://www.excel-easy.com/examples/exponential-smoothing.html
Charles,
Amazing work!
Question about solver–since the default settings don’t give optimal solutions for alpha (or beta and gamma in double and triple exponential smoothing), would it be possible to have Real Stats change the options for solver to “multistart” to make this more robust?
Or is the only option for the user to manually re-run solver once Real Stats finishes its output?
Thanks!
Tom
Hello Tom,
Thank you very much for your kind words. I am glad that you like Real Statistics.
1. You can run Real Statistics and then change to the “multistart” option on the Solver dialog box (without rerunning the Real Statistics). But this may not be necessary.
2. A few releases ago, I made a change in the way I implemented Holt’s Trend and Holt-Winter. Before using Solver, I perform a grid search to try to locate values for alpha, beta, and gamma that are close to the optimum location. I then use Solver to fine-tune the result. I decided not to do the same for Simple Exponential Smoothing (optimize alpha only) since the results from Solver only seemed to be sufficient.
3. Please let me know whether you have found cases where this approach is not working very well.
Charles
With the new “LET” “Lambda” “SCAN”… (functions available in Office insider for now), here is how I generate the smoothed values, then the solver can be used to find the optimum value:
=LET(Data;… ‘ the data to smooth
SES_1;INDEX(Data;1;1); …. ‘ the first value of “Data” to begin the calculation…
α; …. ‘the value of Alpha
SES;SCAN(SES_1;Donnees;LAMBDA(a;b;α*b+(1-α)*a)); … it’s all! the calculation of the simple expo Exponential Smooting is done directly.
Note : … Then I can also generate a Double Exponential Smoothing:
DES_1;INDEX(SES;1;1); … ‘ the first value of “Data” to begin the calculation coming from SES above …
DES;SCAN(DES_1;SES;LAMBDA(a;b;α*b+(1-α)*a)); … the data of the calculations of the double Exponential Smooting
The data are generated and the Solver can be used.
(But I dont’ know how to have access directly to the Solver Functions (question already written in this site), “Please Microsoft make an effort *”, but how to join them ?) a function instead of a as something like Solver (Objective, variables …)
* a function cannot modify a worksheet but a subroutine can.
Hi Charles,
In Simple Exponential Smoothing can we take alpha greater than 1 ? If no why not ? I am working on a forecasting model using past data, where in if I take alpha = 1.17 the MSE in at minimum, where as below that or beyond that the MSE increases, which clearly means that 1.17 is the optimum value of alpha. Any comments ?
Hello Prashant,
The predicted value at time t+1 is based on alpha * the observed value at time t + (1-alpha) * the predicted value at time t.
If alpha > 1 then 1-alpha < 0 and so the predicted value at time t has a negative impact on the predicted value at time t+1, which is counter-intuitive. Charles
If I were to use exponential Smoothing with an Alpha equals to zero, will my forecast remain constant from period to period?
Macx,
I suggest that you try repeating the example on this webpage setting alpha = 0 and see what happens yourself.
Charles
Hi Charles,
Thanks for sharing these examples.
If the ask is to forecast one day ahead in the window, say for instance, I am in D1 and need to forecast for D3 based on D1 observed values. I am tried with SES and want to improve the accuracy. Do you suggest anything better?
Norbert,
I am not sure I understand what you are looking for. Do you need to forecast more than one day ahead?
Charles
ANY STANDARD VALUE FOR DAMPING FACTOR
I used .7 as the default, but I don’t know whether this is considered to be the standard value.
Charles
Hi! I sincerely hope I get a response from you. Would it be possible to forecast, for example I only have a data from 2008 to 2017 and I would like to know their values by 2025?
Hello Arista,
For Simple Exponential Smoothing, the forecast for every year after 2018 is the same as the forecast for 2018 (at least in the implementation that I am using. You can see this by using Holt’s Trend and setting the beta parameter to zero and choosing a forecast period > 1.
Charles
Hi Mr. Charles!
How will the forecasting accuracy be performed then (e.g. MAPE, MSE, MAD)? Since there is no actual value from 2018 to 2025, error cannot be determined =(
Hello Arista,
You won’t be able to calculate MAPE, MSE or MAD even when the values after the first forecast value are not constant since there is no observed data to compare the forecast to.
Charles
I am having a problem about the first value and whether it should be included in calculating MSE.Although MSE is zero based on forecasted value for year 1 but including that zero also impacts the average.Like for example ,question says
a. Calculate the single exponential smoothing forecast for these data using an α of 0.3 and an initial forecast (F1) of 28
The actual data for F1 is also 28.But including zero in mse calculation leads to difference in average.Please suggest what is to be done
Hello Nidhi,
I would not include an initial zero (based on a initialization of values) in the calculation of MSE.
Charles
Hi Charles,
Wikipedia gives a slight variation of your equation which is (using your notation): $\hat{y}_{i+1}=\alpha y_{i+1}+(1-\alpha)\hat{y}_i$ (note the $y_{i+1}$ instead of $y_i$. I wonder if your equation does not imply a unwanted shift in the prediction, in particular if \alpha=1, the Wikipedia formula gives the original series (unsmoothed); where your formula gives the original series shifted by one. I noticed a similar “problem” for the Holt’s linear trend; I have not looked at the triple exponential smoothing.
I am not sure if it is intentional or not, could you clarify this?
Hello Olivier,
The formula that I am using can be found in many sources. For example at https://www.itl.nist.gov/div898/handbook/pmc/section4/pmc432.htm.
I haven-t thought about the significance of the difference between the formula that I am using and that found on wikipedia.
Charles
Hello! is it normal to have a negative result in the forecast by using Holts linear method?
It depends on what sort of data values you are using, but, yes, you can have negative forecast values.
Charles
Why do i have to leave a first column blank when I forecast?
I don’t know what you are referring to? I don’t see where the first column is blank. Do you mean first row?
Charles
Sorry for the wrong question that I asked.
Why should i leave a blank in the first row in the column of the forecast?
Hello,
The predicted value at each time depends on the value at the previous time. The first row corresponds to time 1 (the first time period). Since there is no previous time, you can’t calculate a predicted time for the first row.
Charles
Hi Charles,
I’m wondering if those first 2 lines of formulae are supposed to be a little different. You have that y-hat = y and based on the simple algebra referenced before then next formulae, it seems like it should be y -y-hat = e. What am I missing?
NVM. I missed that the index on the first equation was indexed as 1, not i.
Hello, Charles!
Is there a difference if you forecast the number of teachers by quarter and by year using exponential smoothing methods or time series basic tools(Moving average,holt’s winters,holt’s linear,trend adjusted and single exponential smoothing) in excel? I have a data of 7 years of deployment of teachers with their specialization then i want to forecast the number of teachers in the next 5 years.
Hello Rhea,
Yes, there will be a difference. Which tool to use is described on the website.
Charles
How to forecast the number of teachers deployment in the America in the next 5 years using single exponential smoothing,holts method, holt winters methor,and trend adjusted exponential smoothing with a data of 7 years deployment of teachers in the US?
Hello,
How to forecast using each of these methods is explained on the Real Statistics website. This webpage explains single exponential smoothing. To find the others, enter the name of the method in the Search bar.
Charles
Hi Charles,
Your examples are excellent! I have two questions about fitting the SES forecasts:
1) Is MAE usually the best measure of accuracy when optimizing SES for time series forecasting or is there a case for using Mean Absolute Deviation, or a similar robust measure, when there is concern about outliers skewing the fit?
2) The initial value of your forecast is just the first value in the time series but is there an optimal way of selecting the initial value? I have tried using the average of the first half dozen or so time series values and found that often yields a more accurate fit.
Robert
Robert,
1. I believe that MAE is commonly used. If you are concerned about outliers, then Mean Absolute Deviation might be a better choice.
2. I am not surprised that the average of the first several values gives a better result.
Charles
Charles,
could you please explain how to forecast values for eg. next 5 months.
Paul,
It depends on the forecasting technique that you are using. Are you asking about Simple Exponential Smoothing?
Charles
Yes, Simple Exponential Smoothing
Paul,
Since simple exponential smoothing assumes there is no trend in the data, the forecast is flat. This means that once you forecast a value of x for future period 1, the forecast for all periods after 1 is still x.
If you want to forecast assuming that the future is not flat, you need to use a different technique — e.g. Holt Trend.
Charles
Thanks Charles !
Hi Charles,
Three comments:
1). When deriving the example at the top of the page, you drop alpha as a coefficient for the last expression. (ie, y5 = ay4 + a(1-a)y3 + …….. + (1-a)^3y1).
Is this an oversight or is it correct? I think it’s correct but I’d like to make sure.
2). I think the cells you reference in example 1 are incorrect. There are no values in the I and H columns (but they do for example 2, which is where I think you got confused).
3). I’m having a hard time seeing where the values are derived from.
For example, take example 1 for the data point where t = 4.
Based on your derivation I would assume that you get your answer like this:
y4 = ay3 + a(1-a)y2 + (1-a)^2y1
Using the actual data this comes out to:
(0.4)*3.8 + 0.4*(0.6)*3 + 0.36*3
This does not equal 5.8. Am I incorrect in my reasoning or is something amiss with the examples?
Hi Jonathan,
1. It is correct
2. Thanks for catching this error. I have now corrected the formulas on the webpage.
3. I just checked and the expression for y5 (in item #1) does yield the correct value of 11.528. I didn’t try y4, but I am sure that it will be correct as well.
Charles
Charles,
I’m having a difficult time squaring away the equivalence of yi+1 = yi + aei with the longer derivation you put forth after that.
If you use the values from your given example they don’t match up.
Can you explain this in a little more detail?
Jonathan,
Suppose we place the predicted y values in column G.
Place the formula =B4 in cell G4. Next place the formula =$B$21*B4+(1-$B$21)*G4 in cell G5. Now highlight the range G5:G19 and press Ctrl-D. The result is that column G will contain the same values as column B.
Also notice that the formula =B21*B7+B21*(1-B21)*B6+B21*(1-B21)^2*B5+(1-B21)^3*B4 will have the same value as B8 (or G8).
Charles