Simple Exponential Smoothing

In Weighted Moving Average, you can give more weight to recent events, but you are limited to the last m observations. Exponential Smoothing improves on Weighted Moving Average by taking all previous observations into account, while still favoring the most recent observations.

Basic Concepts

In Simple (aka Single) Exponential Smoothing, the forecasted value at time i+1 is based on the value at time i, and the forecasted value at time i (and so indirectly on all the previous time values). In particular, for some α where 0 ≤ α ≤ 1, for all i > 1, we define

image113

Note that we don’t include time i = 1 in the calculations of MAE and MSE.

By simple algebra, this iteration can also be expressed as

image114

For example

image115

image116x

image117x

and similarly for all values of ŷi.

Example using Excel formulas

Example 1: Redo Example 1 of Simple Moving Average using exponential smoothing with α = .4.

The result is shown in Figure 1. The formula in cell C4 is =B4 and the formula in cell C5 is =C4+B$21*(B4-C4).

Single exponential smoothing

Figure 1 – Single Exponential Smoothing

The forecast for the next value in the time series is 74.0 (cell C19), by using the formula

=C18+B$21*(B18-C18)

Example using Excel’s data analysis tool

Excel Data Analysis Tool: Excel provides the Exponential Smoothing data analysis tool to simplify the calculations described above.

To use this tool for Example 1, select Data > Analysis|Data Analysis and choose Exponential Smoothing from the menu that appears. A dialog box now appears which is similar to that shown in Figure 2 of Simple Moving Average, except that a Damping Factor field is used in place of the Interval field. If this field is left blank it defaults to .7.

The damping factor is just 1 – α. Thus for Example 1, you need to use .6 as the damping factor.

The output is shown in columns D and E of Figure 2 along with the chart.

Exponential Smoothing Excel tool

Figure 2 – Excel’s Exponential Smoothing data analysis tool

The output takes a form similar to that shown in Figure 3 of Simple Moving Average. The standard error is calculated just as for moving averages with an interval of 3.

Example using Real Statistics’ data analysis tool

Example 2: Find the best fit exponential smoothing approximation to the data Example 1, using the MAE measure of accuracy.

Once again we use Solver. This time we minimize the value of MAE (cell J21 in Figure 3) by changing the value in cell H21 subject to the constraint that H21 <= 1.0.

The result shown in Figure 3 is that α = .656205 with MAE = 10.08.

Exponential smoothing Solver optimization

Figure 3 – Optimizing Exponential Smoothing

Click here for an example of how to obtain the standard errors and confidence intervals for the forecast obtained via the Exponential Smoothing option of the Basic Forecasting data analysis tool.

Examples Workbook

Click here to download the Excel workbook with the examples described on this webpage

References

Hyndman, R. J., and Athanasopoulos, G. (2018) Simple exponential smoothing. Forecasting: principals and practice, 2nd Ed.
https://otexts.com/fpp2/ses.html

Excel Easy (2016) Exponential smoothing
https://www.excel-easy.com/examples/exponential-smoothing.html

42 thoughts on “Simple Exponential Smoothing”

  1. Charles,

    Amazing work!

    Question about solver–since the default settings don’t give optimal solutions for alpha (or beta and gamma in double and triple exponential smoothing), would it be possible to have Real Stats change the options for solver to “multistart” to make this more robust?

    Or is the only option for the user to manually re-run solver once Real Stats finishes its output?

    Thanks!
    Tom

    Reply
    • Hello Tom,
      Thank you very much for your kind words. I am glad that you like Real Statistics.
      1. You can run Real Statistics and then change to the “multistart” option on the Solver dialog box (without rerunning the Real Statistics). But this may not be necessary.
      2. A few releases ago, I made a change in the way I implemented Holt’s Trend and Holt-Winter. Before using Solver, I perform a grid search to try to locate values for alpha, beta, and gamma that are close to the optimum location. I then use Solver to fine-tune the result. I decided not to do the same for Simple Exponential Smoothing (optimize alpha only) since the results from Solver only seemed to be sufficient.
      3. Please let me know whether you have found cases where this approach is not working very well.
      Charles

      Reply
    • With the new “LET” “Lambda” “SCAN”… (functions available in Office insider for now), here is how I generate the smoothed values, then the solver can be used to find the optimum value:

      =LET(Data;… ‘ the data to smooth
      SES_1;INDEX(Data;1;1); …. ‘ the first value of “Data” to begin the calculation…
      α; …. ‘the value of Alpha
      SES;SCAN(SES_1;Donnees;LAMBDA(a;b;α*b+(1-α)*a)); … it’s all! the calculation of the simple expo Exponential Smooting is done directly.

      Note : … Then I can also generate a Double Exponential Smoothing:
      DES_1;INDEX(SES;1;1); … ‘ the first value of “Data” to begin the calculation coming from SES above …
      DES;SCAN(DES_1;SES;LAMBDA(a;b;α*b+(1-α)*a)); … the data of the calculations of the double Exponential Smooting

      The data are generated and the Solver can be used.
      (But I dont’ know how to have access directly to the Solver Functions (question already written in this site), “Please Microsoft make an effort *”, but how to join them ?) a function instead of a as something like Solver (Objective, variables …)
      * a function cannot modify a worksheet but a subroutine can.

      Reply
  2. Hi Charles,
    In Simple Exponential Smoothing can we take alpha greater than 1 ? If no why not ? I am working on a forecasting model using past data, where in if I take alpha = 1.17 the MSE in at minimum, where as below that or beyond that the MSE increases, which clearly means that 1.17 is the optimum value of alpha. Any comments ?

    Reply
    • Hello Prashant,
      The predicted value at time t+1 is based on alpha * the observed value at time t + (1-alpha) * the predicted value at time t.
      If alpha > 1 then 1-alpha < 0 and so the predicted value at time t has a negative impact on the predicted value at time t+1, which is counter-intuitive. Charles

      Reply
  3. If I were to use exponential Smoothing with an Alpha equals to zero, will my forecast remain constant from period to period?

    Reply
  4. Hi Charles,

    Thanks for sharing these examples.
    If the ask is to forecast one day ahead in the window, say for instance, I am in D1 and need to forecast for D3 based on D1 observed values. I am tried with SES and want to improve the accuracy. Do you suggest anything better?

    Reply
  5. Hi! I sincerely hope I get a response from you. Would it be possible to forecast, for example I only have a data from 2008 to 2017 and I would like to know their values by 2025?

    Reply
    • Hello Arista,
      For Simple Exponential Smoothing, the forecast for every year after 2018 is the same as the forecast for 2018 (at least in the implementation that I am using. You can see this by using Holt’s Trend and setting the beta parameter to zero and choosing a forecast period > 1.
      Charles

      Reply
      • Hi Mr. Charles!

        How will the forecasting accuracy be performed then (e.g. MAPE, MSE, MAD)? Since there is no actual value from 2018 to 2025, error cannot be determined =(

        Reply
        • Hello Arista,
          You won’t be able to calculate MAPE, MSE or MAD even when the values after the first forecast value are not constant since there is no observed data to compare the forecast to.
          Charles

          Reply
  6. I am having a problem about the first value and whether it should be included in calculating MSE.Although MSE is zero based on forecasted value for year 1 but including that zero also impacts the average.Like for example ,question says
    a. Calculate the single exponential smoothing forecast for these data using an α of 0.3 and an initial forecast (F1) of 28
    The actual data for F1 is also 28.But including zero in mse calculation leads to difference in average.Please suggest what is to be done

    Reply
  7. Hi Charles,

    Wikipedia gives a slight variation of your equation which is (using your notation): $\hat{y}_{i+1}=\alpha y_{i+1}+(1-\alpha)\hat{y}_i$ (note the $y_{i+1}$ instead of $y_i$. I wonder if your equation does not imply a unwanted shift in the prediction, in particular if \alpha=1, the Wikipedia formula gives the original series (unsmoothed); where your formula gives the original series shifted by one. I noticed a similar “problem” for the Holt’s linear trend; I have not looked at the triple exponential smoothing.

    I am not sure if it is intentional or not, could you clarify this?

    Reply
          • Sorry for the wrong question that I asked.
            Why should i leave a blank in the first row in the column of the forecast?

          • Hello,
            The predicted value at each time depends on the value at the previous time. The first row corresponds to time 1 (the first time period). Since there is no previous time, you can’t calculate a predicted time for the first row.
            Charles

  8. Hi Charles,

    I’m wondering if those first 2 lines of formulae are supposed to be a little different. You have that y-hat = y and based on the simple algebra referenced before then next formulae, it seems like it should be y -y-hat = e. What am I missing?

    Reply
  9. Hello, Charles!
    Is there a difference if you forecast the number of teachers by quarter and by year using exponential smoothing methods or time series basic tools(Moving average,holt’s winters,holt’s linear,trend adjusted and single exponential smoothing) in excel? I have a data of 7 years of deployment of teachers with their specialization then i want to forecast the number of teachers in the next 5 years.

    Reply
      • How to forecast the number of teachers deployment in the America in the next 5 years using single exponential smoothing,holts method, holt winters methor,and trend adjusted exponential smoothing with a data of 7 years deployment of teachers in the US?

        Reply
        • Hello,
          How to forecast using each of these methods is explained on the Real Statistics website. This webpage explains single exponential smoothing. To find the others, enter the name of the method in the Search bar.
          Charles

          Reply
  10. Hi Charles,

    Your examples are excellent! I have two questions about fitting the SES forecasts:

    1) Is MAE usually the best measure of accuracy when optimizing SES for time series forecasting or is there a case for using Mean Absolute Deviation, or a similar robust measure, when there is concern about outliers skewing the fit?

    2) The initial value of your forecast is just the first value in the time series but is there an optimal way of selecting the initial value? I have tried using the average of the first half dozen or so time series values and found that often yields a more accurate fit.

    Robert

    Reply
    • Robert,
      1. I believe that MAE is commonly used. If you are concerned about outliers, then Mean Absolute Deviation might be a better choice.
      2. I am not surprised that the average of the first several values gives a better result.
      Charles

      Reply
  11. Hi Charles,

    Three comments:

    1). When deriving the example at the top of the page, you drop alpha as a coefficient for the last expression. (ie, y5 = ay4 + a(1-a)y3 + …….. + (1-a)^3y1).

    Is this an oversight or is it correct? I think it’s correct but I’d like to make sure.

    2). I think the cells you reference in example 1 are incorrect. There are no values in the I and H columns (but they do for example 2, which is where I think you got confused).

    3). I’m having a hard time seeing where the values are derived from.

    For example, take example 1 for the data point where t = 4.

    Based on your derivation I would assume that you get your answer like this:

    y4 = ay3 + a(1-a)y2 + (1-a)^2y1

    Using the actual data this comes out to:

    (0.4)*3.8 + 0.4*(0.6)*3 + 0.36*3

    This does not equal 5.8. Am I incorrect in my reasoning or is something amiss with the examples?

    Reply
    • Hi Jonathan,
      1. It is correct
      2. Thanks for catching this error. I have now corrected the formulas on the webpage.
      3. I just checked and the expression for y5 (in item #1) does yield the correct value of 11.528. I didn’t try y4, but I am sure that it will be correct as well.
      Charles

      Reply
      • Charles,

        I’m having a difficult time squaring away the equivalence of yi+1 = yi + aei with the longer derivation you put forth after that.

        If you use the values from your given example they don’t match up.

        Can you explain this in a little more detail?

        Reply
        • Jonathan,
          Suppose we place the predicted y values in column G.
          Place the formula =B4 in cell G4. Next place the formula =$B$21*B4+(1-$B$21)*G4 in cell G5. Now highlight the range G5:G19 and press Ctrl-D. The result is that column G will contain the same values as column B.
          Also notice that the formula =B21*B7+B21*(1-B21)*B6+B21*(1-B21)^2*B5+(1-B21)^3*B4 will have the same value as B8 (or G8).
          Charles

          Reply

Leave a Comment