Time Series Analysis

We explore various methods for forecasting (i.e. predicting) the next value(s) in a time series. A time series is a sequence of observations y1, …, yn. We usually think of the subscripts as representing evenly spaced time intervals (seconds, minutes, months, seasons, years, etc.).

 Topics

References

Greene, W. H. (2002) Econometric analysis. 5th Ed. Prentice-Hall
https://www.scirp.org/(S(351jmbntvnsjt1aadkposzje))/reference/referencespapers.aspx?referenceid=1243286

Gujarati, D. & Porter, D. (2009) Basic econometrics. 5th Ed. McGraw Hill
http://www.uop.edu.pk/ocontents/gujarati_book.pdf

Hamilton, J. D. (1994) Time series analysis. Princeton University Press
https://press.princeton.edu/books/hardcover/9780691042893/time-series-analysis

Wooldridge, J. M. (2009) Introductory econometrics, a modern approach. 5th Ed. South-Western, Cegage Learning
https://cbpbu.ac.in/userfiles/file/2020/STUDY_MAT/ECO/2.pdf

72 thoughts on “Time Series Analysis”

  1. Hi Charles,
    I am trying to perform regression among time-series.
    If I am not mistaken, I *must* difference them or subtract the time trend.
    Is this included in the section?

    I only want to find the coefficients but I hear the assumptions of regression
    change a bit.Where(your site or book or link ) can one find a working example that includes assumption tests.

    Thanx in advance,
    Savvas

    Reply
  2. Hi Charles

    At present, I am working on QSR restaurant forecasting. So basically I have to forecast the quantity of each product required in each store.

    Can you please help me with this

    Regards,
    chetan

    Reply
  3. Hello Charles, I stumbled across your website while I was searching for methodology on time-series analysis. In fact I have 3 years worth of electric power consumption data that I wish to use Statistical models to perform forecasting. I have a basic statistics background and no “machine learning” background. Will you be able to recommend me on how to best consume your material and I would like to attempt to apply the methods / models onto the sample data. Thank you very much,
    Anson

    Reply
    • Hello Anson,
      I suggest that you start by graphing the 3 years worth of data. Look for any patterns: seasonality, increasing/decreasing trends, randomness, etc. Based on what you observe, you then need to choose a model (ARIMA, Holt-Winters, etc.).
      Charles

      Reply
      • Hello Charles, Thank you for the reply. The time-series data I have in hand is a power consumption data taken from a commercial building consisting of 2 measurements: the accumulated Energy Consumption (Watt Hours), and the Power Consumption (Watts) at the timestamp of the reading
        The sample data looks like below:
        Timestamp | Total Accumulated Energy (Watt Hours)| Total Power Consumed (Watts) |
        Jan 01, 2019 01:02:00 AM | 415,457,280 | 32,683 |
        Jan 02, 2019 01:04:00 AM | 415,629,888 | 25,982 |

        Jan 31, 2019 01:02:00 AM | 424,123,538 | 31,857 |
        Jan 31, 2019 01:04:00 AM | 424,345,242 | 28,735 |
        ====================================================
        When I plot the time-series of the total accumulated energy over the period of 3 years – it is showing a linear upward trend. When I plot the time-series of the energy consumed on fixed units (i.e. days) it shows a seasonality of high energy consumption Monday thru Friday then a lower energy consumption over Saturday and Sunday. Also the energy consumption is low on Public Holidays. The same pattern exhibits over the plotting of Total Power Consumed.
        ====================================================
        My questions are:
        1. Given the specimens are taken un-evenly (i.e. sometimes per 2 minute, sometimes per 5 minutes, so on). Do I normalize the dataset to a common unit (i.e. Day), if the objective is to forecast the next day’s energy consumption?
        2. Do I need to further roll-up and normalize the data to Months if I were to forecast the energy consumption in next month?
        3. What do I have to do to enable a forecast of more days / weeks / months ahead?
        4. How do I take into the considerations of public holidays / Weekdays / Weekend (i.e. low energy consumption). Would it imply a different handling if I were to forecast the next “Day”, “Week”, Month”?
        Thank you very much,
        Anson

        Reply
  4. Hi Charles,

    I have seen your website for long, thanks for your works on these. I found it very useful. I have question, I don’t know if it could be consider as time series analyses, because most of the discussion are on the financial part and forecast.

    I have a random data, see below ( the data is every 0.1s and a wave data ), If I want to calculate the height of the wave on the raise and on the fall, and also the duration between the peaks, what techniques I should use ?

    Thanks

    Gunawan

    See the data below
    10.0 -0.031
    10.1 -0.151
    10.2 -0.266
    10.3 -0.371
    10.4 -0.464
    10.5 -0.546
    10.6 -0.620
    10.7 -0.689
    10.8 -0.758
    10.9 -0.832
    11.0 -0.912
    11.1 -0.999
    11.2 -1.091
    11.3 -1.182
    11.4 -1.267
    11.5 -1.338
    11.6 -1.391
    11.7 -1.421
    11.8 -1.425
    11.9 -1.404
    12.0 -1.359
    12.1 -1.296
    12.2 -1.219
    12.3 -1.133
    12.4 -1.042
    12.5 -0.949
    12.6 -0.855
    12.7 -0.761
    12.8 -0.665
    12.9 -0.566
    13.0 -0.464
    13.1 -0.358
    13.2 -0.250
    13.3 -0.140
    13.4 -0.034
    13.5 0.068
    13.6 0.162
    13.7 0.246
    13.8 0.322
    13.9 0.388
    14.0 0.449
    14.1 0.506
    14.2 0.562
    14.3 0.618
    14.4 0.675
    14.5 0.732
    14.6 0.785
    14.7 0.832
    14.8 0.869
    14.9 0.893
    15.0 0.902
    15.1 0.896
    15.2 0.878
    15.3 0.852
    15.4 0.821
    15.5 0.792
    15.6 0.767
    15.7 0.748
    15.8 0.737
    15.9 0.730
    16.0 0.722
    16.1 0.708
    16.2 0.682
    16.3 0.638
    16.4 0.575
    16.5 0.491
    16.6 0.391
    16.7 0.279
    16.8 0.164
    16.9 0.054
    17.0 -0.042
    17.1 -0.117
    17.2 -0.169
    17.3 -0.195
    17.4 -0.198
    17.5 -0.183
    17.6 -0.154
    17.7 -0.120
    17.8 -0.085
    17.9 -0.055
    18.0 -0.031
    18.1 -0.015
    18.2 -0.004
    18.3 0.003
    18.4 0.010
    18.5 0.020
    18.6 0.034
    18.7 0.054
    18.8 0.077
    18.9 0.104
    19.0 0.131
    19.1 0.156
    19.2 0.178
    19.3 0.195
    19.4 0.207
    19.5 0.213
    19.6 0.213
    19.7 0.208
    19.8 0.196
    19.9 0.175
    20.0 0.142
    20.1 0.094
    20.2 0.029
    20.3 -0.055
    20.4 -0.158
    20.5 -0.278
    20.6 -0.411
    20.7 -0.550
    20.8 -0.687
    20.9 -0.816
    21.0 -0.929
    21.1 -1.020
    21.2 -1.088
    21.3 -1.132
    21.4 -1.155
    21.5 -1.164
    21.6 -1.163
    21.7 -1.159
    21.8 -1.158
    21.9 -1.160
    22.0 -1.166
    22.1 -1.172
    22.2 -1.173
    22.3 -1.161
    22.4 -1.129
    22.5 -1.072
    22.6 -0.986
    22.7 -0.870
    22.8 -0.726
    22.9 -0.561
    23.0 -0.383
    23.1 -0.198
    23.2 -0.017
    23.3 0.155
    23.4 0.312
    23.5 0.452
    23.6 0.575
    23.7 0.683
    23.8 0.778
    23.9 0.864
    24.0 0.944
    24.1 1.020
    24.2 1.092
    24.3 1.160
    24.4 1.222
    24.5 1.276
    24.6 1.320
    24.7 1.354
    24.8 1.376
    24.9 1.386
    25.0 1.385
    25.1 1.372
    25.2 1.347
    25.3 1.309
    25.4 1.256
    25.5 1.187
    25.6 1.100
    25.7 0.994
    25.8 0.870
    25.9 0.730
    26.0 0.578
    26.1 0.422
    26.2 0.270
    26.3 0.130
    26.4 0.011
    26.5 -0.083
    26.6 -0.146
    26.7 -0.181
    26.8 -0.190
    26.9 -0.179
    27.0 -0.157

    Reply
      • Thanks Charles, Yes, it is random wave elevation, and the data is quite massive and i need to get the height upward and height downward. on the data, It could be thousand of them, so I taught, I could do it with a time series analyses. Thanks for your times.

        Reply
  5. Hi Charles

    I am very new to statistics. I wanted to do a multiple regression analysis to predict what drives crop expansion but I only have data for 12 years which is not a sufficient number of observations. Is there an alternative approach I can take to test the drivers?

    TIA
    Michelle

    Reply
    • Michelle,
      Perhaps 12 years of observations is not the best, but if it is all that you have then I would go with that. You can also look at the prediction interval that will give you some idea of the accuracy of the forecasts obtained.
      Charles

      Reply
  6. Hi Charles,

    All your work has been so helpful.

    I am trying to make a Markov Regime Switching model about the stock market in excel.

    Currently, I am using korean stock market index and trying to apply EM for parameters.

    But I am already stuck in there.

    Is there any advice for me?

    I am a beginner and got no clue how to start it.

    I know this is very excuse. But, I got no one to ask.

    Thank you. Have a great day.

    Reply
  7. Hi Charles,
    I want to work on time series dataset and as I am beginner, want to follow the step by step strategy to start this. I have started the work on simple monthly mean of Sunspot dataset (from the year 1749 to 2022) having only the attributes (Date and monthly mean) :
    Date Monthly Mean Total Sunspot Number
    1749-01-31 96.7
    1749-02-28 104.3
    1749-03-31 116.7
    1749-04-30 92.8
    1749-05-31 141.7
    1749-06-30 139.2
    1749-07-31 158
    1749-08-31 110.5
    1749-09-30 126.5
    1749-10-31 125.8
    1749-11-30 264.3
    1749-12-31 142
    1750-01-31 122.2
    1750-02-28 126.5
    1750-03-31 148.7
    1750-04-30 147.2
    1750-05-31 150
    :
    :
    :

    Can you please tell me that what methods are required in terms of both statistics and forecasting methods for this dataset?

    Reply
    • Anjali,
      There isn’t a simple answer to your question. There are many techniques for creating a forecast.
      I suggest that you start by plotting the data to see whether there is a pattern. The pattern (or lack of a pattern) will suggest the approachs to use (or try).
      Charles

      Reply
  8. Hi Charles,

    Most of the Forecast model will consider Trend, seasonality, and Level. is there any other parameters or Factors that should consider if I am building a custom model for forecasting?

    Reply
    • There are many options for creating forecasts. For the models that you are alluding to there is also “damping”.
      There are other models, including ARIMA, SARIMA, etc.
      Charles

      Reply
  9. Hi Charles,

    Firstly, thanks so much for this resource it is greatly appreciated. I love Excel and it’s fantastic to see what you have been able to make it do.

    I was wondering if you could advise a suitable method for my problem. I have quarterly fuel consumption data for a company, spanning the last 10 years, and want to predict the expected consumption by 2050. For example, the data might be:

    Date, Consumption (kWh)
    2010 Q1, 600000
    2010 Q2, 550000
    …, …
    2020 Q1, 400000

    There will of course be a lot of uncertainty in any prediction, but is there any regression method you recommend I use? I’m familiar with Simple/Multiple but understand in this context the assumptions one must make are not quite correct. I have been reading thoroughly into your Time-Series Analysis articles, and there seems to be a lot of methods but I’m struggling to pin-point the one I might need.

    I really appreciate it.

    Reply
    • Robert,
      You should create a chart of your data to see whether there is some pattern (trend, seasonality, etc.). What approach to use depends on what you see. You can try Holt-Trend (or Holt-Winters) or ARIMA. All these approaches are described on the Real Statistics website.
      Charles

      Reply
  10. Mr. Charles,

    I’m fresh out of college and now working for a small business where I am the only data analyst here. As a result, I’m stuck with the data I’m working on right now that I believe time series can solve the problem. However, the prediction is not accurate. I’m not sure if you can take a look at what I did and guide me. I much appreciate your help.

    I hope to hear from you soon,
    Julia

    Reply
  11. Hello Charles,I would like to know which model would be suitable for forecasting air passenger traffic post pandemic.

    Reply
    • Hello Nandini,
      I am not able to give a simple answer to this question. The model to use depends on a number of factors and a suitable response would require a lot more information.
      Charles

      Reply
  12. Hi Charles, I am trying to create a model that forecasts the demand for a commodity. In this case Manganese. What model would you suggest to do this forecast?

    Reply
  13. Hello,

    I am analyzing animal data and I’ve never done time series analyses before. I have 12 animals, six in one group and six in another. I measured the time it takes them to get food over the course of six hours. I’m trying to see if Group 1 animals get to the food faster than Group 2 animals. What type of analysis would you recommend?

    -May

    Reply
    • Hello May,
      If the data in each group are normally distributed, then you should be able to use a two independent sample t-test. No time series analysis os needed.
      Charles

      Reply
  14. Hi Charles,
    I will start work on estimating wait times in health care services using time series models. With which forecasting model you advise me to work (how to choose it), and do you have an example please.

    Thank you in advance

    Reply
  15. Hello Charles:
    I have some raw pressure data which is very choppy. I’m trying to do an interference analysis to look for any offset activity disturbing the current system (which would show as a deflection in the “derivatives? (maybe)” of the original pressure curve being recorded.
    However, with a choppy raw pressure curve, plotting derivatives of that raw pressure curve is out of the question. I found some methods called LOESS etc. (which is also listed in Wikipedia under their smoothing functions page) to smooth out the curve but still “maintain” integrity of the data. I did not find any of those smoothing functions in your page for excel. I went through every method on your page, and most methods predict a curve that lies under the original curve (in magnitude) or if they are in the range of the original curve, it is still choppy. Any suggestions? Thank you, Charles. Enjoying your website!

    Reply
  16. Hello Charles,

    I am using your tool for validating one instrument. I am using the Forecast_Error function. Where is the meaning of each of the variables this function compute? Most of them are well known but I don´t find in your web the meaning of u1 and u2 and the formulas to compute them.

    Thanks very much,
    Gabriel Delgado

    Reply
      • Thanks very much Charles,
        I want to compare two devices that measure angular velocity (one the dispositive I want to validate and the other is the gold standard… They collect 128 samples per second so I have two signals almost perfectly syncronized. Which measure of validity do you recommend me? I am using RMSE…

        Best regards,
        Gabriel

        Reply
          • Hello again Charles,

            Thanks very much. I have also compute the Lin’s CCC and Bland Altman. One question. In my case residuals do not fit the normal distribution. I think in this case Bland Altman is not adequate. Can I simply plot the residuals to make an analysis of the bias?

            Thanks very much,

            Gabriel

          • Bland-Altman does require that the residuals be normally distributed, but if the residuals are not very skewed the results should generally be pretty good.
            Charles

  17. Hi Charles,

    Thank you so much for your website! It is fantastic!
    I’m curious as to why you seem to have skipped over Mincer-Zarnowitz in forecast evaluation? Any particular reason?

    Cheers,

    Reply
    • Hello Dario,
      Glad you like the website
      I have not included Mincer-Zarnowitz yet since it is not in the textbooks that I have consulted and no one has requested it.
      I am adding capabilities all the time and will include this in the list of potential future enhancements.
      Charles

      Reply
  18. Thank you for the fantastic work you have done with Time Series. I would really like to benefit from anything you can publish to help me understand the following:

    1. Generalised Methods of Moments (GMM), when to use it etc
    2. Tips about various methodologies concerning robustness checks in econometrics
    3. Data handling in econometrics
    4. Model transformation in the case of heteroscedaticity. The concern here is I understand it is the data that is transformed. So for instance if you have a cross section with two variables X and Y and Y is regressed on X. Assume there is heteroscedasticity. If the values of X are (3,5,6,8,9) and Y are (5,9,6,8,4). Please explain, using the data how the model Y=a +bX +e is transformed in this case.

    Again thank you for what you do.

    Regards

    Reply
  19. Hi Charles!

    Have a quick question, I have three different matrices that have different time series (1938-1944, 1944-1953 and 1953-1965) and I am trying to do a log-linear analysis on it to make sure the results are comparable. Any advice on how to approach this?

    Cheers,
    Clinton

    Reply
  20. Hi Charles,

    I have little-bit confusion about Plotting Rolling Statistics can you please refer topic from this time series analysis. This is a stationary checking process as you know.

    Reply
  21. Dear Mr Charles,
    Is there any way to forecast cash outflow based on data time series. For example I’d like to make a projection of cash outflow in 2018 based on the time series data of cash disbursement from 2014-2017?

    Thanks in advance.

    Reply
  22. Hi Charles,

    I use your RealStats Add-in for Excel. For school we usa a time-serie analysis book by Rob J Hyndman.
    I was comparing the coefficients of RealsStats with the coefficients of ARIMA in RStudio. For RStudio, I use the ‘fpp2’ package by Rob J Hyndman.
    With the exact same dataset the coefficients are different.
    I was wondered why they are different. Is this because RealStats is using the solver at the background and is estimating the coefficients? Or is it because R uses different algorithms.
    Also with models like ARIMA(1,1,1) the coefficients are almost the same as the coefficients in R. But with a model like ARIMA(3,1,3) the coefficients are very different.

    Greatings,
    John

    Reply
    • John,
      Thanks for identifying this. The Real Statistics add-in using two approaches for estimating the ARIMA coefficients, one via Solver and another iterative approach. In test examples, the estimates agreed with R.
      Can you send me an Excel file with your data and the results you got from R? I will then try to figure what is going one.
      Charles

      Reply
  23. Hi Charles,

    Very nice blog.
    I was wondering whether you could help me understand lag removal in time series analysis. I am dealing with a time series data that has multiple parameters. I understand that we need to remove lag before any modeling is performed.

    Thanks
    Adi

    Reply
    • Mohammed,
      No I haven’t. I expect to publish the first of a series of books shortly. I plan to publish a book on time series analysis as well, but that won’t happen this year.
      Charles

      Reply

Leave a Comment