Author

Dr. Charles Zaiontz has a PhD in mathematics from Purdue University and has taught as an Assistant Professor at the University of South Florida as well as at Cattolica University (Milan and Piacenza) and St. Xavier College (Milan).

Most recently he was Chief Operating Officer and Head of Research at CREATE-NET, a telecommunications research institute in Trento, Italy. He also worked for many years at Bolt Beranek and Newman (BBN), one of the most prestigious research institutes in the US, and is widely credited with implementing the Arpanet and playing a leading role in creating the Internet.

Dr. Zaiontz has held a number of executive management and sales management positions, including President, Genuity Europe, responsible for the European operation of one of the largest global Internet providers and a spinoff from Verizon, with operations in 10 European countries and 1,000 employees.

He grew up in New York City and has lived in Indiana, Florida, Oregon, and finally Boston, before moving to Europe 36 years ago where he has lived in London, England and in northern Italy.

He is married to Prof. Caterina Zaiontz, a clinical psychologist and pet therapist who is an Italian national. In fact, it was his wife who was the inspiration for this website on statistics. A few years ago she was working on a research project and used SPSS to perform the statistical analysis. Dr. Zaiontz decided that he could perform the same analyses using Excel. To accomplish this, however, required that he had to create a number of Excel programs using VBA, which eventually became the Real Statistics Resource Pack that is used in this website.

487 thoughts on “Author”

Masa

September 23, 2020 at 8:40 pm

Dear Charles,

Let me first thank you for your great website and the very useful software, which I have long appreciated very much.

Now I have been experiencing a difficulty in conducting a logistic regression analysis with the Real Statistic software, which is possibly a bug in the program.

When I choose an appropriate input range with summary data and click the “OK” button, a message saying “Input rage must have at least as many data rows as columns” appears.

This is understandable if the data is raw data. However, if it is summary data, columns can surely exceed the rows, for example when your model is interaction model and contains many product (interaction) terms.

In particular, my model was, using Real Statistic Function,
= LogitSelect (R1, “1, 2, 3, 4, 5, 6, 7, 1*7, 2*7, 3*7, 4*7, 5*7, 6*7”, True)
Is there something wrong with this interaction model?

I think this message should appear only when the data was raw data, and should not appear when it was summary data.
But there may be misunderstandings on my part.

In any case, I would appreciate your kind advice.

Best regards,
Masa
Reply
- Charles
  
  September 30, 2020 at 5:17 pm
  
  Hello Masa,
  I have just issued a new release of the Real Statistics software, Rel 7.3.1, that eliminates the error message. You should now be able to use the logistic regression data analysis tool. I am not sure whether the model will converge to a solution, so I would appreciate your letting me know whether you did get a solution.
  Charles
  Reply
Ricardo Jaimes

September 18, 2020 at 8:59 pm

Hello Dr. Zaiontz,

I followed your Mann-Kendall Test instruction page (https://www.real-statistics.com/time-series-analysis/time-series-miscellaneous/mann-kendall-test/) but I am having trouble figuring out the equation you used to calculate the ties corrections. Do you have any references I can look up? I have looked many places but have not found how to calculate the ties corrections. I am curious to know if the correction equation you used is specific to the data set and how I apply it to my own data set.

Thank you,
Reply
- Charles
  
  September 18, 2020 at 10:49 pm
  
  Hello Ricardo,
  Here is a reference:
  Gocic, M. and Trajkovic, S. (2012) Analysis of changes in meteorological variables using Mann-Kendall and Sen’s slope estimator statistical tests in Serbia. Elsevier
  https://www.academia.edu/6955354/Trend_Analysis_MK_Sen_Slope
  Charles
  Reply
  - Abhishek Kabiraj
    
    October 12, 2023 at 10:20 pm
    
    Hi Sir,
    
    While searching “cox ph approach in excel” in the internet, I came across your example which computes the survival probabilities using cox ph partial likelihood method using excel. I have found it intuitive and really very helpful. I also validated it using R code (breslow approach under surv) and it matched.
    
    I have a similar dataset, with one categorical causal variable only, “Product”. I created a dummy variable which takes 1 when product is of a particular type and 0 otherwise. Then I tried to use the similar approach in excel. However I didn’t get correct match while checking the result in R.
    
    I have shared the dataset with you over email. Please could you advise me how could I handle this example.
    
    Thank you
    Reply
Patricio

September 11, 2020 at 6:21 am

Thanks Professor, incredible website!
Reply
Cesar Parra

August 10, 2020 at 4:32 am

Thank you very much Dear Charles, best regards from Chile
Reply
GWANGYONG CHOI

August 6, 2020 at 6:21 am

Instead of clicking CTL+m and putting the information of my data rage in the pop-up window for the time series analyses, I put “SEN_SLOPE(my data range)”in an excel cell below of my data array and “MK_TEST(my data range) in another cell. I have 9,000 time series data and would like to get the Sen’s slope value and its significance(p-value) by dragging these two function columns. This method gave certain values but they seem to not be the correct Sen’s slope values or their p-values either. How can I get correct trend and its significance values for multiple cases in the excel using your software?
Reply
- Charles
  
  August 6, 2020 at 8:44 am
  
  If you email me an Excel file with your data and results, I will check to see whether or not you have correct results.
  Charles
  Reply
  - GWANGYONG CHOI
    
    August 7, 2020 at 2:46 am
    
    I email you with my data file referring to your contact info. in this web site. Once again, i would like to calculate Sen’s slope and its p-value from MK Test for more than 9000 cases of time series data. Thus, if available, I want to drag the one excel cell with the function (Sen_Slope & MK_Test) from your software to apply for all. I will wait for your quick reply to my email.
    
    Many thanks
    Gwangyong Choi
    Reply
Mauro

July 27, 2020 at 2:42 am

Gentile professor Zaiontz, chi le scrive non è in grado di capire una sola parola si statistica.
Succede però che grazie alla sua pagina web, al suo pacchetto software e al suo splendido lavoro, sto realizzando in maniera autonoma l’analisi dei dati della mia tesi di master. Le assicuro che per me è una grande conquista e per averla resa possibile la ringrazio infinitamente.
Tempo fa ho trovato la pagina di Real Statistics per caso e ho realizzato alcune analisi per comprendere come funziona il software. Oggi ho ripreso in mano la tesi dopo quattro mesi di inattivitá e con gran sorpresa ho letto la sua bibliografía che prima avevo ignorato. Sono anch’io di Trento, per ragioni familiari vivo all’altro capo del mondo e mi fa grande piacere scoprire che ho scelto il suo lavoro per compiere il mio.
In questo messaggio non troverà argomenti di statistica ma semplicemente questo piccolo ringraziamento e i miei complimenti.

Mauro Brunelli
Reply
- Charles
  
  July 27, 2020 at 6:44 pm
  
  Ciao Mauro,
  I am very pleased with your comment. I am very happy that I was able to help you.
  Mi fa molto piacere il tuo commento. Sono molto felice di essere stato in grado di aiutarti.
  Charles
  Reply
Jürgen Graf

July 24, 2020 at 2:20 pm

Dear Charles,
I found a little mistake in Figure 2 – REGWQ test. As there is no Response section, I didn’t know where to put it. α(p) is not adjusted for the second stage, meaning V8 and W8. They should be 0.040204.
Jürgen
Reply
- Jürgen Graf
  
  July 24, 2020 at 2:29 pm
  
  same is true for the results table when conducting the REGWQ test in Excel. I just wonder, if the calculation of q(crit) is correct, as it’s using α(p).
  Jürgen
  Reply
- Charles
  
  July 24, 2020 at 3:01 pm
  
  Dear Jürgen,
  The formula in cell V8 is =IF(V6 Reply
  - Jürgen Graf
    
    July 24, 2020 at 4:17 pm
    
    Some parts of your reply are missing.
    Jürgen
    Reply
    - Charles
      
      July 28, 2020 at 9:36 pm
      
      Hello Jürgen,
      What parts are missing? Are you referring to the calculation of q(crit)? My previous response should also cover this.
      Charles
      Reply
      - Jürgen Graf
        
        August 5, 2020 at 4:16 pm
        
        Well I only see this as your reply:
        Dear Jürgen,
        The formula in cell V8 is =IF(V6
      - Charles
        
        August 6, 2020 at 10:03 pm
        
        The formula in cell V8 is
        =IF(V6
CR

July 21, 2020 at 6:20 pm

Nice life story Charles … do you have a Linkedin account?
Reply
- Charles
  
  July 21, 2020 at 8:30 pm
  
  Yes, I have a LinkedIn account
  Charles
  Reply
Jasmine

July 19, 2020 at 3:54 am

Hello, have a great day! I just want to ask about forecasting method. What could be the best method to use in the research paper if you have gathered annual data? Is the Holt’s Winters Method not applicable to it? Why? Hoping for your response. Thank you! 😇
Reply
- Charles
  
  July 19, 2020 at 10:12 am
  
  Hello Jasmine,
  This depends on the details about the annual data. Is there seasonality? Is there an upward or downward trend?
  Charles
  Reply
David

July 3, 2020 at 11:39 am

Dear Charles,
Firstly, thank you for your posts – they provide useful insight to statistics for an amateur such as myself.
I have a question of what test would be best used for my problem:
I have 46 patients and have identified different patient factors (e.g. age, gender, bone density, tissue density, etc) and each patient is administer ultrasound at increasing powers repeatedly with tissue temperatures recorded with each application of ultrasound until the therapeutic effect is achieved.
Unfortunately the ultrasound powers are not exactly the same for each patient, and some patients require more episodes of treatment to achieve the therapeutic effect.
Is there a way to assess which of the patient factors (e.g. age, gender, tissue density) has an effect on the power required to reach the resulting temperature on each application of ultrasound?
I thought a repeated measures ANOVA, but seek your advice to be confident I’m on the right track.
Regards,
David
Reply
- Charles
  
  July 5, 2020 at 10:35 am
  
  Hello David,
  If I understand correctly, age, gender, bone density, tissue density, etc. are the independent variables that you are interested in. The power required to reach the desired temperature appears to be the dependent variable. This looks like an application of regression. If you are also interested in the number of treatments required then you can use Poisson regression for this.
  Charles
  Reply
Cristian

June 20, 2020 at 12:22 am

Just discovered this amazing tool…it’s just awesome what you have created here and all for free….it is soooo helpful!
Reply
Sap hana

May 5, 2020 at 3:37 pm

So interesting
Reply
Jaime Nunez

April 25, 2020 at 8:17 pm

Hi Charles:
Your resource is great, but I am not sure how to carry out an equivalence test.
We are testing whether two dental procedures are equivalent (implants).

Thanks,

Jaime Núñez
Reply
Renee

April 8, 2020 at 4:55 pm

The tolerance calculations were very helpful. How would you perform the calculation if your data isn’t normally distributed.
Reply
- Charles
  
  April 9, 2020 at 9:14 am
  
  Hello Renee,
  Currently, Real Statistics doesn’t support this capability, but I do plan to add it soon. In any case, here is a reference on how to make such calculations:
  https://www.jstatsoft.org/article/view/v036i05/v36i05.pdf
  Charles
  Reply
Andrew Tomaino

April 7, 2020 at 7:50 pm

Hi Charles,

I found your tutorial on how to apply cubic splines using Excel very useful, as it is advantageous to use Excel versus something like MatLab to perform these operations, especially due to accessibility and price.

https://www.real-statistics.com/other-mathematical-topics/spline-fitting-interpolation/

Would you happen to be able to publish an addendum to this tutorial that covers examples and applications of the smoothing cubic spline function that utilizes a weighting parameter? I believe Ridge regression is commonly used as an analogy here.

Thank you!
Reply
- Charles
  
  April 8, 2020 at 9:42 am
  
  Hello Andrew,
  I will look into this. Can you give me a source for this approach?
  Charles
  Reply
  - Andrew Tomaino
    
    April 8, 2020 at 2:16 pm
    
    Outstanding!!
    
    It would be beneficial to see Ridge analysis applied to smoothing the cubic spline method example, such that we could vary the smoothing parameter (penalty value) and see the impact that has on the fit. Furthermore, it would be great to be able to use the Ridge analysis to solve for the optimal smoothing parameter using cross validation.
    
    As an overview, below is a hyperlink to Dr. Liang’s (from Duke University) statistics lecture on the topic.
    
    https://stat.duke.edu/courses/Spring06/sta293.3/topic5/spline.pdf
    Reply
    - Andrew Tomaino
      
      April 8, 2020 at 2:20 pm
      
      Also, here is a hyperlink to a nice lecture on the smoothing spline approach:
      
      https://www.stat.cmu.edu/~ryantibs/advmethods/notes/smoothspline.pdf
      Reply
      - Charles
        
        April 9, 2020 at 7:05 pm
        
        Thanks for sending this link to me.
        Charles
    - Charles
      
      April 9, 2020 at 7:04 pm
      
      Hello Andrew,
      Thanks for sending me these links. I will look into this. Currently, Real Statistics has the following support for Ridge regression and spline interpolation:
      Ridge Regression
      Spline Fitting and Interpolation
      Charles
      Reply
Julia

April 4, 2020 at 3:09 pm

Dear Mr Zaiontz,

many thanks for providing such a great statistical tool! It is a pleasure working with it!

I am trying to run a weighted linear regresssion (the explanatory variables are the dow jones returns (and 2 dummy variables, in oder to capture pre- and post-event returns and the dependent variable are the sugar returns). The weights have been assigned by using the reciprocal of the conditional variances that i have estimated using a GARCH(1,1).

Unfortunately, I constantly get the error message “division by 0” when I am trying to run the regression.

Could you please give an advice on what is possibly going wrong?

Thank you very much in advance! Hope to hear from you soon!

Kind regards,
Julia
Reply
- Charles
  
  April 4, 2020 at 6:27 pm
  
  Hello Julia,
  If you email me an Excel file with your data and results, I will try to figure out what is going wrong.
  Charles
  Reply
Jonah Anton

January 15, 2020 at 12:21 am

Hi Charles,
I have data for trials conducted to evaluate five potato varieties across three sites over two seasons and decided to do pool analysis. I had done the homogeneity test for seasons and had no significant differences and I decided do pooled analysis. Is it right for me to do the pool analysis for the two factors (variety and site) if there is no significant differences in two seasons?
You assistance is very much needed.
Can you send me your email address? My email address is jonahanton986@gmail.com
Regards,
Jonah
Reply
- Charles
  
  January 15, 2020 at 8:28 am
  
  Jonah,
  I would need to know more about what hypothesis you are trying to test in order to answer your question.
  You can find my email address at Contact Us
  Charles
  Reply
  - Jonah Anton
    
    January 15, 2020 at 9:15 am
    
    Charles
    thank you for your respond. I will contact you through the email.
    
    Regards,
    Jonah
    Reply
AZ

January 14, 2020 at 4:00 pm

Thanks for your informative explanations

I have a question

Which statistical test should I use when the independence assumption is violated?

Many thanks
Reply
- Charles
  
  January 14, 2020 at 5:13 pm
  
  It depends on what hypothesis you are trying to test, but generally it is difficult to conduct a valid test if the independence sample assumple is violated.
  Charles
  Reply
  - AZ
    
    January 14, 2020 at 8:14 pm
    
    My objective is to evaluate the significance of differences in robustness measure, which require a statistical test.
    
    The robustness measure used as follows.
    
    I am working with different linear regression models and many datasets.
    
    First, I standardised all the variables (independent/dependent) to zero mean and unit variance.
    
    Suppose I am working with the linear regression model. Then I performed 30-fold split for the dataset. So, I have coefficients for each fold. I calculated the variance for each variable within 30 models. Finally, I sum all the variances.
    
    For example, I have 30 coefficients for a variable (X1), then I calculate the variance for 30 coefficients and the same for all the remaining variables and Finally, I sum all the variance in one total value.
    
    I did this process with different models and datasets. So I end up with a matrix contains the Sum of variances values (its rows refer to linear models used and its columns for datasets used).
    
    I need to use a statistical test to evaluate the significance of differences in robustness (sum of variance value).
    
    Any suggested statistical test?
    
    Your guidance is really appreciated!
    Reply
DB

January 10, 2020 at 3:22 pm

Dr. Z,

Thank you so much for all your work in creating the valuable resource that is this website.

I am trying to convert 3 data points, namely the mode, 5th percentile and 95th percentile, into a Beta distribution. What is the most efficient way in Excel to obtain the Alpha and Beta from those 3 data points? Can it be done without an iterative process?

If you prefer, this question can be moved to one of the pages dealing with Beta distributions.

Many thanks,

DB
Reply
- Charles
  
  January 11, 2020 at 9:28 am
  
  I suggest that you use Solver as follows:
  1. Insert the values for the mode, 5th percentile and 95th percentile in cells A1, A2 and A3.
  2. Insert the initial guesses (say 2 and 2) in cells A4 and A5
  3. Insert the formulas for the mode, 5th percentile and 95th percentile based on the alpha and beta values in cells B1, B2 and B3. Namely, insert the following formulas in these cells: =(A4-1)/(A4+A5-2), =BETA.DIST(0.05,A4,A5,TRUE) and =BETA.DIST(0.95,A4,A5,TRUE)
  4. Insert an error measurement in cell A6, namely the formula =SUMXMY2(A1:A3,B1:B3). This is the sum of the squared errors, the value we want to minimize.
  5. Now select Solver from the Data ribbon. In the dialog box that appears, insert A6 in the Set Objective field, choose Min and insert the range A4:A5 in the By Changing Variable Cells field. After clicking on the Solve button, estimates for alpha and beta in cells A4 and A5 should be obtained.
  Note: The formula in cell B1 for the mode is only applicable when alpha and beta are larger than one. The necessary modifications are not that difficult. Things are easier if you use the mean instead of the mode since the formula in cell B1 becomes =A4/(A4+A5) in all cases.
  Charles
  Reply
  - DB
    
    January 12, 2020 at 5:15 pm
    
    Thank you so much. I’ve used Solver for regressions before but never knew about the SUMXMY2 function which does away with helper columns.
    
    I’m having some difficulty with the solution, and I think the issue lies in the difference between my raw data and the 0 to 1 scale. Maybe we need to solve for [A] and [B] as well?
    
    What is the correct solution where the raw data is as follows:
    mode = 1.00
    5th percentile = 0.96
    95th percentile = 1.08
    
    Thanks again!
    Reply
    - Charles
      
      January 12, 2020 at 9:41 pm
      
      In the version of the beta distribution that I am using, the x values must be between 0 and 1, and so the 95th percentile can’t equal 1.08.
      As explained at https://real-statistics.com/binomial-and-related-distributions/beta-distribution/ there is a 4-parameter version of the beta distribution where x takes values between a and b. In this case, you either need to specify a and b or supply more data so that these values can be estimated.
      Charles
      Reply
Mohammad Al-Sa'di

December 9, 2019 at 4:55 pm

my questions is if i am doing a forecast for daily data and i have actual data for 4 previous years lets say 2019,2018,2017,2016
what is the year that i can start get forecast values for so that i can evaluate the model withe the error measurements ?
and thanks
Reply
- Charles
  
  December 10, 2019 at 8:43 am
  
  I don’t know of a definite answer to your question; this is a judgement call. You can base the model on years 2016, 2017 and 2018 and check its accuracy based on 2019.
  Charles
  Reply
Sylvester Taylor

December 9, 2019 at 1:45 am

I just downloaded this add-in to Excel. I can’t thank you enough for this tool. This a phenomenal resource and you sir are the dude. The Dude abides.
Reply
- Charles
  
  December 9, 2019 at 7:57 am
  
  Hello Sylvester,
  This is the first time anyone has given me the Big Lebowski accolade. Thank you.
  Charles
  Reply
Mohammad Al-Sa'di

December 9, 2019 at 12:06 am

Dear Dr. Charles
i have done all the mathematical equation that is needed to forecast using SARIMA model and all things worked good for me but i need to ask you how i can calculate the mean percentage error “MAPE” for this method as it gives me the forecast for the next period that i don’t know the real “actual” data for it , but to calculate the MAPE to compare these method to other methods i need the forecast for these periods.
Can you help me please ?
Reply
- Charles
  
  December 9, 2019 at 8:07 am
  
  See
  Time Series Forecast Errors
  Charles
  Reply
  - Mohammad Al-Sa'di
    
    December 9, 2019 at 12:36 pm
    
    thank you
    but i already know the equations to calculate the MAPE
    but the proplem is with the error
    there is no forecast data for the period that have actual data
    it just give me a forecast for the net periods that don’t have actual data
    SARIMA method
    Reply
    - Charles
      
      December 10, 2019 at 8:41 am
      
      Mohammad,
      If you don’t have actual data, you won’t be able to calculate errors. Sometimes the model is based on part of the data and then the rest of the data is used to determine the quality of the model since in this case there is some actual data left and so errors can be calculated using the forecasted values vs actual data.
      Charles
      Reply
Blake H.

December 8, 2019 at 4:54 am

Dr. Zaiontz,

I can’t thank you enough for the work you’ve put into this site. I can never know effort you made over the years of learning and dedication it took to become an expert in a topic many, including myself, find extremely challenging, but to do all of it and then have such passion to share what you know and guide people along their own journey with statistics that you made (and maintain!) a resource like this site shows that you truly care about what you do, and I love that! It’s infectious! Thanks for getting me through some of the toughest classes in my undergrad and for giving me a passion for stats!
Reply
- Charles
  
  December 8, 2019 at 8:53 am
  
  Thank you very much, Blake. I am very pleased that I could help you.
  Charles
  Reply
Nizar Shbikat

November 27, 2019 at 6:35 pm

I want to thank all the people behind this website for straightforwardly explaining statistics and providing easy-to-follow examples using Excel.
Reply
- Charles
  
  November 28, 2019 at 6:57 am
  
  Thank you, Nizar.
  Charles
  Reply
  - Richard Gibbons
    
    December 1, 2019 at 8:37 pm
    
    Hello Charles, I just wanted to say thank you for the tremendous website. The amount of analysis and work you have put into this site are amazing. In 1994 when I started an ISP (with 9600 baud modems) the one thing I hoped for most for the burgeoning Internet was that people would begin to communicate and share all manner of information, and they would do it readily and freely. That we could all learn from each other. For a while that idea held promise, but unfortunately not for long.
    Your work and your website are truly examples of that original idea from so long ago. Your willingness to help are what could still form the backbone of the Internet. I was dubious and skeptical at first, but I was surprised and extremely gratified to find your site. I use your site regularly and you give me hope for the Internet. Please don’t stop doing what you are doing. Thank You Very Much, Rich Gibbons
    Reply
    - Charles
      
      December 1, 2019 at 10:51 pm
      
      Hello Richard,
      Thank you very much for very kind remarks.
      I understand very well where you are coming from. I worked with many of the people who were involved in the Internet from very early days and they had very high hopes for this new frontier, some of which were realized and some of which unfortunately were not.
      Charles
      Reply
    - Blake H.
      
      December 8, 2019 at 5:00 am
      
      Rich,
      Though I haven’t started an ISP, I also use this site almost daily. Dr. Zaiontz’ posts, resources, and explanations helped break down the walls I had built around myself that said, “You’re not a numbers person”, “You’re just not good at math”, and “It’s too hard; just quit”, to the point where I went from not having taken a math course since 10th grade in high school to deciding to go for a M.S. in Data Analytics! I’m glad to hear others are getting as much out of it as I do, and I hope Dr. Zaiontz reads this and knows he has changed the course of my life because of his work here (and in all his other contributions, obviously!).
      – Blake
      Reply
Paul G

November 23, 2019 at 8:13 am

Hello Charles,
Thanks for creating and posting all of this information! It was very useful and I’ve recommended the site to my students! People appreciate your work!
Paul
Reply
ProfTB

November 7, 2019 at 8:14 pm

Dr. Zaiontz,

Please help. My dissertation is at a stand-still. I am scheduled to graduate in March 2020.

I intended to use chi-square (Fishers Exact) but was unable to obtain a high enough survey response rate, which yielded a 17% margin of error/confidence interval, at 95%CL. My committee insists I either resurvey or choose a different method due to the CI being so “high”.

I have: 9 IV, 1 DV. Total population:500. Survey sent to 119 based on (SRS) simple random sampling. 30 participants completed the survey(30 observations). Survey completion rate of 25.2%.

Am I able to conduct multiple regression instead with what I have? Do I meet the conditions/assumptions?

And if so, does multiple regression require I choose a CI, as well as a CL?
Thanks!!!!
Reply
- Charles
  
  November 7, 2019 at 11:09 pm
  
  Hello,
  I really don-t have enough information to be able to give any advice.
  Can you explain further what you were testing using Fisher’s exact test and what sort of results you got?
  Charles
  Reply
  - ProfTB
    
    November 8, 2019 at 1:35 am
    
    Hi. Yes.
    
    Testing to see if age, race, gender, experience, political affiliation, and a few other variables have a statistically significant relationship to academic union support.
    
    I had several expected values less than 5, so I used Fishers. 30 total observations.
    
    For example:
    Gender and Union Support: Fishers p value .230, fail to reject null hypothesis.
    
    Is this what you need?
    Reply
    - Charles
      
      November 8, 2019 at 8:40 am
      
      Hi,
      Thanks for the clarification. You can use regression with age, race, gender, experience, political affiliation, etc. as independent variables and academic union support as the dependent variable. If this variable just takes two values, you should explore binary logistic regression. The results of this approach would tell you which of the factors are significant in predicting union support (with p-values and confidence intervals for each).These topics are explained on the Real Statistics website.
      Charles
      Reply
Mike S.

October 20, 2019 at 4:43 am

I can’t thank you enough for creating this set of tools. Do people really believe impoverished students can afford SPSS? You’re a life-saver. Now if I can figure out how to use the discriminant analysis tool…
Reply
Anhan

October 3, 2019 at 2:02 pm

Hi Charles
I have only three column in excel with Frequency, mileage[km] and censor or failure.
mileage[1600, 75, 3500, 5000]
Failure or Censor[F,F,F, C]
Frequency[1,1,1,54]

how can I perform weibull analysis in excel? appreciate that you post it to my email
Reply
- Charles
  
  October 3, 2019 at 8:27 pm
  
  See
  Weibull Distribution
  Survivability using Weibull Distribution
  Charles
  Reply
Yousef

September 29, 2019 at 9:28 am

Logistic Regression not there Please Help
Reply
- Charles
  
  September 29, 2019 at 9:51 pm
  
  You can find the Binary Logistic and Probit Regression data analysis tool on the Reg tab.
  Charles
  Reply
Seth

September 27, 2019 at 6:28 am

I just want to thank you for providing such a powerful and useful addition to Excel. 🙂
Reply
Tsengel

September 25, 2019 at 12:14 pm

Can I use Mann-Kendall test and Sen’s slope estimator to identify long-term (40-70 years) streamflow change trends and variability? Could you refer me some useful links and references on them please.
Reply
- Charles
  
  September 27, 2019 at 8:46 am
  
  I don’t see any reason why you couldn’t use Mann-Kendall or Sen’s slope for a long-term trend.
  I don’t have a specific reference for a long-term trend.
  Charles
  Reply
parminder

August 7, 2019 at 12:03 pm

Hi Charles
you mann kandell see link https://real-statistics.com/time-series-analysis/time-series-miscellaneous/mann-kendall-test/

if you have no values for your data points but counted those as blank how would the formula change. currently we have 12 months aginst those data points. if months 3 and 7 and 9 where 0 values.
1
2
3
4
5
6
7
8
9
10
11
12
Reply
- Charles
  
  August 7, 2019 at 4:59 pm
  
  If the values are truly zero, then you can use the test as described on the website. It the data is missing, then you need to do something special. See
  https://www.researchgate.net/publication/265826436_Mann-Kendall_test_with_missing_data
  https://www.researchgate.net/publication/259183853_Trend_Tests_in_Time_Series_with_Missing_Values_a_Case_Study_with_Imputation
  There are other articles on this subject that you can find via google.
  Charles
  Reply
Parminder

July 31, 2019 at 10:56 pm

Hi Charles. Kandell SE formula re link https://www.real-statistics.com/time-series-analysis/time-series-miscellaneous/mann-kendall-test/

Can you explain why we divide by 18..
Thanks
Reply
parminder

July 31, 2019 at 6:31 pm

Hi Charles.
You’re article “Mann-Kendall Test” which is great, how could you work out the Tau values for that same data set. or do you have a article that explain in the same way, step by step on working out kandell tau values.
Reply
- Charles
  
  July 31, 2019 at 7:00 pm
  
  Glad that you got value from the article.
  Which tau value are you referring to? Augmented Dickey-Fuller? Engle-Granger? Kendall’s tau?
  Charles
  Reply
  - Parminder
    
    July 31, 2019 at 10:57 pm
    
    Kendall tau. Sorry…need some more excellent work showing how this can be calculated.
    Reply
    - Charles
      
      August 1, 2019 at 7:58 am
      
      See Kendall Tau
      Charles
      Reply
jasmine

July 31, 2019 at 11:53 am

Dear Charles Zaiontz,

I need to calculate the sampling variance of Cohen’s d in case of a one sample t test and I found your post “Confidence Interval for one sample Cohen’s d” (link: https://real-statistics.com/students-t-distribution/one-sample-t-test/confidence-interval-one-sample-cohens-d/). In this post you refer to Hedges and Olkin (1985). My question is, did you find the formula of the sampling variance in the book of Hedges and Olkin (1985)? If you did, on what page can I find that formula?

Thank you in advance.

Jasmine
Reply
- Charles
  
  July 31, 2019 at 6:55 pm
  
  Hello Jasmine,
  I don’t know the page number of this reference.
  If I remember correctly, you can also find more details at the Lakens (2013) reference in the Bibliography.
  Charles
  Reply
  - jasmine
    
    August 1, 2019 at 2:36 pm
    
    Dear Charles,
    
    thank you very much for your reply. I found some useful information in that article. There is just one question that I really like to ask:
    so to calculate the sampling variance of Cohen’s d in case of a one sample t test (where I have one group with one measurement on a variable of interest) I can use this formula (1/n)+d2/(2*n), right? Where n represents the sample size and d the Cohen’s d. However, according to Borenstein (2009) this formula can be used to calculate the variance of paired groups (with two measurements of one group). In this case, the original formula is ((1/ni)+di2/(2*ni))*2*(1-r), where r represents the correlation between the two measurements on the variable of interest.
    However, if I just want to calculate the variance of Cohen’s d in case of a one sample t test, then I should assume that the correlation r in that formula is equal to .5 (i.e., 2*(1-.5)=1), right? In this case, I get the first formula. This means that I assume that the correlation between the group and population(?) on the variable of interest is equal to .5? So my question is, is it justified to make such an assumption? And if so, is it perhaps an assumption that is too strong? And can I really use that formula to calculate the sampling variance of Cohen’s d in case of a one sample t test, or are there alternatives?
    
    Thank you in advance.
    
    Jasmine
    Reply
Laco Stehlík

July 7, 2019 at 12:46 pm

Dear Charles,

Thank you very much for this website and for the Real Statistics Package for Excel. It is amazing and very useful. Excel is a powerful tool, but after this add-in is now even more useful and user-friendly for non-mathematics people.
I appreciate your work very much. Very helpful.

Thank you.

Laco
Reply
Jonathan Spaans

May 28, 2019 at 8:44 pm

Dr Zaiontz:
I wanted to see if there was an appropriate citation for N=90 and 5 independ variables for the DW stat. my DW is about 2.2 and I would like to cite a source that would support no autocorrelation or values of the residuals as independent.
Thanks you Sir
Reply
- Charles
  
  May 29, 2019 at 9:08 am
  
  Hello Jonathan,
  You can use the citation described at the following webpage
  Citation
  Also see
  https://web.stanford.edu/~clint/bench/dwcrit.htm
  Charles
  Reply
Elsa

April 18, 2019 at 5:35 am

Good day sir,
How can I used Box plot in R-codes if my table are 3 x 3 contingency tables? Can you give me a data that are example in 3 x 3 contingency tables That are using R-codes in box plot.
Reply
- Charles
  
  April 18, 2019 at 9:41 am
  
  Hi Elsa,
  I don’t use R and so I don’t know what R code to use.
  Charles
  Reply
  - Elsa
    
    April 19, 2019 at 3:40 am
    
    Hello sir, Can you give me a data of 3 x 3 tables or more than 3 x 3 .
    Reply
    - Charles
      
      April 19, 2019 at 8:27 am
      
      Elsa,
      You can put any positive numbers in the table. See
      https://real-statistics.com/chi-square-and-f-distributions/independence-testing/
      Charles
      Reply
      - Elsa Romano
        
        April 19, 2019 at 1:16 pm
        
        Sir How could I know If my data is symmetric?
      - Charles
        
        April 19, 2019 at 4:08 pm
        
        Elsa,
        If skewness = 0 (or is significantly close to zero based on the skewness test).
        Skewness Test
        Charles
  - Elsa
    
    May 1, 2019 at 1:46 pm
    
    Good day sir,
    What is categorical data analysis?
    Reply
    - Charles
      
      May 2, 2019 at 8:37 am
      
      Hi Elsa,
      I believe that you are referring to those statistical analyses that are based on categorical variables (i.e. variables that take values that are not numeric). E.g. the chi-square test for independence is based on counts of categorical variables.
      Charles
      Reply
Elsa

April 9, 2019 at 1:00 pm

Sir do you know about Bowker’s test for symmetry? Such that Bowker’s test is a generalization of McNemar’s test.
Reply
- Charles
  
  April 9, 2019 at 8:10 pm
  
  Elsa,
  Yes, this is a generalization of McNemar’s test. You can find more information about this test at
  https://ncss-wpengine.netdna-ssl.com/wp-content/themes/ncss/pdf/Procedures/PASS/Tests_for_Multiple_Correlated_Proportions-McNemar-Bowker_Test_of_Symmetry.pdf
  I expect to add this test to the website and software shortly, probably in the next release.
  Charles
  Reply
  - Elsa
    
    April 10, 2019 at 4:39 pm
    
    When is the next release of Bowker’s test sir?
    Sir what is the detailed explaination about Technical details and the Test statistics? Why it goes in that Form?
    Reply
    - Charles
      
      April 11, 2019 at 8:48 am
      
      Elsa,
      I don’t have a date for the next release.
      I don’t understand your other questions.
      Charles
      Reply
      - Elsa
        
        April 11, 2019 at 6:28 pm
        
        Sir the McNemar’s test have 3 assumptions then one of them the sample must be a random sample. Then, How to check for randomness?
      - Charles
        
        April 11, 2019 at 6:31 pm
        
        Elsa,
        Generally you dont test or randomness, but instead make sure that your sample is randomly selected. There are many tests for randomness< one which is shown at One sample Runs Test
        Charles
  - Elsa
    
    April 23, 2019 at 6:13 pm
    
    Sir Can you give a data of 3 x 3 matched-pair table.
    Reply
    - Charles
      
      April 23, 2019 at 6:29 pm
      
      Elsa,
      I dont know what a 3×3 matched pairs table is.
      Charles
      Reply
  - Elsa
    
    May 16, 2019 at 3:42 pm
    
    Sir good day! I need your APA Reference for your answer all my questions about what is test for symmetry. Please give me your APA reference sir.
    Reply
    - Charles
      
      May 17, 2019 at 11:09 am
      
      Hello Elsa,
      Which test for symmetry are you referring to?
      I don’t know of any specific APA guidelines for testing symmetry. You can use the general APA guidelines however (as I have shown for a few of the other tests).
      Charles
      Reply
Julio César

March 11, 2019 at 1:50 am

Excellent way to help all of us looking for easier stats. Simple examples and an add-in flawlessly working.
This is just to let you know I am very thankful!
Reply
Karl

March 5, 2019 at 10:36 pm

Hi ,

I discovered this product on Youtube , found it amazing and now wants a piece of it. I am trying to download it but when I click in the download button, nothing happens. I am looking to use Logistic regression that I use very often. I also do not know what package to install. Plus I have excel 2016, is tat Okay?
Could you help, please?
Reply
- Charles
  
  March 6, 2019 at 8:22 am
  
  Hi Karl,
  I don’t know why you were unable to download the software. I suggest that you try again. You can press the button on the following webpage:
  https://real-statistics.com/free-download/real-statistics-resource-pack/
  You can also press the link labelled
  Real Statistics Resource Pack for Excel 2010/2013/2016/2019/365
  on that webpage.
  Charles
  Reply
Jordi

March 2, 2019 at 9:17 pm

Hi Charles,

Just want to thank you for transferring to all of us your knowledge
Reply
Yaseen Al-Hadeethi

January 30, 2019 at 10:20 pm

hello sir,

I hope that you are good please i want to know if you can create a simulation for me on excel for a fee of money please if you can just contact me on my e-mail
Reply
- Charles
  
  January 31, 2019 at 10:04 am
  
  Hello Yaseen,
  What sort of simulation are you looking for? If it is confidential, you can send me the details via email (see Contact Us),
  Charles
  Reply
REWA BOCHARE

January 4, 2019 at 5:24 pm

Dear Sir
Thanks a lot for giving research scholars like me this wonderful software.

Please help me with performing iteratively reweighted least squares regression using this software.
Reply
- Charles
  
  January 5, 2019 at 8:10 am
  
  Dear Rewa,
  See https://real-statistics.com/multiple-regression/lad-regression/lad-regression-irls-method/
  Charles
  Reply
Paramita M.

November 29, 2018 at 6:11 pm

Thank you very much for your great effort, Dr. Charles Zaiontz.
This website is helping me with my theses.
I never imagine before that Excel could these statistic tests.
It is great as a statistic tool learning for me as well as simplified my problem because SPSS program is too heavy for my old laptop.

Regards,
Paramita
Reply
- Joel Weinstein
  
  December 18, 2018 at 8:58 pm
  
  Hello Charles,
  I using real stats for Excel 2013 on Windows and would appreciate if you could help me with the following. I am performing a MANOVA on a data set that is extremely similar to the one you used in the example using four types of soil and measuring yield, water requirement, and fertilizer requirement. You have a total of 32 measurements for each of the three dependent variables (eight for each of the four types of soil). Likewise, I have three independent variables, laser (8 subject), no laser (six subject), and control (21 subjects), for a total of 35 measurements on each of three dependent variables, acuity (A), contrast sensitivity (CS), and retinal thickness (RT). I proceeded by overwriting your example data with my, which simply added three rows. I then proceeded to change the formulae in cells F4 thru L7. That works fine. However I then tried to change the formulae in the SS CP AND GROUP COVARIANCE MATRICES. I received an error message reading “you cannot change part of an array”.
  I MANOVA closer resembles your example and I would like to utilize all of the formatting you have done without completely rewriting all of the formulae. How can I do this?
  Thanks very much.
  Joel joelmweinstein@me.com
  PS I’m not very familiar with your website layout. Where will your reply be posted? Would appreciate it if you could send a copy to my email address.
  Reply
  - Charles
    
    December 23, 2018 at 1:28 pm
    
    Joel,
    You should be able to modify parts of the output produced by Real Statistics. However, if need to change a few of the cells produced by an array formula, then you will need to be a little clever since you cant modify cells with a range output from an array formula. This an Excel restriction. See
    Array Formulas and Functions regarding the error message you are receiving.
    Suppose that the range A1:B5 contains an array formula and you want to modify the output in cell B2. One way to accomplish this is to place the formula =A1 in cell D1, highlight range D1:E5 and then press Ctrl-R and Ctrl-D. Mow range D1:E5 will contain the same results as A1:B5, but whereas you couldnt change cell B2, you can change cell E2.
    Note too you can write your own VBA formulas using calls to the Real Statistics functions, including array functions. This is explained at
    Calling Real Statistics Function in VBA
    Charles
    Reply
Waqar Khalid

November 27, 2018 at 1:20 pm

Dr. Charles Zaiontz,
I want to estimate the translog production function by using the method of ridge regression as my data has a multicollinearity issue. I also tried step by step on the data you have uploaded on the site, but something is going wrong as I did not find the command (i-e. DIAG) in the excel sheet. Now, I can’t go for the remaining work. Therefore, I need your rich and timely assistance in this regard.

Thanks a million.
Reply
- Charles
  
  November 27, 2018 at 6:16 pm
  
  Waqar,
  DIAG is a Real Statistics function. You need to download the Real Statistics software to use it. It is free.
  Charles
  Reply
armel

November 24, 2018 at 1:56 pm

can you tell me if there an excel function witch can regress a polynom
Reply
- Charles
  
  November 24, 2018 at 5:42 pm
  
  See Polynomial Regression
  Charles
  Reply
Orion

November 20, 2018 at 4:46 pm

Hello Dr. Zaiontz,

I stumbled upon your website and saw an example of the Hodges–Lehmann estimator. But it was calculated in the context of a wider problem, and not what I was looking for.
Can one construct a formula in excel, dedicated to output the Hodges–Lehmann estimator for a given series/array of numbers?

Thank you for your advice,
Orion
Reply
- Charles
  
  November 20, 2018 at 6:38 pm
  
  Orion,
  In https://real-statistics.com/non-parametric-tests/wilcoxon-signed-ranks-test/signed-ranks-median-confidence-interval/ I show how to calculate the Hodges-Lehmann estimation for the median. Is this what you are looking for?
  Charles
  Reply
May

November 18, 2018 at 7:25 am

hello, Dr. Zaiontz,
Here I am looking for your help again. I collected some writing samples by 30 Chinese students. To be exact, the 30 students each wrote a dissertation in English and wrote a research article in Chinese. I intend to see if there is any difference between their English dissertations and their corresponding Chinese research articles in terms of hedges. What statistical method should I use for this end? Or is it possible to do any statistical analysis with these data? Thank you so much for your time and help! Best wishes.
Reply
- Charles
  
  November 18, 2018 at 5:35 pm
  
  May,
  A paired t test might work.
  Charles
  Reply
  - May
    
    November 19, 2018 at 7:33 am
    
    Thank you for your prompt reply, Dr. Zaiontz. I’ll try.
    Reply
Fergo Treska

November 14, 2018 at 12:19 am

It’s likely possible to have an overlap classification data at the end when using LDA. Can we set a threshold in discriminant analysis to provide more separation class data points? if so, how?. Regards.
Reply
- Charles
  
  November 16, 2018 at 8:44 pm
  
  Fergo,
  The point of LDA is to determine a specific category. Since the outcome are weights, I guess you can interpret the existence of overlap categories, but I am not surre what purpose this would serve. When you say that you are seeking more separation of the data points, what do you mean?
  Charles
  Reply
  - Fergo Treska
    
    November 19, 2018 at 12:03 pm
    
    i want to test a data vector whether it belongs to category A or B. So, I run several inputs data vector that i know it belongs to category A, but the outputs show some of them are wrongly categorized as B. I meant to seek a way to get a better outputs, maybe by applying a threshold or something so at least i can reduce the errors. Would you like to share some ideas please?
    Reply
    - Charles
      
      November 19, 2018 at 12:06 pm
      
      Fergo,
      This sounds like something you can do with neural networks, training the network based on the data you have. Just an idea.
      
      Charles
      Reply

487 thoughts on “Author”

Leave a Comment Cancel reply