We now define some statistics which are commonly used to characterize data and probability distributions. In particular, we define metrics of central tendency (e.g. mean and median), variability (e.g. variance and standard deviation), and shape (e.g. skewness and kurtosis).
In addition, we provide some important ways of graphically describing data and probability distributions, including histograms, box plots, and QQ plots.
Topics
- Measures of Central Tendency
- Measures of Variability
- Symmetry, Skewness and Kurtosis
- Ranking Functions in Excel
- Descriptive Statistics Tools
- Frequency Tables
- Histograms
- Creating Box Plots
- Box Plots with Outliers
- Dot Plots
- ROC Curve and Classification Table
- Outliers and Robustness
- M-estimators (Tukey’s Biweight and Huber’s estimator)
- Lp estimators and Minkowski distance
- MAD and Related Approaches for Identifying Outliers
- Dealing with Missing Data
- Assumptions for Statistical Tests
- Data Transformations
- Diversity Indices
- Divergence
References
Wikipedia (2012) Descriptive statistics
https://en.wikipedia.org/wiki/Descriptive_statistics
Howell, D. C. (2010) Statistical methods for psychology (7th ed.). Wadsworth, Cengage Learning.
https://labs.la.utexas.edu/gilden/files/2016/05/Statistics-Text.pdf
Zar. J. H. (2010) Biostatistical analysis 5th Ed. Pearson
https://bayesmath.com/wp-content/uploads/2021/05/Jerrold-H.-Zar-Biostatistical-Analysis-5th-Edition-Prentice-Hall-2009.pdf
Dear Charles,
Do you plan to introduce PIVOT TABLE into the extension of your already fine RealStats ??
Thanks for all
Mohamed
Hello Mohamed,
What sort of support for Pivot Tables do you suggest?
Charles
PIVOT TABLE for big data/long lists with many variables.
Thanks
Mohamed
Hi Mohamed,
I believe that the Real Statistics functions and data analysis tools work properly on data in pivot tables. I suggest that you try it and see if works. Let me know if there are problems.
Charles
Dear Charles!
Thanks for your quick reply.
Please what RealStats functions and/or tools?
Mohamed
Dear Mohamed,
Probably almost all of the functions and data analysis tools work with pivot tables.
Charles
I want to extract numbers in set A and Set B with their corresponding frequencies. For example, 53, 67,78, 80,70 and their respective frequencies are 1,6,3,5,4 in set A and then in Set B,we have 35,70,80,49,43 with respective frequencies of 3,7,8,9,1.How do I extract a number with frequency 5 in set A and frequent 8 in set B ( i.e 80). Thank you
John,
Probably the easiest way is to use Excel’s filter capability by choosing Filter from the Data ribbon. Instructions on how to use this capability can be found on the following webpage:
https://support.office.com/en-us/article/Filter-data-in-a-range-or-table-01832226-31b5-4568-8806-38c37dcc180e
Charles
Uploaded your stat package to my MS Office 2011 for the Macintosh. Tried entering the MAD state (median absolute deviation) for some financial benchmarking data we are analyzing as it is not a normal distribution. When I enter the MAD state formula I get back error message “#value”? Does the data have to be unformatted? It is currently in currency formatted numbers in the distribution…. please advise.
Thank You,
Rod Warnick
UMass
Rod,
I just checked and for some strange reason when using the MAD function the data can’t be in Currency or Accounting format. It can be formatted as General, Number, Percentage, Fraction or Scientific. The software that calculates MAD uses Excel’s MEDIAN function. In Excel MEDIAN works fine with currency data, but produces an error inside the software program (VBA).
I have not checked all the various functions to see whether this problem appears for other functions, but it does not for the few that I have checked.
Thanks for identifying this problem. For now it is best not to use MAD with currency formatted data.
Charles