For sufficiently large n, we can use the normal distribution approximation to obtain confidence intervals for the proportion parameter. In fact, there are other approaches that generally yield more accurate results, especially for smaller samples.
Normal approximation (Wald interval)
As described in One-sample Proportion Testing, the 1–α confidence interval is given by the following formula where zcrit = NORM.S.INV(1–α).
Wald interval with continuity correction
Adjusted Wald interval
where
Wilson score binomial interval
This version gives good results even for small values of n or when p or 1–p is small.
Clopper-Pearson exact binomial interval
lower = BETA.INV(α/2, x, n-x+1)
upper = BETA.INV(1-α/2, x+1, n-x)
where x = np = the number of successes in n trials
This approach gives good results even when np(1-p) < 5.
Example
Example 1: A new AIDS drug is shown to cure 30% of 50 patients. Find the 95% confidence interval for the cure rate.
Recommendations
For most situations, the Wilson interval is probably best, although for large samples Agresti-Coull might be better. For smaller samples where np(1-p) < 5, Clopper-Pearson is probably a good choice.
Examples Workbook
Click here to download the Excel workbook with the examples described on this webpage.
References
Influential Points (2020) Confidence intervals of proportions and rates
https://influentialpoints.com/Training/confidence_intervals_of_proportions-principles-properties-assumptions.htm
Wikipedia (2020) Binomial proportion confidence interval
https://en.wikipedia.org/wiki/Binomial_proportion_confidence_interval
After downloading I find there are incomplete entries in columns F, G, and I.
May I ask what these are please?
Hello Adrian,
Are you referring to the Excel workbook that you can download from this web page?
If so, no entries are missing if you have the Real Statistics Resource Pack installed. You can download this Excel add-in for free from
https://real-statistics.com/free-download/real-statistics-resource-pack/
Charles
Hi Charles,
This doesn’t work.
Hello Adrian,
Can you be more specific as to what doesn’t work so that I can help you?
Charles
Could you email me please?
Hello Adrian,
Are you referring to the workbook obtainable on the following webpage?
https://real-statistics.com/binomial-and-related-distributions/proportion-distribution/one-sample-proportion-testing/
Charles
Hi Charles
I have real statistics resource pack installed but entries in cells G9 and G10, G13 and G14, G17 and G18, G21 and G22 are missing, however there are formulas in cells I9, I13, I17 and I21 for G9, G13, G17 and G21
Hello Adrian,
I assume that you are referring to the Prop 2 tab of the Excel Workbook with the exampless.
When I downloaded the workbook I received a Protected View message from Excel. I then clicked on the Enable Editing button and was able to see all the entries on the workbook.
Charles
Hi Charles
I’m running excel on macbook, editing can’t be enabled as you described.
Hello Adrian,
What sort of editing do you want to make?
Charles
Hi Charles,
So, when I copy the formulas from I9, I13, I17, and I21 into G9, G13, G17, and G21, I get the word “Lower”. However, if I change “TRUE” to “FALSE” in the formula I get the values of the lower CI in B11, B16, B20, and B24, which seems correct.
So, 2 questions if I may; why is the formula written as =PropCI(G5,G3,TRUE,G8,G6) for example, and whats the PropCI formula in column I to calculate the upper CI values into column G please?
Hi Adrian,
Formulas such as =PropCI(G5,G3,TRUE,G8,G6) are what Excel calls array formulas. If you simply enter the formula into a cell and press the Enter key, you will only get the first value in the output, which in this case is “Lower”. In order to get all the output, you must highlight all the cells in the output, which is F9:G10 in this case, enter the formula and then press the following keys at the same time Ctrl-Shft-Enter. This is explained at https://real-statistics.com/excel-environment/array-formulas-functions/
If you are using Excel 365 then the approach is simpler. In this case, you can simply enter the formula in the first cell of the output range and press the Enter key. Excel will then fill in all the necessary cells.
Charles
I really don’t know what planet you’re on Charles.
I understand about arrays.
The download I have shows the formulas in column D that are used to calculate the results in column B. For example, the formula in Column D11 =(B5+B10/2)/(B10+1)-B7/(B10+1)*SQRT(B5*(1-B5)/B3+B10/(4*B3)), is used to calculate the result in B11 (Wilson Score Lower). The formula is also there for Wilson Score Upper and the lower and upper values for the other methods. The formulas are also displayed in the formula bar when I click into cells in column B. So all is well in columns B and D.
Now, in column G, there are no results for the lower CI of any of the methods, but in column I there are formulas, e.g., in I9 the formula is =PropCI(G5,G3,TRUE,G8,G6). Cell G9 was empty, just the same as the cells in column G for the Lower CI for all the other methods. When I write the formula =PropCI(G5,G3,TRUE,G8,G6) directly into cell G9 (Wilson Score Lower), I get the word “lower”. However, if I change the word “TRUE” to “FALSE” in the formula that I directly enter into cell G9, I get the same result as cell B11 (Wilson Score Lower), which is calculated with the formula: =(B5+B10/2)/(B10+1)-B7/(B10+1)*SQRT(B5*(1-B5)/B3+B10/(4*B3)). This is also true for the Lower values for other methods in column G when I use the other formulas in column I – they match with their counterparts in coulmn B.
So, if I may ask. what is the “PropCI” formula to calculate the Upper CI values of all the other methods into column G please? There are no “PropCI” formulas for the Upper values.
Adrian,
=PropCI(G5,G3,TRUE,G8,G6) is used to populate the values in all the cells F9, F10, G9, and G10.
You do this by highlighting range F9:G10, entering the formula =PropCI(G5,G3,TRUE,G8,G6), and then holding the Shift and Ctrl keys down while pressing the Enter key. The values you see in cells F9, F10, G9, and G10 will then appear. Try it, it works.
I wasn’t explaining “arrays”. I was explaining “array formulas“. =PropCI(G5,G3,TRUE,G8,G6) is an array formula.
Charles
Thank you!!!! I kept trying to plug in versions of these equations in excel to no avail. thank you for writing it out.