Basic Concepts
The Kuder and Richardson Formula 20 test checks the internal consistency of measurements with dichotomous choices. It is equivalent to performing the split-half methodology on all combinations of questions and is applicable when each question is either right or wrong. A correct question scores 1 and an incorrect question scores 0. The test statistic is
where
k = number of questions
pj = number of people in the sample who answered question j correctly
qj = number of people in the sample who didn’t answer question j correctly
σ2 = variance of the total scores of all the people taking the test = VAR.P(R1) where R1 = array containing the total scores of all the people taking the test.
Values range from 0 to 1. A high value indicates reliability, while too high a value (in excess of .90) indicates a homogeneous test (which is usually not desirable).
Kuder-Richardson Formula 20 is equivalent to Cronbach’s alpha for dichotomous data.
Example
Example 1: A questionnaire with 11 questions is administered to 12 students. The results are listed in the upper portion of Figure 1. Determine the reliability of the questionnaire using Kuder and Richardson Formula 20.
Figure 1 – Kuder and Richardson Formula 20 for Example 1
The values of p in row 18 are the percentage of students who answered that question correctly – e.g. the formula in cell B18 is =B16/COUNT(B4:B15). Similarly, the values of q in row 19 are the percentage of students who answered that question incorrectly – e.g. the formula in cell B19 is =1–B18. The values of pq are simply the product of the p and q values, with the sum given in cell M20.
We can calculate ρKR20 as described in Figure 2.
Figure 2 – Key formulas for worksheet in Figure 1
The value ρKR20 = 0.738 shows that the test has high reliability.
Worksheet Function
Real Statistics Function: The Real Statistics Resource Pack provides the following function:
KUDER(R1) = KR20 coefficient for the data in range R1.
For Example 1, KUDER(B4:L15) = .738.
KR-21
Where the questions in a test all have approximately the same difficulty (i.e. the mean score of each question is approximately equal to the mean score of all the questions), then a simplified version of Kuder and Richardson Formula 20 is Kuder and Richardson Formula 21, defined as follows:
where μ is the population mean score (obviously approximated by the observed mean score).
For Example 1, μ = 69/12 = 5.75, and so
Note that ρKR21 typically underestimates the reliability of a test compared to ρKR20.
Examples Workbook
Click here to download the Excel workbook with the examples described on this webpage.
References
Wikipedia (2012) Kuder-Richardson formulas
https://en.wikipedia.org/wiki/Kuder%E2%80%93Richardson_formulas
Good day! May i ask where does it based its answer? How can u say if it is high or what? How does its rating work?
Hello Che,
Kuder and Richardson 20 is a special case of Cronbach’s alpha. You can interpret the rating as described at
https://real-statistics.com/reliability/internal-consistency-reliability/cronbachs-alpha/cronbachs-alpha-basic-concepts/
Charles
Hi Charles,
How does one interpret a negative result of KR20?
To get such an example, I downloaded your example worksheet. Then I put 1 everywhere, except for the diagonal, so that Student 1 has only mistake on Q1, Student 2 has only mistake on Q2, and so on. Student 12 has everything correct.
Then the spreadsheet computes KR20 to be -11.
The variance in the term 1-Σpq/var is a very small number, while Σpq is relatively large. So the Excel is correct, but the formula for KR clearly allows the result to be negative.
Hi Jerzy,
Yes, the value can be negative. I interpret this in the same way as a zero value — not good.
Charles
Thanks!
Note however, that in the example I have outlined the results are quite homogeneous, so intuitively, one would expect the test to be strange (with extremely high marks) but consistent. The formula says otherwise. Why?
Hi Jerzy,
I don’t know why the test produces such a counter-intuitive result. To me this shows the limit of the test.
Note that if you had used the Guttman test you would get similar results.
Charles
Thank you very much. I really appreciate your answers.
I’ve been also asking this myself, and tried to look for some possible reasons and is it possible that the formula on cell B33 is missing some grouping symbols so that it does not become negative?
Per attached link the cell contains the formula =(B30/(B30-1))*(1-B31/B32)
wherein the first factor have inside grouping symbol prior to division while the second factor does not, is it safe to use =(B30/(B30-1))*((1-B31)/B32), so that it should not give us the counter intuitive result?
If the test was to be a standardised test with 6,000 participants, and those that got each item right was between 3,000 and 4,500 per item. Compute the KR reliability coefficient.
Hello Danladi,
Sorry, but I don’t understand your comment. Do you have a question?
Charles
0.625
Assuming you have five test items responded to by ten examinees on a polytonous test item generate the scores (performance) and determine the reliability coefficient using kuder Richardson formular
Do you mean “polytomous”?
If so what is the problem with using Kuder-Richardson?
Charles
Please interprete B4:B15
Hello Chioma,
For this example, there are 12 students, each of whom answered question Q1. A correct answer (for any student) is assigned the value 1 and an incorrect answer is assigned the value 0.
Charles
Based on the given information, Charles (or any other student) has either answered the question Q1 correctly or incorrectly. Since a correct answer is assigned the value 1 and an incorrect answer is assigned the value 0, Charles’s answer would either be 1 or 0.
Without knowing the actual answer or Charles’s specific response, I cannot definitively state what Charles’s answer is. However, if we had access to the actual data or Charles’s answer sheet, we could easily determine whether his answer was 1 (correct) or 0 (incorrect).
In summary, Charles’s answer for question Q1 is either 1 (correct) or 0 (incorrect), but without additional information, we cannot determine which one it is.
Serial number
How did you result to the answer in cell B18?
Samuel,
As stated on the webpage, B18 contains the formula =B16/COUNT(B4:B15).
Charles
sory Dr, am get confusing on how to calculate variancein KR20 out of excell work
What is confusing you?
Charles
Please sir, what is the sample size?
For dichotomous data, i.e. the type of data supported by Kuder and Richardson Formula 20 (KR20), KR20 is equivalent to Cronbach’s alpha. Thus, you can use the estimates for the sample size for Cronbach’s alpha.
See https://www.real-statistics.com/reliability/internal-consistency-reliability/cronbachs-alpha/cronbachs-alpha-power/
Charles
Dear professor,
First, thanks for the effort of this project. I try my best to search the code for KR20 or KUDER(R1) but, i didn’t find any in the dialog box. Could you advise me on this matter.
Thanks
Fadzli
Dear Fadzli,
For dichotomous data (i.e. consisting only of 0’s and 1’s), KR20 is equivalent to Cronbach’s alpha. You can therefore use the Cronbach’s alpha dialog box.
Charles
To determine how the answer in cell B18 was resulted (assuming this is a reference to a spreadsheet context), we would need more context about the specific calculations or functions used in that cell. However, I can provide a general overview of how you might calculate a result in a cell based on the information you’ve given.
Given that there are 12 students and each student’s answer to question Q1 is either 1 (correct) or 0 (incorrect), you might want to calculate the total number of correct answers, the percentage of correct answers, or some other statistic related to the answers.
Here are some examples of how you could calculate a result in cell B18 based on the answers in cells B4:B15 (assuming the answers are entered in these cells):
Total Correct Answers:
Use the SUM function to add up all the values in cells B4:B15.
Formula in B18: =SUM(B4:B15)
Percentage of Correct Answers:
First, calculate the total number of correct answers using SUM(B4:B15).
Then, divide this by the total number of students (12) and multiply by 100 to get the percentage.
Formula in B18: =(SUM(B4:B15)/12)*100
Average Answer (which would just be the percentage of correct answers divided by 100):
Use the AVERAGE function to calculate the average value in cells B4:B15.
Formula in B18: =AVERAGE(B4:B15)
Note that this will give you a decimal number between 0 and 1, representing the proportion of correct answers. To convert it to a percentage, multiply by 100.
Other Statistics:
You could also calculate other statistics like the mode (most common answer), standard deviation, etc., using appropriate functions like MODE.SNGL, STDEV.P, etc.
Since I don’t have the actual spreadsheet or the specific calculation used in cell B18, I can only provide these general examples. The actual formula or calculation used would depend on what you’re trying to accomplish.
Why do you use the variance of the population rather than the variance of the sample? Also, I am confused why the p, q values are represented as a ratio (total correct or incorrect responses divided by the total number of possible correct/incorrect responses per question (e.g. 10/12 = 0.8333) but the variance represents the dispersion of the number of questions answered correctly per participant (not a ratio).
Hi Katherine,
You can use either the sample or population version (as long as you don’t mix them). You should get the same result.
I don’t know why p and q are ratios and the others are not. It is just the way the math works out.
Charles