Kuder and Richardson Formula 20

Basic Concepts

The Kuder and Richardson Formula 20 test checks the internal consistency of measurements with dichotomous choices. It is equivalent to performing the split-half methodology on all combinations of questions and is applicable when each question is either right or wrong. A correct question scores 1 and an incorrect question scores 0. The test statistic is

where

k = number of questions

p_j = number of people in the sample who answered question j correctly

q_j = number of people in the sample who didn’t answer question j correctly

σ² = variance of the total scores of all the people taking the test = VAR.P(R1) where R1 = array containing the total scores of all the people taking the test.

Values range from 0 to 1. A high value indicates reliability, while too high a value (in excess of .90) indicates a homogeneous test (which is usually not desirable).

Kuder-Richardson Formula 20 is equivalent to Cronbach’s alpha for dichotomous data.

Example

Example 1: A questionnaire with 11 questions is administered to 12 students. The results are listed in the upper portion of Figure 1. Determine the reliability of the questionnaire using Kuder and Richardson Formula 20.

Figure 1 – Kuder and Richardson Formula 20 for Example 1

The values of p in row 18 are the percentage of students who answered that question correctly – e.g. the formula in cell B18 is =B16/COUNT(B4:B15). Similarly, the values of q in row 19 are the percentage of students who answered that question incorrectly – e.g. the formula in cell B19 is =1–B18. The values of pq are simply the product of the p and q values, with the sum given in cell M20.

We can calculate ρ_KR20 as described in Figure 2.

Figure 2 – Key formulas for worksheet in Figure 1

The value ρ_KR20 = 0.738 shows that the test has high reliability.

Worksheet Function

Real Statistics Function: The Real Statistics Resource Pack provides the following function:

KUDER(R1) = KR20 coefficient for the data in range R1.

For Example 1, KUDER(B4:L15) = .738.

KR-21

Where the questions in a test all have approximately the same difficulty (i.e. the mean score of each question is approximately equal to the mean score of all the questions), then a simplified version of Kuder and Richardson Formula 20 is Kuder and Richardson Formula 21, defined as follows:

where μ is the population mean score (obviously approximated by the observed mean score).

For Example 1, μ = 69/12 = 5.75, and so

Note that ρ_KR21 typically underestimates the reliability of a test compared to ρ_KR20.

Examples Workbook

Click here to download the Excel workbook with the examples described on this webpage.

Links

↑ Internal consistency reliability

References

Wikipedia (2012) Kuder-Richardson formulas
https://en.wikipedia.org/wiki/Kuder%E2%80%93Richardson_formulas

397 thoughts on “Kuder and Richardson Formula 20”

Pingback: Validity of multiple-choice questions for the assessment of medical post graduate students- a retrospective observational study – Science Academique
Mohammadhosein saket

January 30, 2025 at 10:33 pm

Thanks
Reply
Che

November 7, 2024 at 10:40 am

Good day! May i ask where does it based its answer? How can u say if it is high or what? How does its rating work?
Reply
- Charles
  
  November 7, 2024 at 2:40 pm
  
  Hello Che,
  Kuder and Richardson 20 is a special case of Cronbach’s alpha. You can interpret the rating as described at
  https://real-statistics.com/reliability/internal-consistency-reliability/cronbachs-alpha/cronbachs-alpha-basic-concepts/
  Charles
  Reply
Jerzy

June 26, 2024 at 1:41 am

Hi Charles,

How does one interpret a negative result of KR20?

To get such an example, I downloaded your example worksheet. Then I put 1 everywhere, except for the diagonal, so that Student 1 has only mistake on Q1, Student 2 has only mistake on Q2, and so on. Student 12 has everything correct.
Then the spreadsheet computes KR20 to be -11.
The variance in the term 1-Σpq/var is a very small number, while Σpq is relatively large. So the Excel is correct, but the formula for KR clearly allows the result to be negative.
Reply
- Charles
  
  June 26, 2024 at 9:22 am
  
  Hi Jerzy,
  Yes, the value can be negative. I interpret this in the same way as a zero value — not good.
  Charles
  Reply
  - Jerzy
    
    June 26, 2024 at 9:56 am
    
    Thanks!
    Note however, that in the example I have outlined the results are quite homogeneous, so intuitively, one would expect the test to be strange (with extremely high marks) but consistent. The formula says otherwise. Why?
    Reply
    - Charles
      
      June 27, 2024 at 11:39 am
      
      Hi Jerzy,
      I don’t know why the test produces such a counter-intuitive result. To me this shows the limit of the test.
      Note that if you had used the Guttman test you would get similar results.
      Charles
      Reply
      - Jerzy
        
        June 27, 2024 at 8:12 pm
        
        Thank you very much. I really appreciate your answers.
    - Mike
      
      October 21, 2024 at 2:41 am
      
      I’ve been also asking this myself, and tried to look for some possible reasons and is it possible that the formula on cell B33 is missing some grouping symbols so that it does not become negative?
      
      Per attached link the cell contains the formula =(B30/(B30-1))*(1-B31/B32)
      wherein the first factor have inside grouping symbol prior to division while the second factor does not, is it safe to use =(B30/(B30-1))*((1-B31)/B32), so that it should not give us the counter intuitive result?
      Reply
Danladi

December 30, 2023 at 9:49 am

If the test was to be a standardised test with 6,000 participants, and those that got each item right was between 3,000 and 4,500 per item. Compute the KR reliability coefficient.
Reply
- Charles
  
  December 30, 2023 at 10:10 am
  
  Hello Danladi,
  Sorry, but I don’t understand your comment. Do you have a question?
  Charles
  Reply
- yangzunian
  
  May 21, 2024 at 10:46 am
  
  0.625
  Reply
Peace Ehikhamenor

September 22, 2023 at 3:37 pm

Assuming you have five test items responded to by ten examinees on a polytonous test item generate the scores (performance) and determine the reliability coefficient using kuder Richardson formular
Reply
- Charles
  
  September 22, 2023 at 6:13 pm
  
  Do you mean “polytomous”?
  If so what is the problem with using Kuder-Richardson?
  Charles
  Reply
Chioma

August 8, 2023 at 4:42 pm

Please interprete B4:B15
Reply
- Charles
  
  August 8, 2023 at 5:24 pm
  
  Hello Chioma,
  For this example, there are 12 students, each of whom answered question Q1. A correct answer (for any student) is assigned the value 1 and an incorrect answer is assigned the value 0.
  Charles
  Reply
  - yangzunian
    
    May 21, 2024 at 11:11 am
    
    Based on the given information, Charles (or any other student) has either answered the question Q1 correctly or incorrectly. Since a correct answer is assigned the value 1 and an incorrect answer is assigned the value 0, Charles’s answer would either be 1 or 0.
    
    Without knowing the actual answer or Charles’s specific response, I cannot definitively state what Charles’s answer is. However, if we had access to the actual data or Charles’s answer sheet, we could easily determine whether his answer was 1 (correct) or 0 (incorrect).
    
    In summary, Charles’s answer for question Q1 is either 1 (correct) or 0 (incorrect), but without additional information, we cannot determine which one it is.
    Reply
- Nnaji, Anayo David
  
  September 7, 2023 at 2:36 am
  
  Serial number
  Reply
Samuel Progress

July 31, 2023 at 10:20 pm

How did you result to the answer in cell B18?
Reply
- Charles
  
  August 1, 2023 at 8:46 am
  
  Samuel,
  As stated on the webpage, B18 contains the formula =B16/COUNT(B4:B15).
  Charles
  Reply
sijjo

February 20, 2023 at 1:20 am

sory Dr, am get confusing on how to calculate variancein KR20 out of excell work
Reply
- Charles
  
  February 20, 2023 at 9:57 am
  
  What is confusing you?
  Charles
  Reply
Uzoigwe

January 11, 2023 at 10:06 am

Please sir, what is the sample size?
Reply
- Charles
  
  January 13, 2023 at 9:22 am
  
  For dichotomous data, i.e. the type of data supported by Kuder and Richardson Formula 20 (KR20), KR20 is equivalent to Cronbach’s alpha. Thus, you can use the estimates for the sample size for Cronbach’s alpha.
  See https://www.real-statistics.com/reliability/internal-consistency-reliability/cronbachs-alpha/cronbachs-alpha-power/
  Charles
  Reply
Fadzli

January 1, 2023 at 3:18 pm

Dear professor,

First, thanks for the effort of this project. I try my best to search the code for KR20 or KUDER(R1) but, i didn’t find any in the dialog box. Could you advise me on this matter.

Thanks

Fadzli
Reply
- Charles
  
  January 2, 2023 at 4:01 pm
  
  Dear Fadzli,
  For dichotomous data (i.e. consisting only of 0’s and 1’s), KR20 is equivalent to Cronbach’s alpha. You can therefore use the Cronbach’s alpha dialog box.
  Charles
  Reply
- yangzunian
  
  May 21, 2024 at 11:15 am
  
  To determine how the answer in cell B18 was resulted (assuming this is a reference to a spreadsheet context), we would need more context about the specific calculations or functions used in that cell. However, I can provide a general overview of how you might calculate a result in a cell based on the information you’ve given.
  
  Given that there are 12 students and each student’s answer to question Q1 is either 1 (correct) or 0 (incorrect), you might want to calculate the total number of correct answers, the percentage of correct answers, or some other statistic related to the answers.
  
  Here are some examples of how you could calculate a result in cell B18 based on the answers in cells B4:B15 (assuming the answers are entered in these cells):
  
  Total Correct Answers:
  
  Use the SUM function to add up all the values in cells B4:B15.
  Formula in B18: =SUM(B4:B15)
  
  Percentage of Correct Answers:
  
  First, calculate the total number of correct answers using SUM(B4:B15).
  Then, divide this by the total number of students (12) and multiply by 100 to get the percentage.
  Formula in B18: =(SUM(B4:B15)/12)*100
  
  Average Answer (which would just be the percentage of correct answers divided by 100):
  
  Use the AVERAGE function to calculate the average value in cells B4:B15.
  Formula in B18: =AVERAGE(B4:B15)
  Note that this will give you a decimal number between 0 and 1, representing the proportion of correct answers. To convert it to a percentage, multiply by 100.
  
  Other Statistics:
  
  You could also calculate other statistics like the mode (most common answer), standard deviation, etc., using appropriate functions like MODE.SNGL, STDEV.P, etc.
  
  Since I don’t have the actual spreadsheet or the specific calculation used in cell B18, I can only provide these general examples. The actual formula or calculation used would depend on what you’re trying to accomplish.
  Reply
Katherine

December 2, 2022 at 8:29 pm

Why do you use the variance of the population rather than the variance of the sample? Also, I am confused why the p, q values are represented as a ratio (total correct or incorrect responses divided by the total number of possible correct/incorrect responses per question (e.g. 10/12 = 0.8333) but the variance represents the dispersion of the number of questions answered correctly per participant (not a ratio).
Reply
- Charles
  
  December 3, 2022 at 11:07 am
  
  Hi Katherine,
  You can use either the sample or population version (as long as you don’t mix them). You should get the same result.
  I don’t know why p and q are ratios and the others are not. It is just the way the math works out.
  Charles
  Reply
  - James
    
    March 4, 2025 at 6:45 pm
    
    It is because you then SUM the p*q values, so the numerator also reflects the number of items in the scale
    Reply
    - Charles
      
      March 5, 2025 at 7:24 pm
      
      Hi James,
      You seem to be responding to a comment I made, but I can’t find the comment. When did I send it?
      Charles
      Reply