Kuder and Richardson Formula 20

Basic Concepts

The Kuder and Richardson Formula 20 test checks the internal consistency of measurements with dichotomous choices. It is equivalent to performing the split-half methodology on all combinations of questions and is applicable when each question is either right or wrong. A correct question scores 1 and an incorrect question scores 0. The test statistic is

Kuder-Richardson formula 20

where

k = number of questions

pj = number of people in the sample who answered question j correctly

qj = number of people in the sample who didn’t answer question j correctly

σ2 = variance of the total scores of all the people taking the test = VAR.P(R1) where R1 = array containing the total scores of all the people taking the test.

Values range from 0 to 1. A high value indicates reliability, while too high a value (in excess of .90) indicates a homogeneous test (which is usually not desirable).

Kuder-Richardson Formula 20 is equivalent to Cronbach’s alpha for dichotomous data.

Example

Example 1: A questionnaire with 11 questions is administered to 12 students. The results are listed in the upper portion of Figure 1. Determine the reliability of the questionnaire using Kuder and Richardson Formula 20.

Kuder Richardson Excel

Figure 1 – Kuder and Richardson Formula 20 for Example 1

The values of p in row 18 are the percentage of students who answered that question correctly – e.g. the formula in cell B18 is =B16/COUNT(B4:B15). Similarly, the values of q in row 19 are the percentage of students who answered that question incorrectly – e.g. the formula in cell B19 is =1–B18. The values of pq are simply the product of the p and q values, with the sum given in cell M20.

We can calculate ρKR20 as described in Figure 2.

Kuder Richardson formulas

Figure 2 – Key formulas for worksheet in Figure 1

The value ρKR20 = 0.738 shows that the test has high reliability.

Worksheet Function

Real Statistics Function: The Real Statistics Resource Pack provides the following function:

KUDER(R1) = KR20 coefficient for the data in range R1.

For Example 1, KUDER(B4:L15) = .738.

KR-21

Where the questions in a test all have approximately the same difficulty (i.e. the mean score of each question is approximately equal to the mean score of all the questions), then a simplified version of Kuder and Richardson Formula 20 is Kuder and Richardson Formula 21, defined as follows:

image7098

where μ is the population mean score (obviously approximated by the observed mean score).

For Example 1,  μ = 69/12 = 5.75, and so

image7099

Note that ρKR21 typically underestimates the reliability of a test compared to ρKR20.

Examples Workbook

Click here to download the Excel workbook with the examples described on this webpage.

References

Wikipedia (2012) Kuder-Richardson formulas
https://en.wikipedia.org/wiki/Kuder%E2%80%93Richardson_formulas

393 thoughts on “Kuder and Richardson Formula 20”

  1. Hi Charles,

    How does one interpret a negative result of KR20?

    To get such an example, I downloaded your example worksheet. Then I put 1 everywhere, except for the diagonal, so that Student 1 has only mistake on Q1, Student 2 has only mistake on Q2, and so on. Student 12 has everything correct.
    Then the spreadsheet computes KR20 to be -11.
    The variance in the term 1-Σpq/var is a very small number, while Σpq is relatively large. So the Excel is correct, but the formula for KR clearly allows the result to be negative.

    Reply
      • Thanks!
        Note however, that in the example I have outlined the results are quite homogeneous, so intuitively, one would expect the test to be strange (with extremely high marks) but consistent. The formula says otherwise. Why?

        Reply
        • Hi Jerzy,
          I don’t know why the test produces such a counter-intuitive result. To me this shows the limit of the test.
          Note that if you had used the Guttman test you would get similar results.
          Charles

          Reply
        • I’ve been also asking this myself, and tried to look for some possible reasons and is it possible that the formula on cell B33 is missing some grouping symbols so that it does not become negative?

          Per attached link the cell contains the formula =(B30/(B30-1))*(1-B31/B32)
          wherein the first factor have inside grouping symbol prior to division while the second factor does not, is it safe to use =(B30/(B30-1))*((1-B31)/B32), so that it should not give us the counter intuitive result?

          Reply
  2. If the test was to be a standardised test with 6,000 participants, and those that got each item right was between 3,000 and 4,500 per item. Compute the KR reliability coefficient.

    Reply
  3. Assuming you have five test items responded to by ten examinees on a polytonous test item generate the scores (performance) and determine the reliability coefficient using kuder Richardson formular

    Reply
    • Hello Chioma,
      For this example, there are 12 students, each of whom answered question Q1. A correct answer (for any student) is assigned the value 1 and an incorrect answer is assigned the value 0.
      Charles

      Reply
      • Based on the given information, Charles (or any other student) has either answered the question Q1 correctly or incorrectly. Since a correct answer is assigned the value 1 and an incorrect answer is assigned the value 0, Charles’s answer would either be 1 or 0.

        Without knowing the actual answer or Charles’s specific response, I cannot definitively state what Charles’s answer is. However, if we had access to the actual data or Charles’s answer sheet, we could easily determine whether his answer was 1 (correct) or 0 (incorrect).

        In summary, Charles’s answer for question Q1 is either 1 (correct) or 0 (incorrect), but without additional information, we cannot determine which one it is.

        Reply
  4. Dear professor,

    First, thanks for the effort of this project. I try my best to search the code for KR20 or KUDER(R1) but, i didn’t find any in the dialog box. Could you advise me on this matter.

    Thanks

    Fadzli

    Reply
    • Dear Fadzli,
      For dichotomous data (i.e. consisting only of 0’s and 1’s), KR20 is equivalent to Cronbach’s alpha. You can therefore use the Cronbach’s alpha dialog box.
      Charles

      Reply
    • To determine how the answer in cell B18 was resulted (assuming this is a reference to a spreadsheet context), we would need more context about the specific calculations or functions used in that cell. However, I can provide a general overview of how you might calculate a result in a cell based on the information you’ve given.

      Given that there are 12 students and each student’s answer to question Q1 is either 1 (correct) or 0 (incorrect), you might want to calculate the total number of correct answers, the percentage of correct answers, or some other statistic related to the answers.

      Here are some examples of how you could calculate a result in cell B18 based on the answers in cells B4:B15 (assuming the answers are entered in these cells):

      Total Correct Answers:

      Use the SUM function to add up all the values in cells B4:B15.
      Formula in B18: =SUM(B4:B15)

      Percentage of Correct Answers:

      First, calculate the total number of correct answers using SUM(B4:B15).
      Then, divide this by the total number of students (12) and multiply by 100 to get the percentage.
      Formula in B18: =(SUM(B4:B15)/12)*100

      Average Answer (which would just be the percentage of correct answers divided by 100):

      Use the AVERAGE function to calculate the average value in cells B4:B15.
      Formula in B18: =AVERAGE(B4:B15)
      Note that this will give you a decimal number between 0 and 1, representing the proportion of correct answers. To convert it to a percentage, multiply by 100.

      Other Statistics:

      You could also calculate other statistics like the mode (most common answer), standard deviation, etc., using appropriate functions like MODE.SNGL, STDEV.P, etc.

      Since I don’t have the actual spreadsheet or the specific calculation used in cell B18, I can only provide these general examples. The actual formula or calculation used would depend on what you’re trying to accomplish.

      Reply
  5. Why do you use the variance of the population rather than the variance of the sample? Also, I am confused why the p, q values are represented as a ratio (total correct or incorrect responses divided by the total number of possible correct/incorrect responses per question (e.g. 10/12 = 0.8333) but the variance represents the dispersion of the number of questions answered correctly per participant (not a ratio).

    Reply
    • Hi Katherine,
      You can use either the sample or population version (as long as you don’t mix them). You should get the same result.
      I don’t know why p and q are ratios and the others are not. It is just the way the math works out.
      Charles

      Reply

Leave a Comment