Item Analysis Basic Concepts

Key Concepts

Item analysis is a technique that evaluates the effectiveness of items in tests. Two principal measures used in item analysis are item difficulty and item discrimination.

Item Difficulty: The difficulty of an item (i.e. a question) in a test is the percentage of the sample taking the test that answers that question correctly. This metric takes a value between 0 and 1. High values indicate that the item is easy, while low values indicate that the item is difficult.

Item Discrimination is a measure of how well an item (i.e. a question) distinguishes between those with more skill (based on whatever is being measured by the test) from those with less skill.

Discrimination Index

The principal measure of item discrimination is the discrimination index. This is measured by selecting two groups: high-skill and low-skill based on the total test score. E.g. you can assign the high-skilled group to be those subjects whose score on the entire test is in the top half and the low-skilled group to those in the bottom half. Alternatively, you can designate the high-skilled group to consist of those subjects whose total score is in the top 33% and the low-skilled group to those in the bottom 33%. The discrimination index is then the percentage of subjects in the high-skilled group who answered the item correctly minus the percentage in the low-skilled group who answered the item correctly.

The discrimination index takes values between -1 and +1. Values close to +1 indicate that the item does a good job of discriminating between high performers and low performers. Values near zero indicate that the question does a poor job of discriminating between high performers and low performers. Finally, values near -1 indicate that the item tends to be answered correctly by those who perform the worst on the overall test and incorrectly by those who perform the best on the overall test.

Point-biserial Correlation

Another measure of item discrimination is the point-biserial correlation between the scores on the entire test and the scores on the single item (where 1 = correct answer and 0 = incorrect answer).

Example

Example 1: A 20-question test is given to 18 students. The table in Figure 1 shows the results for question 1 and for the whole test. Calculate the difficulty Df of question 1, its discrimination index D (using the top third vs. the bottom third), and its point-biserial correlation coefficient p.

Item analysis example Excel

Figure 1 – Item Analysis

We calculate the difficulty by Df = SUM(B4:B21)/COUNT(B4:B21) = 11/18 = .611. Since 5 of the top 6 students answered question Q1 correctly and 2 of the bottom 6 got the question right, the discrimination index for Q1 is D = 5/6 – 2/6 = 3/6 = .5. Here, 6 = 18/3. The point-biserial correlation coefficient p = CORREL(B1:B21,C4:C21) = .405.

Observation

In computing the discrimination index the boundary between the high-skilled, medium-skilled, and low-skilled groups is not always so clear. E.g. in Figure 1, the 6th and 7th highest total scores are both 16. So, which one of these should we choose to be in the high-skilled group? In this case, it doesn’t matter since the score for either subject on Q1 is 1, but if one of these had a score of 1 and the other had a score of 0, then we would have to make a decision. For our purposes, we will count the score for Q1 as the average of these, i.e. 0.5. More detail about this matter can be found in Real Statistics Item Analysis Functions.

Examples Workbook

Click here to download the Excel workbook with the examples described on this webpage.

References

Matlock-Hetzel, S. (1997) Basic concepts in item and test analysis. Texas A&M
http://ericae.net/ft/tamu/Espy.htm

Office of Educational Assessment (2016) Understanding item analysis. University of Washington
http://www.washington.edu/assessment/scanning-scoring/scoring/reports/item-analysis/

Vallejo-Elias, J.(2016) Interpretation of discrimination data from multiple-choice test items
No longer available online

Albano, A. (2016) Introduction to educational and psychological measurement, Course Notes. University of Nebraska-Lincoln
https://digitalcommons.unl.edu/cgi/viewcontent.cgi?article=1005&context=prtunl

21 thoughts on “Item Analysis Basic Concepts”

    • They are different approaches to the analysis of questions in an exam. Item analysis is quite straightforward, while item response theory is more complicated.
      Charles

      Reply
  1. hi. i am using the resource tool pax. I can get the calculation for difficulty index correctly. But when I am comparing the discriminant index with the formula, I cant get the answer when the sample size is odd such as 23. I regard the higher group 12 students and another 11 lower group.pls help me.

    Reply
    • Hello Aishah,
      The current implementation of the discriminant index doesn’t permit a 12/11 split. The split depends on a percentage, such as 27%. There may be overlap between the two groups, which will be the case if you use 50% with 23 elements. See the link in the last paragraph of this webpage for more details.
      If you need to use a 12/11 split, you can do this manually, as described on the website.
      Charles

      Reply
  2. Hello Sir, I want to know that how can I find the discrimination index of 5 point Likert scale questionnaire items? Please suggest an appropriate method and how it will be measured i.e., on Excel or SPSS?

    Reply
    • Magaji,
      It depends on what you are referring to. Somethings can be done with Excel directly. Others can be done with the Real Statistics software, although you can also do them with Excel directly but it will take more work.
      Charles

      Reply
  3. Dr buenas tardes, Dr, el item discriminante solo aplica para preguntas de tipo dicotómico? Como se harìa la aplicaciòn para preguntas con màs posiblidades?

    Dr good afternoon, Dr, the discriminant item only applies to questions of the dichotomous type? How would the application be done for questions with more possibilities?

    Reply
  4. How would you do this same process in excel to calculate difficulty/item discrimination if you’re using a Likert scale? So instead of correct/incorrect, you have a range of 1-5 for your responses?

    Reply
    • Sara,
      I don’t know of a standard way of doing this, but here is an approach that might work for you:
      Item Difficulty: Take the mean of all the scores
      Item Discrimination: Instead of using the point serial correlation between the item and total score, use the correlation between the item and total score (or average score for the whole questionnaire).
      Whether this approach is suitable depends on why you want to use such measurements for Likert data.
      Charles

      Reply
  5. If the results of Difficulty is low, let say that it is only at most .250, can we removed the item or to what extent that the item can be removed because it is difficult. Or can we remove an item base on the results of the DF?

    Reply
    • Whether or not you remove it is your decision. There could be good reasons for keeping a difficult question in the test. If the question is difficult and has poor discrimination, then I would remove it.
      Charles

      Reply

Leave a Comment