Randomized Complete Block Design

On this webpage, we discuss blocking and randomized complete block design (RCBD). See also the following related webpages:

Blocking is a technique for dealing with nuisance factors, i.e. a variable which is not of interest, except that it has some influence on the variables that are of interest. For the design described in CRD & RCDB, the Farm is such a nuisance factor since each farm potentially has different levels of moisture, fertility, etc.

In agriculture, a block consists of contiguous plots of land that share the same characteristics (moisture, fertility, acidity, etc.). If, for example, we want to test the difference between different fertilizers on crop yield, we can apply a different fertilizer (the treatment) at random to different plots in the block (therefore controlling for the nuisance factors).

If the nuisance variable is known and controllable, then we use blocking. If the nuisance variable is known but uncontrollable we can use ANCOVA, while if the nuisance variable is unknown and/or uncontrollable then we must reply on randomization to balance out its effect.

We now consider a randomized complete block design (RCBD). Here a block corresponds to a level in the nuisance factor.

The model takes the form:

image103c

which is equivalent to the two-factor ANOVA model without replication, where the B factor is the nuisance (or blocking) factor. As we can see from the equation, the objective of blocking is to reduce the variability of the error term, which results in a more accurate way to detect differences between the treatments.

Note that the one-way ANOVA model corresponds to what is called a completely randomized design (CRD).

In a randomized complete block design, we assign the seeds such that each of the three fields in any farm is assigned a different seed type.

Randomized complete block design

This picture takes the following form when we add the yield:

Randomized complete block design

Actually, the order of the fields within each farm is not important in the analysis, and so we can view the yields per field in the following form:

RCBD Excel format

In fact, we will use the transpose of this picture, so that the treatments will correspond to the columns of the data representation and the rows will corresponding to the blocking factor.

Example 1: A company that plans to introduce a new type of herbicide wants to determine which dosage produces the best crop yield for cotton. Four fields are available for testing with each field having fairly uniform characteristics (size, moisture, fertility, etc.), although there are some differences between the fields. Each field is divided into 6 equal-sized plots, with dosages of 5, 10, 15, 20, 25 and 30 units of herbicide assigned to the plots at random. The yields are as shown in Figure 1.

RCBD example

Figure 1 – Yield based on herbicide dosage per field

We use a randomized complete block design, which can be implemented using Two Factor ANOVA without Replication. A key assumption for this test is that there is no interaction effect. We test this assumption by creating the chart of the yields by field as shown in Figure 2.

Yield chart (RCBD)

Figure 2 – Chart of the yield

We see that the lines for the four fields are roughly parallel, which indicates that the interaction assumption is reasonable.

We now run the Real Statistics Two Factor ANOVA data analysis tool using the data in Figure 1 as input, selecting the Excel input format and inserting 1 in the Number of Rows per Sample field. The main part of the output is shown in Figure 3.

RCBD - two-factor ANOVA

Figure 3 – RCBD using ANOVA

The rows correspond to the blocking factor and the columns correspond to the treatments. We are really only interested in the columns factor, and see that there is a significant difference between the dosages (p-value = 1E-08).

Alternatively, we can use the RCBD Anova data analysis tool to get the same result. Here we press Crtl-m, choose the Analysis of Variance option and then select the Randomized Complete Block Anova option. You now fill in the dialog box that appears as shown in Figure 4.

RCBD dialog box

Figure 4 – RCBD data analysis tool dialog box

The output shown in Figure 5 is very similar to that shown in Figure 3.

RCBD data analysis

Figure 5 – Randomized Complete Block Anova

54 thoughts on “Randomized Complete Block Design”

  1. thank you Mr. Charles.
    If I have 8 different treatment insecticides and 3 replicates against white fly and i need to test the efficacy of the treatments for two season (
    2 years).

    Reply
  2. Charles,
    What if the data show that the potential nuisance factor turns out to be not a problem, resulting in a large p-value for blocks? Would it be better to use a Completely Randomized Design to evaluate the effect of the Treatment Groups? In Example 1, if 2 is subtracted from all yields in Field 1 and 4 is added to all yields in Field 4, the Blocks are no longer significantly different and the effect of herbicide dosage appears to be better defined with a CRD.
    Thanks in advance for your thoughts on this approach,

    Reply
    • Dave,
      CRD is the way to go unless you have restrictions. If the approach that you have suggested yields a CRD that is good provided:
      1. You can still test the hypothesis that you originally wanted to test, and
      2. The approach that you decided to use is not based on any information that you collected based on another test (RCBD, CRD, etc.) that you performed previously
      Charles

      Reply
      • Charles,
        Thanks for these insights. I also discovered your page on Relative Efficiency of RCBD and CRD. It turns out that the Example 1 data have a blocking RE of 3.06, whereas the modified data described above gave a RE of only 1.06, which supports using the CRD instead of the RCBD.

        Reply
  3. Dr. You do an amazing job here. I am trying to compare four okra varieties grown in open field and greenhouse for plant height, length of leaf, length of pod. How do I designe this experiment and what statistical tool will be used to analyse the data?

    Reply
    • Hello Eugene,
      If you are comparing the four varieties based on plant height, leaf length and pod length, then generally you should start by using MANOVA since you have multiple dependent variables.
      The experimental design (completely randomized design, randomized complete block design, etc.) depends on the constraints you have.
      Charles

      Reply
  4. Hi Charles,

    If I have 4 different treatment rooms (4 sections per room but sections did not consider replicate in my case because of air quality study) and have 6-week samples. Since I don’t have any replicates, Should I consider week as a block or replicate? If I will consider week as a block then I can say it as CRBD? Do I have to do one-way ANOVA or TWo-way ANOVA?

    Reply
    • Sorry, but I don’t understand the scenario that you are describing. Why are the sections not replicates? What is the relationship between the 6-week samples and the rooms and sections? What hypothesis are you trying to test?
      Charles

      Reply
      • Dr. Charles,

        Since I have 4 treatment room, one treatment in each room (one is control). Although each room have 4 section in it, we cannot consider as replicate because we are measuring dust concentration and section are not completely sealed plus dust can move from one section to another. Room number is our limitation.

        Hypothesis is dust level will differs after placing treatment. so, if we will consider room as a replicate or block, I got data significantly different. Can I do that or not?
        Room is treatment
        Week is dust measuring or sample taking

        Reply
  5. Sir, my experiment is pot experiment at two locations (Factor C) repeated for two years. At each location there are 2 factors. Factor A has two levels and Factor B has 6 levels. Thus, there are 12 treatments. Each treatment has 3 replications (R).

    So there are 2*6*3 = A*B*R= 36 pots (experimental units) at each location.
    I will use Factor A and B as fixed factor and Factor C as random factor.

    Which design I should use to analyze in Excel?

    I want to see the individual as well as interaction effect of factor A and Factor B om yield and soil microbial population at two different times. I also want to see whether the factor A and B differently influence the variables at different locations.

    Reply
    • Hello,
      If I understand correctly, you have 2 levels for Factor A, 6 levels for Factor B and 2 levels for Factor C, for 2x6x2 = 24 combinations, resulting in 24×3 = 72 data elements (ignoring Time for a moment). This can be analyzed using 3 Factor ANOVA, as described at
      Factorial ANOVA
      But you also mentioned that you have data for two years. This introduces more complexity, requiring the use of Three Factor REpeated Measures ANOVA or Three Factor MANOVA.
      Charles

      Reply
  6. Dear Charles,
    I should analyse a multi-year experiment based on a RCBD:
    I have one experimental factor with 3 levels (3 treatments) and 3 blocks
    the experiment was repeated 3 times (in 3 different years).
    I suppose that the experimental factor is a fixed factor while block and year are random factors.
    I am interested in studying the effect of year and experimental factor on a dependent variable

    I am wondering whether this experiment could be analysed using Excel and how, could you please suggest some ideas?

    thank you very much

    best regards
    Donato

    Reply
  7. Sir, my experiment is similar with Annes’s. I want to ask the same question.

    Metric will be used is survival rate.

    Hypothesis are first –
    Ho: There is a clear difference on the effectiveness of treatments towards towards seeds.
    H1: There is no difference on the effectiveness of treatments towards towards seed.

    Second –
    Ho: there is a clear difference on the survival rates of the two type of seed.
    H1: there is no difference on the survival rates of the two type of seed.

    Reply
    • Maddle,
      If you don’t have any constraints then you can use a one-factor ANOVA (or even a two independent sample t-test if there are only two treatments).
      Charles

      Reply
      • What if I collect binary data (survived & mortal seedlings). Is ANOVA still applicable?
        By the way, I use two type of seeds and 3 type of treatments.

        Reply
        • Maddie,
          Probably not since the normality assumption won’t be met.
          Which test to use depends on the details, particularly what hypotheses do you wish to test and the type of data that you have. E.g. Cochran’s Q Test is a non-parametric test for ANOVA with repeated measures where the dependent variable is dichotomous.
          Charles

          Reply
  8. What experimental method I will use?

    Treatments (6) for both new and old seeds. 10 seeds per treatment, how is that?

    Reply
  9. please i conducted an experiment using RCBD with 6 treatments to find out the the critical period for weed control in finger in my state. i don’t know how to enter the data in excel to conduct my analysis

    Reply
  10. Hi Charles,
    Say I have 30 progenies, each with 3 replications, and 5 plants /replication could you please outline a workflow to follow to generate an RBD for field experiment, how shall i do this in Excel.
    Thanks

    Reply
    • I don’t completely understand the experiment that you are describing. What hypothesis are you testing? How does the experiment differ from that in Example 1?
      Charles

      Reply
  11. I have 100 genotypes in three replications and want to test them for yield and other traits vis a vis check or control variety. Shall I go for CRD or RCBD

    Reply
    • This depends on the details. What hypotheses are you trying to test? What is the relationship between the genotypes and the check or control variety? Are there any constraints in how you conduct your experiment?
      Charles

      Reply
  12. Hi Charles,
    So i am conducting an experiment/study about pasta straws. Is this RCBD or CRD? And what type of statistical app is the best for this kind of study? Thank you.

    Reply
    • Hi Drey,
      It depends on what hypotheses you are trying to test and what constraints you have. In general, the default would be CRD, but constraints on resources might lead you to using RCBD. It depends on the details.
      Charles

      Reply
  13. Hi Charles,
    I have planted 5 different varieties of the same crop species in RCBD, each crop variety in 3 different treatments (seed sizes) to check the effect of seed size on growth and yield. What statistical analysis can I use for this data?

    Reply
  14. Hello,
    Thank you so much for sharing information.
    I plant 10 chilli varieties (3 plants per one variety) inside greenhouse. For this type of experiment, how can I consider which experimental design it is?
    Please could you advise me which kinds of data analysis should I use?
    Thank you so much.

    Reply
  15. Hi Charles,
    Thank you for taking out time to share your knowledge.
    I have data from a RCBD with 5 treatments and 3 replicates. I collected data from four plants per experimental unit. I would like to ask how I can enter this data, get means of each unit and eventually combine these means to analyse the data like you’ve done here.
    Thank you

    Reply
  16. if i have disc plough, disc harrow, mouldboard plough, use for five treatment, what will the block design be

    Reply
    • Hello Inusa Adamu,
      The block design will depend on many factors and so I cant give an answer based on the information that you have supplied.
      Charles

      Reply
  17. Charles,
    I would like to report errors on Figure 3 of the “RCBD w/ 1 missing data element” section.
    For the adjusted RCBD Anova analysis table, the SStotal should be 636.9843 rather than 654.7848. Therefore the SSe should be correctly accordingly as well.

    One question I have is that when I ran the same data (ie, 1 missing cell value) using the “Regression” option of the RCBD analysis tool, the SSgroup (477.1427) is not the same as the adjusted SSgroup (494.9431) based on the RCBD analysis. Should they be the same?

    -Sun

    Reply
    • Charles,
      Please disregard the first part of the error report I made. I found out why the SStotal between my calculation and Figure 3.

      My SStotal is directly from the original data with the one missing cell based on DEVSQ, while the Figure 3 SStotal is obtained from =DEVSQ(the imputed missing value-SS (block correction) – SS(group correction).

      Reply
        • Yes, please. I read your statement on the web stating that the one missing cell RCBD adjusted Anova results are not the same as the RCBD regression results. As I learned now that the SStotal and SSerror are differently calculated (ie, SS block and group correction factor), I wonder whether that was the main reason for the difference. Was that what you meant to describe. If so, it would be more informative to state such on the web.

          Thanks,
          -Sun

          Reply
          • Hello Sun,
            I believe that I stated that the results would be different because when I tried both approaches I got different answers. I suggest that you try both approaches with one missing data element to see where the results differ. (Perhaps you have already done this).
            Charles

  18. Hi charles, could you please help me with this question

    with possible randomizations for a CRD and RCBD, Be sure you assign treatments to the experimental units in such a way that the CRD cannot be mistaken for a RCBD. Treatment=5, replications=3. (code for random number (round(rununif(25), digit=3))

    Reply
  19. Dear Sir,
    I Find this your Update on RCB Design interesting and capture my attention as is part of my thesis.
    please can i have your Private Mobile Number

    Reply
    • Usman,
      I am pleased that you find this webpage interesting. In the next couple of days, I expect to update the information about RCBD to include the cases where some data is missing.
      Sorry, but I don’t give out my private mobile number.
      Charles

      Reply
  20. Assuming that in the course of the conduct of the study, the entry under dosage 5 units of field 3 was attacked by rodents and no data was obtained for this particular entry, how would I calculate the missing observation ?

    Reply
    • Maria,
      I have not yet addressed the issue of missing data for a Randomized Complete Block Design. I am currently in the process of expanding the missing data capabilities of the Real Statistics website and software, especially by using the EM Algorithm. I will try to see if I have enough time to add support for the type of missing data that you have requested.
      Charles

      Update: I plan to include this topic in the next release, which should be available this month.

      Reply
  21. Could you clear the issue of determining the plots in a random pattern after developing random number from excel.
    Scenario: 3 treatments, 3 replicates total of 9 plots

    Reply

Leave a Comment