Jackknifing for Deming Regression

Jackknife Procedure

To calculate an estimate of the standard error of θ-hat which is an estimate of some parameter θ, we can use a nonparametric procedure called the Jackknife method. This uses the following procedure:

  1. For each element xi in the sample (actually the pair (xi, yi) for Deming regression), calculate θi-hat in the same way as θ-hat was calculated except that the element xi is left out of the calculation.
  2. Calculate theta-hat-star
  3. Calculate the mean of these values, called the jackknifed estimate by using the following formulatheta-hat star mean
  4. The jackknifed estimate of the variance and standard error of is now given by the formulas

jackknife variance

jackknife standard error

It is easy to see that the jackknifed estimate of the variance can also be calculated more simply as n – 1 times the deviation squared of the θi-hat values. Thus

Jackknife standard error

wheretheta-tilde formula

We can use this technique to find confidence intervals for the regression coefficients b0 and b1 based on the t distribution with n–1 degrees of freedom. We can also calculate a confidence interval for the prediction of y for a given value of x in the same manner.

Examples

Example 1: Calculate the standard error for coefficients from Example 1 of Deming Regression Basic Concepts.

First, we define x̄i, ȳi, ui, vi, ri to be the same as x̄, ȳ, u, v, r from Property 1 of Deming Regression Basic Concepts except that the sample pair (xi, yi) is omitted.

We note thatmeans for jackknifing

Sinceequivalent formula deviation squared

it follows thatsum of squares

and so the sum of squared data elements excluding xi is equal to u + nx̄2 – xi2, from which it follows that

deviation squared jackknifing

which can be simplified toJackknifed deviation squared

SimilarlyJackknifed deviation y squared

Sincecovariance equivalence

it follows thatsumproduct jackknifing

and so the sum product of data elements excluding xi and yi is equal to r + nx̄ȳ – xiyi, from which it follows that

formula for r_i

which can be simplified toJackknifed r_i

Using the formulas

Using these formulas, we can calculate the standard errors of the intercept and slope for Example 1 of Deming Regression Basic Concepts as shown in Figure 1.

Standard errors Deming coefficients

Figure 1 – Standard errors of coefficients

Here, the values for x̄, ȳ, u, v, r are shown in range E14:I14, and can be calculated exactly as shown in Figure 1 of Deming Regression Basic Concepts.

Cell E4 contains the formula =($A$13*E$14-B4)/($A$13-1). Note that it could also be calculated by the array formula

=AVERAGE(IF($A$4:$A$13=$A4,””,B$4:B$13))

Similarly, cell F4 contains the formula =($A$13*F$14-C4)/($A$13-1).

Cell G4 can be calculated by the formula

=G$14-$A$13*(E$14-B4)^2/($A$13-1)

or optionally by the array formula

=SUMSQ(IF(A$4:A$13=A4,””,B$4:B$13-E4))

Similarly, cell H4 can be calculated by the formula

=H$14-$A$13*(F$14-C4)^2/($A$13-1)

Cell I4 can be calculated by the formula =I$14-$A$13*(E$14-B4)*(F$14-C4)/($A$13-1) or optionally by the array formula =SUM(IF(B$4:B$13=B4,””,(B$4:B$13-E4)*(C$4:C$13-F4))).

Now we apply the jackknifing procedure. Cell J4 contains the formula =F4-K4*E4 and cell K4 contains the formula =($B$15*H4-G4+SQRT(($B$15*H4-G4)^2+4*$B$15*I4^2))/ (2*$B$15*I4). Finally, cells J16 and K16 contain =DEVSQ(J4:J13)*($A$13-1) and =DEVSQ(K4:K13)*($A$13-1). Cells J17 and L17 contain =SQRT(J16/$A$13) and =SQRT(K16/$A$13), yielding the standard error for the intercept of 1.3375 and for the slope of 0.2193.

The usual regression coefficient table can now be calculated as shown in Figure 2.

Coefficient table

Figure 2 – Regression coefficient table

Note that we use df = n–2 = 10–2 = 8, although some implementations use df = n–1 = 9.

We see from Figure 2 that the slope coefficient is significant.

Observation: When the parameter θ is a smooth function, the delete-1 jackknifing approach described above does a pretty good job in computing the standard error, but when θ is not smooth (i.e. a small change in the sample can result in a large change in the estimate of θ), then this approach doesn’t work very well. An example of such a θ is the median. For such θ, delete-d jackknifing or bootstrapping work better.

Delete-d jackknifing approach

  1. For each subset T with d elements from the sample, calculate θT-hat in the same way as θ-hat was calculated except that the elements in T are left out of the calculation.
  2. Calculate the mean of the θT-hat, labeled θ-tilde, using the following formula where n = the sample sizetheta-tilde formula
  3. The jackknifed estimate of the standard error of θ is now given by the formula

jackknifed standard error

Note that some would replace n – d by n in the above formula.

In practice, it is best to choose a value of d between the square root of n and n.

Examples Workbook

Click here to download the Excel workbook with the examples described on this webpage.

Reference

NCSS (2016) Deming regression
https://www.ncss.com/wp-content/themes/ncss/pdf/Procedures/NCSS/Deming_Regression.pdf

17 thoughts on “Jackknifing for Deming Regression”

  1. Hi, the formula you wrote for the V calculation in J16 is =DEVSQ(J4:J13)*($A$13-1), but, according to the formula you reporte in the upper part of the text (after point 4) it should have been =DEVSQ(J4:J13)/($A$13-1). This difference is not clear to me

    Reply
    • Hi Maurizio,
      I have just checked the results obtained by Real Statistics against those obtained by R on the same data and the variance and standard errors agree.
      It does seem that the formula should use “/” instead of “*”, but apparently this is not the case. I still need to track down why.
      Charles

      Reply
      • Hi, I checked also NCSS and infact they agree, because the SE formula you reported have SQR((n-1)/n…) that is the same used in R and is the resulting from your excel.
        The discrepancy is in text formulas in italic:
        V has 1/n-1
        SE has V/n
        So SE should have 1/((n-1)*n) instead has n/n-1 that seems the correct one

        Reply
        • Hello Maurizio,
          Sorry for the delayed response.
          I am a bit confused about your email. You state that SE should use 1/((n-1)*n) instead has n/(n-1). But you also state that R uses SQR((n-1)/n…), but doesn’t this contradict your previous statement?
          Charles

          Reply
  2. Thank you very much for the materials. I have a question. The value of lamda is 2.5 in the table. How it has come? Please help me.

    Regards,
    Tapas.

    Reply
  3. This example is very useful and informative. However, when I compute the variance and SE of the jackknifed estimates (col j & k), I obtain very different values from those in the table.
    var(b0) = .2208
    se(b0) = .1486
    var(b1) = .005938
    se(b1) = .02437

    Shouldn’t cells j16 and k16 be divided by 9, not multiplied by 9?

    Thank you.

    Reply
    • Hi Tim,
      These cells should be multiplied by 9 as shown in step 2 of the webpage.
      I have checked the results with those generated by the NCSS software package and they agree.
      Charles

      Reply
  4. This website has been a life saver for me. I’ve been tasked with trying to replicate the results of a Deming Regression performed by a commercial product in excel and thanks to this website I have successfully calculated the slope and intercept. Is there another method (besides jack knifing) for determine the standard error of the slope and intercept. When I use the jack knife procedure as described i get the same 95% CI as the excel addon “Analyse-it” but the 95% CI from the commercial statistical software i’m using is much wider for the intercept (it is similar for the slope). I tried reading digging into the software documentation to see if they give information on how the CI is calculated but so far no luck.

    Any guidance that anyone has would be greatly appreciated!

    Reply

Leave a Comment