Deming Regression Basic Concepts

Basic Approach

In ordinary linear regression, the y_i values are estimated from the x_i values with error ɛ_i. In Deming regression, it is assumed that also the x_i values are estimated with error, which we will denote δ_i.

We further assume that the ɛ_i and δ_i errors are independent of each other and both are distributed normally with a mean of zero. The variance of the ɛ_i values is denoted σ² and the variance of the δ_i values is τ².

The regression model now takes the form

where β₀ and β₁ are the intercept and slope coefficients and ŷ_i and $\hat x_i$ are the estimates of the true values of y_i and x_i, respectively.

Variances

If the values of σ² and τ² are known then we use σ² and τ² as the variances of the ɛ_iand δ_i errors. If they are not known then we must have multiple estimates of the x and y values. Assuming that for each i we have k_i estimates for the x_i and m_i estimates for the y_i, then for each i, we can estimate x_i as the mean of these k_i estimates x_ij and y_i as the mean of these m_i estimates y_ij, i.e.

where n is the sample size. Note that k_i > 1 and m_i > 1 , but k_i can be different from m_i.

We can also estimate the values of σ² and τ² by

Note if all the k_i are equal (say to k) and all the m_i are equal (say to m), then each estimate of the variance is equal to the average of the row variances, i.e.

Estimated variances alternative

Property

Property 1: If λ = s²/t², then the estimates of the coefficients that minimize the Deming sum of squares

are given by

where x̄ and ȳ are the means of the x_i and y_ivalues respectively, and

Formulas for u v r

Note: In some references, λ is defined as the reciprocal of the value shown in Property 1.

Definitions

Definition 1: The estimated true values for the x_i and y_i are then calculated by

where the e_i are the (raw) residuals

Definition 2: In addition to the raw residuals defined above, there are the following additional types of residuals:

Example (known variances)

Example 1: Find the Deming regression equation for the data in columns A, B, and C of Figure 1. Here the variance of the measurements for the x values is known to be .05 and the variance for the y values is known to be .02.

Figure 1 – Calculation of regression coefficients

Using the formulas described above we see that the regression formula is

y = -.1708 +1.018x

We can characterize the sample data and residuals as described in Figure 2 using the formulas in Definitions 1 and 2.

Figure 2 – Residuals Report

For example, the formula in cell M6 is =F$12+K6*F$13 (referring to Figure 1), the formula in cell P6 is =L6-M6, the formula in cell N6 is =K6+$F$11*$F$13*P6/($F$11*$F$13^2+1), and the formula in cell O6 is =L6-P6/($F$11*$F$13^2+1). The formula shown in cell Q6 is =K6-N6, the formula in cell R6 is =L6-N6, and, finally, the formula in cell S6 is =SIGN(P6)*SQRT(Q6^2+$F$11*R6^2).

Note, further, that the means of these residuals (as shown in row 16) are all close to zero, as expected.

Testing Residuals for Normality

One of the assumptions for Deming regression is that the residuals are normally distributed. We test the optimized residuals (range P6:P15) for normality using a QQ plot and Shapiro-Wilk, as shown in Figure 3. Both tests confirm that the residuals are normality distributed.

Figure 3 – Testing optimized residuals for normality

Example (unknown variances)

Example 2: Find the Deming regression equation for the data in Figure 4.

Figure 4 – Deming Regression Data

This time note that there are 3 measurements for each x value and 2 measurements for each y value. We, therefore need to calculate the x and y variances for each subject in order to calculate lambda. In order to carry out the Deming regression, we also need to take the mean of the x and y measurements for each subject. This is shown on the left side of Figure 5.

Figure 5 – Deming regression

For example, cell H4 contains the formula =DEVSQ(B4:D4), cell I4 contains =DEVSQ(E4:F4), cell L4 contains =AVERAGE(B4:D4), and cell M4 contains the formula =AVERAGE(E4:F4).

Cells L15 and M15 contain the variances for x and y values, as calculated by the worksheet formulas =SUM(H4:H13)/(COUNT(B4:D13)-A13) and =SUM(I4:I13)/(COUNT(E4:F13)-A13). The value of lambda shown in cell P10 is calculated as 47.5/19.7 = 2.411168.

Using the data in columns K, L, and M we can calculate the regression coefficients exactly as we did in Example 1. The regression equation is y = -15.9117 + .772981x.

Worksheet Functions

Real Statistics Functions: For array or range R1 containing X values and R2 containing y values, we have two forms of the following two array functions. The first version of each corresponds to data as in Example 1 where lambda is known and the second where lambda is calculated from the data as in Example 2.

DRegCoeff(R1, R2, λ, lab) = 2 × 2 array consisting of the intercept and slope coefficients and standard errors for Deming regression on the data in R1 and R2 where lambda = λ.

DRegCoeff(R1, R2,, lab) = DRegCoeff(R3, R4, λ, lab) where R3 consists of the averages of the X data measurements in R1, R4 consists of the averages of the Y data measurements in R2 and the λ is calculated as in Example 2.

DRegResiduals(R1, R2, λ, lab) = n × 7 array consisting of pred y, x-hat, y-hat, raw residual, x-residual, y-residual and optimized residual for each pair of data elements in R1 and R2 based on the Deming regression on the data in R1 and R2 where lambda = λ and n = the number of elements in R1 (or R2). It is assumed that R1 and R2 are column arrays.

DRegResiduals(R1, R2,, lab) = DRegResiduals(R3, R4, λ, lab) where R3 consists of the averages of the X data measurements in R1, R4 consists of the averages of the Y data measurements in R2 and the λ is calculated as in Example 2.

If lab = TRUE (default FALSE), then an extra column is appended to the output from DRegCoeff containing the labels “intercept” and “slope”. Similarly, If lab = TRUE (default FALSE), then an extra row is appended to the output from DRegResiduals with the labels shown in range M5:S5 of Figure 2.

Non-array function

In addition, we have the following non-array function:

DRegLambda(R1, R2) = the lambda value calculated from R1 and R2 as described in Example 2.

Note that for Example 1, the array formula =DRegCoeff(B4:B13, C4:C13, 2.5) produces the coefficients shown in F12:F13 of Figure 1. For Example 2, =DRegCoeff(B4:D13, E4:F13) produces the coefficients shown in P11:P12 of Figure 5. Also for Example 2, =DRegLambda(B4:D13, E4:F13) produces the result shown in cell P10 of Figure 5.

For Example 1, =DRegResiduals(B4:B13, C4:C13, 2.5, TRUE) produces the output shown in range M5:S15 of Figure 2.

Conclusion

We have now shown how to calculate the regression coefficients in the case where the measurement variances are known (Example 1) and when they need to be estimated from the data (Example 2). See Jackknifing for how to calculate the standard error of these coefficients by using a technique called jackknifing.

Examples Workbook

Click here to download the Excel workbook with the examples described on this webpage.

References

NCSS (2016) Deming regression
https://www.ncss.com/wp-content/themes/ncss/pdf/Procedures/NCSS/Deming_Regression.pdf

Tellinghuisen, J. (2020) Least squares methods for treating problems with uncertainty in x and y. Anal. Chem. 2020, 92, 16, 10863–10871
https://pubmed.ncbi.nlm.nih.gov/32678579/

Mandel, J. (2018) Fitting Straight Lines When Both Variables are Subject to Error
https://www.tandfonline.com/doi/abs/10.1080/00224065.1984.11978881

43 thoughts on “Deming Regression Basic Concepts”

GERARDO ARDILA

September 25, 2023 at 12:58 am

Dear Charles, I hope you and your family are ok. Please i am doing Demkin Revgression, but the resoult are:

Deming Regression
alpha 0,05
coeff std err df t stat p-value lower upper
#¡VALOR! #¡VALOR! #¡VALOR! 9 #¡VALOR! #¡VALOR! #¡VALOR! #¡VALOR!
#¡VALOR! #¡VALOR! #¡VALOR! 9 #¡VALOR! #¡VALOR! #¡VALOR! #¡VALOR!

Hypothesis Testing
alpha 0,025
test param std err df t stat p-value lower upper
slope = 1 #¡VALOR! #¡VALOR! 9 #¡VALOR! #¡VALOR! #¡VALOR! #¡VALOR!
identity #¡VALOR! #¡VALOR! 9 #¡VALOR! #¡VALOR! #¡VALOR! #¡VALOR!

coeff std err df t stat p-value lower upper
#¡VALOR! #¡VALOR! #¡VALOR! 9 #¡VALOR! #¡VALOR! #¡VALOR! #¡VALOR!
#¡VALOR! #¡VALOR! #¡VALOR! 9 #¡VALOR! #¡VALOR! #¡VALOR! #¡VALOR!
Please help me.
Thanks
Reply
- GERARDO ARDILA
  
  September 25, 2023 at 1:03 am
  
  Excuseme, I have had an error.Thanks
  Reply
  - Charles
    
    September 25, 2023 at 10:50 am
    
    Hello Gerardo,
    We are all doing well. I hope that you and your family are ok as well.
    If you are getting error values, please send me an Excel file with your data so that I can try to figure what is causing the error.
    Charles
    Reply
Blair S

June 29, 2023 at 11:55 pm

I’m trying to understand how the variance is determined for the x and y variables in figure 5. I can’t seem to duplicate the values that are generated in the example.
Reply
- Charles
  
  June 30, 2023 at 7:29 pm
  
  Hello Blair,
  Have you downloaded the Excel spreadsheet containing the example described on the webpage? This spreadsheet should explain the calculations. You can click on the link towards the bottom of the webpage.
  Charles
  Reply
  - Blair S
    
    June 30, 2023 at 10:42 pm
    
    Thanks Charles
    Reply
Tony Kwan

April 16, 2023 at 12:20 pm

Dr. Zaiontz,
thanks for the tutorial and detailed illustration. I am wondering how b0 and b1 can be calculated in Fig 1 without doing minimization (such as using the Solver in excel)?
Reply
- Charles
  
  April 17, 2023 at 7:48 pm
  
  Hi Tony,
  Probably so. I have used Solver for many types of regression, as illustrated elsewhere on this website, but haven’t tried it for Deming Regression.
  Charles
  Reply
  - Tony Kwan
    
    April 18, 2023 at 8:39 pm
    
    Dr. Zaiontz,
    sorry for not being clear in my question, my question was why the fig 1 calculation does not appear to require a minimization step? I don’t mean it has to be Solver. My question was why there was no minimization step in Fig 1 when b0 and b1 are calculated. shouldn’t a regression always look for a pair of b0 and b1 that minimizes the summed deviations?
    thanks
    Reply
abrar wadera

December 30, 2022 at 10:32 pm

Dear Sirs
There is one more small error in line below this

“We can also estimate the values of σ2 and τ2 by” line 25 approximately

you are replacing σ2 by s2 and τ2 by t2, i have also sent a mail for the same.

rgds
abrar
Reply
- Charles
  
  January 2, 2023 at 7:23 pm
  
  Hello Abrar,
  Sorry, but I don’t understand what the error is. Sigma squared is indeed estimated by s squared.
  Charles
  Reply
  - abrar wadera
    
    January 4, 2023 at 5:09 am
    
    ‘We can also estimate the values of σ2 and τ2 by’
    In this line you define σ2 and τ2
    
    below line above you write as s-square s2 intead of σ2, it is there just concentrate a bit.
    
    rgds
    abrar
    Reply
    - Charles
      
      January 5, 2023 at 11:53 am
      
      Sorry, but I don’t understand your comment. sigma-squared is the population parameter, while s-square is the sample statistic used as an estimate of sigma-squared.
      Charles
      Reply
abrar wadera

June 8, 2022 at 11:36 pm

Dear Sirs Thanks for such a real explanation for Demming Regression, however there seem to be some typos as in these two lines ……

Note, further, that the mean of these residuals are all close to zero (see row 30), as expected.
test the optimized residuals (range P20:P29) for normality using a QQ plot and Shapiro-Wilk, as

where is row 30
where is range P20:P29

Kindly clarify
Reply
- Charles
  
  June 9, 2022 at 9:44 am
  
  Thank you for bringing these typos to my attention.
  I have just corrected the webpage
  row 30 => row 16
  range P20:P29 => P4:P15
  I appreciate your help in improving the quality of the Real Statistics website.
  Charles
  Reply
  - abrar wadera
    
    June 10, 2022 at 5:45 am
    
    Thanks a lot, your work is superb, some, i think you should have typed P6-P15 instead of P4:P15, but it is now meaningful, i think you should correct it for new commers leave sites if they see any error.
    Reply
    - Charles
      
      June 10, 2022 at 9:15 am
      
      Yes, it has been corrected to P6:P15.
      Charles
      Reply
Elias Ikonen

March 25, 2022 at 2:24 pm

Dear Charles,

I was struggling to understand how deming regression works, but when I found this page it took about an hour.

Thank you very much for you wonderful webpage!

-Elias
Reply
- Charles
  
  March 26, 2022 at 7:34 am
  
  Hello Elias,
  Glad that I could help.
  Charles
  Reply
Igor Lima

November 2, 2021 at 6:12 am

Dear Charles, good morning!

Have you ever thought about turning your website into a book? I would buy! I learned more about some Statistics topics from you than from other teachers! You are extremely didactic!

A big hug from Brazil, Igor
Reply
- Charles
  
  November 2, 2021 at 5:17 pm
  
  Igor,
  Thank you for your very kind words and for your suggestion.
  I originally wrote the website as a book and planned to publish it, but ultimately I decided against it since I was constantly updating it and adding content, something I couldn’t do with a book.
  Charles
  Reply
Julian G.C.

September 28, 2021 at 6:39 pm

Charles,
Thank you very much for your webpage and your package, I found it very useful!
I have one question, just to be sure. When I use the Deming regression tool in Excel, lambda is defined as the quotient of s2/t2 as in this page?
I ask because in Saylor et al, 2006, they define the quotient as t2/s2.
My results using your package seem more compatible with your definition, but I need to make sure. Which reference do you suggest to check the formulas (in addition of the obscure book by Deming itself)?
Cheers,
Julian

Saylor, R., Edgerton, E., & Hartsell, B. (2006). Linear regression techniques for use in the EC tracer method of secondary organic aerosol estimation. Atmospheric Environment, 40(39), 7546–7556. https://doi.org/10.1016/j.atmosenv.2006.07.018
Reply
- Charles
  
  October 3, 2021 at 9:18 pm
  
  Hello Julian,
  I am using the following reference:
  NCSS (2016) Deming regression
  https://ncss-wpengine.netdna-ssl.com/wp-content/themes/ncss/pdf/Procedures/NCSS/Deming_Regression.pdf
  Charles
  Reply
Andreas

September 6, 2021 at 5:54 pm

Dear Charles,
For comparison’s reason, I’m looking for some kind of “coefficient of determination” to serve as an analogy for the least squares R2.

Which formula can be used to calculate R2 for the Deming regression?
Thank you in advance!

Kind regards,
Andreas
Reply
- Charles
  
  September 7, 2021 at 9:51 am
  
  Dear Andreas,
  See
  https://www.stata.com/statalist/archive/2008-10/msg00284.html
  https://www.graphpad.com/guides/prism/latest/curve-fitting/reg_qa_deming_regression.htm
  Charles
  Reply
  - Andreas
    
    September 7, 2021 at 5:19 pm
    
    Dear Charles,
    Thank you very much! You helped me a lot.
    Kind regards,
    Andreas
    Reply
sanju gautam

November 20, 2020 at 10:09 am

how to calculate variance error if we have dataset like example 1
Reply
- Charles
  
  November 20, 2020 at 7:00 pm
  
  How to do this is described at the end of this webpage or see
  Jackknifing
  Charles
  Reply
Mehmet Aslan

June 18, 2020 at 3:42 pm

Hi there,
In simple Deming regression, how did we find var values as 0,05 and 0,02? Can you describe?
Reply
- Charles
  
  June 19, 2020 at 9:53 am
  
  Mehmet,
  It is not stated where these values come from. You can consider them to be assumptions. In reality, they probably came from previous testing.
  Charles
  Reply
JMS Martins

January 11, 2020 at 6:22 pm

Please correct: The formula shown in cell Q6 is =K20-N6, the formula in cell R6 is =L20-N6

should be:
The formula shown in cell Q6 is =K6-N6, the formula in cell R6 is =L6-O6

Suggestion: Why not shift fig.2 two lines up, so that it would align with fig.1 (i.e., “Subect” would b in line 3 in both figures)?
Reply
- Charles
  
  January 16, 2020 at 10:50 pm
  
  Hello JMS,
  Thanks for identifying these two typos. I have now made the corrections that you suggested.
  I don’t quite understand what shifting Fig 2 up to lines does.
  Charles
  Reply
  - JMS Martins
    
    January 19, 2020 at 9:51 pm
    
    “I don’t quite understand what shifting Fig 2 up to lines does.”
    
    Nothing! Just keeps corresponding things in the same line of the sheet. Ex.: Subject1 is in line 4 of fig.1 and in line 6 of fig, 2. If you put the information os bothe figures side by side in the same sheet, the alignment of both sections will make it more readable.
    Reply
    - Charles
      
      January 19, 2020 at 10:25 pm
      
      Ok. Thanks for the explanation. I will look into this.
      Charles
      Reply
Cyndi W.

July 16, 2019 at 8:51 pm

Should the λ be τ2 in the fourth paragraph?

“If the values of σ2 and λ are known then we use σ2 and τ2 as the variances of ɛi and δi errors. If they are not known then we must have multiple estimates of the x and y values. Assuming that for each i we have ki estimates for the xi and mi estimates for the yi, then for each i, we can estimate xi as the mean of these ki estimates xij and yi as the mean of these mi estimates yij, i.e.”
Reply
- Cyndi W.
  
  July 16, 2019 at 9:51 pm
  
  Nevermind – figured out my error.
  Reply
Berns Buenaobra

March 31, 2019 at 12:10 am

Another of those sophisticated topic well elucidated. Thanks very much.
Reply
Sabina

July 20, 2018 at 9:03 am

Hello Dr. Zaiontz,

I am trying to perform regression analysis for my data, where the x values have a known error that is not constant across the whole range. Can Deming regression still be applied in this case, or if not what analysis is more appropriate?
Thank you.
Reply
- Charles
  
  July 20, 2018 at 5:17 pm
  
  Sabrina,
  Deming regression requires homogeneity of variances for each of the two types of errors.
  I don’t know how to modify Deming regression when this assumption is not met.
  Charles
  Reply
Hazem Haddad

December 18, 2017 at 5:19 pm

Well done Sir.
Have you published something with such detail for a nonlinear model? Implicit model?
Reply
- Charles
  
  December 19, 2017 at 8:06 am
  
  Hazem,
  No, but if you look at the following webpage you can see how to perform nonlinear regression.
  https://real-statistics.com/regression/exponential-regression-models/exponential-regression-using-solver/
  Charles
  Reply