Log-linear Regression

Background

In Linear Regression Models for Comparing Means and ANOVA using Regression we studied regression where some of the independent variables were categorical. In this part of the website, we look at log-linear regression, in which all the variables are categorical. Log-linear regression provides a new way of modeling chi-squared goodness of fit and independence problems (see Independence Testing and Dichotomous Variables and Chi-square Test for Independence).

Log-linear model

We use the following model:

image2235

where all the xij are dummy variables coded to represent categorical variables. In addition, we also consider more complicated models that contain factors consisting of interactions between these variables, as described in the webpages listed below, and the yi are used to express the frequency of outcomes.

We will consider the cases where k = 2 or 3. The case where k = 2 corresponds to the two-way contingency tables studied in Independence Testing and re-examined in Two-way Contingency Tables. The case where k = 3 corresponds to three-way contingency tables, which are examined in Three-way Contingency Tables.

Topics

Reference

Howell, D. C. (2010) Statistical methods for psychology (7th ed.). Wadsworth, Cengage Learning.
https://labs.la.utexas.edu/gilden/files/2016/05/Statistics-Text.pdf

15 thoughts on “Log-linear Regression”

  1. Regarding my previous question about log linear regression, or log linear analysis never mind. I found my answer (it is I cannot use it). Participants can only contribute one observation to the data (i.e. they cannot be their own controls)

    Reply
  2. I am trying to determine whether one screening blood test (measuring hemoglobin A1C) is better than another blood test (blood glucose measurement while fasting, or while not fasting) at identifying people to be referred (according to guidelines) for care who may have either pre-diabetes or diabetes. My problem is similar to the example involving cure, gender and therapy in that the variable Referral is dichotomous and can be positive (i.e. the patient is referred) or negative (the patient is not referred); the variable Disease is dichotomous and either pre-diabetes or diabetes; and the variable Test can be one of three (A1C, Blood glucose fasting, blood glucose not fasting). However, in my problem the subject will serve as their own control. Can i still use log linear regression to determine which test is better?

    Reply
  3. I was told to use a log linear relationship to solve a multiple regression problem involving 10 different y values ranging from 10- 30 million and 10 different x values where x1 ranges from 1-2, x2 ranges from 30-71 and x3 ranges from 444-687. How do I do this with Xcel?

    Reply
  4. Dr.bs tardes, Dr excuseme la molestia, que modelo me sugire si tengo en las variables indepientes una continua y varias dummy, y en la dependiente una variable continua?.

    Dr good evening. Dr. excuseme which multiple model you could suggest me when it have some dummy variables and one continuos varible? The dependent variable is continuous.

    Reply
  5. Hi,

    I’m currently taking a bio statistic course online and of course its very difficult. What makes it difficult is that on some variables data is missing. Also I have to figure out which statistical analysis is best suited. I know I calculate it with the missing data or I can calculate the mean and use that in place of the missing data. I’m just not sure which route to go. I have to select the appropriate statistical test that allows me to conduct an analysis of the factors affecting hospital costs. can I sent the workbook for better clarification?

    Reply
  6. Dear Sir,
    I have learnt a lot from your literature on and examples of using logistic regressions. I am analyzing a community’s satisfaction on the performance of a project (dependent variable), with 5 independent variables that are categorical; in fact, they are rated on a scale of 1 to 4(1 = highly unsatisfactory; 4 = highly satisfactory). Plus, I have 3 dummy variables to throw into model. How will I model this using logistic regression? I have read your notes on log-linear regression, but I still can’t figure out how to apply it.

    Looking forward to your reply.

    Reply
    • Samuel,

      I understand that you are using a 1 to 4 rating for “community’s satisfaction on the performance of a project (dependent variable).” If so then you probably want to use an ordinal logistic regression model.

      You can’t use the binary logistic regression model since you have 4 (and not 2) values for the dependent variable. You could use a multinomial model, but this wouldn’t take the order of the ratings into account.

      You can’t use a log-linear regression model since the dependent variable doesn’t take continuous values. Think of the log-linear regression model as an extension of chi-squares independence testing.

      Charles

      Reply
  7. My name is Mamuye from Ethiopia, I need to evaluate the community’s satisfaction in their irrigation service using Logit Model with five independent variables, I have understood what and how I am going to do, as I saw different literature’s how the model parameters will be estimated ,but it make gap, so how this parameters ( the constant value and coefficients ) will be estimated.

    Reply

Leave a Comment