Logistic Regression

When the dependent variable is categorical it is often possible to show that the relationship between the dependent variable and the independent variables can be represented by using a logistic regression model. Using such a model, the value of the dependent variable can be predicted from the values of the independent variables.

We review here binary logistic regression models where the dependent variable only takes one of two values. In Multinomial and Ordinal Logistic Regression we look at multinomial and ordinal logistic regression models where the dependent variable can take two or more values.

We also review a model similar to logistic regression called probit regression.

Topics

References

Howell, D. C. (2010) Statistical methods for psychology (7th ed.). Wadsworth, Cengage Learning.
https://labs.la.utexas.edu/gilden/files/2016/05/Statistics-Text.pdf

Christensen, R. (2013) Logistic regression: predicting counts.
http://stat.unm.edu/~fletcher/SUPER/chap21.pdf

Wikipedia (2012) Logistic regression
https://en.wikipedia.org/wiki/Logistic_regression

Agresti, A. (2013) Categorical data analysis, 3rd Ed. Wiley.
https://mybiostats.files.wordpress.com/2015/03/3rd-ed-alan_agresti_categorical_data_analysis.pdf

164 thoughts on “Logistic Regression”

  1. Hello Dr., Congrats on a wonderful app and thanks for generosity in sharing it. I am teaching myself stats, so I don’t know much. But I am trying to do a logistic regression. I have a Y variable and 6 x variables. All are in binary variables (0, 1). When I click OK on logistic regression function it gives me an error message about run time error. It aborts and says Type mismatch. Can you help me know what I am doing wrong?

    Reply
    • Hi Paul,
      What do you see when you insert the formula =VER() in any cell?
      If you email me a file with your Excel spreadsheet, I will try to figure what is causing the error?
      Charles

      Reply
  2. Hi Charles,
    I hope you are well. Thank you for the wonderful excel plug-in that you have designed. I find it really helpful. I was hoping you could help me with a query.
    I’m trying to perform a logistic regression to identify the need for revision surgery in a group of patients that have had one surgery already. I have 4 independent variables of which 2 variables are numerical. The third is a categorical variable with 5 classes from 1 through 5. The fourth variable is a score derived by adding points assigned for various abnormalities identified on an X-ray.
    My doubts are as below:
    1. Is it ok to consider the 4th variable numerical ?
    2. Can I use categorical variable by giving it numbers 1 through 5 in the logistic regression equation or does it have to be binary?

    Reply
    • Hello Bhushan,
      1. It sounds like the 4th variable is numerical, but I would have to understand better how you add points for the abnormities.
      2. It sounds like your output is a categorical (dependent) variable that takes the value 1 if the patient needs revision surgery and 0 if not. You can use independent variables that are categorical that are not binary. You can also use independent variables that take Likert values (1 to 5), treating such ordinal variables simply as numerical variables.
      Charles

      Reply
  3. Dr Zaiontz,
    First off, I really do appreciate your development and free download on XREALSTATS. It has been wonderful!
    I have a need to perform a pooled logistic regression (i.e., the independent variable values are time dependent). Will XREALSTATS handle this scenario and how would the variables be loaded into Excel?
    Thanks,
    Russ

    Reply
    • Hell Russ,
      If by pooled, you mean that you can ignore the assumption of independence of the observations (due to the time), then you can use ordinary logistic regression. If, as I expect, you are looking for panel analysis for logistic regression, then currently Real Statistics does nor support this scenario.
      Charles

      Reply
  4. Hi Dr Zaionitz

    I had a question about sample size calculation for a clinical trial. My hypothesis is: “Colchicine lowers the recurrence rate of pericarditis in patients with SLE. “

    The Y-variable is binary. There are multiple independent X-variables like age, bmi, sex, placebo or colchicine…

    I am planning on using logistic regression to analyze, but I’m having issues finding the right formula for the sample size calculation.

    Thank you
    Do you have a recommendation?

    Reply
  5. Hello, I have a quick question regarding the logistic regression outputs.

    Does the logistic regression automatically standardize my numeric observations, or would I have to firstly change my data using the STANDARDIZE function, and then produce the logistic regression?

    Thanks,

    Wadi Luca Watfa

    Reply
  6. Hi, Dr Zaionitz

    After running the model on the training data (the one that have dependent variables) , how to use the results to predict the probability of the unlabeled data?

    Thank you in advance

    Reply
  7. Hello Charles,

    I really need your help, Can you help me explain about “Logistic and Probit regression” and “Multinomial logistic Regression” when we should use one of them.

    For details, I have a dataset with only one Independent variables is quantitative variables and one dependent variables ( 0 and 1 values) for simple logistic regression.

    Also, I have a dataset with quantitative, binary, nominal, ordinal variables for Independent variables and one dependent variables ( 0 and 1 values) for multiple logistic regression.

    Can you please let me know what I should use and how I can use for both of them. I can’t find the docs for my problems. My english not really good I’m sorry about that.

    Reply
    • Hello Yen Nguyen,
      In either of these cases, since you have one dependent variable that only takes the values 0 or 1, logistic regression or Probit regression could be used.
      Charles

      Reply
  8. Dear Dr Zaionitz, i am waiting that you and your family are ok. Excuseme how can i see the Hosmer Stattistics, at the Logistic Regression?, the 7.6 version had this statitistics, and this one not.

    Thanks a lot.

    Reply
    • Hello Gerardo,
      We are all fine. I hope the same is true for you and your family. Covid has caused a lot of disruption and stress, but fortunately we are all fine.
      I removed the Hosmer statistic from the logistic regression tool for two reasons:
      1. The Hosmer statistic is not really that useful
      2. The version of the statistic that was included in previous releases was only correct in very limited cases.
      Charles

      Reply
  9. Hello Charles,

    I am unable to add in the realstat plugin. getting an error while doing so.
    I am using 365. Can you please help.

    Thanks
    Ankit

    Reply
  10. I wanted to do a binary logistic regression however can only see an option for logistic and probit regression, can I use this test?

    Reply

Leave a Comment