logistic regression

Understanding Deviance Residuals

If you have ever performed binary logistic regression in R using the glm() function, you may have noticed a summary of “Deviance Residuals” at the top of the summary output. In this article we talk about how these residuals are calculated and what we can use them for. We also talk about other types of […]

Logistic Regression Four Ways with Python

What is Logistic Regression? Logistic regression is a predictive analysis that estimates/models the probability of an event occurring based on a given dataset. This dataset contains both independent variables, or predictors, and their corresponding dependent variable, or response. To model the probability of a particular response variable, logistic regression assumes that the log-odds for the […]

Getting Started with Bootstrap Model Validation

Let’s say we fit a logistic regression model for the purposes of predicting the probability of low infant birth weight, which is an infant weighing less than 2.5 kg. Below we fit such a model using the “birthwt” data set that comes with the MASS package in R. (This is an example model and not […]

ROC Curves and AUC for Models Used for Binary Classification

This article assumes basic familiarity with the use and interpretation of logistic regression, odds and probabilities, and true/false positives/negatives. The examples are coded in R. ROC curves and AUC have important limitations, and I encourage reading through the section at the end of the article to get a sense of when and why the tools […]

A Beginner’s Guide to Marginal Effects

What are average marginal effects? If we unpack the phrase, it looks like we have effects that are marginal to something, all of which we average. So let’s look at each piece of this phrase and see if we can help you get a better handle on this topic. To begin we simulate some toy […]

Getting Started with Binomial Generalized Linear Mixed Models

Binomial Generalized Linear Mixed Models, or binomial GLMMs, are useful for modeling binary outcomes for repeated or clustered measures. For example, let’s say we design a study that tracks what college students eat over the course of 2 weeks, and we’re interested in whether or not they eat vegetables each day. For each student we’ll […]

Simulating a Logistic Regression Model

Logistic regression is a method for modeling binary data as a function of other variables. For example we might want to model the occurrence or non-occurrence of a disease given predictors such as age, race, weight, etc. The result is a model that returns a predicted probability of occurrence (or non-occurrence, depending on how we […]

Visualizing the Effects of Logistic Regression

Logistic regression is a popular and effective way of modeling a binary response. For example, we might wonder what influences a person to volunteer, or not volunteer, for psychological research. Some do, some don’t. Are there independent variables that would help explain or distinguish between those who volunteer and those who don’t? Logistic regression gives […]