Clay Ford

Getting Started with Analysis of Covariance

The Analysis of Covariance, or ANCOVA, is a regression model that includes both categorical and numeric predictors, often just one of each. It is commonly used to analyze a follow-up numeric response after exposure to various treatments, controlling for a baseline measure of that same response. For example, given two subjects with the same baseline […]

Getting Started with Simple Slopes Analysis

A Simple Slopes Analysis is a follow-up procedure to regression modeling that helps us investigate and interpret “significant” interactions. The analysis is often employed for interactions between two numeric predictors, but it can be applied to other types of interactions as well. To motivate why we might be interested in this type of analysis, consider […]

Simulating Multinomial Logistic Regression Data

In this article we demonstrate how to simulate data suitable for a multinomial logistic regression model using R. One reason to do this is to gain a better understanding of how multinomial logistic regression models work. Another is to simulate data for the purposes of estimating power and sample size for a planned experiment that […]

Understanding Precision-Based Sample Size Calculations

When designing an experiment it’s good practice to estimate the number of subjects or observations we’ll need. If we recruit or collect too few, our analysis may be too uncertain or misleading. If we collect too many, we potentially waste time and expense on diminishing returns. The optimal sample size provides enough information to allow […]

Understanding Semivariograms

I’ve heard something frightening from practicing statisticians who frequently use mixed effects models. Sometimes when I ask them whether they produced a [semi]variogram to check the correlation structure they reply “what’s that?” –Frank Harrell When it comes to statistical modeling, semivariograms help us visualize and assess correlation in residuals. We can use them for two […]

Getting Started with Gamma Regression

In this article we plan to get you up and running with gamma regression. But before we dive into that, let’s review the familiar Normal distribution. This will provide some scaffolding to help us transition to the gamma distribution. As you probably know, a Normal distribution is described by its mean and standard deviation. These […]

Understanding Deviance Residuals

If you have ever performed binary logistic regression in R using the glm() function, you may have noticed a summary of “Deviance Residuals” at the top of the summary output. In this article we talk about how these residuals are calculated and what we can use them for. We also talk about other types of […]

Getting Started with Bootstrap Model Validation

Let’s say we fit a logistic regression model for the purposes of predicting the probability of low infant birth weight, which is an infant weighing less than 2.5 kg. Below we fit such a model using the “birthwt” data set that comes with the MASS package in R. (This is an example model and not […]

Mathematical Annotation in R

In this article we demonstrate how to include mathematical symbols and formulas in plots created with R. This can mean adding a formula in the title of the plot, adding symbols to axis labels, annotating a plot with some math, and so on. R provides a \(\LaTeX\)-like language for defining mathematical expressions. It is documented […]