# Clay Ford

## Getting Started with Bootstrap Model Validation

Let’s say we fit a logistic regression model for the purposes of predicting the probability of low infant birth weight, which is an infant weighing less than 2.5 kg. Below we fit such a model using the “birthwt” data set that comes with the MASS package in R. (This is an example model and not […]

## Mathematical Annotation in R

In this article we demonstrate how to include mathematical symbols and formulas in plots created with R. This can mean adding a formula in the title of the plot, adding symbols to axis labels, annotating a plot with some math, and so on. R provides a $$\LaTeX$$-like language for defining mathematical expressions. It is documented […]

## Comparing Mixed-Effect Models in R and SPSS

Occasionally we are asked to help students or faculty implement a mixed-effect model in SPSS. Our training and expertise is primarily in R, so it can be challenging to transfer and apply our knowledge to SPSS. In this article we document for posterity how to fit some basic mixed-effect models in R using the lme4 […]

## Comparing the accuracy of two binary diagnostic tests in a paired study design

There are many medical tests for detecting the presence of a disease or condition. Some examples include tests for lesions, cancer, pregnancy, or COVID-19. While these tests are usually accurate, they’re not perfect. In addition, some tests are designed to detect the same condition, but use a different method. A recent example are PCR and […]

## Correlation of Fixed Effects in lme4

If you have ever used the R package lme4 to perform mixed-effect modeling you may have noticed the “Correlation of Fixed Effects” section at the bottom of the summary output. This article intends to shed some light on what this section means and how you might interpret it. To begin, let’s simulate some data. Below […]

## A Beginner’s Guide to Marginal Effects

What are average marginal effects? If we unpack the phrase, it looks like we have effects that are marginal to something, all of which we average. So let’s look at each piece of this phrase and see if we can help you get a better handle on this topic. To begin we simulate some toy […]

## Power and Sample Size Analysis using Simulation

The power of a test is the probability of correctly rejecting a null hypothesis. For example, let’s say we suspect a coin is not fair and lands heads 65% of the time. The null hypothesis is the coin is not biased to land heads. The alternative hypothesis is the coin is biased to land heads. […]

## Post Hoc Power Calculations are Not Useful

It is well documented that post hoc power calculations are not useful (Goodman and Berlin 1994, Hoenig and Heisey 2001, Althouse 2020). Also known as observed power or retrospective power, post hoc power purports to estimate the power of a test given an observed effect size. The idea is to show that a “non-significant” hypothesis […]

## Understanding Ordered Factors in a Linear Model

Consider the following data from the text Design and Analysis of Experiments, 7 ed (Montgomery, Table 3.1). It has two variables: power and rate. Power is a discrete setting on a tool used to etch circuits into a silicon wafer. There are four levels to choose from. Rate is the distance etched measured in Angstroms […]

## Getting Started with Generalized Estimating Equations

Generalized Estimating Equations, or GEE, is a method for modeling longitudinal or clustered data. It is usually used with non-normal data such as binary or count data. The name refers to a set of equations that are solved to obtain parameter estimates (ie, model coefficients). If interested, see Agresti (2002) for the computational details. In […]