Stata

Understanding Robust Standard Errors

What are robust standard errors? How do we calculate them? Why use them? Why not use them all the time if they’re so robust? Those are the kinds of questions this post intends to address. To begin, let’s start with the relatively easy part: getting robust standard errors for basic linear models in Stata and […]

Stata Basics: foreach and forvalues

There are times we need to do some repetitive tasks in the process of data preparation, analysis or presentation, for instance, computing a set of variables in a same manner, rename or create a series of variables, or repetitively recode values of a number of variables. In this post, I show a few of simple […]

Stata Basics: Reshape Data

In this post, I use a few examples to illustrate the two common data forms: wide form and long form, and how to convert datasets between the two forms – here we call it “reshape” data. Reshaping is often needed when you work with datasets that contain variables with some kinds of sequences, say, time-series […]

Stata Basics: Combine Data (Append and Merge)

When I first started working with data, which was in a statistics class, we mostly used clean and completed dataset as examples. Later on, I realize it’s not always the case when doing research or data analysis for other purposes; in reality, we often need to put two or more dataset together to be able […]

Stata Basics: Subset Data

Sometimes only parts of a dataset mean something to you. In this post, we show you how to subset a dataset in Stata, by variables or by observations. We use the census.dta dataset installed with Stata as the sample data. Subset by variables * Load the data > sysuse census.dta (1980 Census data by state) […]

Stata Basics: Create, Recode and Label Variables

This post demonstrates how to create new variables, recode existing variables and label variables and values of variables. We use variables of the census.dta data come with Stata as examples. -generate-: create variables Here we use the -generate- command to create a new variable representing population younger than 18 years old. We do so by […]

Stata Basics: Data Import, Use and Export

In Stata, the very first step of analyzing a dataset should be opening the dataset in Stata so that it knows which file you are going to work with. Yes, you can simply double click on a Stata data file that ends in .dta to open it, or you can do something fancier to achieve […]

Using and Interpreting Cronbach’s Alpha

I. What is Cronbach’s alpha? Cronbach’s alpha is a measure used to assess the reliability, or internal consistency, of a set of scale or test items. In other words, the reliability of any given measurement refers to the extent to which it is a consistent measure of a concept, and Cronbach’s alpha is one way […]

Getting Started with Quantile Regression

When we think of regression we usually think of linear regression, the tried and true method for estimating a mean of some variable conditional on the levels or values of independent variables. In other words, we’re pretty sure the mean of our variable of interest differs depending on other variables. For example the mean weight […]

Stata Tip: Name Your Graphs

An important component of data analysis is graphing. Stata provides excellent graphics facility for quickly exploring and visualizing your data. For example, let’s load the auto data set that comes with Stata (1978 Automobile Data) and make two scatterplots and then two boxplots: sysuse auto twoway scatter price mpg twoway scatter mpg weight graph box […]