What is Bootstrapping? Bootstrapping is a statistical procedure that utilizes resampling (with replacement) of a sample to infer properties of a wider population. More often than not, we want to understand the properties of a population but we only have access to a small sample of that population. Sometimes, we are unable to gather more […]
python
Logistic Regression Four Ways with Python
What is Logistic Regression? Logistic regression is a predictive analysis that estimates/models the probability of an event occurring based on a given dataset. This dataset contains both independent variables, or predictors, and their corresponding dependent variable, or response. To model the probability of a particular response variable, logistic regression assumes that the log-odds for the […]
List Comprehensions in Python
List comprehensions are a topic a lot of new Python users struggle with. This article seeks to explain the benefits of list comprehensions and how list comprehensions work in a digestible manner. Single for loop list comprehension The following code uses a traditional for loop to change each string in a for loop from upper […]
Getting Started with the Kruskal-Wallis Test
What is it? One of the most well-known statistical tests to analyze the differences between means of given groups is the ANOVA (analysis of variance) test. While ANOVA is a great tool, it assumes that the data in question follows a normal distribution. What if your data doesn’t follow a normal distribution or if your […]
Getting Started with Web Scraping in Python
“Web scraping” or “data scraping” is simply the process of extracting data from a website. This can, of course, be done manually: you could go to a website, find the relevant data or information, and enter that information into some data file that you have stored locally. But imagine that you want to pull a […]
Getting Started with pandas in Python
The pandas package is an open-source software library written for data analysis in Python. Pandas allows users to import data from various file formats (comma-separated values, JSON, SQL, fits, etc.) and perform data manipulation operations, including cleaning and reshaping the data, summarizing observations, grouping data, and merging multiple datasets. In this article, we’ll explore briefly […]
A Guide to Python in QGIS
This post is something I’ve been thinking about writing for a while. I was inspired to write it by my own trials and tribulations, which are still ongoing, while working with the QGIS API, trying to programmatically do stuff in QGIS instead of relying on available widgets and plugins. I have spent, and will probably […]
How to Create and Export Print Layouts in Python for QGIS 3
I’ve been struggling off and on for literally months trying to create and export a print layout using Python for QGIS 3. Or PyQGIS 3 for short. I have finally figured out may of the ins and outs of the process and hopefully this will serve as a guide to save someone else a lot […]
How to Apply a Graduated Color Symbology to a Layer Using Python for QGIS 3
I was recently working on a project in QGIS 3 with a member of UVA Health’s Oncology department. This person wanted to take a set of patient data (after identifying info had been removed) and after doing some other stuff, apply a graduated color scheme to the results, shading them from light to dark based […]
How to Use the Field Calculator in Python for QGIS 3
Recently, I have taken the dive into python scripting in QGIS. QGIS is a really nice open source (and free!) alternative to ESRI’s ArcGIS. While QGIS is a little quirky and generally not quite as user friendly as ArcGIS, it still provides nearly the same functionality. Personally, I’ve become a fan of it and now […]