Workshops

Spring 2022 Schedule (Part 1)

Our Data Essentials Workshop Series takes place January 13-14.

We offer workshops from our own team, as well as our colleagues in Research Computing. Workshops are listed below, grouped by series. Click on the date link to register; registration is free. Workshops (except where noted) are being offered synchronously online.

You may find it more convenient to view a list of our workshops in date order.

The Coffee & Coding Python and R sessions are offered once a week on Thursday from noon-12:30pm. They are open to all, and no registration is needed, just use the Zoom link to join any of the 13 sessions.

Check back in early February for additional workshops, and also be sure to look at the data workshops being offered by our colleagues in the Health Sciences Library.

High Performance Computing

Workshop Topic (Instructor) Day (Click date to register) Time Location
Parallel/GPU Computing with Matlab (Ed Hall) Thu, 2/03 2:00 – 4:00pm Online
Learn how to submit Matlab parallel jobs that use multiple cores within one compute node as well as multiple cores across multiple compute nodes. Parallelism using GPUs will also be covered. Familiarity with Matlab and a Rivanna account are needed.

 


Converting Jupyter Notebooks to run as a Batch Job (Jacalyn Huband) Thu, 2/17 2:00 – 4:00pm Online
Using Jupyter Notebooks interactively is great until your code takes more than an hour to run. In this workshop, we will look at how to convert a notebook to code and submit the code as a batch job (i.e., one where you don’t have to remain logged in for the job to run). Experience with JupyterLab is assumed.

 


Optimization Techniques with Matlab (Ed Hall) Thu, 2/24 2:00 – 4:00pm Online
Learn how to code and run optimization problems in Matlab. Prerequisites: Familiarity with Matlab and a Rivanna account.

 


Using Spark on Rivanna (Ruoshi Sun) Mon, 3/14 2:00 – 4:00pm Online
A step-by-step guide on how to run Spark interactively (Jupyter integration, Spark UI) and non-interactively (single-node and multi-node Slurm job) on Rivanna. This is not an introduction to Spark – basic knowledge is assumed.

 


Information and Publishing

Workshop Topic (Instructor) Day (Click date to register) Time Location
Intro to the Command Line (Ricky Patterson Fri, 1/14 3:00 – 4:00pm Online
This session of the Data Essential Reboot series provides an introduction to the command-line interface. We’ll learn how to use commands to perform basic operations in the terminal — creating or navigating directories, listing and displaying files, moving or copying files — as well as searching files, managing file permissions, and creating symbolic links. Working in the command line, we can combine existing programs, automate repetitive tasks, and connect to remote resources.

Workshop Slides


Intro to Zotero (Maggie Nunley) Tue, 1/25 4:00 – 5:00pm Online
Intro to Zotero (Maggie Nunley) Wed, 2/09 7:00 – 8:00pm Online
Zotero is a free and open-source reference management tool that allows researchers to maintain a database of the books, articles, and other media used in a project. This workshop will cover all the basics of downloading, setting up, and using Zotero. You don’t need to prep anything before the workshop but you will need a computer in order to participate. We’ll start by setting up accounts and downloading all the required apps and plugins. If you’ve looked at using Zotero before but found it a bit overwhelming, this workshop will help break down each step in the set-up.

 


Getting Oriented with the new Scifinder-n (Jenny Coffman) Tue, 2/01 4:00 – 5:00pm Online
SciFinder is a chemical literature database, provided by UVA library to the UVA community for free, which enables you to search by structure, reaction, research topic, author name, and more. As of December 31, 2021 the classic version of SciFinder has been retired and the new SciFinder-n has been fully transitioned. In this workshop we will walk through the new interface and features, review how to transition personal settings and alerts from the classic version, and learn some tips and tricks for searching in SciFinder-n. Your SciFinder account should be set-up prior to the workshop. No experience is required though some familiarity with SciFinder will be useful.

 


Intro to Endnote (Jenny Coffman) Thu, 2/10 1:00 – 2:00pm Online
Endnote is a commercial reference manager that allows you to save, cite, store, share, and organize your references through a searchable personal library. This workshop will walk you through the basics of using Endnote and offer tips and tricks to integrating Endnote into your research workflow. At the end of this workshop, you will know how to cite while you write, organize your citation library, set up and manage library sharing across your teams, save and manage PDFs, and more. No experience with Endnote is required, however having Endnote installed is highly recommended. Currently Endnote costs $116 for students and $250 for a full license.

 


Python

Workshop Topic (Instructor) Day (Click date to register) Time Location
Intro to Python (Erich Purpur) Wed, 1/26 11:00am – 12:30pm Online
This workshop serves as an introduction to the Python programming language and is intended for beginners to programming in general as well those who may be coming to Python from anther language. We will cover basic programming concepts like variables and data types, basic programming logic such as loops and if/else statements, and also how to download and set up your coding environment (IDE). There will be hands-on coding exercises and follow up materials to help you on your journey.

Workshop Materials: https://github.com/epurpur/python-intro


PrettyPrinter with complex data structures (Will Rosenow) Thu, 1/27 Noon – 12:30pm Online
Join us on Zoom for Coffee&Coding every Thursday afternoon between noon and 12:30 for amazing coding tips that will streamline your research!


Data Analysis and Visualization in Python with Pandas and Matplotlib (Erich Purpur) Wed, 2/02 11:00am – 12:30pm Online
In this workshop we will be exploring the python package Pandas and Matplotlib. These are two closely linked data analysis and visualization packages which are widely used. There will be minimal lecture about the background of these packages before diving in and writing real code using these packages. We will go from a raw data set, which we manipulate with pandas, before visualizing it with matplotlib. We will also learn about the Jupyter notebook software environment.

Workshop Materials: https://github.com/epurpur/PythonDataViz


f strings in Python (Will Rosenow) Thu, 2/03 Noon – 12:30pm Online
Join us on Zoom for Coffee&Coding every Thursday afternoon between noon and 12:30 for amazing coding tips that will streamline your research!


Scientific Image Processing with Python (Karsten Siller) Wed 2/09 2:00 – 4:00pm Brown 133
In this advanced workshop participants are introduced to scientific image processing with the Python OpenCV package. Topics include splitting and merging of color channels, morphological filters, image thresholding and segmentation. Participants should have some experience in programming with Python. Note: this workshop is scheduled to be in person, in Brown 133

 


Plotting with Matplotlib (Will Rosenow) Thu, 2/10 Noon – 12:30pm Online
Join us on Zoom for Coffee&Coding every Thursday afternoon between noon and 12:30 for amazing coding tips that will streamline your research!


Python and APIs (Erich Purpur) Wed, 2/16 11:00am – 12:30pm Online
An API is a connection between computers or computer programs and they are used throughout software and the internet today. We will start with some basic concepts and overview of what APIs are. We will visit how they are used throughout popular applications that we all use every day. Then we will use python and the accompanying ‘requests’ library to make some API requests and get data from various places on the internet in order to use it for our own purposes. Some prior experience with a programming language (python in particular) is recommended but not required.

Workshop Materials: https://github.com/epurpur/pythonAPI


Building lists with .append() (Will Rosenow) Thu, 2/17 Noon – 12:30pm Online
Join us on Zoom for Coffee&Coding every Thursday afternoon between noon and 12:30 for amazing coding tips that will streamline your research!


Text Parsing/Regular Expressions (Will Rosenow) Thu, 2/24 Noon – 12:30pm Online
Join us on Zoom for Coffee&Coding every Thursday afternoon between noon and 12:30 for amazing coding tips that will streamline your research!


Error Handling with try and except (Will Rosenow) Thu, 3/03 Noon – 12:30pm Online
Join us on Zoom for Coffee&Coding every Thursday afternoon between noon and 12:30 for amazing coding tips that will streamline your research!


Qualitative Research

Workshop Topic (Instructor) Day (Click date to register) Time Location
Introduction to Qualitative Analysis Using Dedoose (Christine Slaughter) Tue 2/08 11:00 – 12:30pm Online
Dedoose is a Qualitative Data Analysis (QDA) application that facilitates the organization, coding, and interpretation of media, whether textual, video, audio, or image. In this workshop we will cover the basics of using Dedoose to code and analyze such documents and examine how Dedoose’s functionality allows you to surface relationships and insights that may be inchoate in your coding. Along the way we will discuss some of the fundamentals of analyzing qualitative data. We will also briefly discuss and explore Dedoose’s potential for mixed methods analysis, i.e. using Dedoose to integrate qualitative and quantitative data into one project. No prerequisites, but you may find it helpful to download the Dedoose app at www.dedoose.com in order to follow along.

 


R

Workshop Topic (Instructor) Day (Click date to register) Time Location
Introduction to R/RStudio (Jenn Huck) Thu, 1/13 9:00 – 10:30am Online
Join us in this Data Essentials Reboot session for a gentle introduction to R and RStudio. R is a free, open-source software environment and programming language designed specifically for statistical analysis; RStudio is a free, open source integrated development environment (IDE) for R that provides a friendly interface for viewing graphs, data tables, R code, and output all at the same time. In this hands-on session we’ll get started navigating R with RStudio, loading libraries, and importing data. We’ll do some basic data manipulation, exploration, and analysis, and begin creating plots and graphics. And we’ll cover some key practices and shortcuts for using R effectively and helpful resources for learning more.


Data Visualization in R (Clay Ford) Thu, 1/13 10:45am – 12:15pm Online
A Data Essentials Reboot workshop. Exploring our data with graphs allows us to visualize relationships, spot unusual observations, or find unexpected patterns. In this workshop we introduce how to effectively use the ggplot2 package to explore and visualize data in R. With its consistent syntax and layered approach to making graphics, ggplot2 has revolutionized data visualization. What previously would have required tedious programming can now be accomplished in a few lines of ggplot2 code. This session will introduce the logic behind ggplot2, how to use ggplot2 to explore your data, and how to customize and polish ggplot2 graphs.


Data Wrangling in R with dplyr (Jenn Huck) Fri, 1/14 9:00 – 10:30am Online
A Data Essentials Reboot workshop. Data almost always requires processing and manipulation before analysis. This session will explain and illustrate some of the most common data manipulation tasks in R using the dplyr package. We will learn how to select specific columns or rows, create new columns or remove unnecessary ones, combine multiple commands into a single command, and reshape data to enable data analysis. We will work through several examples together with opportunities to practice using these tools on your own. We will utilize R Markdown throughout the session.


Getting Started with R (Jenn Huck) Thu, 1/27 10:00 – 11:30am Online
Designed for the absolute beginner, this workshop provides a gentle introduction to R and RStudio. R is a free, open-source software environment and programming language designed specifically for statistical analysis. RStudio is a free integrated development environment (IDE) that makes using and learning R much easier. In this workshop we’ll get you started using R with RStudio, show you how to import data, do some basic data manipulation, create a few graphics, perform some basic statistical analyses, and point you in the direction to learn more and go further with R!

Getting Started with R (Jenn Huck) instructional materials


Linear Modeling with R (Clay Ford) Fri, 1/14 10:45am – 12:15pm Online
Linear Modeling with R (Clay Ford) Thu, 2/03 10:00 – 11:30am Online
This workshop will cover how to carry out multiple regression, model selection, and diagnostics using the R statistical computing environment. Special emphasis will be placed on visualizing linear models to help communicate results. This workshop is ideal for those familiar with linear modeling in other programs (such as Stata or SPSS) but who want to learn how to do it in R. This can also serve as a refresher for forgotten statistics!

 


Mixed-Effect/Multilevel Modeling with R (Clay Ford) Thu, 2/10 10:00 – 11:30am Online
Mixed-effect models, multilevel models, hierarchical linear models – all refer to a class of statistical models used to analyze correlated data. Such data include repeated measurements, longitudinal measurements and clustered observations. In this workshop we introduce the basics of mixed-effect modeling with an emphasis on implementation and interpretation. Examples will be given in R using the lme4 package. Previous experience with R and linear regression will be helpful but not required.

 


Bayesian Data Analysis, Part I (Clay Ford) Thu, 2/17 10:00 – 11:30am Online
In this first of a two-part series, we learn the basics of running and interpreting a Bayesian data analysis in R. The workshop will center on comparing traditional (Frequentist) statistical analyses to “equivalent” Bayesian approaches. We write equivalent in quotes because as we’ll see they’re not really equivalent. For example, a research question that would traditionally call for a t-test requires a different approach using Bayesian statistics. The workshop features very little math and emphasizes application. It is intended for all audiences! However some basic experience using R and some memory of an Introductory stats course will be helpful.

 


Plot annotations with ggplot2 (Marieke Jones) Thu, 3/17 Noon – 12:30pm Online
Join us on Zoom for Coffee&Coding every Thursday afternoon between noon and 12:30 for amazing coding tips that will streamline your research!


Mathematical Annotation in R (Clay Ford) Thu, 3/24 Noon – 12:30pm Online
Join us on Zoom for Coffee&Coding every Thursday afternoon between noon and 12:30 for amazing coding tips that will streamline your research!


Working with Dates in R (Clay Ford) Thu, 3/31 Noon – 12:30pm Online
Join us on Zoom for Coffee&Coding every Thursday afternoon between noon and 12:30 for amazing coding tips that will streamline your research!


Data Cleaning with tidyverse (Will Rosenow) Thu, 4/07 Noon – 12:30pm Online
Join us on Zoom for Coffee&Coding every Thursday afternoon between noon and 12:30 for amazing coding tips that will streamline your research!


across() function (Marieke Jones) Thu, 4/14 Noon – 12:30pm Online
Join us on Zoom for Coffee&Coding every Thursday afternoon between noon and 12:30 for amazing coding tips that will streamline your research!


purrr map() function (Marieke Jones) Thu, 4/21 Noon – 12:30pm Online
Join us on Zoom for Coffee&Coding every Thursday afternoon between noon and 12:30 for amazing coding tips that will streamline your research!


Rolling joins (Clay Ford) Thu, 4/28 Noon – 12:30pm Online
Join us on Zoom for Coffee&Coding every Thursday afternoon between noon and 12:30 for amazing coding tips that will streamline your research!


Reproducibility

Workshop Topic (Instructor) Day (Click date to register) Time Location
Version Control with GitHub (Erich Purpur) Thu, 1/13 3:00 – 4:30pm Online
This Data Essentials Reboot workshop introduces version control using git through GitHub, free and open source platforms for building projects collaboratively. We’ll learn the basics of git repositories and commits including how to fork a repository on GitHub to your account, clone the repository to your local machine, and point it to the source repository. And we’ll practice GitHub workflows like fetching and merging changes from the source, making changes and commits locally, pushing to GitHub, and making pull requests. The use of GitHub requires a user account so please set yours up before the workshop at github.com. Git and GitHub are available to everyone. No prior knowledge is needed for this workshop, but you will need a computer to participate.


Reproducible Analysis and Documentation with R Studio and R Markdown (Jenn Huck) Tue, 2/01 10:00 – 11:30am Online
Make your research life easier with the tools of reproducible research! In “Reproducible Analysis and Documentation with RStudio and R Markdown,” we will focus primarily on two workflow-tools. First, we briefly review RStudio Projects to create a project-oriented workflow in your R scripts. The second, and more in-depth part of the workshop, is using R Markdown for literate programming. We will review R Markdown documents, which you can use to write complete papers. We will take a look at the new features in Visual R Markdown, such as citations and technical writing.

This is the first of our four-session series “Reproducible Research Practices: Make Your Research Life Easier.” Other sessions include “Organize Your Files and Metadata for Transparent and Reproducible Research,” “Version Control with GitHub,” and “Sharing Your Data for Transparent and Reproducible Research.”

Workshop materials: https://github.com/jennhuck/reproAnalysis


Organize for Transparent and Reproducible Research (Jenn Huck) Tue, 2/15 10:00 – 11:00am Online
In this workshop, participants will learn fundamental approaches to creating a research compendium. This is the foundation of transparent and reproducible research. Participants will learn what kinds of documents and materials they should create and preserve; the information the documents should contain; and how they should be formatted and organized.  Topics include raw data, analysis data, scripts, metadata, readme files, project organization, and naming conventions. Examples will be provided in R, but this information can be applied to any quantitative programming environment. The are no prerequisites.

This is part of our four-session series “Reproducible Research Practices: Make Your Research Life Easier.” Other sessions include “Reproducible Analysis and Documentation with R and RStudio,” “Sharing Your Data for Transparent and Reproducible Research,” and “Version Control with Git and GitHub.”

Slides (with speaker notes – look for the 3 dot icon in lower left) are available online


Tableau

Workshop Topic (Instructor) Day (Click date to register) Time Location
Data Visualization in Tableau: Getting Started (Nancy Kechner) Thu 1/13 1:30 – 1:45pm Online
Introduction to Data Visualization: Using Tableau (Nancy Kechner) Tue 1/25 10:00 – 11:30am Online
Tableau is an extremely powerful tool for visualizing massive sets of data very easily. It has an easy to use drag and drop interface. You can build beautiful visualizations easily and in a short amount of time. This is a hand’s on workshop designed to have you up and running in Tableau so that you can create your own visualizations.

 


Data Visualization in Tableau: Intermediate Techniques (Nancy Kechner) Fri, 1/14 1:30 – 2:45pm Online
Exploring Data Visualizations Using Tableau (Nancy Kechner) Mon, 1/31 2:00 – 3:30pm Online
This workshop will continue using Tableau to create data visualizations and will go into data cleaning, building dashboards, and creating calculations.

 



Colleagues across the university offer workshops in data, programming, and more!
All Library Workshops || HSL Data Workshops || Scholars’ Lab Workshops