Raw datasets are rarely pretty, and few -- if any -- are ready to be analyzed right out of the box. Missing data, typos, irregular capitalization, and inconsistent date/time and lat/long formatting can make it impossible to analyze or summarize your data. In this workshop, participants learn how to fix, clean, standardize, and reorganize their raw data until it is ready for analysis.
- Session 1 covers basic principles of data manipulation using subsetting and filtering techniques.
- Session 2 covers text cleaning using the
stringr
package. - Session 3 covers date/time cleaning using R's
lubridate
package; and Session 4 covers advanced cleaning techniques, such as conditional statements andfor
loops.
Course Length
- 12 hours in 4 sessions
Prerequisites
- Zero to Hero in R or previous experience with dataframes and the
tidyverse
package in R.