Practical. Insightful. Empowering.

Data Science Training for Modern Research Organizations

We help institutions and organizations equip their teams with tools to understand, process, and communicate data-driven research.

We offer custom designed, fun, and interactive on-site and live virtual training in data science and modern research tools that inspires and empowers our participants.

Custom designed seminars for your organization

Build your own workshop

Choose from our learning modules below, which can be mix and matched and customized to suit your group's learning needs.


A boot camp for complete beginners, designed to fast-track your first steps in data science.


Zero to Hero in R

99% of people who quit trying to learn data science give up in the first few hours; the rest give up in the first few weeks. This module, designed for participants who have never used R or any other programming language, will jump-start their data science career by guiding them through a series of firsts: their first line of code, first script, first data table, first plot, and first project. Participants leave with the skills and the orientation they need to not just survive in data science, but thrive.


Everything you need to make your data viz dreams a reality through code.


Data Visualization in Practice

This workshop puts into practice the theory covered in the Principles of data visualization workshop. Participants will learn how to write code that produces beautiful, compelling, and impactful visualizations of their data.

This workshop teaches beginner and advanced ggplot techniques.


A code-free deep-dive on turning your raw data into a compelling story: the do's, the don'ts, and the art of visualizing data.


Priciples of Data Vizualization

Raw data is hard to interpret, and relationships within raw data are difficult to spot directly. These challenges become increasingly acute as the amount of data increases. Substandard and boring graphs are ineffective in knowledge dissemination. This code-free module will teach participants powerful design principles for clearly summarizing and communicating the messages within their data.


An exploration (half theory, half coding) on how to collect and organize your data in a way that makes it accessible and usable.


Data Management

Data that is stored haphazardly, organized poorly, and/or formatted unpredictably is hard to analyze, undermines its own value, and wastes people’s limited time. This module teaches modern practices for cleaning, storing, formatting, and accessing data so that its full potential can be realized.

One session is devoted to non-coding principles of keeping data organized, clean, and accessible. The second session is devoted to strategies for bringing your data into R and basic data cleaning.


Strategies for turning messy, inconsistent data into tables that are clean, formatted, and ready for analysis.


Data Cleaning

Raw datasets are rarely pretty, and few - if any - are ready to be analyzed right out of the box. Missing data, typos, irregular capitalization, and inconsistent date/time and lat/long formatting can make it impossible to analyze or summarize your data. In this workshop, participants learn how to fix, clean, standardize, and reorganize their raw data until it is ready for analysis.


Build and share interactive online dashboards that allow users to explore the data on their own.


Web Tools & Dashboards

Traditional data-driven reports are static, difficult to update as additional data is acquired, and challenging to distribute. Though we live in the 21st century, reports and papers often adhere to 20th century technology foundations. This module will teach participants how to build and deploy interactive dashboards that can be interrogated directly, updated automatically, and shared easily.

In session 1, participants learn how to build a basic data exploration dashboard using R's shiny package. In session 2, participants learn advanced techniques for making their dashboards as flexible and impressive as possible. In session 3, participants learn how to deploy their dashboards to the web.


Learn and put into practice simple, open-source tools for making beautiful maps.


Making Maps

Spatial data can be surprisingly difficult to visualize well. Ready to move away from screenshots of GoogleEarth? In this workshop, participants learn simple, open-source tools for making beautiful maps in R. Session 1 is devoted to basic principles and simple mapping functions in R, then producing elegant and interactive maps using R's leaflet package; Session 2 is devoted to building publication-ready maps using the sf and tmap packages in R.


Learn the techniques and tools to analyze spatial data.


GIS Applications

Geographic variability adds significant complexity to data and requires specialized tools and knowledge. GIS software is closed-source, licensed and expensive. This module will teach participants how to work with, analyze, and visualize geographic data using free open source tools.


Catch up on the basics: conducting and interpreting simple statistical tests.


Statistics Refresher

Statistics are everywhere, but they can be confusing to interpret. It doesn’t help that statistical methods are often used incorrectly or misinterpreted by the folks relying on them. This back-to-basics workshop guides participants through the fundamental principles of statistics, such as thinking about distributions and interpreting p-values, and how to decide which statistical test to use given your data and your research question. Participants will practice using the most common and widely applicable techniques, including the t-test, ANOVA, linear regression, and chi-square analysis.


Leverage your data to make informed predictions and weigh strategic decisions.


Statistical Modeling

Justifiable, informed decisions have to be made while facing uncertainty, risk, and changing conditions. This module will teach participants rigorous methods for (a) predicting future outcomes based on previous observations and (b) analyzing how those outcomes depend on interacting variables.

In Session 1, participants will learn to conduct simple linear regressions, build models with multiple predictors, and interpret the results. In Session 2, participants will learn more advanced modeling techniques (glm and gam approaches) and methods for model evaluation (e.g., goodness of fit tests) and model comparison. In Session 3, participants wil learn how to make predictions based on their models and evaluate the quality and uncertainty of their predictions.


A code-free deep-dive on how to write effective, publication-ready reports and give compelling, persuasive presentations.


Writing & Presenting Research

Datasets are useless without effective communication, but many professionals have never been taught how to write an impactful data-driven report or deliver an engaging and persuasive presentation. This module will teach participants how to write about research – organizing a report, conducting a literature review, designing compelling arguments, crafting sentences that are crystal-clear, etc. – and how to present it memorably to professional and public audiences. Session 1 is devoted to writing, and Session 2 is devoted to speaking.


Generate perfectly-formatted bibliographies for research reports using automated open-source tools.


Automated Bibliographies

Bilbiographies are the backbone of reproducible research and scientific progress, but they are tedious to write, prone to errors, and a pain to update. And changing between citation styles is an absolute nightmare! But it no longer has to be this way. In this workshop, participants will learn how to use open-source reference management tools to revolutionize the way they manage their libraries and write their articles.

#unboring Expert Team

Meet Our Instructors

Our instructors are recognized experts in their fields who make learning data science fun, interesting, and understandable to people of all skill levels.











Participant Testimonials

“This crash course is ideal for the busy academic or business person who has been longing to learn coding but struggles to set aside time to do this on their own. The intensive nature of the course means one can walk in with literally zero knowledge of R and walk out transforming datasets and making awesome maps. The Datatrain team is extremely generous with their time and the trainer-student ratio allows for keeping the class moving fast, even if someone falls temporarily behind. I should have done this years ago!”


Carlos C. ISGlobal (Barcelona Institute for Global Health)

“Datatrain's Zero to Hero in R course allowed me to improve my working knowledge on basic R programming. I regularly use the skills I learned in my everyday work as a PhD student in the field of biomedical science. The course is heavily focused on practical knowledge, which is excellent if your background is not in computer science or programming.”


Juan Carlos G. PhD Student, University of Navarra

“The DataTrain instructors provided good context as to why learning the basics of data management and data visualization is key for researchers. We learned the most important commands in R by doing, and the practice exercises were very useful. The instructors were available all the time to answer individual questions. I wished we had more time than two mornings - the course was great!”


Paula RC Staff Scientist, ISGlobal (Barcelona Institute for Global Health)

Ready to hop aboard?
Create a customized training program for your organization today!