Workshop: Data Preparation for Analytics
Course Objectives
“The world’s most valuable resource is no longer oil, but data” – The Economist, May 2017.
Data has the potential to create immense business value, to disrupt existing, and to create new business models. But like oil, data needs refinement to fulfill this potential! The amount of time required to bring the data into shape for machine learning and artificial intelligence algorithms or statistical analysis is often underestimated. Furthermore, introductions to data science typically focus on the methods and algorithms and do not cover the required data preparation appropriately.
This workshop aims at enabling students to go beyond the unrealistically clean datasets provided in data science and machine learning tutorials. Instead, students learn how to handle data as they would face it in real-life business situations, where errors, inconsistencies, incompleteness, duplicates and many more problems are commonplace. They learn how to combine data from different sources and how to efficiently perform computations, aggregations, and other typical data preparation steps. Finally, students are introduced to special data preprocessing steps required for machine learning.
Having completed this course will give students an edge in the labor market where most newcomers have little experience with real-life datasets – especially those aiming for a career in consulting or other areas related to data science and artificial intelligence.
This course is also an ideal complement for students taking the courses “Managing Data Science” and “Visual Data Analysis”.
Course Contents
The course covers the typical data preparation techniques required for analytics:
- Loading and joining data from different types of data sources
- Data types and conversions
- Filtering
- Computations
- Aggregations
- Pivoting / reshaping
- Handling inconsistencies and errors in the data
- Time series operations
- Special preprocessing operations for machine learning
We may emphasize or skip topics based on questions or suggestions during the workshop and based on the pace of the group.
Date | Time |
---|---|
Thursday, 11.01.2024 | 09:45 - 17:00 |
Friday, 02.02.2024 | 09:45 - 17:00 |
Thursday, 08.02.2024 | 09:45 - 17:00 |