Data Science Week - Spring 2023
Contents
- Contents
- What is the Data Science Week?
- Who can join?
- What can I do to prepare?
- What will we do?
- Schedule
Spring 2023 Data Science Week
Registration for the Spring 2023 Data Science Week, March 6th - 13th, is still open!.
What is the Data Science Week?
Over the course of two weeks, BDSi will organize a datathon and a series of workshops and seminars. In the datathon, you will compete with other teams of behavioural data scientists to solve a real data science case. BDSi staff will organize seminars and workshops throughout the first week to introduce the various steps involved in data science. These activities are meant to support the datathon, but are open to all students and staff. Coaches will be available throughout both weeks to guide you when you run into problems.
The goal of the Data Science week is to introduce interested students and staff to data science in a fun and cooperative way, and help create a community of data scientists at the University of Twente, the faculty of Behavioural and Management Sciences, and beyond. After one week, the teams with the best solutions and most interesting approaches to the data science problem will present their work, and be presented with a suitable prize.
Who can join?
Staff, students, family, and friends
Everyone related to the University of Twente and their friends and family can join. You can join with friends, colleagues or even family. The event is open to both novices and experts, and everyone in between. You can join the datathon as a team, alone, or skip it altogether and only participate in the workshops. If you do join alone, you can choose to be assigned to a team with other data science enthusiasts.
BDSi primarily supports Data Science for the faculty of Behavioural and Management Sciences (BMS). Other University of Twente students and staff are welcome to attend (with or without their family and friends), as long as spaces are available.
Some experience with R (or another programming language)
You’ll need to have a basic idea of R in order to follow along with the workshops and seminars as all of our examples will be using R and various R packages.
Some programming knowledge is required!
While we will do our best to introduce data science topics in the various workshops, a basic understanding of R will make it much easier to follow along.
Introductory courses an R helpdesk for staff are provided by the Cognition, Data and Education (CoDE) section.
If you’re confident you can do the datathon in Python (or any other language - we challenge you to try in C, Fortran, Brainf***, or JavaScript), you’re more than welcome to do so. Just be aware that we probably can’t offer support if or when you get stuck.
What can I do to prepare?
Get a team
First off, get a team together. The datathon is meant to be a collaborative experience where you work alongside a variety of expertises.
Read the book (or at least pretend to)
The materials we will use are based on the freely available Introduction to Statistical Learning book. If you’re interested in data science, statistical learning or machine learning, this book would be a great place to start. BDSi also organizes a yearly reading club around this book.
Install R, RStudio, and tidyverse
As a faculty, BMS has decided to use R for statistical education. We will follow this example, and use R and the tidyverse packages in the workshops and seminars. If you do not already have a preferred programming language, you may want to install R and RStudio. ModernDive has a good primer on installing R and RStudio, that also covers the basics of working in R. If you’d like to go further, we recommend the free R for Data Science book by Hadley Wickham - a name you’ll encounter often in the R community.
What will we do?
Amin Asadi was kind of enough to record a video presentation of his submission for the 2021 Data Science Week - have a look!
Amin chose to meet the challenge on his own, but you don't have to. In fact, the datathon is best performed as a group - so that you can learn from each other, and explore together. BDSi staff will be available to give you a helping hand if and when you get stuck.
Schedule
The Data Science Week will start and end with a group session. You will be free to work on the case on your own schedule, and coaches will be available for questions and feedback throughout. The workshops and seminars are scheduled throughout the week, gradually introducing new topics by creating a baseline solution to datathon.
Kickoff
Monday March 6th, Ravelijn 2501
12:45 - 13:15
After a quick introduction about BDSi, we will introduce the goal of the datathon, and how you can compete. We will also explain how to reach the coaches for help, and give a brief overview of the schedule. Finally we will announce the teams for those who signed up alone and want to join a team.
13:15 - 13:45
Coffee, tea, and cookies while meeting your team and having the opportunity to ask questions to BDSi staff and coaches.
13:45 - 14:30
Quick introduction to resources you can use, followed by a hands on exploration of the dataset for the datathon. Bring your laptop, you’re expected to get down and dirty with the data!
Workshop Data Wrangling
Tuesday March 7th, Ravelijn 2231
12:45 - 13:30
A 45 minute guided introduction to data wrangling in R, using the ‘tidy’ data principles. Karel Kroeze will show how to prepare a ‘raw’ dataset for analysis, by cleaning, reshaping and mutating the data until it gives up all its secrets.
This workshop is also open for those who do not want to participate in the Data Science Week. You can find more information about the workshop here.
13:45 - 14:30
Hands-on data wrangling for the datathon dataset.
Workshop Modelling I
Wednesday March 8th, Oosthorst 111
12:45 - 13:30
A 45 minute guided overview of basic machine learning techniques. Anna Machens will take you through the basics of model fitting, parameter selection and hyperparameter tuning, ending up with a simple but effective predictive model.
This workshop is also open for those who do not want to participate in the Data Science Week. You can find more information about the workshop here.
13:45 - 14:30
Hands-on creation of a basic model for the datathon.
Workshop Modelling II
Thursday March 9th, Carré 2N
12:45 - 13:30
A 45 minute deeper dive into more advanced modelling techniques with Anna Machens.
This workshop is also open for those who do not want to participate in the Data Science Week. You can find more information about the workshop here.
13:45 - 14:30
Hands-on tuning and improvements of a categorization model for the datathon, and plenty of time to ask questions.
Workshop Data Visualization
Friday March 10th, Oosthorst 114
12:45 - 13:30
A 45 minute guided overview of data visualization using ggplot2
and the grammar of graphics. Karel Kroeze will explain the principles of creating and layering visualizations with ggplot in R, and give a quick introduction to interactive visualizations with plotly, shiny and beyond.
This workshop is also open for those who do not want to participate in the Data Science Week. You can find more information about the workshop here.
13:45 - 14:30
Hands-on visualization practical, with a focus on visualizing model results and parameter importance for the datathon.
Submission Deadline
Sunday March 12th
23:59
After spending all weekend with your team fine-tuning your solutions, you will have to submit them before midnight on Sunday. That gives us a bit of time to check your models and pick a winner. In the meantime, you can practice your victory speech - or suddenly have a brilliant idea that it’s too late to implement before submission.
Closing Session
Monday March 13th, Ravelijn 2501
12:45 - 13:15
Debriefing by the BDSi team, and announcement of the winning team(s).
13:15 - 13:45
Presentations by the winning team(s) of their solution(s) and approach. The teams that created the best and most creative solutions will give a short presentation about their approach, and there will be time to ask questions to the winning teams as well as BDSi staff and coaches.
13:45 - 14:30
Coffee, tea, cookies.
References
header image adapted from upklyak