Prof. Andrea Maurino, Dott. Michele Ciavotta
University of Milano-Bicocca, Milano, Italy
In several research fields (including computer science) there is the need to work with large dataset. This requires to be able to well understand the data lifecycle and to know the main data management issues and existing solutions (both theoretical and practical). In this course we will help PhD students to learn how to manage data for their research activities.
Contents:
-
Data life cycle
-
Data model: beyond the relational model and noSql
-
Data quality issue in big (and small) data
-
Data lake
-
Big data Engines – Map Reduce, Hadoop – Spark
-
Data streaming systems – Apache Storm – Apache Flink – Spark streaming
-
Data management in practice
All lessons start at 14.30 in the seminar room
- Thursday 1-2-2018
- Friday 9-2-2018
- Thursday 15-2-2018
- Friday 16-2-2018
- Monday 19-2-2018
- Friday 23-2-2018
- Tuesday 27-2-2018
- Friday 2-3-2018