In several research fields (including computer science) there is the need to work with large dataset. This requires to be able to well understand the data lifecycle and to know the main data management issues and existing solutions (both theoretical and practical). In this course we will help PhD students to learn how to manage data for their research activities.
Data life cycle
Data model: beyond the relational model and noSql
Data quality issue in big (and small) data
Big data Engines – Map Reduce, Hadoop – Spark
Data streaming systems – Apache Storm – Apache Flink – Spark streaming
Data management in practice
All lessons start at 14.30 in the seminar room