COM6012 Scalable Machine Learning
|| This module will focus on technologies and algorithms that can be applied to data at a very large scale (e.g. population level). From a theoretical perspective it will focus on parallelization of algorithms and algorithmic approaches such as stochastic gradient descent. There will also be a significant practical element to the module that will focus on approaches to deploying scalable ML in practice such as SPARK, programming languages such as Python/Scala and deployment on high performance computing platforms/clusters.
||Coursework and Blackboard quizzes
||Dr Mauricio Alvarez & Dr Haiping Lu
|| Unconfirmed practical marks when available
This unit aims to to provide a deeper understanding of the fundamental technologies underlying data analytics at scale. In particular it will provide advanced understanding of
- parallelization of algorithms and algorithmic approaches such as stochastic gradient descent
- practical skills relating to the deployment of scalable ML
By the end of the unit, a student will be able to
- understand the theoretical issues and wider context relating to ML at scale
- understand practical parallelization of algorithms and algorithmic approaches using such techniques as stochastic gradient descent;
- deploy a practical implementation of ML at scale, using SPARK, and programming languages such as Python/Scala;
- deployment onto high performance computing platforms/clusters.
- Spark overview
- Scala programming
Spark & HPC
- Spark DataFrame/dataset
- Machine learning pipeline
- High performance computing
Parallelization & optimization in Spark
Scalable matrix factorization for collaborative filtering & applications
Scalable KMeans clustering & applications
Scalable PCA for dimensionality reduction & applications
Scalable decision trees & applications
Scalable logistic regression & applications
Scalable GLM & applications
Scalable neural networks
||Lectures, laboratory classes.
||Immediately for exercises in laboratory classes. After each coursework stage through debriefing lecture and individual marking.