### In-depth tutorials with practical sessions will take place on January 4th, 5th, and 9th

10 to 12 participants per session: registrations after application acceptance.

The sessions will be 3 hours long.

Below is the list of confirmed sessions as of today (click on the session title to see detailed information).

##### ALGORITHMS, THEORY, AND APPLICATIONS OF SUBMODULAR OPTIMIZATION: FROM DISCRETE TO CONTINUOUS AND BACK

**Organizer**: Hamed HASSANI

**Abstract**:

This tutorial will cover recent advancements in discrete optimization methods prevalent in large-scale machine learning problems. Traditionally, machine learning has been harnessing convex optimization to design fast algorithms with provable guarantees for a broad range of applications. In recent years, however, there has been a surge of interest in applications that involve discrete optimization. For discrete domains, the analog of convexity is considered to be submodularity, and the evolving theory of submodular optimization has been a catalyst for progress in extraordinarily varied application areas including active learning and experimental design, vision, sparse reconstruction, graph inference, video analysis, clustering, document summarization, object detection, information retrieval, network inference, interpreting neural network, and discrete adversarial attacks.

As applications and techniques of submodular optimization mature, a fundamental gap between theory and application emerges. In the past decade, paradigms such as large-scale learning, distributed systems, and sequential decision making have enabled a quantum leap in the performance of learning methodologies. Incorporating these paradigms in discrete problems has led to fundamentally new frameworks for submodular optimization. The goal of this tutorial is to cover rigorous and scalable foundations for discrete optimization in complex, dynamic environments, addressing challenges of scalability and uncertainty, and facilitating distributed and sequential learning in broader discrete settings.

##### DEEP GENERATIVE MODELS FOR REPRESENTATION LEARNING

**Organizer**:** **Vincent FORTUIN

**Abstract**:

Like many areas of machine learning, generative modeling has seen great progress through the introduction of deep learning. These deep generative models can not only produce realistic synthetic data, but can also learn useful representations of the data. These representations may then be used for downstream tasks, such as clustering, missing data imputation, semi-supervised learning, anomaly detection or conditional generation. In this tutorial, we will review different model architectures (such as variational autoencoders and normalizing flows) and implement them in PyTorch. We will then try them out in different application scenarios and discuss their strengths and weaknesses.

**Requirements**:

Basic knowledge of probabilistic modeling and linear algebra; familiarity with Python, PyTorch, and Jupyter notebooks; a Google account to use Google Colab.

##### INTRODUCTION TO DEEP LEARNING WITH KERAS AND TENSORFLOW

**Organizer**:Olivier GRISEL

**Abstract**:

This session will introduce the main deep learning concepts with worked examples using Keras. In particular, we will cover the following concepts:

- feed-forward fully connected network trained with stochastic gradient descent,
- convolution networks for image classification with transfer learning,
- embeddings (continuous vectors as a representation for symbolic/discrete variables such as words, tags…),
- if time allows: Recurrent Neural Networks for NLP.

**Requirements**:

- Working of Python programming with NumPy
- Basics of linear algebra and statistics
- Environment: Python Jupyter
- Packages: numpy, matplotlib, tensorflow
- Follow the instructions here

##### KERNEL METHODS: FROM BASICS TO MODERN APPLICATIONS

**Organizer**: D.J. SUTHERLAND

**Abstract**: Kernel methods were previously the dominant paradigm of machine learning, but have somewhat fallen out of favor with the advent of deep learning. This course will demonstrate how insights from kernel methods can still be beneficial in modern deep learning settings, assuming no prior knowledge of kernels. In particular, they seem to be especially useful for tasks such as meta-learning, distinguishing distributions, and generative models. Topics covered will include:

- Basic formulations of kernel spaces and learning with kernels, particularly kernel ridge regression, and the core theoretical results justifying them.
- Distinguishing distributions with the kernel maximum mean discrepancy.
- End-to-end learning of kernels within deep learning frameworks, why and when we would want to do this.
- If time permits: the use of kernels for theoretical analysis of deep learning.

The coding components of the course will be based on IPython notebooks, which can either be run locally (with the requirements sent beforehand) or in the cloud on Google Colab for free.

**Requirements**:

- General knowledge roughly as assumed for the rest of DS3: basic linear algebra, probability, vector calculus.
- Basic familiarity with Python programming and use of NumPy.
- Bonus: familiarity with PyTorch, though if you haven’t used it before we’ll cover the basics.

##### MODERN BAYESIAN METHODS: PRINCIPLES AND PRACTICE

**Organizer**: Vinayak RAO

**Abstract**: This course focuses on two aspects of Bayesian Statistics: hierarchical modeling and computation. The first part will cover topics from basic hierarchical Bayes to modern nonparametric Bayesian methods like the Dirichlet process. The second part will cover topics such as Markov chain Monte Carlo, Hamiltonian Monte Carlo and optimization-based approaches like variational Bayes. Time permitting, we will also cover hybrid approaches involving deep neural networks, such as variational autoencoders and neural processes. All theory will be accompanied by examples and coding sessions where these ideas will be put into practice on interesting datasets.

**Requirements**:

- Fluency with Python or R
- Basics of linear algebra
- Basics of probability such as Bayes rule, conditional expectations, the Gaussian distribution, the Dirichlet distribution etc.
- Basics of statistics such as likelihood, mean, median
- Basics of calculus and vector calculus (e.g. gradient, Hessian etc).

##### MODERN RECOMMENDATION FOR ADVANCED PRACTITIONERS

**Organizer**:** **Flavian VASILE

**Abstract**:

The course contains two parts: In the first part we will cover current approaches for Recommendation that are based on Empirical Risk Minimization models and their associated shortcomings. In the second part we will discuss ways to address the aforementioned shortcomings and introduce Policy-based methods as a potential solution. All of the concepts will be followed by examples and coding sessions where we will put in practice the newly introduced conceptual tools.

**Requirements**:

- Working of
**Python**programming with**NumPy** - Basics of linear algebra
- Notions of vector, matrix, and how to multiply and transpose them
- Basics of machine learning (what is a model? A cost function? What does it mean to train a model? What is the difference between a model parameter and a hyper-parameter? What does “regularizing a model” mean? What is over-fitting? What are the training, validation, and test sets? What is a cross-validation?)
- Basics of recommender system
- Collaborative filtering
- Matrix factorization
- Precision@k, Mean Percentile Rank

##### OPTIMIZATION FOR MACHINE LEARNING "HANDS ON"

**Organizers**: Alexandre GRAMFORT, Quentin BERTRAND

**Abstract**:

Modern machine learning heavily relies on optimization tools, typically to minimize the so-called loss functions on training sets. The objective of this course is to give an overview of the most commonly employed gradient based algorithms : proximal / accelerated gradient descent, (proximal) coordinate descent, L-BFGS and stochastic gradient descent. As the course is meant to be practical one will see how all these algorithms can be implemented In Python on regression and classification problems. Jupyter notebooks will be used in the programming session.

**Requirements**:

Python with numpy, scipy and matplotlib

##### SAFE LEARNING FOR CONTROL

**Organizers**: Melanie ZEILINGER, Lukas HEWLING

**Abstract**:

While reinforcement learning has shown great success in solving complex control tasks, most techniques cannot ensure the safety of the system, in particular during learning iterations. Safety frameworks based on invariance emerged from control theory to address this issue by making use of (approximate) system models. This tutorial gives an introduction to predictive control techniques for such safety frameworks, including a hands-on programming session demonstrating the approach.

Outline:

- Introduction to invariance-based safety frameworks
- Crash course in model predictive control
- Model predictive safety filters
- Hands-on programming session (Python 3, NumPy, CVXpy)

##### STOCHASTIC OPTIMIZATION

**Organizers**: Aymeric DIEULEVEUT, Eric MOULINES

**Abstract**: The purpose of this course is to give an introduction to stochastic convex optimization and its applications in statistical learning. In the first part, we will recall the importance of convex optimization in statistical learning. On this occasion, we will briefly introduce some useful results on convex analysis. We will then analyze the gradient algorithms for strongly convex and then convex smooth functions. We will take this opportunity to establish some results of complexity lower bound for such problems. We will see on this occasion that the gradient algorithm does not reach the optimal speed and show a strategy for accelerating the gradient algorithms to get optimal speeds. In the second part, we will look at stochastic versions of these algorithms, providing convergence rates and proofs, and also describe variance reduced algorithms.

**Requirements**: A good knowledge of the fundamental tools of calculus and linear algebra, as well as the probabilistic tools (random variables, conditional expectation, $\sigma$-algebra).

##### TOPOLOGICAL DATA ANALYSIS: TOPOLOGY-BASED DATA SKELETONIZATION IN PYTHON

**Organizers**: Vitaliy KURLIN, Krasen SAMARDZHIEV

**Abstract**:

We are planning three 50-min sessions with 10-min breaks, maximum 3 hours in total. Each session involves interactive demonstrations in Python notebooks, which participants could follow in parallel on their laptops. All materials with installation instructions will be e-mailed in advance.

Part 1. The data skeletonization problem for unorganized point clouds. The first attempt via a Minimum Spanning Tree.

Part 2. A Delaunay triangulation and a filtration of alpha-complexes on a point cloud in the Euclidean space.

Part 3. A Homologically Persistent Skeleton. Its topological and geometric guarantees in a metric space.

##### TOPOLOGICAL DATA ANALYSIS: MAPPER, PERSISTENCE, AND APPLICATIONS

**Organizers**: Pawel DLOTKO, Vincent ROUVREAU

**Abstract**:

Hour 1. Descriptors of point clouds: mapper and ball mapper.

Hour 2. Persistent homology, Betti number and Euler characteristic curves.

Hour 3. Vectorization of persistence diagrams, classification and regression based on them.

If time permits. we will discuss topological descriptors of embedded trees and graphs.

**Requirements**:

Basic programming knowledge in Python.