Teaching @ School of Mathematical Sciences, Tel-Aviv university
Spring 2024 - Mathematical foundations of machine learning
Syllabus: In the course we will study Machine Learning (ML) through the lens of geometric approximation theory and modern harmonic analysis. We will review in depth the most successful tools of ML: Support Vector Machines, Random Forest, Gradient Boosting, Deep Learning networks (Multi Layer Perceptron, Convolution, Attention, Transformers). We will discuss related theory and applications in computer vision, natural language, numerical solutions to PDEs, etc.
Lesson 1 - Intro, basic ML models (linear regression, logistic regression, soft-max), ML terminology and notation, function spaces I.
Lesson 2 - Support Vector Machines I, function spaces II
Lesson 3 - Support Vector Machines II, Random Forest I, Function spaces III
Lesson 4 - Random Forest II, Wavelet decomposition of RF I, function spaces IV
Lesson 5 - Wavelet II, Tree-based Classification, function spaces V
Lesson 6 - Feature importance, Boosting, convolutions
Lesson 7 - Deep learning building blocks I
Lesson 8 - Deep learning building blocks II, DL computer vision applications I
Lesson 9 - DL Computer vision applications II, Approximation theory of DL I
Lesson 10 - Approximation theory of DL I, Application of DL in numerical PDEs
Lesson 11 - Transformers, autoregressive image generation
Presentations
- Intro, basic ML models and terminology
- Essentials of function space theory
- Computer vision applications of deep learning
- Applications of DL in numerical PDEs
- Autoregressive image generation using wavelets
Assignments
References
[1] T. Hastie, R. Tibshirani and J. Friedman, Elements of statistical learning, Springer-Verlag 2009.
[2] Y. LeCun, Y. Bengio and G. Hinton, Deep Learning, Nature 521 (2015), 436–444.
[3] R. DeVore & G. Lorentz, Constructive Approximation
[4] R. DeVore, Nonlinear approximation, Acta Numerica 1998, 51-150.
[5] S. Dekel and D. Leviatan, Adaptive multivariate approximation using binary space partitions and geometric wavelets, SIAM Journal on Numerical Analysis 43 (2005), 707-732.
[6] O. Elisha and S. Dekel, Wavelet decomposition of Random Forests - smoothness analysis, sparse approximation and applications, JMLR 17 (2016).
[7] O. Morgan, O. Elisha and S. Dekel, Wavelet decomposition of Gradient Boosting, https://arxiv.org/abs/1805.02642
[8] I. Ben-Shaul, S. Dekel and O. Elisha, Sparse Besov space analysis of deep learning representation layers in high dimensions, Pure and Applied Functional Analysis, to appear.
[9] S. Ruder, On overview of gradient descent optimization algorithms, https://arxiv.org/abs/1609.04747
[10] I. Ben-Shaul and S. Dekel, Nearest class center simplification through intermediate layers, PMLR 196, 2022.
[11] I. Ben-Shaul, T. Galanti and S. Dekel, Exploring the approximation capabilities of multiplicative neural networks for smooth functions, TMLR 2023..
[12] M. Phuong and M. Hutter, Formal algorithms for transformers, DeepMind, 2022.
Fall 2023 - Foundations of approximation theory
Syllabus Approximation theory is one of the main theoretical pillars of applied mathematics. One of its goals is to characterize the classes of functions that can be approximated by a specified algorithm with the error decaying at a certain qualitative rate. Examples for approximation algorithms are: Fourier series, algebraic polynomials, splines, wavelets, finite elements, etc. So as to provide the theoretical foundations of signal processing & machine learning, approximation theory applies tools to measure weak-type smoothness of functions, which allows to assess the ‘smoothness’ of functions that are not even continuous. One of the main challenges in the theory is multivariate approximation where modeling of the geometry of the approximated function plays an important role. The syllabus includes: weak-type smoothness, functions spaces, trigonometric approximation, local polynomial approximation, splines, multiresolution, non-linear approximation using piecewise polynomials and wavelets, approximation spaces, the machinery of the Jackson-Bernstein theorems for the characterization of approximation spaces, geometric approximation.
Lesson 1 - Introduction, Lp and Hilbert spaces
Lesson 2 - Fourier series, Approximation using trigonometric polynomials I
Lesson 3 - Fourier integral, Approximation with piecewise constants I
Lesson 4 - Review
Lesson 5 - Approximation with piecewise constants II, Modulus of smoothness, K-functional
Lesson 6 - Lipschitz spaces, nonlinear approximation I, Jackson theorem for trigonometric polynomials I, Besov space I
Lesson 7 - Besov space II, Shift invariant spaces I
Lesson 8 - Shift invariant spaces II, wavelets I
Lesson 9 - Jackson theorem for trigonometric polynomials II, Approximation spaces I (Dany Leviatan)
Lesson 10 - Approximation spaces II, Jackson-Bernstein machinery, Bernstein-type theorems (Dany Leviatan)
Lesson 11 - wavelets II
Lesson 12 - Review of theorems
Presentations:
- What is approximation theory?
Assignments:
References:
R. DeVore & G. Lorentz, Constructive Approximation, Springer-Verlag, 1993.
R. DeVore, Nonlinear Approximation, Acta Numerica (1998), 51-150.
S. Brenner and L. Scott, The mathematical theory of finite elements, Springer 1994.
L. Grafakos, Classical and modern harmonic analysis, Prentice-Hall, 2004.
R. Adams and J. Fournier, Sobolev Spaces (2nd edition).
Spring 2023 - Mathematical foundations of machine learning
With Ido Ben-Shaul and Yuval Zelig
Syllabus: In the course we will approach Machine Learning (ML) from the perspective of geometric approximation theory and modern harmonic analysis. We will review in depth the most successful tools of ML: Support Vector Machines, Random Forest, Gradient Boosting, Deep Learning networks (Multi Layer Perceptron, Convolution, Attention, Transformers). We will discuss related theory and applications in computer vision, natural language, numerical solutions to PDEs, etc.
Lesson 1 - Intro, sequential sampling (normal distribution, Beta distribution, Dirichlet distribution), basic ML models (linear regression, logistic regression, soft-max)
Lesson 2 - ML terminology and notation, function spaces I, Random Forest I
Lesson 3 - Function spaces II, Random Forest II
Lesson 4 - Function spaces III, Wavelet decomposition of RF I
Lesson 5 - Function spaces IV, Wavelet decomposition of RF II
Lesson 6 - Function spaces V, feature importance
Lesson 7 - Approximation spaces, Besov smoothness of datasets, Support Vector machines, anisotropic RF using SVM, AdaBoost
Lesson 8 - Gradient Boosting , Wavelet based Gradient boosting, Deep Learning basics I
Lesson 9 - Deep Learning basics II, Computer vision applications of DL I
Lesson 10 - Computer vision applications of DL II, Mathematical analysis of DL !
Lesson 11- Mathematical analysis of DL !I, Applications of DL in numerical solutions for PDEs
Lesson 12 - NLP, Transformers
Lesson 13 - Applied ML workshop I
Lesson 14 - Applied ML workshop II, Review of Summer projects
Presentations:
- Intro, sequential sampling, basic ML models, ML terminology
- Function spaces, approximation theory
- Decision Trees, Random Forest, Wavelet decomposition of RF
- Deep learning - basic concepts
- Computer vision applications of deep learning
- PDE applications of deep learning
Assignments:
References:
[1] T. Hastie, R. Tibshirani and J. Friedman, Elements of statistical learning, Springer-Verlag 2009.
[2] Y. LeCun, Y. Bengio and G. Hinton, Deep Learning, Nature 521 (2015), 436–444.
[3] R. DeVore & G. Lorentz, Constructive Approximation
[4] R. DeVore, Nonlinear approximation, Acta Numerica 1998, 51-150.
[5] S. Dekel and D. Leviatan, Adaptive multivariate approximation using binary space partitions and geometric wavelets, SIAM Journal on Numerical Analysis 43 (2005), 707-732.
[6] O. Elisha and S. Dekel, Wavelet decomposition of Random Forests - smoothness analysis, sparse approximation and applications, JMLR 17 (2016).
[7] O. Morgan, O. Elisha and S. Dekel, Wavelet decomposition of Gradient Boosting, https://arxiv.org/abs/1805.02642
[8] O. Elisha and S. Dekel, Function space analysis of deep learning representation layers, https://arxiv.org/abs/1710.03263
[9] S. Ruder, On overview of gradient descent optimization algorithms, https://arxiv.org/abs/1609.04747
[10] Introduction to boosted trees, Tianqi Chen, 2014.
[11] I. Ben-Shaul and S. Dekel, Sparsity-Probe: analysis tool for deep learning models, IMVC 2021.
[12] I. Ben-Shaul and S. Dekel, Nearest class center simplification through intermediate layers, PMLR 196, 2022.
[13] I. Ben-Shaul, T. Galanti and S. Dekel, Exploring the approximation capabilities of multiplicative neural networks for smooth functions, submitted.
[14] M. Phuong and M. Hutter, Formal algorithms for transformers, DeepMind, 2022.
[15] ChatGPT - https://openai.com/blog/chatgpt/
Spring 2022 - Mathematical foundations of machine learning
With Ido Ben-Shaul
Syllabus: In the course we will approach Machine Learning (ML) from the perspective of geometric approximation theory and modern harmonic analysis. We will review in depth the most successful tools of ML: Prophet, Gaussian Processes, Support Vector Machines, Random Forest, Gradient Boosting, Deep Learning. We will discuss related theory and applications in computer vision, numerical solutions to PDEs, etc.
lesson 1 - Introduction, adaptive sampling using the Beta and Dirichlet distributions, basic ML models: linear regression, logistic regression, soft-max.
lesson 2 - Basic definitions of ML, function spaces, Gaussian Processes for noisy time series
lesson 3 - Function spaces II, Decision Trees, Random Forest
lesson 4 - Function spaces III, Wavelet decomposition of Random Forest
lesson 5 - Function spaces IV, feature importance using linear correlation, tree-based feature importance, wavelet-based feature importance.
lesson 6 - Function spaces V, Besov space smoothness of datasets, Support vector machines (SVM), anisotropic RF using SVM.
lesson 7 - AdaBoost and additive models, Gradient Boosting, wavelet based Gradient boosting, Deep Learning I
lesson 8 - Deep learning II, Computer vision/imagining applications of DL I
lesson 9 - Computer vision/imagining applications of DL II, Mathematical analysis of DL I
lesson 10 - Mathematical analysis of DL II, PDE applications of DL I
lesson 11 - PDE applications of DL II, Attention models (Transformers)
lesson 12 - Applied ML workshop (Ido Ben Shaul)
lesson 13 - Review of projects
Presentations
- Intro, statistics and basic models
- Wavelet decomposition of Random Forest
- Deep learning building blocks
- Computer vision/imaging applications of DL
- Applied workshop with Ido Ben Shaul
Assignments
Summer project list, summer assignment
References:
[1] T. Hastie, R. Tibshirani and J. Friedman, Elements of statistical learning, Springer-Verlag 2009.
[2] Y. LeCun, Y. Bengio and G. Hinton, Deep Learning, Nature 521 (2015), 436–444.
[3] R. DeVore & G. Lorentz, Constructive Approximation
[4] R. DeVore, Nonlinear approximation, Acta Numerica 1998, 51-150.
[5] S. Dekel and D. Leviatan, Adaptive multivariate approximation using binary space partitions and geometric wavelets, SIAM Journal on Numerical Analysis 43 (2005), 707-732.
[6] O. Elisha and S. Dekel, Wavelet decomposition of Random Forests - smoothness analysis, sparse approximation and applications, JMLR 17 (2016). link
[7] O. Morgan, O. Elisha and S. Dekel, Wavelet decomposition of Gradient Boosting, https://arxiv.org/abs/1805.02642
[8] O. Elisha and S. Dekel, Function space analysis of deep learning representation layers, https://arxiv.org/abs/1710.03263
[9] S. Ruder, On overview of gradient descent optimization algorithms, https://arxiv.org/abs/1609.04747
[10] Introduction to boosted trees, Tianqi Chen, 2014.
[11] S. Roberts, M. Osborne, M. Ebden, S. Reece, N. Gibson and S. Aigrain, Gaussian processes for time-series modelling, Philosophical transactions of the royal society 371 (2013)
[12] I. Ben-Shaul and S. Dekel, Sparsity-Probe: analysis tool for deep learning models, IMVC 2021.
[13] I. Ben-Shaul and S. Dekel, Nearest class center simplification through intermediate layers, submitted.
[14] R. Devore, B. Hanin and G. Petrova, Neural network approximation, Acta Numerica (2021), 327-444.
Spring 2021 - Introduction to function space theory
Syllabus: In the course we will review the range of function spaces that are fundamental to mathematical analysis and their various characterizations through Harmonic analysis, atomic representations and approximation spaces: Lp Spaces, Hardy spaces, Sobolev spaces, Triebel-Lizorkin spaces, Besov spaces. Time allowing we will also cover interpolation of functions spaces and function spaces over manifolds.
References
E. Stein, Harmonic analysis, real variable methods, orthogonality and oscillatory integrals,
L. Grafakos, Classical and modern harmonic analysis,
L. Tartar, An introduction to Sobolev spaces and interpolation spaces,
R. Adams and J. Fournier, Sobolev Space (2nd edition),
R. DeVore & G. Lorentz, Constructive Approximation
Lesson 1 - Lp spaces, weak Lp spaces.
Lesson 2 - first glimpse into function space interpolation, first glimpse into Hardy spaces. Schwartz class, Distributions, convolutions, Sobolev spaces I.
Lesson 3 - Sobolev spaces II, Fourier transform of Schwartz functions.
Lesson 4 - Fourier transform II, Fourier transform of distributions, Fourier representation of W^r_2.
Lesson 5 - Derivation of Fourier integral from Heat equation, Maximal functions, Hardy spaces I.
Lesson 6 - Hardy spaces II
Lesson 7 - Hardy spaces III, Moduli of smoothness I
Lesson 8 - Moduli of smoothness II, K-functional
Lesson 9 - Generalized Lipschitz spaces, Approximation with piecewise constants, Besov spaces I
Lesson 10 - Besov spaces II
Fall 2020 - Foundations of approximation theory
Syllabus Approximation theory is one of the main theoretical pillars of applied mathematics. One of its goals is to characterize the classes of functions that can be approximated by a specified algorithm with the error decaying at a certain qualitative rate. Examples for approximation algorithms are: Fourier series, algebraic polynomials, splines, wavelets, finite elements, etc. So as to provide the theoretical foundations of signal & data analysis, approximation theory applies tools to measure weak-type smoothness of functions, which allows to assess the ‘smoothness’ of functions that are not even continuous. One of the main challenges in the theory is multivariate approximation where modeling of the geometry of the approximated function plays an important role. The syllabus includes: weak-type smoothness, functions spaces, trigonometric approximation, local polynomial approximation, splines, multiresolution, non-linear approximation using piecewise polynomials and wavelets, approximation spaces, the machinery of the Jackson-Bernstein theorems for the characterization of approximation spaces, geometric approximation.
R. DeVore & G. Lorentz, Constructive Approximation, Springer-Velag, 1993.
R. DeVore, Nonlinear Approximation, Acta Numerica (1998), 51-150.
S. Brenner and L. Scott, The mathematical theory of finite elements, Springer 1994.
L. Grafakos, Classical and modern harmonic analysis, Prentice-Hall, 2004.
Lesson 1 - Introduction, Lp spaces, Smoothness spaces I
Lesson 2 - Smoothness spaces II, Trigonometric polynomials and Fourier series approximation, Dirichelet, Fejer, Summability kernels, Fourier integral I
Lesson 3 - Fourier integral II, approximation with piecewise constants, modulus of smoothness
Lesson 4 - K-functional, Lip spaces, first glimpse at nonlinear approximation (free knot piecewise constants), Jackson theorem for trigonometric polynomials
Lesson 5 - Besov spaces, "local" algebraic polynomial approximation I
Lesson 6 - "local" algebraic polynomial approximation II
Lesson 7 - Approximation from shift invariant spaces
Lesson 8 - Approximation from shift invariant spaces II, Wavelets I
Lesson 9 - Wavelets II
Lesson 10 - Wavelets III, Approximation spaces I
Lesson 11 - Approximation spaces II
Lesson 12 - Approximation spaces III
Lesson 13 - Review of theorems for the exam
Lectures notes: notes, local polynomial approximation notes
Spring 2020- Mathematical foundations of machine learning
Syllabus: In the course we will approach Machine Learning (ML) from the perspective of geometric approximation theory and modern harmonic analysis. We will review in depth the most successful tools of ML: Prophet, Gaussian Processes, Support Vector Machines, Random Forest, Gradient Boosting, Deep Learning. We will discuss related theory and applications in computer vision, numerical solutions to PDEs, etc.
Lesson 1 - Introduction to the course - presentation
Lesson 2 - Linear regression, logistic regression, soft-max, statistical evaluation metrics - notes, Function space theory I [3]- notes
Lesson 3 - Function space theory II [3] - notes, Gaussian Processes + application for noisy time series [12]- notes
Lesson 4 - Function space theory III [3] - notes, Decision trees, Random Forest [6],[7].
Lesson 5 - Besov spaces [3] - notes, Besov smoothness of indicator function - notes, RF classification - mapping to vector regression [6], [8] - notes , standard methods [1] - notes. Scikit Learn RF
Lesson 6 - Kaggle datasets, Wavelet decomposition of RF, Besov index of datasets - theory and applications [6],[8], Linear correlation - notes
Lesson 7 - Feature importance - standard methods & wavelet method [6], Support Vector Machines ([1] Chapter 12) , Anisotropic RF using linear SVM, additional notes
Lesson 8 - AdaBoost and additive models [1], Wavelet-based Gradient Boosting [7], Jackson theorem for wavelet decomposition of RF [6]- additional notes , Deep learning building blocks I - Convolutions I
Lesson 9 - Deep learning building blocks II - Convolutions II, non-linearities, pooling methods, Loss functions, Back Propagation [2] - additional notes.
Lesson 10 - Gradient descent [9], applications of DL in CV, additional notes.
Lesson 11 - Applications of DL in CV II, applications of DL in numerical PDEs
Lesson 12 - Deep neural decision forests & wavelets [13], Prophet model for time series forecasting [14]
Lesson 13 - Review of summer assignment & projects
Summer projects, summer assignment - submission on 31st August 2020
References:
[1] T. Hastie, R. Tibshirani and J. Friedman, Elements of statistical learning, Springer-Verlag 2009.
[2] Y. LeCun, Y. Bengio and G. Hinton, Deep Learning, Nature 521 (2015), 436–444.
[3] R. DeVore & G. Lorentz, Constructive Approximation
[4] R. DeVore, Nonlinear approximation, Acta Numerica 1998, 51-150.
[5] S. Dekel and D. Leviatan, Adaptive multivariate approximation using binary space partitions and geometric wavelets, SIAM Journal on Numerical Analysis 43 (2005), 707-732.
[6] O. Elisha and S. Dekel, Wavelet decomposition of Random Forests - smoothness analysis,sparse approximation and applications, JMLR 17 (2016). link
[7] O. Morgan, O. Elisha and S. Dekel, Wavelet decomposition of Gradient Boosting, https://arxiv.org/abs/1805.02642
[8] O. Elisha and S. Dekel, Function space analysis of deep learning representation layers, https://arxiv.org/abs/1710.03263
[9] S. Ruder, On overview of gradient descent optimization algorithms, https://arxiv.org/abs/1609.04747
[10] Introduction to boosted trees, Tianqi Chen, 2014.
[11] Theories of Deep Learning lecture notes, STATS 385, Stanford University 2017, 2019.
[12] S. Roberts, M. Osborne, M. Ebden, S. Reece, N. Gibson and S. Aigrain, Gaussian processes for time-series modelling, Philosophical transactions of the royal society 371 (2013)
[13] P. Kontschieder, M. Fiterau, A. Criminisi, S. Rota Bulo, Deep neural decision forests, ICCV 2015.
[14] S. Taylor and B. Letham, Forecasting at scale, 2017.
======================
Spring 2019 - Mathematical foundations of machine learning
Syllabus: In the course we will approach Machine Learning (ML) from the perspective of geometric approximation theory and modern harmonic analysis. We will review in depth the most successful tools of ML: Support Vector Machines, Random Forest, Gradient Boosting and Deep Learning and discuss related theory and applications.
Lesson 1 - Introduction to the course (presentation)
Lesson 2 - Linear regression, logistic regression, soft-max, statistical evaluation metrics, Function space theory I [4]
Lesson 3 - Function space theory II [4], Decision tree I [1]
Lesson 4 - Function space theory III [4], Decision tree II [1]
Lesson 5 - Decision tree III, Wavelet decomposition of a decision tree , Random Forest [7]
Lesson 6 - Wavelet decomposition of RF [7], numeric estimate of Besov smoothness of a data set [9], RF-based feature importance - standard methods [1], wavelet-based [7].
Lesson 7 - Jackson theorem for wavelet decomposition of RF [7], Support Vector Machines ([1] Chapter 12), Anisotropic RF using linear SVM,
Tree based Gradient Boosting [1], [11], Wavelet-based Gradient Boosting [8].
21st April - No lesson (Passover vacation)
Lesson 8 - Deep learning building blocks I - Convolutions, non-linearities, pooling methods, Loss functions I, DL architectures I
Lesson 9 - Deep learning building blocks II - Loss functions II, Back Propagation [3], Gradient descent I [10], DL architectures II.
Lesson 10 - DL computer vision applications, AI based numerical solutions for PDEs
Lesson 11 - Overview of platform for projects I - Azure Cloud, software packages (Oren Elisha, Microsoft) presentation
Lesson 12 - Review of projects.
Lesson on 2nd of June is cancelled
Course mini-conference/project presentations - Sunday, September 22nd 2019, WIX offices, Bitan 26 TLV port schedule
References:
[1] T. Hastie, R. Tibshirani and J. Friedman, Elements of statistical learning, Springer-Verlag 2009.
[2] S. Mallat, Understanding deep convolutional networks, Phil. Trans. R. Soc 374 (2016).
[3] Y. LeCun, Y. Bengio and G. Hinton, Deep Learning, Nature 521 (2015), 436–444.
[4] R. DeVore & G. Lorentz, Constructive Approximation
[5] R. DeVore, Nonlinear approximation, Acta Numerica 1998, 51-150.
[6] S. Dekel and D. Leviatan, Adaptive multivariate approximation using binary space partitions and geometric wavelets, SIAM Journal on Numerical Analysis 43 (2005), 707-732.
[7] O. Elisha and S. Dekel, Wavelet decompositions of Random Forests - smoothness analysis,sparse approximation and applications, JMLR 17 (2016). link
[8] O. Morgan, O. Elisha and S. Dekel, Wavelet decomposition of Gradient Boosting, https://arxiv.org/abs/1805.02642
[9] O. Elisha and S. Dekel, Function space analysis of deep learning representation layers, https://arxiv.org/abs/1710.03263
[10] S. Ruder, On overview of gradient descent optimization algorithms, https://arxiv.org/abs/1609.04747
[11] Introduction to boosted trees, Tianqi Chen, 2014.
[12] Theories of Deep Learning lecture notes, Stanford University 2017.
======================
Spring 2018 - Mathematical foundations of machine learning
Syllabus: In the course we will approach Machine Learning (ML) from the perspective of geometric approximation theory and modern harmonic analysis. We will review in depth the most successful tools of ML: Support Vector Machines, Random Forest, Gradient Boosting and Deep Learning and discuss related theory and applications.
Lesson 1 - Quick introduction, Statistics - basic definitions
Lesson 2 - Function spaces I ([4], parts of sections 2.1, 2.5, 2.6, 2.7)
Lesson 3 - Function Spaces II ([4], parts of sections 2.7, 2.9, [5] parts of sections 3.1, 3.2), Decision Trees I ([1], [6], [7])
Lesson 4 - Decision Trees II ([1], [6], [7]), Wavelet decomposition of decision trees ([6],[7]), Random Forest ([1], [7]), Examples of RF code in R, Wavelet decomposition of RF ([7])
Lesson 5 - RF-based feature importance - standard methods [1], wavelet-based [7]
Lesson 6 - Function Spaces III ([4] Section 2.10), Definition and numeric estimate of Besov smoothness of data set [9]
Lesson 7 - Jackson theorem for wavelet decomposition of RF [7], Support Vector Machines ([1] Chapter 12), Anisotropic RF using linear SVM,
Lesson 8 - Ada Boost [1], Tree-based Gradient Boosting [1], Wavelet-based GB [8], Deep learning building blocks I - convolutions [3]
Lesson 9 - Deep learning building blocks II - Convolutions II, Logistic Regression, Soft-Max layer [3], Gradient descent I [10], Back Propagation [3].
Lesson 10 - Deep Learning architectures. Computer vision using DL - applications (presentation). With Meir Perez (AI team leader, WIX)
[20th May, Shavuhot holiday, no lesson]
Lesson 11 - Applied ML hands-on session I (presentation). With Oren Elisha (Microsoft R&D)
Lesson 12 - Applied ML hands-on session II (presentation). With Oren Elisha (Microsoft R&D)
Lesson 13 - Gradient descent II [10], Projects review, epilogue [11]
Oren Elisha - Further DL implementation tips
Course mini-conference/project presentations - Wednesday, September 5th, WIX offices, Bitan 26 TLV port Schedule
References:
[1] T. Hastie, R. Tibshirani and J. Friedman, Elements of statistical learning, Springer-Verlag 2009.
[2] S. Mallat, Understanding deep convolutional networks, Phil. Trans. R. Soc 374 (2016).
[3] Y. LeCun, Y. Bengio and G. Hinton, Deep Learning, Nature 521 (2015), 436–444.
[4] R. DeVore & G. Lorentz, Constructive Approximation
[5] R. DeVore, Nonlinear approximation, Acta Numerica 1998, 51-150.
[6] S. Dekel and D. Leviatan, Adaptive multivariate approximation using binary space partitions and geometric wavelets, SIAM Journal on Numerical Analysis 43 (2005), 707-732.
[7] O. Elisha and S. Dekel, Wavelet decompositions of Random Forests - smoothness analysis,sparse approximation and applications, JMLR 17 (2016).
[8] O. Morgan, O. Elisha and S. Dekel, Wavelet decomposition of Gradient Boosting, https://arxiv.org/abs/1805.02642
[9] O. Elisha and S. Dekel, Function space analysis of deep learning representation layers, https://arxiv.org/abs/1710.03263
[10] S. Ruder, On overview of gradient descent optimization algorithms, https://arxiv.org/abs/1609.04747
[11] Theories of Deep Learning lecture notes, Stanford University 2017.
Fall 2017 - Introduction to function space theory
Syllabus: In the course we will review the range of function spaces that are fundamental to mathematical analysis and their various characterizations through Harmonic analysis, atomic representations and approximation spaces: Lp Spaces, Hardy spaces, Sobolev spaces, Triebel-Lizorkin spaces, Besov spaces. Time allowing we will also cover interpolation of functions spaces and function spaces over manifolds.
References
E. Stein, Harmonic analysis, real variable methods, orthogonality and oscillatory integrals,
L. Grafakos, Classical and modern harmonic analysis,
L. Tartar, An introduction to Sobolev spaces and interpolation spaces,
R. Adams and J. Fournier, Sobolev Space (2nd edition),
R. DeVore & G. Lorentz, Constructive Approximation
Lesson 1 - Lp spaces, Hilbert spaces, first glimpse into Hardy spaces, weak Lp spaces
Lesson 2 - first glimpse into function space interpolation, Fourier analysis on the Torus, Schwartz functions,
Lesson 3 - Fourier analysis on R^n, distributions, convolutions
Lesson 4 - Real Hardy spaces I
Lesson 5 - Real Hardy spaces II
Lesson 6 - Real Hardy spaces III
Lesson 7 - Sobolev spaces I
Lesson 8 - Sobolev spaces II, Modulus of smoothness I,
Lesson 9 - Modulus of smoothness II, Lipschitz spaces
Lesson 10 - Besov spaces I
Lesson 11 - Besov spaces II, review of theorem list for exam
Spring 2017 - Foundations of approximation theory
Syllabus Approximation theory is one of the main theoretical pillars of applied mathematics. One of its goals is to characterize the classes of functions that can be approximated by a specified algorithm with the error decaying at a certain qualitative rate. Examples for approximation algorithms are: Fourier series, algebraic polynomials, splines, wavelets, finite elements, etc. So as to provide the theoretical foundations of signal & data analysis, approximation theory applies tools to measure weak-type smoothness of functions, which allows to assess the ‘smoothness’ of functions that are not even continuous. One of the main challenges in the theory is multivariate approximation where modeling of the geometry of the approximated function plays an important role. The syllabus includes: weak-type smoothness, functions spaces, trigonometric approximation, local polynomial approximation, splines, multiresolution, non-linear approximation using wavelets, approximation spaces, the machinery of the Jackson-Bernstein theorems for the characterization of approximation spaces, geometric approximation.
R. DeVore & G. Lorentz, Constructive Approximation, Springer-Velag, 1993.
R. DeVore, Nonlinear Approximation, Acta Numerica 1998, 51-150.
Lesson 1 - Quick overview, Banach Spaces, Lp spaces, Fourier analysis I
Lesson 2 - Fourier analysis II, Heat Kernel, Fourier transform I
Lesson 3 - Fourier transform II, Smoothness spaces I, Modulus of Smoothness I
Lesson 4 - Modulus of Smoothness II, Smoothness spaces II, Jackson theorem for trigonometric polynomial approximation
Lesson 5 - K-functional, Smoothness spaces III (Lip)
Lesson 6- Smoothness spaces IV (Besov) , Local polynomial approximation
Lesson 7 - Approximation from Shift-Invariant Spaces I
Lesson 8 - Approximation from Shift-Invariant Spaces II, Multiresolution, Free-knot piecewise constant approximation,
Lesson 9 - Wavelets
Lesson 10 - Jackson theorem for Wavelets, Approximation spaces
Lesson 11 - Bernstein theorem for Trigonometric polynomials, Jackson - Bernstein machinery, interpolation spaces
Lesson 12 - Bernstein theorem for N-term wavelets, more examples of characterizations of approximation spaces, review of theorem list for the exam
Fall 2016 - Wavelets
Syllabus Quick introduction using the Haar system, properties of the Fourier transform, sampling theorems, continuous wavelet transform, frames, singularity analysis, wavelet bases, fast wavelet transform, nonlinear approximation, applications: compressed sensing, image compression, denoising, data science (deep convolution networks, random forests, gradient boosting)
S. Mallat, A Wavelet tour of signal processing, 3rd edition (the sparse way), 2009.
I. Daubechies, Ten lectures on wavelets, 1992.
A. Cohen - Numerical analysis of wavelet methods, 2003
R. DeVore - Nonlinear Approximation, Acta Numerica 1998, 51-150.
O. Christensen - An introduction to frames and Riesz bases, 2003.
Lesson 1 - Quick introduction using the Haar system, review of some applications of wavelets.
Lesson 2 - Fourier analysis I
Lesson 3 - Fourier analysis II, Shift Invariant spaces
Lesson 4 - Shift Invariant spaces II, Sinc and Ideal low pass
Lesson 5 - Multiresolution
Lesson 6 - Wavelets, Wavelet transforms, Cascade algorithm
Lesson 7- Cascade II, Biorthogonal wavelets
Lesson 8 - Mutlivariate wavelets, Applications: image compression, inpainting, compressed sensing
Lesson 9- Wavelet compositions of Random Forests (Oren Elisha)
Lesson 10 - Wavelet continuous transform
Lesson 11 - Frames, Riesz basis
Lesson 12 - Scattering Networks, Convolution Networks (Leon Gugel)
Lesson 13 - Review of project list
Spring 2016 - Analysis of spaces of homogeneous type
Syllabus Spaces of homogeneous type are an important modern generalization of the Euclidian space. In the course we shall see that a rather weak assumption, on the relationship between the measure of ‘volume’ and the metric, provides the setup for wide-ranging theory, as in the Euclidian case. Some examples for spaces of homogeneous type are: manifolds, anisotropic spaces, graphs, various matrix and group structures, etc. The course will introduce function spaces over such geometric structures and harmonic analysis tools that replace the Fourier series or transform. The syllabus includes: Schwartz functions and distributions, Fourier analysis over R^n, function space characterization, homogeneous spaces, spectral representation, Laplace and heat kernel operators on manifolds, functional calculus, spectral analysis on manifolds. We will also discuss applications in data science, for example: deep learning where the data is collected over a manifold.
[S] E. Stein, Harmonic analysis: real variable methods, orthogonality and Oscillatory integrals, 1993.
[Gra] L. Grafakos, Classical and Modern Fourier Analysis, 2004.
[Gri] A. Grigor’yan, Heat kernel and analysis on manifolds, 2009.
[CKP] T. Coulhon, G. Kerkyacharian, P. Petrushev, Heat kernel generated frames in the setting of Dirichlet spaces, Jounral of Fourier analysis and Applications 18 (2012), 995-1066.
[KP] G. Kerkyacharian, P. Petrushev, Heat kernel based decomposition of spaces of distributions in the framework of Dirichlet spaces, Transactions of the American Mathematical Society 367 (2015), 121-189.
[DDP] W. Dahmen, D. Dekel and P. Petrushev, Two-level splits of anisotropic Besov spaces, Constructive Approximation 31 (2010), 149-194.
[DL] R. DeVore & G. Lorentz, Constructive Approximation, Springer-Velag, 1993.
Lesson 1 - Banach spaces, Fourier series, heat equation on the torus, functional calculus on manifolds (intro)
Lesson 2 - Fourier series (cont), definition & examples of homogenuous spaces ([S] 8-11), elliposid covers I ([DDP], Section 2).
Lesson 3- Ellipsoid covers II, spaces of homogenuous type II ([KP]), Schwartz class ([Gra] Sections 2.2, 2.3).
Lesson 4 - distributions ([Gra] Sections 2.2, 2.3), Schwartz & distributions on manifolds [KP], Fourier transform, Fourier transform of distributions [Gra],
Lesson 5 - Laplace operator on manifolds [Gri, Chapter 3], setup of localized heat kernels [KP].
Lesson 6 - Applications - deep learning on manifolds (Leon Gugel's presentation).
Lesson 7 - Davies-Gaffney estimate, Finite Speed Propogation, Kernel localization [KP].
Lesson 8 - Kernel localization II [KP]
Lesson 9 - Kernel localization III, spaces of bandlimited functions, spectral approximation (Euclidian case)
Lesson 10 - Spectral space on manifolds [KP], Nikolskii inequality, Bernstein inequality (trigonometric polynomials) [DL],
Lesson 11 - Bernstein & Jackson inequalities (spectral spaces) [CKP], Maximal function [S]
Lesson 12 - Maximal function II, Hardy spaces (def), uniformly bounded Hardy norm of atoms [S], review of list of theorems for exam.