Table of Contents
Fundamental Math Courses
All topics in one resource
- NYU DS-GA 1002: Statistical and Mathematical Methods - Breadth Course covering Probability, linear algebra, statistics, and optimization. Recommend doing each individually below
- Mathematics for Machine Learning by Marc Peter Deisenroth (Assistant Professor at Imperial College London)
Individual Classes/Books per resource
Probability
- pre-requirements: None
- Online Courses
    - Harvard - Statistics 110: Probability
- edX MIT Course - Introduction to Probability - The Science of Uncertainty - RECOMMENDED
- MIT OCW Course: Probabilistic Systems Analysis and Applied Probability
- EXCELLENT THEORETICAL YET PRACTICAL SERIES by Michael Betancourt
 
- Books: 
    - A First Course in Probability by Sheldon Ross
- Monte Carlo theory, methods and examples (last 2 chapters on Variance reduction and Importance sampling are particularly useful in machine learning)
 
Linear Algebra
- pre-requirements: None
- Online Course: 
    - MIT OCW Course: Linear Algebra (lectures, hws, exams, solutions available) - RECOMMENDED
- Introduction to Applied Linear Algebra – Vectors, Matrices, and Least Squares - Course by Stephen Boyd
 
- Books/PDFs:
    - Short: 11-page review by NYU computational neuroscientist Eero Simoncelli
- Medium-Length, gentle introduction: 58-page Appendix on Linear Algebra from the PDP book series, by computer scientist Michael Jordan
- Gilbert Strang: Introduction to Linear Algebra
- Sheldon Axler: Linear Algebra Done Right
- The Matrix Calculus You Need For Deep Learning
 
Statistics
- pre-requirements: Probability, Linear Algebra
- Recommended Book Sequence:
    - An Introduction to Statistical Learning, with Applications in R (more intuition, less math)
- The Elements of Statistical Learning: Data Mining, Inference, and Prediction (more rigorous)
 
- Online Courses
    - Stanford Statistics 200: Introduction to Statistical Inference (more rigorous, has solutions for psets/exams) - RECOMMENDED
- MIT OCW Course - Statistics for Applications (more rigorous, lacks solutions for psets/exams) - proof based
        - con: doesn’t teach intuition for complex math. requires mathematical maturity.
 
- Stanford: Statistical Learning (less rigorous, everything available online) - good beginner course. more intuition, less math.
 
Convex Optimization
- pre-requirements: Probability, Linear Algebra
- Convex Optimization Class, Lectures - RECOMMENDED