Course Syllabus

CS 281B / Stat 241B Spring 2016

Statistical Learning Theory

Lectures: 306 Soda. Tuesday/Thursday 9:30-11:00am.

Instructor: Peter Bartlett.
Office Hours: Wed 9:00-10:00, 399 Evans; Thu 11:00-12:00; 723 SDH.

Alan Malek. Office Hours: Wed 5:00-6:00 and Thu 4:00-5:00; 283H Soda.

This course will provide an introduction to the theoretical analysis of prediction methods, focusing on statistical and computational aspects. It will cover approaches such as neural networks, kernel methods and boosting algorithms, and probabilistic and game theoretic formulations of prediction problems, and it will focus on tools for the theoretical analysis of the performance of learning algorithms and the inherent difficulty of learning problems.

  • Probabilistic formulations of prediction problems
    • Plug-in estimators, empirical risk minimization
    • Linear threshold functions, perceptron algorithm
    • Risk bounds
      • Concentration inequalities
      • Uniform convergence
      • Rademacher averages; combinatorial dimensions
      • Convex surrogate losses for classification
  • Game-theoretic formulations of prediction problems
    • Minimax strategies for log loss, linear loss, and quadratic loss
    • Universal portfolios
    • Online convex optimization
  • Neural networks
    • Stochastic gradient methods
    • Combinatorial dimensions and Rademacher averages
    • Hardness results for learning
    • Efficient learning algorithms
  • Kernel methods
    • Reproducing kernel Hilbert spaces, Mercer's theorem
    • Convex optimization for kernel methods
    • Representer theorem
  • Ensemble methods
    • AdaBoost
    • AdaBoost as I-projection
    • Convergence and consistency of AdaBoost

Prerequisites: Probability, statistics, analysis of algorithms (CS281A/Stat241A, Stat 205A, or Stat 210A, plus mathematical maturity). This is a theory class: although many tools will be reviewed in lectures, a strong mathematical background is necessary.

Materials: Lecture notes will be posted on bCourses, and the Readings page will give links to papers and textbooks.

The grade will be based 40% on homework and 60% on the final project.

  • There will be approximately five homework assignments, approximately one every two weeks. Late homeworks will not be accepted. The homework grade will be that of the best n-1 of n homeworks. You are welcome to discuss homeworks with other students, but please work out and write up the solutions completely on your own, and specify in your solutions who you've discussed which problems with. Some of the problems have appeared in the literature. Please attempt them yourself, rather than searching for someone else's solution. If you happen to have seen a problem before, please write up a solution on your own (and indicate that you've seen it before - it would also be helpful to point out where).
  • There will be a final project. This can be in any area related to the topics of the course. You might extend a theoretical result, develop a new method and investigate its performance, or run experiments on an existing method for a particular application, or do a combination of these. You will need to submit a brief written report and give a presentation in class during the week of April 25. It is OK to work on projects in groups of two (please email me an explanation if there's a good reason to work in a larger group).
    • Project proposal: due on March 11 (one or two plain text paragraphs).
    • Presentations in class: during the week of April 25.
    • Project reports: due on Friday, April 29.


Academic Dishonesty: Copying, in whole or in part, from other students in the class or from any other source without acknowledgment constitutes cheating. Any student found to be cheating risks automatically failing the class and being referred to the Office of Student Conduct. Please see the policy on academic dishonesty

Communication: Please use Piazza: post a public question or note. For questions related to you only, please create a private post on Piazza.

Course Summary:

Date Details Due