Lecture 11

Readings:

Gradient-based Hyperparameter Optimization through Reversible Learning Links to an external site., Dougal Maclaurin, David Duvenaud, Ryan P. Adams, arXiv 1502.03492, 2015 - shows that optimal DNN parameters are time-varying, so hyperparameter optimization is really trajectory optimization.

Download Notes

by Biye Jiang

 

Practical Bayesian Optimization of Machine Learning Algorithms Links to an external site., Jasper Snoek, Hugo Larochelle, Ryan P. Adams, arXiv 1206.2944, 2012 - classic article on bayesian optimization, which is the current state of the art for static hyperparameter optimization.

Download Notes

by Arturo Fernandez

 

Learning Where to Sample in Structured Prediction Links to an external site., Tianlin Shi, Jacob Steinhardt, Percy Liang, JMLR, 2015 - interesting application of reinforcement learning to tune a sampling model.

Download Notes

by Yuansi Chen

 

See Also:

The Effects of Hyperparameters on SGD Training of Neural Networks Links to an external site., Thomas M. Breuel, arXiv 1508.02788v1, 2015