Lecture 11
Readings:
Gradient-based Hyperparameter Optimization through Reversible Learning
Links to an external site., Dougal Maclaurin, David Duvenaud, Ryan P. Adams, arXiv 1502.03492, 2015 - shows that optimal DNN parameters are time-varying, so hyperparameter optimization is really trajectory optimization.
Notes Download Notes by Biye Jiang
Practical Bayesian Optimization of Machine Learning Algorithms
Links to an external site., Jasper Snoek, Hugo Larochelle, Ryan P. Adams, arXiv 1206.2944, 2012 - classic article on bayesian optimization, which is the current state of the art for static hyperparameter optimization.
Notes Download Notes by Arturo Fernandez
Learning Where to Sample in Structured Prediction
Links to an external site., Tianlin Shi, Jacob Steinhardt, Percy Liang, JMLR, 2015 - interesting application of reinforcement learning to tune a sampling model.
Notes Download Notes by Yuansi Chen
See Also:
The Effects of Hyperparameters on SGD Training of Neural Networks
Links to an external site., Thomas M. Breuel, arXiv 1508.02788v1, 2015