Tutorial Sessions

Tutorial Sessions by Nathan Srebro and Alexander (Sasha) Rakhlin

will be presented on Tuesday, October 3, 2017* 

Coordinated Science Laboratory Auditorium, Room B02

Registration Fee Per Session
$40 advanced registration fee (before September 18)
$50 regular registration fee (September 19-October 3)


Registration and Breakfast: 8:30-9:30 am in the lower level lobby of Coordinated Science Laboratory

Morning Tutorial 
Session begins: 9:30am-11:30am
Lunch break: 11:30am-12:30pm (Lunch will not be provided)
Session resumes: 12:30pm-1:30pm

Afternoon Tutorial 
Session begins: 2:00pm-3:30pm
Afternoon break: 3:30pm-4:00pm (Snacks and water will be provided)
Session resumes: 4:00pm-5:30pm

Conference Welcome Reception will follow from 6:00-8:00pm in 3002 Electrical and Computer Engineering Building 

Nathan Srebro will start his tutorial presentation at 9:30am-1:30pm (includes a break for lunch from 11:30am-12:30pm)

Title:  Learning, Optimization and the Geometry of Parameter Space

Abstract:   We will explore how learning and stochastic optimization are one and the same, and explore how optimization algorithms are biased by a choice of “geometry”, which in turn can endow implicit regularization, provide an inductive bias and aid in generalization.  We will first consider in detail the convex case, in which this connection is well understood, and study (stochastic) mirror descent as a master algorithm linking geometry, optimization and learning.  We will use stability to both motivate and analyze mirror descent, and emphasize the role of strong convexity in ensuring stability.  We will also discuss steepest descent methods and natural gradients.  We will then turn to the non-convex case, and see how implicit regularization plays a crucial, though not yet well understood, role in deep learning.

Biography: Professor Nati Srebro obtained his PhD at the Massachusetts Institute of Technology (MIT) in 2004, held a post-doctoral fellowship with the Machine Learning Group at the University of Toronto, and was a Visiting Scientist at IBM Haifa Research Labs. Since January 2006, he has been on the faculty of the Toyota Technological Institute at Chicago (TTIC) and the University of Chicago, and has also served as
the first Director of Graduate Studies at TTIC. From 2013 to 2014 he was associate professor at the Technion-Israel Institute of Technology. Prof. Srebro’s research encompasses methodological, statistical and computational aspects of Machine Learning, as well as related problems in Optimization. Some of Prof. Srebro’s significant contributions include work on learning “wider” Markov networks, pioneering work on matrix factorization and collaborative prediction, including introducing the use of the nuclear norm for machine learning and matrix reconstruction and work on fast optimization techniques for machine learning, and on the relationship between learning and optimization.


Alexander (Sasha) Rakhlin will start his tutorial presentation at 2:00pm-5:30pm (includes a 30 minute break from 3:30-4:00pm)

Title: Recent advances in online prediction

Abstract: In a growing number of machine learning applications—-such as problems of advertisement placement, movie recommendation, and node or link prediction in evolving networks—-one must make online, real-time decisions and continuously improve performance with the sequential arrival of data. In addition to the online aspect, aforementioned applications are also characterized by the non-i.i.d. nature of data. These characteristics put online problems outside of the scope of Statistical Learning which has been providing guidance on questions of overfitting, model selection, and regularization. Is there an analogous foundation for online non-i.i.d. (universal) prediction?

This tutorial will cover the basics of online methods and their analysis. We will start with an elegant result of T. Cover and apply it to the problem of node classification in a social network. We will develop a general theory of what is achievable in the online framework and complement it with an algorithmic prescription based on approximate dynamic programming. Fundamental connections to Probability (in particular, uniform martingale laws of large numbers), Optimization, and Information Theory will be outlined, as well as directions for further research.

Biography: Alexander Rakhlin is an Associate Professor in the Department of Statistics at the University of Pennsylvania. He received his bachelor’s degrees in mathematics and computer science from Cornell University, and doctoral degree from MIT. He was a postdoc at UC Berkeley EECS before joining UPenn. He is a recipient of the NSF CAREER award, IBM Research Best Paper award, Machine Learning Journal award, and COLT Best Paper Award.

*Please note that lodging will not be available on Monday, October 2 at the Allerton Park and Retreat Center. If you plan to attending these tutorials, please pursue other lodging options at the 2017 Recommended Hotels section.

Metered parking is available. Please pay via cell phone or plan accordingly. Change will not be provided by organizers.