Title | : | Concentration Bounds for Stochastic Approximation with Applications to Reinforcement Learning |
Speaker | : | Gugan Thoppe (Duke University, Durham, USA) |
Details | : | Thu, 22 Nov, 2018 3:00 PM @ A M Turing Hall |
Abstract: | : | Abstract: Stochastic Approximation (SA) refers to iterative algorithms that can be used to find optimal points or zeros of a function, given only its noisy estimates. In this talk, I will review our recent advances in techniques for analysing SA methods. This talk has four major parts. In the first part, we will see a motivating application of SA to network tomography and, alongside, discuss the convergence of a novel stochastic Kaczmarz method. In the second and the major part of the talk, we shall see a novel analysis approach for non-linear SA methods in the neighbourhood of an isolated solution. The main tools here include the Alekseev formula, which helps exactly compare the solutions of a non-linear ODE to that of its perturbation, and a novel concentration inequality for a sum of scaled martingale differences. In the third part, we will extend the previous tool to the two timescale but linear SA setting. Here, I will also present our ongoing work to obtain tight convergence rates in this setup. In parallel, we will also see how these results can be applied to gradient Temporal Difference (TD) methods such as GTD(0), GTD2, and TDC that are used in reinforcement learning. For the analyses in the second and third parts to hold, the initial step size must be chosen sufficiently small, depending on unknown problem-dependent parameters; or, alternatively, one must use projections. In the fourth part, we shall discuss a trick to obviate this in context of the one timescale, linear TD(0) method. We strongly believe that this trick is generalizable. We also provide here a novel expectation bound. We shall end with some future directions. Speaker Bio: Dr Gugan Thoppe is a Post-Doctoral Associate in the Dept. of Statistical Science at Duke University, USA with Prof. Sayan Mukherjee. Earlier, he worked with Prof. Robert Adler as an ERC Senior Researcher (postdoc) at Technion, Israel. He has done his PhD and MS in Systems Science with Prof. Vivek Borkar at TIFR, India. His PhD work won the TAA-Sasken best thesis award for 2017. He is also a two-time recipient of the IBM PhD fellowship award (2013–14 and 2014-15). His research interests include random topology and stochastic approximation. In random topology, he has looked at topological evolutions in dynamic Erdos-Renyi graphs, the connections between persistence diagrams and minimal spanning acycles, and, recently, the behaviour of Betti numbers of Gaussian excursions. Presently, he is interested in exploring the topological behaviour of dynamic random fields, volatility in the topology of dynamic Erdos-Renyi graphs, and on building online algorithms for minimal spanning acycles. In stochastic approximation, he has been focussing on building theoretical tools for deriving convergence rates for the class of one/ two timescale stochastic approximation algorithms in both linear and non-linear settings. He now plans to extend this study to interacting particle systems and random switching. |