Karan Singh email: Please enable Javascript to view. I'm a second year Computer Science PhD student at Princeton University, advised by Elad Hazan. My research is focussed on interactive machine learning, and seeks to advance the theoretical understanding of reinforcement learning. I'm also invested in issues concerning privacy and fairness in machine learning. I did my bachelors at IIT Kanpur, where I worked on sketch-based algorithms for machine learning, and space lower bounds in the streaming model. I've also spent time at Microsoft Research in Redmond working on program synthesis.
 News [September 2017] Paper on learning LDS accepted at NIPS 2017 (Spotlight). [May 2017] Two papers accepted at ICML 2017.
 Publications
 Online Learning of Linear Dynamical Systems with Elad Hazan, Cyril Zhang Neural Information Processing Systems (NIPS), 2017 Spotlight Available Soon. We present an efficient and practical algorithm for the online prediction of discrete-time linear dynamical systems. Despite the non-convex optimization problem, our algorithm comes with provable guarantees: it has near-optimal regret bounds compared to the best LDS in hindsight, while overparameterizing the model by a small logarithmic factor. Our analysis brings together ideas from improper learning through convex relaxations, online regret minimization, and the spectral theory of Hankel matrices. The Price of Differential Privacy for Online Learning with Naman Agarwal International Conference on Machine Learning (ICML), 2017 pdf | bibtex | arXiv | slides | poster We design differentially private algorithms for the problem of online linear optimization in the full information and bandit settings with optimal $O(\sqrt{T})$ regret bounds. In the full-information setting, our results demonstrate that $\epsilon$-differential privacy may be ensured for free. In particular, the regret bounds scale as $O\left(\sqrt{T}+\frac{1}{\epsilon}\right)$. For bandit linear optimization, and as a special case, for non-stochastic multi-armed bandits, the proposed algorithm achieves a regret of $O\left(\frac{1}{\epsilon}\sqrt{T}\right)$ while the previous best regret bound was $O\left(\frac{1}{\epsilon}T^{\frac{2}{3}}\right)$. Efficient Regret Minimization in Non-Convex Games with Elad Hazan, Cyril Zhang International Conference on Machine Learning (ICML), 2017 pdf | bibtex | arXiv | poster We consider regret minimization in repeated games with non-convex loss functions. Minimizing the standard notion of regret is computationally intractable. Thus, we define a natural notion of regret which permits efficient optimization and generalizes offline guarantees for convergence to an approximate local optimum. We give gradient-based methods that achieve optimal regret, which in turn guarantee convergence to equilibrium in this framework. Manuscripts Dynamic Task Allocation for Crowdsourcing with Irineo Cabreros, Angela Zhou Prelim at ICML Workshop on Data Efficient Machine Learning, 2016
 Teaching (Assistantships in Instruction)
 COS 402: Machine Learning and Artificial Intelligence Princeton University, Fall 2016 Instructors: Prof. Sanjeev Arora, Prof. Elad Hazan COS 324: Introduction to Machine Learning Princeton University, Fall 2017 Instructors: Prof. Elad Hazan, Prof. Yoram Singer
 A minimal adaptation of this, and hence this and this.