Karan Singh email: Please enable Javascript to view. I'm a final year PhD candidate in Computer Science at Princeton University. I'm very fortunate to be advised by Elad Hazan. My research is focused on provably efficient robust algorithms for control and reinforcement learning, aided by the lens of dynamical systems and the algorithmic toolkit of optimization and (non-stochastic) online learning. My recent efforts seek to address the challenges that accompany continuous states, partial observability and model misspecification. At this time, I'm a long-term research intern at Google AI. I did my bachelors at IIT Kanpur, where I worked on sketch-based algorithms for machine learning, and space lower bounds in the streaming model. I've also spent time at Microsoft Research in Redmond working on program synthesis.
 News [December 2019] Best Paper Award at the Optimization Foundations for RL workshop at NeurIPS2019. [September 2019] Paper on logarithmic regret for control accepted at NeurIPS 2019 (Oral Presentation). [May 2019] Three papers accepted at ICML 2019.
 Publications
 Logarithmic Regret for Online Control with Naman Agarwal, Elad Hazan Neural Information Processing Systems (NeurIPS), 2019 Oral Presentation pdf | bibtex | arXiv We study optimal regret bounds for control in linear dynamical systems under adversarially changing strongly convex cost functions, given the knowledge of transition dynamics. This includes several well studied and fundamental frameworks such as the Kalman filter and the linear quadratic regulator. State of the art methods achieve regret which scales as $\sqrt{T}$, where $T$ is the time horizon. We show that the optimal regret in this setting can be significantly smaller, scaling as $polylog(T)$. The Nonstochastic Control Problem with Elad Hazan, Sham Kakade Algorithmic Learning Theory (ALT), 2020 pdf | arXiv We consider the problem of controlling an unknown linear dynamical system with nonstochastic adversarial perturbations and adversarial convex loss functions. The possibility of achieving sub-linear regret was previously posed as an open question. We give an efficient algorithm that achieves $T^{\frac{2}{3}}$ in this setting. Crucial to our result is a simple system identification procedure that is effective under adversarial noise. Online Control with Adversarial Disturbances with Naman Agarwal, Brian Bullins, Elad Hazan, Sham Kakade International Conference on Machine Learning (ICML), 2019 pdf | bibtex | arXiv We study the control of linear dynamical systems with adversarial disturbances, as opposed to statistical noise. We present an efficient algorithm that achieves nearly-tight regret bounds in this setting. Our result generalizes upon previous work in two main aspects: the algorithm can accommodate adversarial noise in the dynamics, and can handle general convex costs. Provably Efficient Maximum Entropy Exploration with Elad Hazan, Sham Kakade, Abby Van Soest International Conference on Machine Learning (ICML), 2019 pdf | bibtex | arXiv Suppose an agent is in a (possibly unknown) Markov Decision Process in the absence of a reward signal, what might we hope that an agent can efficiently learn to do? This work studies a broad class of objectives that are defined solely as functions of the state-visitation frequencies that are induced by how the agent behaves. For example, one natural, intrinsically defined, objective problem is for the agent to learn a policy which induces a distribution over state space that is as uniform as possible. We provide an efficient algorithm to optimize such such intrinsically defined objectives, when given access to a black box planning oracle. Efficient Full-Matrix Adaptive Regularization with Naman Agarwal, Brian Bullins, Xinyi Chen, Elad Hazan, Cyril Zhang, Yi Zhang International Conference on Machine Learning (ICML), 2019 pdf | bibtex | arXiv Adaptive regularization methods come in diagonal and full-matrix variants. However, only the former have enjoyed widespread adoption in training large-scale deep models. In this paper, we show how to make full-matrix adaptive regularization practical and useful. At the heart of our algorithm is an efficient method for computing the inverse square root of a low-rank matrix. We show that GGT converges to first-order local minima, providing the first rigorous theoretical analysis of adaptive regularization in non-convex optimization. Spectral Filtering for General Linear Dynamical Systems with Elad Hazan, Holden Lee, Cyril Zhang, Yi Zhang Neural Information Processing Systems (NIPS), 2018 Oral Presentation pdf | bibtex | arXiv We give a polynomial-time algorithm for learning latent-state linear dynamical systems without system identification, and without assumptions on the spectral radius of the system's transition matrix. The algorithm extends the recently introduced technique of spectral filtering, previously applied only to systems with a symmetric transition matrix, using a novel convex relaxation to allow for the efficient identification of phases. Towards Provable Control for Unknown Linear Dynamical Systems with Sanjeev Arora, Elad Hazan, Holden Lee, Cyril Zhang, Yi Zhang International Conference on Learning Representatios (ICLR), 2018 Workshop Track pdf | bibtex We study the control of symmetric linear dynamical systems with unknown dynamics and a hidden state. Using a recent spectral filtering technique, we formulate optimal control in this setting as a convex program. This approach eliminates the need to solve the nonconvex problem of explicit identification of the system and its latent state, and allows for provable optimality guarantees for the control signal. Learning Linear Dynamical Systems via Spectral Filtering with Elad Hazan, Cyril Zhang Neural Information Processing Systems (NIPS), 2017 Spotlight pdf | bibtex | arXiv We present an efficient and practical algorithm for the online prediction of discrete-time linear dynamical systems. Despite the non-convex optimization problem, our algorithm comes with provable guarantees: it has near-optimal regret bounds compared to the best LDS in hindsight, while overparameterizing the model by a small logarithmic factor. Our analysis brings together ideas from improper learning through convex relaxations, online regret minimization, and the spectral theory of Hankel matrices. The Price of Differential Privacy for Online Learning with Naman Agarwal International Conference on Machine Learning (ICML), 2017 pdf | bibtex | arXiv We design differentially private algorithms for the problem of online linear optimization in the full information and bandit settings with optimal $O(\sqrt{T})$ regret bounds. In the full-information setting, our results demonstrate that $\epsilon$-differential privacy may be ensured for free. In particular, the regret bounds scale as $O\left(\sqrt{T}+{\epsilon}^{-1}\right)$. In the bandit setting, this is the first $O\left({\epsilon}^{-1}\sqrt{T}\right)$-regret algorithm. Efficient Regret Minimization in Non-Convex Games with Elad Hazan, Cyril Zhang International Conference on Machine Learning (ICML), 2017 pdf | bibtex | arXiv We consider regret minimization in repeated games with non-convex loss functions. Minimizing the standard notion of regret is computationally intractable. Thus, we define a natural notion of regret which permits efficient optimization and generalizes offline guarantees for convergence to an approximate local optimum. We give gradient-based methods that achieve optimal regret, which in turn guarantee convergence to equilibrium in this framework. Dynamic Task Allocation for Crowdsourcing with Irineo Cabreros, Angela Zhou Prelim at ICML Workshop on Data Efficient Machine Learning, 2016
 Teaching (Assistantships in Instruction)
 COS 402: Machine Learning and Artificial Intelligence Princeton University, Fall 2016 Instructors: Prof. Sanjeev Arora, Prof. Elad Hazan COS 324: Introduction to Machine Learning Princeton University, Fall 2017 Instructors: Prof. Elad Hazan, Prof. Yoram Singer
 A minimal adaptation of this, and hence this and this.