I am a Software Engineer at Google, working on reliability, analytics and performance evaluation for large-scale AI systems. I work across different layers of the stack to detect, localize, and mitigate fail-stop, fail-wrong, and fail-silent issues within Google’s infrastructure, spanning CPUs, TPUs, GPUs, and NICs.

My research interests lie at the intersection of AI and systems, specifically applying AI methods to improve system reliability and performance. I completed my Ph.D. in Computer Science at the University of Illinois at Urbana-Champaign, advised by Prof. Ravishankar K. Iyer. My dissertation research focused on establishing a framework (using reinforcement learning) for the control, management, and optimization of large-scale heterogeneous computer systems.

News [More Entries]

Selected Publications  [Full List: Publications, Projects]

2025

2021

2020

2019

2018

  • Powered by Hugo
  • Last updated 06/16/2026
  • Feed