Aryaman Reddi 🚀
Aryaman Reddi ආර්යමන් රෙඩ්ඩි

PhD Student

About Me

I am a PhD student in reinforcement learning at the LiteRL group at the Technical University of Darmstadt in partnership with the Intelligent Autonomous Systems lab and Hessian.AI, supervised by Professor Carlo D’Eramo 🎓

I am interested in developing sample-efficient techniques in deep multi-agent reinforcement learning using insights from applied mathematics and game theory 🕹️

I am a continuum hypothesis skeptic, a mereological universalist, and a Collatz conjecture supporter.

CV
Interests
  • Reinforcement Learning
  • Game Theory
  • Ethics and Philosophy
Education
  • PhD in Computer Science

    Technical University of Darmstadt, Germany

  • MEng & BA in Information and Computer Engineering

    University of Cambridge, United Kingdom

Experience

  1. Machine Learning Research Engineer

    Arm, Cambridge, United Kingdom
    • Developed an open-source tool (ML Inference Advisor) to optimise neural networks for inference on Arm GPUs using Python (PyTorch, Tensorflow, Numpy, Jupyter, Pandas), C++, Kubernetes, & Docker.
    • Achieved a 20% boost in GPU inference throughput by analyzing TensorFlow operator efficiency using deep learning clustering and pruning techniques
    • Enhanced IoT device performance by 12% by benchmarking over 30 silicon-on-chip devices using Python, Jenkins CI, SQL, & Kubernetes

Education

  1. PhD in Computer Science

    Technical University of Darmstadt, Germany

    My focus is on developing sample-efficient algorithms for exploration, coordination, and communication in multi-agent reinforcement learning using insights from applied mathematics and game theory.

    I believe bridging the gap between practical deep learning and theoretical models of stochastic optimisation is essential for scaling RL in real-world settings.

    I build algorithms which exhibit high performance in high-dimensional environments while providing mathematical insights using probability theory, linear algebra, calculus, & functional analysis.

  2. MEng & BA in Information and Computer Engineering

    University of Cambridge, United Kingdom
    • Grade: Distinction (GPA 4.0 Equivalent)
    • Received the David Thompson prize for academic achievement
    Read Thesis
Publications and Collaborations
Can “consciousness” be observed from large language model (LLM) internal states? Dissecting LLM representations obtained from Theory of Mind test with Integrated Information Theory and Span Representation analysis

Can “consciousness” be observed from large language model (LLM) internal states? Dissecting LLM representations obtained from Theory of Mind test with Integrated Information Theory and Span Representation analysis

Robust Adversarial Reinforcement Learning via Bounded Rationality Curricula

Robust Adversarial Reinforcement Learning via Bounded Rationality Curricula

Robustness against adversarial attacks and distribution shifts is a long-standing goal of Reinforcement Learning (RL). To this end, Robust Adversarial Reinforcement Learning (RARL) trains a protagonist against destabilizing forces exercised by an adversary in a competitive zero-sum Markov game, whose optimal solution, i.e., rational strategy, corresponds to a Nash equilibrium. However, finding Nash equilibria requires facing complex saddle point optimization problems, which can be prohibitive to solve, especially for high-dimensional control. In this paper, we propose a novel approach for adversarial RL based on entropy regularization to ease the complexity of the saddle point optimization problem. We show that the solution of this entropy-regularized problem corresponds to a Quantal Response Equilibrium (QRE), a generalization of Nash equilibria that accounts for bounded rationality, i.e., agents sometimes play random actions instead of optimal ones. Crucially, the connection between the entropy-regularized objective and QRE enables free modulation of the rationality of the agents by simply tuning the temperature coefficient. We leverage this insight to propose our novel algorithm, Quantal Adversarial RL (QARL), which gradually increases the rationality of the adversary in a curriculum fashion until it is fully rational, easing the complexity of the optimization problem while retaining robustness. We provide extensive evidence of QARL outperforming RARL and recent baselines across several MuJoCo locomotion and navigation problems in overall performance and robustness.

Talks
Blog
Projects
More

Communities

Interesting links

Contact

✉️ aryaman{}reddi{}tu-darmstadt.de

📍 E327, S2|02 Robert-Piloty-Gebäude, Technical University of Darmstadt, Darmstadt 64289