About me
I am a research associate and a PhD candidate at the XAI group at Fraunhofer HHI, working under the supervision of Sebastian Lapuschkin and Wojciech Samek. I am excited about understanding what happens inside LLMs with the goal of making them more trustworthy, secure and safe. More broadly, my research interests revolve around mitigating the risks associated with increasingly powerful ML systems.
Prior to joining Fraunhofer, I did a research master’s at the University of Tübingen. I had the pleasure to work at the STAI group, where I investigated knowledge conflicts in retrieval-augmented LLMs. Before that, I worked as a student research assistant at the Decision Making group, where I explored the Multi-Armed Bandit problem under distribution shifts and its application to recommender systems.
Before coming to Tübingen, I got my Bachelor degree in Computer Science from Lomonosov Moscow State University in 2020. During and after my studies, I worked for about 2.5 years as a software engineer at Yandex.
I am passionate about using technical expertise to help solve the world’s most pressing problems.
Latest Publications
A Behavioural and Representational Evaluation of Goal-Directedness in Language Model Agents
Paper CodeRaghu Arghal, Fade Chen, Niall Dalton, Evgenii Kortukov, Calum McNamara, Angelos Nalmpantis, Moksh Nirvaan, Gabriele Sarti, Mario Giulianelli
Strategic Dishonesty Can Undermine AI Safety Evaluations of Frontier LLMs
Paper CodeAlexander Panfilov, Evgenii Kortukov, Kristina Nikolić, Matthias Bethge, Sebastian Lapuschkin, Wojciech Samek, Ameya Prabhu, Maksym Andriushchenko, Jonas Geiping
ASIDE: Architectural Separation of Instructions and Data in Language Models
Paper CodeEgor Zverev, Evgenii Kortukov, Alexander Panfilov, Soroush Tabesh, Sebastian Lapuschkin, Wojciech Samek, Christoph H. Lampert
