Home | Research | Groups | Johannes Kinder

Research Group Johannes Kinder

Johannes Kinder

Prof. Dr.

Associate

A3 | Computational Models

Programming Languages and Artificial Intelligence

Johannes Kinder

holds the Chair of Programming Languages and Artificial Intelligence at LMU Munich.

His research focuses on securing software through automated methods. His team builds systems to analyze software and understand its properties and purpose, and to harden software against malicious attacks.

Team members @MCML

PhD Students

Moritz Dannehl

A3 | Computational Models
→ Group Johannes Kinder

Programming Languages and Artificial Intelligence

Samuel Valenzuela

A3 | Computational Models
→ Group Johannes Kinder

Programming Languages and Artificial Intelligence

Yunru Wang

A3 | Computational Models
→ Group Johannes Kinder

Programming Languages and Artificial Intelligence

Publications @MCML

2025

[3]

T. Benoit, Y. Wang, M. Dannehl and J. Kinder.
BLens: Contrastive Captioning of Binary Functions using Ensemble Embedding.
USENIX 2025 - 34th USENIX Security Symposium. Seattle, WA, USA, Aug 13-15, 2025. To be published. Preprint available. PDF

Abstract

Function names can greatly aid human reverse engineers, which has spurred the development of machine learning-based approaches to predicting function names in stripped binaries. Much current work in this area now uses transformers, applying a metaphor of machine translation from code to function names. Still, function naming models face challenges in generalizing to projects unrelated to the training set. In this paper, we take a completely new approach by transferring advances in automated image captioning to the domain of binary reverse engineering, such that different parts of a binary function can be associated with parts of its name. We propose BLens, which combines multiple binary function embeddings into a new ensemble representation, aligns it with the name representation latent space via a contrastive learning approach, and generates function names with a transformer architecture tailored for function names. Our experiments demonstrate that BLens significantly outperforms the state of the art. In the usual setting of splitting per binary, we achieve an F1 score of 0.79 compared to 0.70. In the cross-project setting, which emphasizes generalizability, we achieve an F1 score of 0.46 compared to 0.29. Finally, in an experimental setting reducing shared components across projects, we achieve an F1 score of 0.32 compared to 0.19.

MCML Authors

Yunru Wang

A3 | Computational Models
→ Group Johannes Kinder

Programming Languages and Artificial Intelligence

Moritz Dannehl

A3 | Computational Models
→ Group Johannes Kinder

Programming Languages and Artificial Intelligence

Johannes Kinder

Prof. Dr.

A3 | Computational Models

Programming Languages and Artificial Intelligence

[2]

M. Ahmadpanah, M. Gobbi, D. Hedin, J. Kinder and A. Sabelfeld.
CodeX: Contextual Flow Tracking for Browser Extensions.
CODASPY 2025 - 15th ACM Conference on Data and Application Security and Privacy. Pittsburgh, PA, USA, Jun 04-06, 2025. DOI

Abstract

Browser extensions put millions of users at risk when misusing their elevated privileges. Despite the current practices of semi-automated code vetting, privacy-violating extensions still thrive in the official stores. We propose an approach for tracking contextual flows from browser-specific sensitive sources like cookies, browsing history, bookmarks, and search terms to suspicious network sinks through network requests. We demonstrate the effectiveness of the approach by a prototype called CodeX that leverages the power of CodeQL while breaking away from the conservativeness of bug-finding flavors of the traditional CodeQL taint analysis. Applying CodeX to the extensions published on the Chrome Web Store between March 2021 and March 2024 identified 1,588 extensions with risky flows. Manual verification of 339 of those extensions resulted in flagging 212 as privacy-violating, impacting up to 3.6M users.

MCML Authors

Johannes Kinder

Prof. Dr.

A3 | Computational Models

Programming Languages and Artificial Intelligence

[1]

M. Dannehl, S. Valenzuela and J. Kinder.
Which Instructions Matter the Most: A Saliency Analysis of Binary Function Embedding Models.
DLSP @SPW 2025 - 8th Deep Learning Security and Privacy Workshop co-located with the 46th IEEE Symposium on Security and Privacy (SPW 2025). San Francisco, CA, May 15, 2025. DOI

Abstract

Current deep learning models for binary code struggle with explainability, since it is often unclear which factors are important for a given output. In this paper, we apply occlusion-based saliency analysis as an explainability method to binary code embedding models. We conduct experiments on two state-of-the-art Transformer-based models that take preprocessed assembly code as input and calculate embedding vectors for each function. We show that, during training, the models learn the importance of different instructions. From the results, we observe that call instructions and the names of external call targets are important. This observation confirms the intuition that function calls significantly impact the semantics of a function and therefore should also have a large impact on its learned embedding. This motivates the need for developing model architectures that integrate stronger analysis into preprocessing to further leverage call relationships.

MCML Authors

Moritz Dannehl

A3 | Computational Models
→ Group Johannes Kinder

Programming Languages and Artificial Intelligence

Samuel Valenzuela

A3 | Computational Models
→ Group Johannes Kinder

Programming Languages and Artificial Intelligence

Johannes Kinder

Prof. Dr.

A3 | Computational Models

Programming Languages and Artificial Intelligence