31.07.2025

Teaser image to From Vulnerable to Verified: Exact Certificates Shield Models from Label‑Flipping

From Vulnerable to Verified: Exact Certificates Shield Models From Label‑Flipping

MCML Research Insight - With Lukas Gosch, Stephan Günnemann and Debarghya Ghoshdastidar

Machine‑learning models can be undermined before training even starts. By silently altering a small share of training labels - marking “spam” as “not‑spam,” for instance - an attacker can cut accuracy by double‑digit percentages.

The paper “Exact Certification of (Graph) Neural Networks Against Label Poisoning” by MCML Junior Member Lukas Gosch, PIs Stephan Günnemann and Debarghya Ghoshdastidar and collaborator Mahalakshmi Sabanayagam, introduces the first exact guarantees that a neural network will remain stable under a prescribed number of label flips. Although demonstrated on graph‑neural networks (GNNs), the method applies to any sufficiently wide neural network.


How the certification works

Illustration of the label-flipping certificate

Figure 1: Illustration of the label-flipping certificate

  • Neural‑tangent view. In the wide‑network limit, training behaves like a support‑vector machine using the network’s neural tangent kernel (NTK).
  • Single‑level reformulation. Substituting this NTK model allows to convert the attacker‑versus‑learner game for certification into one optimization problem.
  • Mixed‑integer linear program. That problem is expressed as a mixed‑integer linear program whose solution yields (i) sample‑wise certificates for individual test nodes and (ii) collective certificates for the entire test set.

What experiments show

Certified ratios

Figure 2: Certified ratios (the share of test‑set predictions that the certificate proves cannot be overturned even if an attacker flips up to a fraction of the training labels) of selected architectures as calculated with the sample-wise and collective certificate on the Cora-MLb dataset.

  • No universal best architecture. The most robust GNN depends on the data set.
  • Design choices matter. Linear activations improve robustness, while deeper architectures often weaken it.
  • A robustness plateau. Collective certificates reveal a flattening of vulnerability at medium attack budgets - an effect not noted before (see Figure 2).

«Machine learning models are highly vulnerable to label flipping, i.e., the adversarial modification (poisoning) of training labels to compromise performance.»


Lukas Gosch et al.

MCML Junior Members

Practical Implications

Because the approach relies only on the NTK, it extends to standard (non‑graph) wide neural networks, giving practitioners the first provable defence against label poisoning in deep learning.


«There is no silver bullet: robustness hierarchies of GNNs are strongly data dependent.»


Lukas Gosch et al.

MCML Junior Members

Key takeaway

Exact certification shifts robustness from a best‑effort practice to a provable property. For anyone concerned about poisoned training data, this work provides a clear path toward verifiably trustworthy machine‑learning models.


Interested in Exploring Further?

Published as a spotlight presentation at at the A* conference ICLR 2025, you can explore the full paper—including proofs, algorithmic details, and additional experiments—and find the open-source code on GitHub.

M. Sabanayagam, L. Gosch, S. Günnemann and D. Ghoshdastidar.
Exact Certification of (Graph) Neural Networks Against Label Poisoning.
VerifAI @ICLR 2025 - Workshop AI Verification in the Wild at the 13th International Conference on Learning Representations (ICLR 2025). Singapore, Apr 24-28, 2025. Spotlight Presentation. URL GitHub
Abstract

Machine learning models are highly vulnerable to label flipping, i.e., the adversarial modification (poisoning) of training labels to compromise performance. Thus, deriving robustness certificates is important to guarantee that test predictions remain unaffected and to understand worst-case robustness behavior. However, for Graph Neural Networks (GNNs), the problem of certifying label flipping has so far been unsolved. We change this by introducing an exact certification method, deriving both sample-wise and collective certificates. Our method leverages the Neural Tangent Kernel (NTK) to capture the training dynamics of wide networks enabling us to reformulate the bilevel optimization problem representing label flipping into a Mixed-Integer Linear Program (MILP). We apply our method to certify a broad range of GNN architectures in node classification tasks. Thereby, concerning the worst-case robustness to label flipping: (i) we establish hierarchies of GNNs on different benchmark graphs; (ii) quantify the effect of architectural choices such as activations, depth and skip-connections; and surprisingly, (iii) uncover a novel phenomenon of the robustness plateauing for intermediate perturbation budgets across all investigated datasets and architectures. While we focus on GNNs, our certificates are applicable to sufficiently wide NNs in general through their NTK. Thus, our work presents the first exact certificate to a poisoning attack ever derived for neural networks, which could be of independent interest.

MCML Authors
Link to website

Lukas Gosch

Data Analytics & Machine Learning

Link to Profile Stephan Günnemann

Stephan Günnemann

Prof. Dr.

Data Analytics & Machine Learning

Link to Profile Debarghya Ghoshdastidar

Debarghya Ghoshdastidar

Prof. Dr.

Theoretical Foundations of Artificial Intelligence


Share Your Research!


Get in touch with us!

Are you an MCML Junior Member and interested in showcasing your research on our blog?

We’re happy to feature your work—get in touch with us to present your paper.

31.07.2025


Subscribe to RSS News feed

Related

Link to Tracking Our Changing Planet from Space - with Xiaoxiang Zhu

30.07.2025

Tracking Our Changing Planet From Space - With Xiaoxiang Zhu

In this video, Xiaoxiang Zhu shares how her team extracts geo-information from petabytes of data, with real impact on global challenges.

Link to AI for Enhanced Eye Diagnostics - with researcher Lucie Huang

29.07.2025

AI for Enhanced Eye Diagnostics - With Researcher Lucie Huang

Lucie Huang develops AI for faster eye scans and earlier diagnoses, featured in a new KI Trans video on real-world AI impact.

Link to SceneDINO: How AI Learns to See and Understand Images in 3D–Without Human Labels

24.07.2025

SceneDINO: How AI Learns to See and Understand Images in 3D–Without Human Labels

Accepted at ICCV 2025, SceneDINO infers 3D geometry and semantics from one image—no labels, inspired by human scene understanding.

Link to How Reliable Are Machine Learning Methods? With Anne-Laure Boulesteix and Milena Wünsch

23.07.2025

How Reliable Are Machine Learning Methods? With Anne-Laure Boulesteix and Milena Wünsch

In this research film, Anne-Laure Boulesteix and Milena Wünsch reveal how subtle biases in ML benchmarking can lead to misleading results.

Link to  AI-Powered Cortical Mapping for Neurodegenerative Disease Diagnoses - with Christian Wachinger

16.07.2025

AI-Powered Cortical Mapping for Neurodegenerative Disease Diagnoses - With Christian Wachinger

Research film with Christian Wachinger shows how AI maps the brain’s cortex to support diagnoses of neurodegenerative diseases.