Home | Publications | BH26

Position: The Alignment Community Is Unintentionally Building a Censor’s Toolkit

MCML Authors

Sarah Ball

→ Group Frauke Kreuter
Social Data Science and AI

Abstract

This position paper argues that modern AI alignment methods – originally designed to prevent harmful output – are dual-use technologies that may easily be misused by malicious actors for censorship and manipulation. By mapping current alignment techniques to the possibility and actual cases of misuse, we show that the quest for a 'perfectly aligned' model inadvertently also provides malicious actors with an ever-improving tool for informational dominance. We need to discuss this dual-use potential now, as its risk is exacerbated by rapid user adoption of AI as information provider, economic power asymmetries, and a political landscape that increasingly shifts towards authoritarianism. We conclude by urging the community to consider the intentional misuse of AI alignment mechanisms and propose mitigation strategies to safeguard against this dual-use potential.

inproceedings BH26

ICML 2026

43rd International Conference on Machine Learning. Seoul, South Korea, Jul 06-11, 2026. Oral Presentation. To be published. Preprint available.

Authors

S. Ball • P. Hackemann

Links

GitHub

Research Area

C4 | Computational Social Sciences

BibTeXKey: BH26

#p-kreuter