How Can a Machine Understand the World in 3D?

Appointment With... Benjamin Busam

MCML PI Benjamin Busam researches how machines can perceive and understand their environment in three dimensions. At the intersection of computer vision, robotics, and artificial intelligence, he develops methods for 3D reconstruction from image and sensor data and creates digital twins to produce photorealistic and geometrically precise representations of our world. In this process, multimodal data is analyzed automatically.

Busam studied Mathematics and Physics at the Technical University of Munich (TUM), ParisTech in France, and the University of Melbourne. He earned his doctorate in computer science with a focus on 3D computer vision and led research groups in the industry as Head of Research at Framos and 3D Computer Vision Team Lead at Huawei Research in London. He has held the Professorship of Photogrammetry and Remote Sensing at TUM since September 1, 2025.

ED: How did you become the person you are today?

I’ve always been fascinated by the question of how a machine can perceive the world. We humans have eyes; we move through the world and learn intuitively. Even as a child, I found the idea of applying this principle to machines exciting. I wanted to understand how one could objectively describe an environment.

I also developed a passion for mathematics at an early age. Mathematics provides the vocabulary to describe the world, and physics helps us understand its dynamics. That’s why I studied mathematics and earned my master’s degree with a focus on geometry. Through this, I learned the fundamentals that I use every day.

After that, I wanted to put this knowledge into practice, so I started out in the industrial sector. At Framos, a medium-sized company, I equipped machines with “3D eyes”, for example to measure packages in logistics or to perform accurate, minimally invasive surgeries in medical technology.

Because I couldn’t stop wondering what machines are capable of understanding, I pursued my doctorate in the industry, this time in computer science. Mathematics was my tool, real-world problems were my motivation, and computer science was the practical tool - this combination continues to fascinate me to this day.

At Huawei Research London, I had the opportunity to make technology accessible to people. Many are familiar with the results of taking photos with their smartphones: We developed a shooting mode that allows users to adjust capture parameters such as focus or exposure through simulation even after the photo has been taken, thereby rewriting the rules of mobile photography - all in real time and across millions of devices.

After this creative work in the consumer sector, I was looking for a new challenge. My path took me back from business to academia, to the Chair of Computer Aided Medical Procedures at TUM, and from there to the Professorship of Photogrammetry and Remote Sensing.

How big was the leap from 3D applications in medicine to geodesy?

It sounds more complicated than it is. The applications differ, of course, but the underlying methodology is surprisingly similar. Whether a robotic arm is performing surgery or a satellite is mapping Earth’s surface, the goal is always to accurately digitize the real world.

Today, a modern scientist rarely works in isolation. We need teams drawing on physics, computer science, engineering, mathematics, and geodesy. That is how we are building our team at the chair: interdisciplinary, open-minded, and methodologically strong.

What is your first major research project at TUM?

Less of a single project and more of a goal: We want to create a realistic digital representation of our environment. As simple as a photo, but in 3D. This digital twin should be geometrically precise, look photorealistic, and be easy to capture - ideally even with a smartphone.

To achieve this, we combine multimodal data: color images, thermal data, laser scans, drone footage, and camera images from cars or satellites. Our projects cover the entire spectrum, from robots working at close range to precisely place screws, to drone flights for medical transport, to the Bavarian mini-satellite that delivers agricultural data every three days. The goal is always to digitally map the world in such a way that machines can think and act meaningfully within it.

Another goal I aim to achieve through IT is a process of democratization. For decades, it took many experts to build these digital twins - with the necessary hardware and expertise. Now we want to make this equipment and knowledge accessible to the general public, so that anyone can build a digital twin simply using a smartphone.

What changes are you hoping for in the future?

From a technical standpoint, I’d like to see even more interdisciplinarity. Modern AI excels at generalization, but many problems in photogrammetry and remote sensing are highly specialized. We need to incorporate this expert knowledge into AI models. At the same time, we need models that understand their own embodiment - that is, know what they’re capable of. Simply putting a language model on a drone isn’t enough. The system must also know what sensors the drone has and what actions are possible.

On a broader societal level, I hope that the added value and the democratic aspects of geodesy will become more visible. Robotics, AI, environmental and disaster management, autonomous systems, medicine, agriculture, and civil engineering: the field has an incredibly wide range of applications, and I would be delighted if our research helped young people discover this field.

Geodesy provides students with an extremely versatile set of tools. It allows them to grow into almost any future technical field. I hope we can make this diversity more visible. If, ten years from now, I visit a school and someone says, “Geodesy - the cool subject that’s needed everywhere!” - then I will have achieved my goal.

@TUM

#research #research-project #busam

Subscribe to RSS News feed

02.04.2026

How AI Avatars Shape Perceived Fairness

Accepted at CHI 2026, this study shows how the race and gender of AI interview avatars shape perceptions of fairness and bias in automated hiring.

31.03.2026

GRaM Competition @ ICLR 2026

GRaM Competition 2026 challenges participants to predict airflow dynamics using AI on 3D geometries. Deadline: April 22 (AoE).

24.03.2026

Cybersecurity: “Even Smart Light Bulbs Harbor Risks”

Interview with computer science expert Johannes Kinder on digital security in everyday life.

24.03.2026

MCML Members Win Most Cited Article Award at ECR 2026

MCML researchers win top citation award for ChatGPT radiology study, highlighting benefits and risks in patient communication.

23.03.2026

MCML at EACL 2026

MCML researchers are represented with 13 papers at EACL 2026 (9 Main, and 4 Findings).

How Can a Machine Understand the World in 3D?

Appointment With... Benjamin Busam

Related