is Assistant Professor at the Chair of Human-Computer Interaction and Artificial Intelligence at LMU Munich.
His research sits at the intersection between Human-Computer Interaction and Artificial Intelligence, where he focuses on the next generation of computing systems. He uses artificial intelligence to design, build, and evaluate future human-centered interfaces. In particular, he envisions enabling humans to outperform their performance in collaboration with the machine. He focuses on areas such as augmented and virtual reality, mobile scenarios, and robotics.
null
As artificial intelligence becomes increasingly pervasive, it is essential that we understand the implications of bias in machine learning. Many developers rely on crowd workers to generate and annotate datasets for machine learning applications. However, this step risks embedding training data with labeler bias, leading to biased decision-making in systems trained on these datasets. To characterize labeler bias, we created a face dataset and conducted two studies where labelers of different ethnicity and sex completed annotation tasks. In the first study, labelers annotated subjective characteristics of faces. In the second, they annotated images using bounding boxes. Our results demonstrate that labeler demographics significantly impact both subjective and accuracy-based annotations, indicating that collecting a diverse set of labelers may not be enough to solve the problem. We discuss the consequences of these findings for current machine learning practices to create fair and unbiased systems.
Images and videos are widely used to elicit emotions; however, their visual appeal differs from real-world experiences. With virtual reality becoming more realistic, immersive, and interactive, we envision virtual environments to elicit emotions effectively, rapidly, and with high ecological validity. This work presents the first interactive virtual reality dataset to elicit emotions. We created five interactive virtual environments based on corresponding validated 360° videos and validated their effectiveness with 160 participants. Our results show that our virtual environments successfully elicit targeted emotions. Compared with the existing methods using images or videos, our dataset allows virtual reality researchers and practitioners to integrate their designs effectively with emotion elicitation settings in an immersive and interactive way.
null
In a world increasingly reliant on artificial intelligence, it is more important than ever to consider the ethical implications of artificial intelligence. One key under-explored challenge is labeler bias — bias introduced by individuals who label datasets — which can create inherently biased datasets for training and subsequently lead to inaccurate or unfair decisions in healthcare, employment, education, and law enforcement. Hence, we conducted a study (N=98) to investigate and measure the existence of labeler bias using images of people from different ethnicities and sexes in a labeling task. Our results show that participants hold stereotypes that influence their decision-making process and that labeler demographics impact assigned labels. We also discuss how labeler bias influences datasets and, subsequently, the models trained on them. Overall, a high degree of transparency must be maintained throughout the entire artificial intelligence training process to identify and correct biases in the data as early as possible.
Physiological sensing enables us to use advanced adaptive functionalities through physiological data (e.g., eye tracking) to change conditions. In this work, we investigate the impact of infilling methods on LSTM models’ performance in handling missing eye tracking data, specifically during blinks and gaps in recording. We conducted experiments using recommended infilling techniques from previous work on an openly available eye tracking dataset and LSTM model structure. Our findings indicate that the infilling method significantly influences LSTM prediction accuracy. These results underscore the importance of standardized infilling approaches for enhancing the reliability and reproducibility of LSTM-based eye tracking applications on a larger scale. Future work should investigate the impact of these infilling methods in larger datasets to investigate generalizability.
Currently, interactive systems use physiological sensing to enable advanced functionalities. While eye tracking is a promising means to understand the user, eye tracking data inherently suffers from missing data due to blinks, which may result in reduced system performance. We conducted a literature review to understand how researchers deal with this issue. We uncovered that researchers often implemented their use-case-specific pipeline to overcome the issue, ranging from ignoring missing data to artificial interpolation. With these first insights, we run a large-scale analysis on 11 publicly available datasets to understand the impact of the various approaches on data quality and accuracy. By this, we highlight the pitfalls in data processing and which methods work best. Based on our results, we provide guidelines for handling eye tracking data for interactive systems. Further, we propose a standard data processing pipeline that allows researchers and practitioners to pre-process and standardize their data efficiently.
Sedentary behavior is endemic in modern workplaces, contributing to negative physical and mental health outcomes. Although adjustable standing desks are increasing in popularity, people still avoid standing. We developed an open-source plug-and-play system to remotely control standing desks and investigated three system modes with a three-week in-the-wild user study (N=15). Interval mode forces users to stand once per hour, causing frustration. Adaptive mode nudges users to stand every hour unless the user has stood already. Smart mode, which raises the desk during breaks, was the best rated, contributing to increased standing time with the most positive qualitative feedback. However, non-computer activities need to be accounted for in the future. Therefore, our results indicate that a smart standing desk that shifts modes at opportune times has the most potential to reduce sedentary behavior in the workplace. We contribute our open-source system and insights for future intelligent workplace well-being systems.
null
Eye tracking is the basis for many intelligent systems to predict user actions. A core challenge with eye-tracking data is that it inherently suffers from missing data due to blinks. Approaches such as intent prediction and user state recognition process gaze data using neural networks; however, they often have difficulty handling missing information. In an effort to understand how prior work dealt with missing data, we found that researchers often simply ignore missing data or adopt use-case-specific approaches, such as artificially filling in missing data. This inconsistency in handling missing data in eye tracking hinders the development of effective intelligent systems for predicting user actions and limits reproducibility. Furthermore, this can even lead to incorrect results. Thus, this lack of standardization calls for investigating possible solutions to improve the consistency and effectiveness of processing eye-tracking data for user action prediction.
Over the last few years, we have seen many approaches using tangibles to address the limited expressiveness of touchscreens. Mainstream tangible detection uses fiducial markers embedded in the tangibles. However, the coarse sensor size of capacitive touchscreens makes tangibles bulky, limiting their usefulness. We propose a novel deep-learning super-resolution network to facilitate fiducial tangibles on capacitive touchscreens better. In detail, our network super-resolves the markers enabling off-the-shelf detection algorithms to track tangibles reliably. Our network generalizes to unseen marker sets, such as AprilTag, ArUco, and ARToolKit. Therefore, we are not limited to a fixed number of distinguishable objects and do not require data collection and network training for new fiducial markers. With extensive evaluation, including real-world users and five showcases, we demonstrate the applicability of our open-source approach on commodity mobile devices and further highlight the potential of tangibles on capacitive touchscreens.
null
null
null
©all images: LMU | TUM