Home | Research | Groups | Göran Kauermann

Research Group Göran Kauermann

Göran Kauermann

Prof. Dr.

Principal Investigator

A1 | Statistical Foundations & Explainability

Applied Statistics in Social Sciences, Economics and Business

Göran Kauermann

holds the Chair of Applied Statistics in Social Sciences, Economics and Business at LMU Munich.

He conducts research in advanced regression analysis, focusing on generalized additive models and generalized mixed models. His work aims to refine statistical methods for complex data, enhancing their application in various scientific fields.

Team members @MCML

PhD Students

Jan Anders

A1 | Statistical Foundations & Explainability
→ Group Göran Kauermann

Applied Statistics in Social Sciences, Economics and Business

Publications @MCML

2025

[21]

C. Gruber, H. Alber, B. Bischl, G. Kauermann, B. Plank and M. Aßenmacher.
Revisiting Active Learning under (Human) Label Variation.
Preprint (Jul. 2025). arXiv

Abstract

Access to high-quality labeled data remains a limiting factor in applied supervised learning. While label variation (LV), i.e., differing labels for the same instance, is common, especially in natural language processing, annotation frameworks often still rest on the assumption of a single ground truth. This overlooks human label variation (HLV), the occurrence of plausible differences in annotations, as an informative signal. Similarly, active learning (AL), a popular approach to optimizing the use of limited annotation budgets in training ML models, often relies on at least one of several simplifying assumptions, which rarely hold in practice when acknowledging HLV. In this paper, we examine foundational assumptions about truth and label nature, highlighting the need to decompose observed LV into signal (e.g., HLV) and noise (e.g., annotation error). We survey how the AL and (H)LV communities have addressed – or neglected – these distinctions and propose a conceptual framework for incorporating HLV throughout the AL loop, including instance selection, annotator choice, and label representation. We further discuss the integration of large language models (LLM) as annotators. Our work aims to lay a conceptual foundation for HLV-aware active learning, better reflecting the complexities of real-world annotation.

MCML Authors

Helen Alber

A1 | Statistical Foundations & Explainability
→ Group Bernd Bischl

Statistical Learning and Data Science

Bernd Bischl

Prof. Dr.

A1 | Statistical Foundations & Explainability

Statistical Learning and Data Science

Göran Kauermann

Prof. Dr.

A1 | Statistical Foundations & Explainability

Applied Statistics in Social Sciences, Economics and Business

Barbara Plank

Prof. Dr.

B2 | Natural Language Processing

AI and Computational Linguistics

Matthias Aßenmacher

Dr.

A1 | Statistical Foundations & Explainability
→ Group Bernd Bischl

Statistical Learning and Data Science

[20]

L. Meynent, I. Melev, K. Schürholt, G. Kauermann and D. Borth.
Structure Is Not Enough: Leveraging Behavior for Neural Network Weight Reconstruction.
Weight Space Learning @ICLR 2025 - Workshop on Weight Space Learning at the 13th International Conference on Learning Representations (ICLR 2025). Singapore, Apr 24-28, 2025. arXiv URL

Abstract

The weights of neural networks (NNs) have recently gained prominence as a new data modality in machine learning, with applications ranging from accuracy and hyperparameter prediction to representation learning or weight generation. One approach to leverage NN weights involves training autoencoders (AEs), using contrastive and reconstruction losses. This allows such models to be applied to a wide variety of downstream tasks, and they demonstrate strong predictive performance and low reconstruction error. However, despite the low reconstruction error, these AEs reconstruct NN models with deteriorated performance compared to the original ones, limiting their usability with regard to model weight generation. In this paper, we identify a limitation of weight-space AEs, specifically highlighting that a structural loss, that uses the Euclidean distance between original and reconstructed weights, fails to capture some features critical for reconstructing high-performing models. We analyze the addition of a behavioral loss for training AEs in weight space, where we compare the output of the reconstructed model with that of the original one, given some common input. We show a strong synergy between structural and behavioral signals, leading to increased performance in all downstream tasks evaluated, in particular NN weights reconstruction and generation.

MCML Authors

Göran Kauermann

Prof. Dr.

A1 | Statistical Foundations & Explainability

Applied Statistics in Social Sciences, Economics and Business

[19]

M. Schneble and G. Kauermann.
Statistical modelling of on-street parking spot occupancy in smart cities.
Journal of the Royal Statistical Society. Series C (Applied Statistics).qlaf017 (Mar. 2025). DOI

Abstract

Many studies suggest that searching for parking is associated with significant direct and indirect costs. Therefore, it is appealing to reduce the time that car drivers spend on finding an available parking spot, especially in urban areas where the space for all road users is limited. The prediction of on-street parking spot occupancy can provide drivers with guidance on where clear parking spaces are likely to be found. This field of research has gained more and more attention in the last decade through the increasing availability of real-time parking spot occupancy data. In this paper, we pursue a statistical approach for the prediction of parking spot occupancy, where we make use of time-to-event models and semi-Markov process theory. The latter involves the employment of Laplace transformations as well as their inversion, which is an ambitious numerical task. We apply our methodology to data from the City of Melbourne in Australia. Our main result is that the semi-Markov model outperforms a Markov model in terms of both true negative rate and true positive rate while this is essentially achieved by respecting the current duration that a parking space already spends in its initial state.

MCML Authors

Göran Kauermann

Prof. Dr.

A1 | Statistical Foundations & Explainability

Applied Statistics in Social Sciences, Economics and Business

2023

[18]

C. Koller, G. Kauermann and X. Zhu.
Going Beyond One-Hot Encoding in Classification: Can Human Uncertainty Improve Model Performance in Earth Observation?
IEEE Transactions on Geoscience and Remote Sensing 62 (Dec. 2023). DOI GitHub

Abstract

Technological and computational advances continuously drive forward the field of deep learning in remote sensing. In recent years, the derivation of quantities describing the uncertainty in the prediction—which naturally accompanies the modeling process—has sparked interest in the remote sensing community. Often neglected in the machine learning setting is the human uncertainty that influences numerous labeling processes. As the core of this work, the task of local climate zone (LCZ) classification is studied by means of a dataset that contains multiple label votes by domain experts for each image. The inherent label uncertainty describes the ambiguity among the domain experts and is explicitly embedded into the training process via distributional labels. We show that incorporating the label uncertainty helps the model to generalize better to the test data and increases model performance. Similar to existing calibration methods, the distributional labels lead to better-calibrated probabilities, which in turn yield more certain and trustworthy predictions.

MCML Authors

Göran Kauermann

Prof. Dr.

A1 | Statistical Foundations & Explainability

Applied Statistics in Social Sciences, Economics and Business

[17]

L. Fahrmeir, G. Kauermann, G. Tutz and M. Windmann.
Spatial smoothing revisited: An application to rental data in Munich.
Statistical Modelling 23.5-6 (Aug. 2023). DOI

Abstract

Spatial smoothing makes use of spatial information to obtain better estimates in regression models. In particular flexible smoothing with B-splines and penalties, which has been propagated by Eilers and Marx (1996), provides strong tools that can be used to include available spatial information. We consider alternative smoothing methods in spatial additive regression and employ them for analysing rental data in Munich. The first method applies tensor product P-splines to the geolocation of apartments, measured on a continuous scale through the centroid of the quarter where an apartment is. The alternative approach exploits the neighbourhood structure of districts on a discrete scale, where districts consist of a set of neighbouring quarters. The discrete modelling approach yields smooth estimates when using ridge-type penalties but can also enforce spatial clustering of districts with a homogeneous structure when using Lasso-type penalties.

MCML Authors

Göran Kauermann

Prof. Dr.

A1 | Statistical Foundations & Explainability

Applied Statistics in Social Sciences, Economics and Business

[16]

C. Fritz, G. De Nicola, S. Kevork, D. Harhoff and G. Kauermann.
Modelling the large and dynamically growing bipartite network of German patents and inventors.
Journal of the Royal Statistical Society. Series A (Statistics in Society) 186.3 (Jul. 2023). DOI

Abstract

To explore the driving forces behind innovation, we analyse the dynamic bipartite network of all inventors and patents registered within the field of electrical engineering in Germany in the past two decades. To deal with the sheer size of the data, we decompose the network by exploiting the fact that most inventors tend to only stay active for a relatively short period. We thus propose a Temporal Exponential Random Graph Model with time-varying actor set and sufficient statistics mirroring substantial expectations for our analysis. Our results corroborate that inventor characteristics and team formation are essential to the dynamics of invention.

MCML Authors

Cornelius Fritz

Dr.

A1 | Statistical Foundations & Explainability
→ Group Göran Kauermann

* Former Member

Göran Kauermann

Prof. Dr.

A1 | Statistical Foundations & Explainability

Applied Statistics in Social Sciences, Economics and Business

2022

[15]

C. Fritz, G. De Nicola, F. Günther, D. Rügamer, M. Rave, M. Schneble, A. Bender, M. Weigert, R. Brinks, A. Hoyer, U. Berger, H. Küchenhoff and G. Kauermann.
Challenges in Interpreting Epidemiological Surveillance Data – Experiences from Germany.
Journal of Computational and Graphical Statistics 32.3 (Dec. 2022). DOI

Abstract

As early as March 2020, the authors of this letter started to work on surveillance data to obtain a clearer picture of the pandemic’s dynamic. This letter outlines the lessons learned during this peculiar time, emphasizing the benefits that better data collection, management, and communication processes would bring to the table. We further want to promote nuanced data analyses as a vital element of general political discussion as opposed to drawing conclusions from raw data, which are often flawed in epidemiological surveillance data, and therefore underline the overall need for statistics to play a more central role in public discourse.

MCML Authors

Cornelius Fritz

Dr.

A1 | Statistical Foundations & Explainability
→ Group Göran Kauermann

* Former Member

David Rügamer

Prof. Dr.

A1 | Statistical Foundations & Explainability

Statistics, Data Science and Machine Learning

Andreas Bender

Dr.

A1 | Statistical Foundations & Explainability
→ Group Bernd Bischl

Machine Learning Consulting Unit (MLCU)

Maximilian Weigert

C4 | Computational Social Sciences
→ Group Helmut Küchenhoff

* Former Member

Helmut Küchenhoff

Prof. Dr.

C4 | Computational Social Sciences

Statistical Consulting Unit (StaBLab)

Göran Kauermann

Prof. Dr.

A1 | Statistical Foundations & Explainability

Applied Statistics in Social Sciences, Economics and Business

[14]

C. Fritz, M. Mehrl, P. W. Thurner and G. Kauermann.
All that Glitters is not Gold: Relational Events Models with Spurious Events.
Network Science 11.2 (Sep. 2022). DOI

Abstract

As relational event models are an increasingly popular model for studying relational structures, the reliability of large-scale event data collection becomes more and more important. Automated or human-coded events often suffer from non-negligible false-discovery rates in event identification. And most sensor data are primarily based on actors’ spatial proximity for predefined time windows; hence, the observed events could relate either to a social relationship or random co-location. Both examples imply spurious events that may bias estimates and inference. We propose the Relational Event Model for Spurious Events (REMSE), an extension to existing approaches for interaction data. The model provides a flexible solution for modeling data while controlling for spurious events. Estimation of our model is carried out in an empirical Bayesian approach via data augmentation. Based on a simulation study, we investigate the properties of the estimation procedure. To demonstrate its usefulness in two distinct applications, we employ this model to combat events from the Syrian civil war and student co-location data. Results from the simulation and the applications identify the REMSE as a suitable approach to modeling relational event data in the presence of spurious events.

MCML Authors

Cornelius Fritz

Dr.

A1 | Statistical Foundations & Explainability
→ Group Göran Kauermann

* Former Member

Göran Kauermann

Prof. Dr.

A1 | Statistical Foundations & Explainability

Applied Statistics in Social Sciences, Economics and Business

[13]

M. Schneble and G. Kauermann.
Estimation of Latent Network Flows in Bike-Sharing Systems.
Statistical Modelling 22.2 (Aug. 2022). DOI

Abstract

Estimation of latent network flows is a common problem in statistical network analysis. The typical setting is that we know the margins of the network, that is, in- and outdegrees, but the flows are unobserved. In this article, we develop a mixed regression model to estimate network flows in a bike-sharing network if only the hourly differences of in- and outdegrees at bike stations are known. We also include exogenous covariates such as weather conditions. Two different parameterizations of the model are considered to estimate (a) the whole network flow and (b) the network margins only. The estimation of the model parameters is proposed via an iterative penalized maximum likelihood approach. This is exemplified by modelling network flows in the Vienna bike-sharing system. In order to evaluate our modelling approach, we conduct our analyses exploiting different distributional assumptions while we also respect the provider’s interventions appropriately for keeping the estimation error low. Furthermore, a simulation study is conducted to show the performance of the model. For practical purposes, it is crucial to predict when and at which station there is a lack or an excess of bikes. For this application, our model shows to be well suited by providing quite accurate predictions.

MCML Authors

Göran Kauermann

Prof. Dr.

A1 | Statistical Foundations & Explainability

Applied Statistics in Social Sciences, Economics and Business

[12]

C. Fritz, G. De Nicola, M. Rave, M. Weigert, Y. Khazaei, U. Berger, H. Küchenhoff and G. Kauermann.
Statistical modelling of COVID-19 data: Putting generalized additive models to work.
Statistical Modelling 24.4 (Aug. 2022). DOI

Abstract

Over the course of the COVID-19 pandemic, Generalized Additive Models (GAMs) have been successfully employed on numerous occasions to obtain vital data-driven insights. In this article we further substantiate the success story of GAMs, demonstrating their flexibility by focusing on three relevant pandemic-related issues. First, we examine the interdepency among infections in different age groups, concentrating on school children. In this context, we derive the setting under which parameter estimates are independent of the (unknown) case-detection ratio, which plays an important role in COVID-19 surveillance data. Second, we model the incidence of hospitalizations, for which data is only available with a temporal delay. We illustrate how correcting for this reporting delay through a nowcasting procedure can be naturally incorporated into the GAM framework as an offset term. Third, we propose a multinomial model for the weekly occupancy of intensive care units (ICU), where we distinguish between the number of COVID-19 patients, other patients and vacant beds. With these three examples, we aim to showcase the practical and ‘off-the-shelf’ applicability of GAMs to gain new insights from real-world data.

MCML Authors

Cornelius Fritz

Dr.

A1 | Statistical Foundations & Explainability
→ Group Göran Kauermann

* Former Member

Maximilian Weigert

C4 | Computational Social Sciences
→ Group Helmut Küchenhoff

* Former Member

Helmut Küchenhoff

Prof. Dr.

C4 | Computational Social Sciences

Statistical Consulting Unit (StaBLab)

Göran Kauermann

Prof. Dr.

A1 | Statistical Foundations & Explainability

Applied Statistics in Social Sciences, Economics and Business

[11]

C. Fritz.
Statistical approaches to dynamic networks in society.
Dissertation 2022. DOI

Abstract

This dissertation focuses on dynamic networks in the Social Sciences, examining methods and applications in network modeling. Part two provides an overview of modeling frameworks for dynamic networks, including applications in studying COVID-19 infections using social connectivity as covariates. In part three, the dissertation introduces a Signed Exponential Random Graph Model (SERGM) for signed networks and a bipartite variant of the Temporal Exponential Random Graph Model (TERGM) to study co-inventorship in patents. Part four concludes with models for event networks, including a Relational Event Model for Spurious Events (REMSE) to manage false-discovery rates in event data. (Shortened).

MCML Authors

Cornelius Fritz

Dr.

A1 | Statistical Foundations & Explainability
→ Group Göran Kauermann

* Former Member

[10]

M. Schneble and G. Kauermann.
Intensity Estimation on Geometric Networks with Penalized Splines.
Annals of Applied Statistics 16.2 (Jun. 2022). DOI

Abstract

In the past decades the growing amount of network data lead to many novel statistical models. In this paper we consider so-called geometric networks. Typical examples are road networks or other infrastructure networks. Nevertheless, the neurons or the blood vessels in a human body can also be interpreted as a geometric network embedded in a three-dimensional space. A network-specific metric, rather than the Euclidean metric, is usually used in all these applications, making the analyses of network data challenging. We consider network-based point processes, and our task is to estimate the intensity (or density) of the process which allows us to detect high- and low-intensity regions of the underlying stochastic processes. Available routines that tackle this problem are commonly based on kernel smoothing methods. This paper uses penalized spline smoothing and extends this toward smooth intensity estimation on geometric networks. Furthermore, our approach easily allows incorporating covariates, enabling us to respect the network geometry in a regression model framework. Several data examples and a simulation study show that penalized spline-based intensity estimation on geometric networks is a numerically stable and efficient tool. Furthermore, it also allows estimating linear and smooth covariate effects, distinguishing our approach from already existing methodologies.

MCML Authors

Göran Kauermann

Prof. Dr.

A1 | Statistical Foundations & Explainability

Applied Statistics in Social Sciences, Economics and Business

[9]

S. Kevork and G. Kauermann.
Bipartite Exponential Random Graph Models with Nodal Random Effects.
Social Networks 70 (Jun. 2022). DOI

Abstract

We examine the inclusion of specific nodal random effects for first- and second-mode nodes towards an ERGM for bipartite networks. The inclusion of such node-specific random effects in the ERGM accounts for unobserved heterogeneity in the bipartite network and ensures stable estimation results, especially for large-scale bipartite networks. Moreover, The predicted nodal random effects deliver reasonable interpretation to understand the network behavior. The estimation is carried out by an iterative estimation technique, iterating between pseudolikelihood estimation for the nodal random effects and maximum likelihood estimation for the network parameters.

MCML Authors

Göran Kauermann

Prof. Dr.

A1 | Statistical Foundations & Explainability

Applied Statistics in Social Sciences, Economics and Business

[8]

C. Fritz, E. Dorigatti and D. Rügamer.
Combining Graph Neural Networks and Spatio-temporal Disease Models to Predict COVID-19 Cases in Germany.
Scientific Reports 12.3930 (Mar. 2022). DOI

Abstract

During 2020, the infection rate of COVID-19 has been investigated by many scholars from different research fields. In this context, reliable and interpretable forecasts of disease incidents are a vital tool for policymakers to manage healthcare resources. In this context, several experts have called for the necessity to account for human mobility to explain the spread of COVID-19. Existing approaches often apply standard models of the respective research field, frequently restricting modeling possibilities. For instance, most statistical or epidemiological models cannot directly incorporate unstructured data sources, including relational data that may encode human mobility. In contrast, machine learning approaches may yield better predictions by exploiting these data structures yet lack intuitive interpretability as they are often categorized as black-box models. We propose a combination of both research directions and present a multimodal learning framework that amalgamates statistical regression and machine learning models for predicting local COVID-19 cases in Germany. Results and implications: the novel approach introduced enables the use of a richer collection of data types, including mobility flows and colocation probabilities, and yields the lowest mean squared error scores throughout the observational period in the reported benchmark study. The results corroborate that during most of the observational period more dispersed meeting patterns and a lower percentage of people staying put are associated with higher infection rates. Moreover, the analysis underpins the necessity of including mobility data and showcases the flexibility and interpretability of the proposed approach.

MCML Authors

Cornelius Fritz

Dr.

A1 | Statistical Foundations & Explainability
→ Group Göran Kauermann

* Former Member

Emilio Dorigatti

Dr.

A1 | Statistical Foundations & Explainability
→ Group Bernd Bischl

* Former Member

David Rügamer

Prof. Dr.

A1 | Statistical Foundations & Explainability

Statistics, Data Science and Machine Learning

[7]

G. De Nicola, B. Sischka and G. Kauermann.
Mixture Models and Networks: The Stochastic Block Model.
Statistical Modelling 22.1-2 (Feb. 2022). DOI

Abstract

Mixture models are probabilistic models aimed at uncovering and representing latent subgroups within a population. In the realm of network data analysis, the latent subgroups of nodes are typically identified by their connectivity behaviour, with nodes behaving similarly belonging to the same community. In this context, mixture modelling is pursued through stochastic blockmodelling. We consider stochastic blockmodels and some of their variants and extensions from a mixture modelling perspective. We also explore some of the main classes of estimation methods available and propose an alternative approach based on the reformulation of the blockmodel as a graphon. In addition to the discussion of inferential properties and estimating procedures, we focus on the application of the models to several real-world network datasets, showcasing the advantages and pitfalls of different approaches.

MCML Authors

Göran Kauermann

Prof. Dr.

A1 | Statistical Foundations & Explainability

Applied Statistics in Social Sciences, Economics and Business

[6]

C. Fritz and G. Kauermann.
On the Interplay of Regional Mobility, Social Connectedness, and the Spread of COVID-19 in Germany.
Journal of the Royal Statistical Society. Series A (Statistics in Society) 185.1 (Jan. 2022). DOI

Abstract

Since the primary mode of respiratory virus transmission is person-to-person interaction, we are required to reconsider physical interaction patterns to mitigate the number of people infected with COVID-19. While research has shown that non-pharmaceutical interventions (NPI) had an evident impact on national mobility patterns, we investigate the relative regional mobility behaviour to assess the effect of human movement on the spread of COVID-19. In particular, we explore the impact of human mobility and social connectivity derived from Facebook activities on the weekly rate of new infections in Germany between 3 March and 22 June 2020. Our results confirm that reduced social activity lowers the infection rate, accounting for regional and temporal patterns. The extent of social distancing, quantified by the percentage of people staying put within a federal administrative district, has an overall negative effect on the incidence of infections. Additionally, our results show spatial infection patterns based on geographical as well as social distances.

MCML Authors

Cornelius Fritz

Dr.

A1 | Statistical Foundations & Explainability
→ Group Göran Kauermann

* Former Member

Göran Kauermann

Prof. Dr.

A1 | Statistical Foundations & Explainability

Applied Statistics in Social Sciences, Economics and Business

2021

[5]

S. Kevork and G. Kauermann.
Iterative Estimation of Mixed Exponential Random Graph Models with Nodal Random Effects.
Network Science 9.4 (Dec. 2021). DOI

Abstract

The presence of unobserved node-specific heterogeneity in exponential random graph models (ERGM) is a general concern, both with respect to model validity as well as estimation instability. We, therefore, include node-specific random effects in the ERGM that account for unobserved heterogeneity in the network. This leads to a mixed model with parametric as well as random coefficients, labelled as mixed ERGM. Estimation is carried out by iterating between approximate pseudolikelihood estimation for the random effects and maximum likelihood estimation for the remaining parameters in the model. This approach provides a stable algorithm, which allows to fit nodal heterogeneity effects even for large scale networks. We also propose model selection based on the Akaike Information Criterion to check for node-specific heterogeneity.

MCML Authors

Göran Kauermann

Prof. Dr.

A1 | Statistical Foundations & Explainability

Applied Statistics in Social Sciences, Economics and Business

[4]

C. Fritz, M. Mehrl, P. W. Thurner and G. Kauermann.
The Role of Governmental Weapons Procurements in Forecasting Monthly Fatalities in Intrastate Conflicts: A Semiparametric Hierarchical Hurdle Model.
International Interactions 48.4 (Nov. 2021). DOI

Abstract

Accurate and interpretable forecasting models predicting spatially and temporally fine-grained changes in the numbers of intrastate conflict casualties are of crucial importance for policymakers and international non-governmental organizations (NGOs). Using a count data approach, we propose a hierarchical hurdle regression model to address the corresponding prediction challenge at the monthly PRIO-grid level. More precisely, we model the intensity of local armed conflict at a specific point in time as a three-stage process. Stages one and two of our approach estimate whether we will observe any casualties at the country- and grid-cell-level, respectively, while stage three applies a regression model for truncated data to predict the number of such fatalities conditional upon the previous two stages. Within this modeling framework, we focus on the role of governmental arms imports as a processual factor allowing governments to intensify or deter from fighting. We further argue that a grid cell’s geographic remoteness is bound to moderate the effects of these military buildups. Out-of-sample predictions corroborate the effectiveness of our parsimonious and theory-driven model, which enables full transparency combined with accuracy in the forecasting process.

MCML Authors

Cornelius Fritz

Dr.

A1 | Statistical Foundations & Explainability
→ Group Göran Kauermann

* Former Member

Göran Kauermann

Prof. Dr.

A1 | Statistical Foundations & Explainability

Applied Statistics in Social Sciences, Economics and Business

[3]

C. Fritz, P. W. Thurner and G. Kauermann.
Separable and Semiparametric Network-based Counting Processes applied to the International Combat Aircraft Trades.
Network Science 9.3 (Sep. 2021). DOI

Abstract

We propose a novel tie-oriented model for longitudinal event network data. The generating mechanism is assumed to be a multivariate Poisson process that governs the onset and repetition of yearly observed events with two separate intensity functions. We apply the model to a network obtained from the yearly dyadic number of international deliveries of combat aircraft trades between 1950 and 2017. Based on the trade gravity approach, we identify economic and political factors impeding or promoting the number of transfers. Extensive dynamics as well as country heterogeneities require the specification of semiparametric time-varying effects as well as random effects. Our findings reveal strong heterogeneous as well as time-varying effects of endogenous and exogenous covariates on the onset and repetition of aircraft trade events.

MCML Authors

Cornelius Fritz

Dr.

A1 | Statistical Foundations & Explainability
→ Group Göran Kauermann

* Former Member

Göran Kauermann

Prof. Dr.

A1 | Statistical Foundations & Explainability

Applied Statistics in Social Sciences, Economics and Business

2020

[2]

C. Fritz, M. Lebacher and G. Kauermann.
Tempus volat, hora fugit: A survey of tie-oriented dynamic network models in discrete and continuous time.
Statistica Neerlandica 74.3 (Aug. 2020). DOI

Abstract

Given the growing number of available tools for modeling dynamic networks, the choice of a suitable model becomes central. The goal of this survey is to provide an overview of tie-oriented dynamic network models. The survey is focused on introducing binary network models with their corresponding assumptions, advantages, and shortfalls. The models are divided according to generating processes, operating in discrete and continuous time. First, we introduce the temporal exponential random graph model (TERGM) and the separable TERGM (STERGM), both being time-discrete models. These models are then contrasted with continuous process models, focusing on the relational event model (REM). We additionally show how the REM can handle time-clustered observations, that is, continuous-time data observed at discrete time points. Besides the discussion of theoretical properties and fitting procedures, we specifically focus on the application of the models on two networks that represent international arms transfers and email exchange, respectively. The data allow to demonstrate the applicability and interpretation of the network models.

MCML Authors

Cornelius Fritz

Dr.

A1 | Statistical Foundations & Explainability
→ Group Göran Kauermann

* Former Member

Göran Kauermann

Prof. Dr.

A1 | Statistical Foundations & Explainability

Applied Statistics in Social Sciences, Economics and Business

[1]

A. Beyer, G. Kauermann and H. Schütze.
Embedding Space Correlation as a Measure of Domain Similarity.
LREC 2020 - 12th International Conference on Language Resources and Evaluation. Marseille, France, May 13-15, 2020. URL

Abstract

Prior work has determined domain similarity using text-based features of a corpus. However, when using pre-trained word embeddings, the underlying text corpus might not be accessible anymore. Therefore, we propose the CCA measure, a new measure of domain similarity based directly on the dimension-wise correlations between corresponding embedding spaces. Our results suggest that an inherent notion of domain can be captured this way, as we are able to reproduce our findings for different domain comparisons for English, German, Spanish and Czech as well as in cross-lingual comparisons. We further find a threshold at which the CCA measure indicates that two corpora come from the same domain in a monolingual setting by applying permutation tests. By evaluating the usability of the CCA measure in a domain adaptation application, we also show that it can be used to determine which corpora are more similar to each other in a cross-domain sentiment detection task.

MCML Authors

Göran Kauermann

Prof. Dr.

A1 | Statistical Foundations & Explainability

Applied Statistics in Social Sciences, Economics and Business