15.03.2021

MCML Researchers With Three Papers at ECIR 2021

43rd European Conference on Information Retrieval (ECIR 2021). Virtual, 28.03.2021–01.04.2021

We are happy to announce that MCML researchers are represented with three papers at ECIR 2021. Congrats to our researchers!

Main Track (3 papers)

M. Berrendorf, E. Faerman and V. Tresp.
Active Learning for Entity Alignment.
ECIR 2021 - 43rd European Conference on Information Retrieval. Virtual, Mar 28-Apr 01, 2021. DOI GitHub

Abstract

In this work, we propose a novel framework for labeling entity alignments in knowledge graph datasets. Different strategies to select informative instances for the human labeler build the core of our framework. We illustrate how the labeling of entity alignments is different from assigning class labels to single instances and how these differences affect the labeling efficiency. Based on these considerations, we propose and evaluate different active and passive learning strategies. One of our main findings is that passive learning approaches, which can be efficiently precomputed, and deployed more easily, achieve performance comparable to the active learning strategies.

MCML Authors

Max Berrendorf

Dr.

* Former Member

→ Group Volker Tresp
Database Systems, Data Mining and AI

Evgeny Faerman

Dr.

* Former Member

→ Group Matthias Schubert
Spatial Artificial Intelligence

Volker Tresp

Prof. Dr.

Principal Investigator

Database Systems, Data Mining and AI

M. Berrendorf, L. Wacker and E. Faerman.
A Critical Assessment of State-of-the-Art in Entity Alignment.
ECIR 2021 - 43rd European Conference on Information Retrieval. Virtual, Mar 28-Apr 01, 2021. DOI GitHub

Abstract

In this work, we perform an extensive investigation of two state-of-the-art (SotA) methods for the task of Entity Alignment in Knowledge Graphs. Therefore, we first carefully examine the benchmarking process and identify several shortcomings, making the results reported in the original works not always comparable. Furthermore, we suspect that it is a common practice in the community to make the hyperparameter optimization directly on a test set, reducing the informative value of reported performance. Thus, we select a representative sample of benchmarking datasets and describe their properties. We also examine different initializations for entity representations since they are a decisive factor for model performance. Furthermore, we use a shared train/validation/test split for an appropriate evaluation setting to evaluate all methods on all datasets. In our evaluation, we make several interesting findings. While we observe that most of the time SotA approaches perform better than baselines, they have difficulties when the dataset contains noise, which is the case in most real-life applications. Moreover, in our ablation study, we find out that often different features of SotA method are crucial for good performance than previously assumed.

MCML Authors

Max Berrendorf

Dr.

* Former Member

→ Group Volker Tresp
Database Systems, Data Mining and AI

Evgeny Faerman

Dr.

* Former Member

→ Group Matthias Schubert
Spatial Artificial Intelligence

M. Fromm, M. Berrendorf, S. Obermeier, T. Seidl and E. Faerman.
Diversity Aware Relevance Learning for Argument Search.
ECIR 2021 - 43rd European Conference on Information Retrieval. Virtual, Mar 28-Apr 01, 2021. DOI GitHub

Abstract

In this work, we focus on the problem of retrieving relevant arguments for a query claim covering diverse aspects. State-of-the-art methods rely on explicit mappings between claims and premises, and thus are unable to utilize large available collections of premises without laborious and costly manual annotation. Their diversity approach relies on removing duplicates via clustering which does not directly ensure that the selected premises cover all aspects. This work introduces a new multi-step approach for the argument retrieval problem. Rather than relying on ground-truth assignments, our approach employs a machine learning model to capture semantic relationships between arguments. Beyond that, it aims to cover diverse facets of the query, instead of trying to identify duplicates explicitly. Our empirical evaluation demonstrates that our approach leads to a significant improvement in the argument retrieval task even though it requires less data.

MCML Authors

Michael Fromm

Dr.

* Former Member

→ Group Thomas Seidl
Database Systems, Data Mining and AI