LWDA 2021 - Program


Lernen. Wissen. Daten. Analysen. - Learning. Knowledge. Data. Analytics.

LWDA 2021 at MCML, Munich, 01.09–03.09.2021.

LWDA 2021 in Munich

Quick Links: Schedule | Workshop Programs | Keynote Speakers

We thank you for your patience today regarding the track sessions. After collecting feedback, 30 mins should be enough to get back to our program. Accordingly, due to our delay:

  • Starting 16:30: Keynote 2, Prof. Dr. ir. Arjen P. de Vries
  • Starting 17:30: Social Event

Schedule

Time Wednesday (Sep 1) Thursday (Sep 2) Friday (Sep 3)
09:00
10:00 Opening Keynote 3
Prof. Dr. Mykola Pechenizkiy
Keynote 1
Prof. Dr. Michael Leyer
11:00 Track Session Track Session
Joint Session
12:00
Lunch Break Lunch Break Lunch Break
13:00
Track Session Track Session Track Session
14:00
15:00 Closing Session
16:00 Community Meetings
Keynote 2
Prof. Dr. ir. Arjen P. de Vries
17:00
Social Event (Open End)
18:00
19:00

It is planned to have a short break of 5-10 minutes between the sessions.


Joint Session

Wednesday, Sep 1st, 11:30-12:30

  • 11:30-12:00
    Bettina Finzel, René Kollmann, Ines Rieger, Jaspar Pahl and Ute Schmid: Deriving Temporal Prototypes from Saliency Map Clusters for the Analysis of Deep-Learning-based Facial Action Unit Classification (Paper from FGKDML)

  • 12:00-12:30
    Michael Steininger, Konstantin Kobs, Padraig Davidson, Anna Krause and Andreas Hotho: Density-based weighting for imbalanced regression (Paper from FGKDML)

Program of each Workshop

FG Datenbanksysteme - Data Engineering for Data Science

Wednesday, Sep 1st, 13:30-16:00

  • 13:30-13:40
    Opening

  • 13:40-14:00
    Peter K. Schwab, Jonas Röckl, Maximilian S. Langohr, Klaus Meyer-Wegener: Performance Evaluation of Policy-Based SQL Query Classification for Data-Privacy Compliance

  • 14:00-14:20
    Ioannis Prapas, Behrouz Derakhshan, Alireza Rezaei Mahdiraji, Volker Markl: Continuous Training and Deployment of Deep Learning Models

  • 14:20-14:40
    Chris-Marian Forke, Marina Tropmann-Frick: Feature Engineering as a Part of Data Processing for Spatio-Temporal Data

  • 14:40-14:55
    Pause

  • 14:55-15:15
    Daniyal Kazempour, Johannes Winter, Peer Kröger, Thomas Seidl: On Methods and Measures for Inspection and Evaluation of Arbitrarily Oriented Subspace Clusters

  • 15:15-15:35
    Ulf Leser, Marcus Hilbrich, Claudia Draxl, Peter Eisert, Lars Grunske, Patrick Hostert, Dagmar Kainmüller, Odej Kao, Birte Kehr, Timo Kehrer, Christoph Koch, Volker Markl, Henning Meyerhenke, Tilmann Rabl, Alexander Reinefeld, Knut Reinert, Kerstin Ritter, Björn Scheuermann, Florian Schintke, Nicole Schweikardt, Matthias Weidlich: FONDA – Foundations of Workflows for Large-Scale Analysis of Scientific Data

  • 15:35-15:55
    Lars Kegel, Claudio Hartmann, Maik Thiele, Wolfgang Lehner: Season- and Trend-aware Symbolic Approximation for Accurate and Efficient Time Series Matching

Thursday, Sep 2nd, 11:00-12:30

  • 11:00-11:20
    Thomas Weißgerber, Mehdi Ben Amor, Christofer Fellicious, Michael Granitzer: PyPads - Transparent Machine Learning Experiment Tracking

  • 11:20-11:40
    Alexander Schoenenwald, Simon Kern, Josef Viehhauser, Johannes Schildgen: Collecting and visualizing data lineage of Spark jobs

  • 11:40-12:00
    Meike Klettke, Uta Störl: Four Generations in Data Engineering for Data Science – The Past, Presence and Future of a Field of Science

  • 12:00-12:30
    Jörg Desel, Daniel Krupka, Julia Meisner: GI entwickelt Empfehlungen zur Gestaltung von Data-Science-Masterstudiengängen


FG Knowledge Discovery und Machine Learning

Wednesday 2021-09-01: 11:30-16:00

  • 11:30-12:00
    Bettina Finzel, René Kollmann, Ines Rieger, Jaspar Pahl and Ute Schmid: Deriving Temporal Prototypes from Saliency Map Clusters for the Analysis of Deep-Learning-based Facial Action Unit Classification (Joint Session)

  • 12:00-12:30
    Michael Steininger, Konstantin Kobs, Padraig Davidson, Anna Krause and Andreas Hotho: Density-based weighting for imbalanced regression (Joint Session)

  • 13:30-13:55
    Leonid Schwenke and Martin Atzmueller: Abstracting Local Transformer Attention for Enhancing Interpretability on Time Series Data

  • 13:55-14:20
    Dominik Dürrschnabel, Maren Koyda and Gerd Stumme: Attribute Selection using Contranominal Scales

  • 14:20-14:45
    Pascal Welke, Fouad Alkhoury, Christian Bauckhage and Stefan Wrobel: Decision Snippet Features

  • 14:45-15:10
    Felix Stamm, Martin Becker, Markus Strohmaier and Florian Lemmerich: Redescription Model Mining

  • 15:10-15:35
    Bastian Schäfermeier, Gerd Stumme and Tom Hanika: Topic Space Trajectories

  • 15:35-16:00
    Deniz Neufeld: Visualization Methods for Periodic Time Series Data

Thursday 2021-09-02: 11:00-12:30

  • 11:00-11:15
    Christopher Hagedorn and Johannes Huegle: Constraint-Based Causal Structure Learning in Multi-GPU Environments

  • 11:15-11:30
    Max Luebbering, Michael Gebauer, Rajkumar Ramamurthy, Maren Pielka, Christian Bauckhage and Rafet Sifa: Utilizing Representation Learning for Robust Text Classification Under Datasetshift

  • 11:30-11:45
    Tobias Rohrer, Ludwig Samuel, Adriatik Gashi, Gunter Grieser and Elke Hergenröther: Foosball table goalkeeper automation using reinforcement learning

  • 11:45-12:00
    Lars Schmarje and Reinhard Koch: Life is not black and white - Combining Semi-Supervised Learning with fuzzy labels

  • 12:00-12:15
    Felix Gonsior, Sascha Mücke and Katharina Morik: Structure Search for Normalizing Flows

  • 12:15-12:30
    Philipp Doebler, Anna Doebler, Philip Buczak and Andreas Groll: Interactions of Scores Derived from Two Groups of Variables: Alternating Lasso Regularization Avoids Overfitting and Finds Interpretable Scores

Thursday 2021-09-02: 13:30-16:00

  • 13:30-13:55
    Simon Omlor and Alexander Munteanu: Oblivious sketching for logistic regression

  • 13:55-14:20
    Daniel Neider, Jean-Raphaël Gaglione, Ivan Gavran, Ufuk Topcu, Bo Wo and Zhe Xu: AdvisoRL: Advice-Guided Reinforcement Learning in a non-Markovian Environment

  • 14:20-14:45
    Xuan Xie: Property-Directed Verification and Robustness Certification of Recurrent Neural Networks

  • 14:45-15:10
    Mirko Bunse and Katharina Morik: A PAC Learning Theory for Active Class Selection

  • 15:10-15:35
    Erich Schubert: HACAM: Hierarchical Agglomerative Clustering Around Medoids - and its Limitations

Friday 2021-09-03: 11:00-12:30

  • 11:00-11:15
    Erik Thordsen and Erich Schubert: CANDLE: Classification And Noise Detection With Local Embedding Approximations

  • 11:15-11:30
    Andreas Lohrer, Anna Beer, Maximilian Archimedes Xaver Hünemörder, Jenny Lauterbach, Thomas Seidl and Peer Kröger: AnyCORE - An Anytime Algorithm for Cluster Outlier REmoval

  • 11:30-11:45
    Nil Ayday and Debarghya Ghoshdastidar: Improvement on Incremental Spectral Clustering

  • 11:45-12:10
    Eike Stadtländer, Tamás Horváth and Stefan Wrobel: Learning Weakly Convex Sets in Metric Spaces

Friday 2021-09-03: 13:30-15:00

  • 13:30-13:55
    Mirko Lenz, Premtim Sahitaj, Sean Kallenberg, Christopher Coors, Lorik Dumani, Ralf Schenkel and Ralph Bergmann: Towards an Argument Mining Pipeline Transforming Texts to Argument Graphs

  • 13:55-14:20
    Philip Hausner and Michael Gertz: News Article Extraction Using Graph Embeddings

  • 14:20-14:45
    Patrick Kolpaczki, Viktor Bengs and Eyke Hüllermeier: Identifying Top-k Players in Cooperative Games via Shapley Bandits


FG Business Intelligence und Analytics

Donnerstag, 2.9.:

  • 13:30-13:40 Uhr:
    Henning Baars: Begrüßung

  • 13:40-14:10 Uhr:
    12 – Malte Constantinescu, Michael Schulz and Kerstin Schneider: Konzeption einer Anwendung zur Ad-hoc-Sternschema-Generierung

  • 14:10:-14:40 Uhr:
    36 – Thomas Rupek: Establishing Governance Structures for Analytics-Driven Interorganizational Data Sharing Networks – Designing a Framework Based on a Qualitative Study

  • 15:00-15:30 Uhr:
    48 – Maximilian Werling: Gestaltung eines methodischen Vorgehens bei der Auswahl eines konzeptionellen Ansatzes zur Befähigung kooperativer Wertschöpfung

  • 15:30-16:00 Uhr:
    23 – Sebastian Trinks: Real Time Quality Assurance and Defect Detection in Industry 4.0


FG Knowledge Management

Wednesday, Sep 1st, 13:30-16:00

  • 13.35 Presentation 1:
    Arnab Ghosh Chowdhury, Nils Schut and Martin Atzmueller: A Hybrid Information Extraction Approach using Transfer Learning on Richly-Structured Documents

  • 14.10 Presentation 2:
    Andreas Korger and Joachim Baumeister: Rule-based Relation Extraction in Regulatory Documents

  • 14.45 Presentation 3:
    Pascal Reuss, Wasgen Muradian and Klaus-Dieter Althoff: Towards a domain-specific language for knowledge maintenance of CBR systems

  • 15.20 Presentation 4:
    Jakob Michael Schoenborn and Klaus-Dieter Althoff: Detecting SQL-Injection and Cross-Site Scripting Attacks Using Case-Based Reasoning and SEASALT

Thursday, Sep 2nd, 13:30-16:00

  • 13.35 Presentation 5:
    Pascal Reuss and Klaus-Dieter Althoff: Using gaming environments to teach the idea and application of CBR

  • 14.10 Presentation 6:
    Hannes Reil and Michael Leyer: How smart services affect relevant job characteristics in production environments

  • 14.45 Presentation 7:
    Mahta Bakhshizadeh, Christian Jilek, Heiko Maus and Andreas Dengel: Leveraging context-aware recommender systems for improving personal knowledge assistants by introducing contextual states

(Break / “get-together” until official community meeting at 16.00)


FG Information Retrieval

Wednesday, Sep 1st, 13:30-16:00

  • 13:30 Introduction and Welcome

  • 13:40
    Martin Potthast, Sebastian Günther, Janek Bevendorff, Jan Philipp Bittner, Alexander Bondarenko, Maik Fröbe, Christian Kahmann, Andreas Niekler, Michael Völske, Benno Stein and Matthias Hagen: The Information Retrieval Anthology

  • 14:05
    Fabian Haak: Emojis in Lexicon-Based Sentiment Analysis: Creating Emoji Sentiment Lexicons from Unlabeled Corpora

  • 14:30
    Khanh Hiep Tran, Azin Ghazimatin and Rishiraj Saha Roy: Counterfactual Explanations for Neural Recommenders

  • 14:55
    Timo Breuer, Nicola Ferro, Norbert Fuhr, Maria Maistro, Tetsuya Sakai, Philipp Schaer and Ian Soboroff: How to Measure the Reproducibility of System-oriented IR Experiments

  • 15:20
    Dirk Lewandowski, Sebastian Sünkler and Nurce Yagci: Der Einfluss der Suchmaschinenoptimierung auf die Ergebnisse von Google: Ein mehrdimensionaler Ansatz zur Erkennung von SEO

  • 15:45 End of Session

Keynote 2

  • 16:00
    Arjen de Vries: You will want to rank your text data with a database too!

Thursday, Sep 2nd, 13:30-16:00

  • 13:30 Introduction and Welcome

  • 13:40
    Edgar Meij: Keynote: Search and Discovery for Finance

  • 14:40
    Jurek Leonhardt, Fabian Beringer and Avishek Anand: Exploiting Sentence-Level Representations for Passage Ranking

  • 15:05
    Magdalena Kaiser, Rishiraj Saha Roy and Gerhard Weikum: Reinforcement Learning from Reformulations in Conversational Question Answering

  • 15:30
    Lorik Dumani and Ralf Schenkel: Quality-Aware Ranking of Arguments

  • 15:55 End of Session

Community Meeting

  • 16:00
    Community Meeting of Special Interest Group IR

FG Grundlagen von Datenbanken

Donnerstag, 02.09.2021

  • 13:30-13:45 Uhr
    Eröffnung

  • 13:45-14:15 Uhr
    Manfred Moosleitner: Co-Rating Attacks on Recommendation Algorithms

  • 14:15-14:45 Uhr
    Michael Hohenstein: Progressive Indexing for Interactive Analytics

  • 15:00-15:30 Uhr
    Marcel Weisgut: Experimental Index Evaluation for Partial Indexes in Horizontally Partitioned In-Memory Databases

  • 15:30-16:00 Uhr
    Andreas Görres: Ausblick auf einen erweiterten CHASE-Algorithmus

Freitag, 03.09.2021

  • 11:00-11:30 Uhr
    Moritz Wilke: Towards Multi-modal Entity Resolution for Product Matching

  • 11:30-12:00 Uhr
    Johannes Fett: Towards Porting Hardware-Oblivious Vectorized Query Operators to GPUs

  • 12:00-12:30 Uhr
    Florian Rose: Der BACKCHASE zur Unterstützung von Data Provenance und Schema-Evolution

    Lunch Break

  • 13:30-14:00 Uhr
    Anh Trang Le: Design Considerations Towards AI-Driven Co-Processor Accelerated Database Management

  • 14:00-14:30 Uhr
    Philsy Baban: Validation of Data Streams using Time Series Forecasting

  • 14:30-15:00 Uhr
    Benjamin Murauer: DT-grams: Structured Dependency Grammar Stylometry for Cross-Language Authorship Attribution


LWDA Keynote Speakers

Prof. Dr. Michael Leyer

How our brain reacts to and interacts with data, information and knowledge



Abstract: Individuals decide and act on the basis of perceived data, information and knowledge. What happens in the brain with this input and which influencing factors lead to which decisions and actions is targeted in different disciplines. In addition, there are algorithms and systems that generate data, information and knowledge, make it available to individuals and with which individuals interact. The complexity is increased with different systems related to artificial intelligence. But even if artefacts are generated that process input without humans, there are many interactions. The keynote gives an overview of different theories that explain cognitive processes of people from different angles. It also takes a closer look at how visualizations of data and information are processed and influence cognitive processes. In addition, the interaction of humans with applications based on artificial intelligence is considered as well as how these are accepted by humans.

Bio: Prof. Dr. Michael Leyer holds the Chair of Service Operations at the University of Rostock and Adjunct Professor in the School of Management of the Queensland University of Technology in Brisbane, Australia. He conducts research on the effects of new technologies, the design of new forms of work (future of work), the integration of customers into business processes and the design of service networks. A central aspect is the consideration of behavior, cognitive processes and decisions of people in processes. The topics are examined from a theoretical perspective in order to be able to derive well-founded, practical implications. He has published his research results in over 100 scientific publications. In addition to his research activities, he is active in various positions. He is President of the Council of the University of Rostock, in the scientific management board of the Center for Entrepreneurship and a member of the board of the Information and Communication Network at the University of Rostock. In addition, he is involved in the Association of University Teachers in Business Administration with the establishment of innovative concepts at the central scientific association conference and is a member of the steering committee of the Knowledge Management Group (FGWM) in the Gesellschaft für Informatik.


Prof. Dr. Mykola Pechenizkiy

The origins and future of AI fairness, accountability and transparency



Abstract: Modern machine learning techniques contribute to the massive automation of the data-driven decision making and decision support. Multiple examples from different industries, healthcare, education, and government illustrate the challenges of developing and making use of trustworthy and human-centered AI. It becomes better understood and accepted that employed predictive models may need to be audited. Disregarding whether we deal with so-called black-box models (e.g. deep learning) or more interpretable models (e.g. decision trees), answering even basic questions like “why is this model giving these answers?” and “how do particular features affect the model output?” is nontrivial. In reality, auditors need tools not just to explain the decision logic of an algorithm, but also to uncover and characterize undesired or unlawful biases in predictive model performance, e.g. by law hiring decisions cannot be influenced by race or gender. In this talk I will give a brief overview of the different facets of comprehensibility of predictive analytics and reflect on the current state-of-the-art and further research needed for gaining a deeper understanding of what it means for predictive analytics to be truly transparent, fair and accountable. I will also reflect on the necessity to study utility of the methods for interpretable predictive analytics.

Bio: Mykola Pechenizkiy is Professor of Data Mining at the Department of Mathematics and Computer Science, TU Eindhoven. His main expertise and research interests are in predictive analytics and its application to real-world problems in industry, healthcare and education. He leads Trustworthy AI interdisciplinary research studying foundations of robustness, safety, trust, reliability, scalability, interpretability and explainability of AI; developing novel techniques for informed, accountable and transparent predictive and prescriptive analytics; and demonstrating their ecological validity in practice in collaboration with industrial parters. He has co-authored several publications and served on the program committees of the leading data mining and AI conferences, including IJCAI, ECMLPKDD, AAAI, and ICML among others.


Prof. Dr. ir. Arjen P. de Vries

You will want to rank your text data with a database too!



Abstract: My research has always focused on the question how to integrate databases and information retrieval technology. Why would you even want to do that? Are the join operations necessary to solve information access problems not way too expensive to run ranking queries on a database engine? In this talk I will argue that, yes, it is technically feasible and desirable to bring the benefits of “the database approach” to the field of information retrieval, to enable the field to tackle the challenges posed by the next generation of search systems. Addressing complex information needs that span multiple, heterogeneous information sources and match the relevance criteria to the personal or work context where they arise calls for a higher level of abstraction than the inverted file, and adoption of the separation of concerns and data independence that are the de facto standard for developing business applications. I will discuss the basic building blocks drawn from my prior research and experience with the integration of IR and databases, and conclude with a brief introduction to GeeseDB, a research toolkit being developed in my group to explore the benefits of graphs as a representation and express search solutions in a graph query language.

Bio: Arjen P. de Vries is professor of Information Retrieval and research director of the Institute of Computing and Information Sciences.at Radboud University Nijmegen in the Netherlands. His research aims to resolve the question how users and systems may cooperate to improve information access, with a specific focus on the value of a combination of structured and unstructured information representations. He is a founding member of Spinque, a company that integrates databases and information retrieval to develop search solutions with and for information specialists.


Edgar Meij

Search and Discovery for Finance

Abstract: Finance professionals face a myriad of tasks in their day-to-day workflows, including finding relevant information, generating trade ideas, staying up to date with breaking news or general trends in the world, and coming up with novel ways to generate “alpha”. Historically, this has centered mainly around traditional sources of data such as stock exchange ticks, news stories, company filings, etc. An increasing amount of data is being generated in textual form on social media and elsewhere, however, and given the simultaneous increase in more machine-readable forms of “alternative data” such as web site usage, credit card transactions, and mobile app analytics, we face a unique opportunity to identify, score, rank, suggest, filter, and alert to financially relevant information and events. In this talk, Edgar describes typical information needs of finance professionals and how Bloomberg uses techniques such as search, summarization, entity linking, and natural language understanding & generation to address those needs.

Bio: Edgar Meij is the head of the Artificial Intelligence (AI) Discovery group in Bloomberg’s Engineering department. He leads several teams of researchers and engineers who develop systems that provide question answering capabilities, smart contextual suggestions with severe latency constraints, as well as the Bloomberg Knowledge Graph with its advanced machine learning-based analytics that is used to generate accurate, timely, and contextual financial insights. Edgar holds a PhD in computer science from the University of Amsterdam and has an extensive track record in information retrieval, natural language processing, and machine learning. Before joining Bloomberg, Edgar worked at Yahoo! Labs on all aspects related to entities in the context of web search.