Lernen. Wissen. Daten. Analysen. - Learning. Knowledge. Data. Analytics.
LWDA 2021 hosted by the Munich Center for Machine Learning (MCML), Munich, 01.09–03.09.2021.
Quick Links: Schedule | Workshop Programs | Keynote Speakers
We thank you for your patience today regarding the track sessions. After collecting feedback, 30 mins should be enough to get back to our program. Accordingly, due to our delay:
Time | Wednesday (Sep 1) |
Thursday (Sep 2) |
Friday (Sep 3) |
---|---|---|---|
09:00 | |||
10:00 | Opening | Keynote 3 Prof. Dr. Mykola Pechenizkiy |
|
Keynote 1 Prof. Dr. Michael Leyer |
|||
11:00 | Track Session | Track Session | |
Joint Session | |||
12:00 | |||
Lunch Break | Lunch Break | Lunch Break | |
13:00 | |||
Track Session | Track Session | Track Session | |
14:00 | |||
15:00 | Closing Session | ||
16:00 | Community Meetings | ||
Keynote 2 Prof. Dr. ir. Arjen P. de Vries |
|||
17:00 | |||
Social Event (Open End) | |||
18:00 | |||
19:00 | |||
It is planned to have a short break of 5-10 minutes between the sessions.
Wednesday, Sep 1st, 11:30-12:30
11:30-12:00
Bettina Finzel, René Kollmann, Ines Rieger, Jaspar Pahl and Ute Schmid: Deriving Temporal Prototypes from Saliency Map Clusters for the Analysis of Deep-Learning-based Facial Action Unit Classification (Paper from FGKDML)
12:00-12:30
Michael Steininger, Konstantin Kobs, Padraig Davidson, Anna Krause and Andreas Hotho: Density-based weighting for imbalanced regression (Paper from FGKDML)
Wednesday, Sep 1st, 13:30-16:00
13:30-13:40
Opening
13:40-14:00
Peter K. Schwab, Jonas Röckl, Maximilian S. Langohr, Klaus Meyer-Wegener: Performance Evaluation of Policy-Based SQL Query Classification for Data-Privacy Compliance
14:00-14:20
Ioannis Prapas, Behrouz Derakhshan, Alireza Rezaei Mahdiraji, Volker Markl: Continuous Training and Deployment of Deep Learning Models
14:20-14:40
Chris-Marian Forke, Marina Tropmann-Frick: Feature Engineering as a Part of Data Processing for Spatio-Temporal Data
14:40-14:55
Pause
14:55-15:15
Daniyal Kazempour, Johannes Winter, Peer Kröger, Thomas Seidl: On Methods and Measures for Inspection and Evaluation of Arbitrarily Oriented Subspace Clusters
15:15-15:35
Ulf Leser, Marcus Hilbrich, Claudia Draxl, Peter Eisert, Lars Grunske, Patrick Hostert, Dagmar Kainmüller, Odej Kao, Birte Kehr, Timo Kehrer, Christoph Koch, Volker Markl, Henning Meyerhenke, Tilmann Rabl, Alexander Reinefeld, Knut Reinert, Kerstin Ritter, Björn Scheuermann, Florian Schintke, Nicole Schweikardt, Matthias Weidlich: FONDA – Foundations of Workflows for Large-Scale Analysis of Scientific Data
15:35-15:55
Lars Kegel, Claudio Hartmann, Maik Thiele, Wolfgang Lehner: Season- and Trend-aware Symbolic Approximation for Accurate and Efficient Time Series Matching
Thursday, Sep 2nd, 11:00-12:30
11:00-11:20
Thomas Weißgerber, Mehdi Ben Amor, Christofer Fellicious, Michael Granitzer: PyPads - Transparent Machine Learning Experiment Tracking
11:20-11:40
Alexander Schoenenwald, Simon Kern, Josef Viehhauser, Johannes Schildgen: Collecting and visualizing data lineage of Spark jobs
11:40-12:00
Meike Klettke, Uta Störl: Four Generations in Data Engineering for Data Science – The Past, Presence and Future of a Field of Science
12:00-12:30
Jörg Desel, Daniel Krupka, Julia Meisner: GI entwickelt Empfehlungen zur Gestaltung von Data-Science-Masterstudiengängen
Wednesday 2021-09-01: 11:30-16:00
11:30-12:00
Bettina Finzel, René Kollmann, Ines Rieger, Jaspar Pahl and Ute Schmid: Deriving Temporal Prototypes from Saliency Map Clusters for the Analysis of Deep-Learning-based Facial Action Unit Classification (Joint Session)
12:00-12:30
Michael Steininger, Konstantin Kobs, Padraig Davidson, Anna Krause and Andreas Hotho: Density-based weighting for imbalanced regression (Joint Session)
13:30-13:55
Leonid Schwenke and Martin Atzmueller: Abstracting Local Transformer Attention for Enhancing Interpretability on Time Series Data
13:55-14:20
Dominik Dürrschnabel, Maren Koyda and Gerd Stumme: Attribute Selection using Contranominal Scales
14:20-14:45
Pascal Welke, Fouad Alkhoury, Christian Bauckhage and Stefan Wrobel: Decision Snippet Features
14:45-15:10
Felix Stamm, Martin Becker, Markus Strohmaier and Florian Lemmerich: Redescription Model Mining
15:10-15:35
Bastian Schäfermeier, Gerd Stumme and Tom Hanika: Topic Space Trajectories
15:35-16:00
Deniz Neufeld: Visualization Methods for Periodic Time Series Data
Thursday 2021-09-02: 11:00-12:30
11:00-11:15
Christopher Hagedorn and Johannes Huegle: Constraint-Based Causal Structure Learning in Multi-GPU Environments
11:15-11:30
Max Luebbering, Michael Gebauer, Rajkumar Ramamurthy, Maren Pielka, Christian Bauckhage and Rafet Sifa: Utilizing Representation Learning for Robust Text Classification Under Datasetshift
11:30-11:45
Tobias Rohrer, Ludwig Samuel, Adriatik Gashi, Gunter Grieser and Elke Hergenröther: Foosball table goalkeeper automation using reinforcement learning
11:45-12:00
Lars Schmarje and Reinhard Koch: Life is not black and white - Combining Semi-Supervised Learning with fuzzy labels
12:00-12:15
Felix Gonsior, Sascha Mücke and Katharina Morik: Structure Search for Normalizing Flows
12:15-12:30
Philipp Doebler, Anna Doebler, Philip Buczak and Andreas Groll: Interactions of Scores Derived from Two Groups of Variables: Alternating Lasso Regularization Avoids Overfitting and Finds Interpretable Scores
Thursday 2021-09-02: 13:30-16:00
13:30-13:55
Simon Omlor and Alexander Munteanu: Oblivious sketching for logistic regression
13:55-14:20
Daniel Neider, Jean-Raphaël Gaglione, Ivan Gavran, Ufuk Topcu, Bo Wo and Zhe Xu: AdvisoRL: Advice-Guided Reinforcement Learning in a non-Markovian Environment
14:20-14:45
Xuan Xie: Property-Directed Verification and Robustness Certification of Recurrent Neural Networks
14:45-15:10
Mirko Bunse and Katharina Morik: A PAC Learning Theory for Active Class Selection
15:10-15:35
Erich Schubert: HACAM: Hierarchical Agglomerative Clustering Around Medoids - and its Limitations
Friday 2021-09-03: 11:00-12:30
11:00-11:15
Erik Thordsen and Erich Schubert: CANDLE: Classification And Noise Detection With Local Embedding Approximations
11:15-11:30
Andreas Lohrer, Anna Beer, Maximilian Archimedes Xaver Hünemörder, Jenny Lauterbach, Thomas Seidl and Peer Kröger: AnyCORE - An Anytime Algorithm for Cluster Outlier REmoval
11:30-11:45
Nil Ayday and Debarghya Ghoshdastidar: Improvement on Incremental Spectral Clustering
11:45-12:10
Eike Stadtländer, Tamás Horváth and Stefan Wrobel: Learning Weakly Convex Sets in Metric Spaces
Friday 2021-09-03: 13:30-15:00
13:30-13:55
Mirko Lenz, Premtim Sahitaj, Sean Kallenberg, Christopher Coors, Lorik Dumani, Ralf Schenkel and Ralph Bergmann: Towards an Argument Mining Pipeline Transforming Texts to Argument Graphs
13:55-14:20
Philip Hausner and Michael Gertz: News Article Extraction Using Graph Embeddings
14:20-14:45
Patrick Kolpaczki, Viktor Bengs and Eyke Hüllermeier: Identifying Top-k Players in Cooperative Games via Shapley Bandits
Donnerstag, 2.9.:
13:30-13:40 Uhr:
Henning Baars: Begrüßung
13:40-14:10 Uhr:
12 – Malte Constantinescu, Michael Schulz and Kerstin Schneider: Konzeption einer Anwendung zur Ad-hoc-Sternschema-Generierung
14:10:-14:40 Uhr:
36 – Thomas Rupek: Establishing Governance Structures for Analytics-Driven Interorganizational Data Sharing Networks – Designing a Framework Based on a Qualitative Study
15:00-15:30 Uhr:
48 – Maximilian Werling: Gestaltung eines methodischen Vorgehens bei der Auswahl eines konzeptionellen Ansatzes zur Befähigung kooperativer Wertschöpfung
15:30-16:00 Uhr:
23 – Sebastian Trinks: Real Time Quality Assurance and Defect Detection in Industry 4.0
Wednesday, Sep 1st, 13:30-16:00
13.35 Presentation 1:
Arnab Ghosh Chowdhury, Nils Schut and Martin Atzmueller: A Hybrid Information Extraction Approach using Transfer Learning on Richly-Structured Documents
14.10 Presentation 2:
Andreas Korger and Joachim Baumeister: Rule-based Relation Extraction in Regulatory Documents
14.45 Presentation 3:
Pascal Reuss, Wasgen Muradian and Klaus-Dieter Althoff: Towards a domain-specific language for knowledge maintenance of CBR systems
15.20 Presentation 4:
Jakob Michael Schoenborn and Klaus-Dieter Althoff: Detecting SQL-Injection and Cross-Site Scripting Attacks Using Case-Based Reasoning and SEASALT
Thursday, Sep 2nd, 13:30-16:00
13.35 Presentation 5:
Pascal Reuss and Klaus-Dieter Althoff: Using gaming environments to teach the idea and application of CBR
14.10 Presentation 6:
Hannes Reil and Michael Leyer: How smart services affect relevant job characteristics in production environments
14.45 Presentation 7:
Mahta Bakhshizadeh, Christian Jilek, Heiko Maus and Andreas Dengel: Leveraging context-aware recommender systems for improving personal knowledge assistants by introducing contextual states
(Break / “get-together” until official community meeting at 16.00)
Wednesday, Sep 1st, 13:30-16:00
13:30 Introduction and Welcome
13:40
Martin Potthast, Sebastian Günther, Janek Bevendorff, Jan Philipp Bittner, Alexander Bondarenko, Maik Fröbe, Christian Kahmann, Andreas Niekler, Michael Völske, Benno Stein and Matthias Hagen: The Information Retrieval Anthology
14:05
Fabian Haak: Emojis in Lexicon-Based Sentiment Analysis: Creating Emoji Sentiment Lexicons from Unlabeled Corpora
14:30
Khanh Hiep Tran, Azin Ghazimatin and Rishiraj Saha Roy: Counterfactual Explanations for Neural Recommenders
14:55
Timo Breuer, Nicola Ferro, Norbert Fuhr, Maria Maistro, Tetsuya Sakai, Philipp Schaer and Ian Soboroff: How to Measure the Reproducibility of System-oriented IR Experiments
15:20
Dirk Lewandowski, Sebastian Sünkler and Nurce Yagci: Der Einfluss der Suchmaschinenoptimierung auf die Ergebnisse von Google: Ein mehrdimensionaler Ansatz zur Erkennung von SEO
15:45 End of Session
Keynote 2
Thursday, Sep 2nd, 13:30-16:00
13:30 Introduction and Welcome
13:40
Edgar Meij: Keynote: Search and Discovery for Finance
14:40
Jurek Leonhardt, Fabian Beringer and Avishek Anand: Exploiting Sentence-Level Representations for Passage Ranking
15:05
Magdalena Kaiser, Rishiraj Saha Roy and Gerhard Weikum: Reinforcement Learning from Reformulations in Conversational Question Answering
15:30
Lorik Dumani and Ralf Schenkel: Quality-Aware Ranking of Arguments
15:55 End of Session
Community Meeting
Donnerstag, 02.09.2021
13:30-13:45 Uhr
Eröffnung
13:45-14:15 Uhr
Manfred Moosleitner: Co-Rating Attacks on Recommendation Algorithms
14:15-14:45 Uhr
Michael Hohenstein: Progressive Indexing for Interactive Analytics
15:00-15:30 Uhr
Marcel Weisgut: Experimental Index Evaluation for Partial Indexes in Horizontally Partitioned In-Memory Databases
15:30-16:00 Uhr
Andreas Görres: Ausblick auf einen erweiterten CHASE-Algorithmus
Freitag, 03.09.2021
11:00-11:30 Uhr
Moritz Wilke: Towards Multi-modal Entity Resolution for Product Matching
11:30-12:00 Uhr
Johannes Fett: Towards Porting Hardware-Oblivious Vectorized Query Operators to GPUs
12:00-12:30 Uhr
Florian Rose: Der BACKCHASE zur Unterstützung von Data Provenance und Schema-Evolution
Lunch Break
13:30-14:00 Uhr
Anh Trang Le: Design Considerations Towards AI-Driven Co-Processor Accelerated Database Management
14:00-14:30 Uhr
Philsy Baban: Validation of Data Streams using Time Series Forecasting
14:30-15:00 Uhr
Benjamin Murauer: DT-grams: Structured Dependency Grammar Stylometry for Cross-Language Authorship Attribution
Abstract: Individuals decide and act on the basis of perceived data, information and knowledge. What happens in the brain with this input and which influencing factors lead to which decisions and actions is targeted in different disciplines. In addition, there are algorithms and systems that generate data, information and knowledge, make it available to individuals and with which individuals interact. The complexity is increased with different systems related to artificial intelligence. But even if artefacts are generated that process input without humans, there are many interactions. The keynote gives an overview of different theories that explain cognitive processes of people from different angles. It also takes a closer look at how visualizations of data and information are processed and influence cognitive processes. In addition, the interaction of humans with applications based on artificial intelligence is considered as well as how these are accepted by humans.
Bio: Prof. Dr. Michael Leyer holds the Chair of Service Operations at the University of Rostock and Adjunct Professor in the School of Management of the Queensland University of Technology in Brisbane, Australia. He conducts research on the effects of new technologies, the design of new forms of work (future of work), the integration of customers into business processes and the design of service networks. A central aspect is the consideration of behavior, cognitive processes and decisions of people in processes. The topics are examined from a theoretical perspective in order to be able to derive well-founded, practical implications. He has published his research results in over 100 scientific publications. In addition to his research activities, he is active in various positions. He is President of the Council of the University of Rostock, in the scientific management board of the Center for Entrepreneurship and a member of the board of the Information and Communication Network at the University of Rostock. In addition, he is involved in the Association of University Teachers in Business Administration with the establishment of innovative concepts at the central scientific association conference and is a member of the steering committee of the Knowledge Management Group (FGWM) in the Gesellschaft für Informatik.
Abstract: Modern machine learning techniques contribute to the massive automation of the data-driven decision making and decision support. Multiple examples from different industries, healthcare, education, and government illustrate the challenges of developing and making use of trustworthy and human-centered AI. It becomes better understood and accepted that employed predictive models may need to be audited. Disregarding whether we deal with so-called black-box models (e.g. deep learning) or more interpretable models (e.g. decision trees), answering even basic questions like “why is this model giving these answers?” and “how do particular features affect the model output?” is nontrivial. In reality, auditors need tools not just to explain the decision logic of an algorithm, but also to uncover and characterize undesired or unlawful biases in predictive model performance, e.g. by law hiring decisions cannot be influenced by race or gender. In this talk I will give a brief overview of the different facets of comprehensibility of predictive analytics and reflect on the current state-of-the-art and further research needed for gaining a deeper understanding of what it means for predictive analytics to be truly transparent, fair and accountable. I will also reflect on the necessity to study utility of the methods for interpretable predictive analytics.
Bio: Mykola Pechenizkiy is Professor of Data Mining at the Department of Mathematics and Computer Science, TU Eindhoven. His main expertise and research interests are in predictive analytics and its application to real-world problems in industry, healthcare and education. He leads Trustworthy AI interdisciplinary research studying foundations of robustness, safety, trust, reliability, scalability, interpretability and explainability of AI; developing novel techniques for informed, accountable and transparent predictive and prescriptive analytics; and demonstrating their ecological validity in practice in collaboration with industrial parters. He has co-authored several publications and served on the program committees of the leading data mining and AI conferences, including IJCAI, ECMLPKDD, AAAI, and ICML among others.
Abstract: My research has always focused on the question how to integrate databases and information retrieval technology. Why would you even want to do that? Are the join operations necessary to solve information access problems not way too expensive to run ranking queries on a database engine? In this talk I will argue that, yes, it is technically feasible and desirable to bring the benefits of “the database approach” to the field of information retrieval, to enable the field to tackle the challenges posed by the next generation of search systems. Addressing complex information needs that span multiple, heterogeneous information sources and match the relevance criteria to the personal or work context where they arise calls for a higher level of abstraction than the inverted file, and adoption of the separation of concerns and data independence that are the de facto standard for developing business applications. I will discuss the basic building blocks drawn from my prior research and experience with the integration of IR and databases, and conclude with a brief introduction to GeeseDB, a research toolkit being developed in my group to explore the benefits of graphs as a representation and express search solutions in a graph query language.
Bio: Arjen P. de Vries is professor of Information Retrieval and research director of the Institute of Computing and Information Sciences.at Radboud University Nijmegen in the Netherlands. His research aims to resolve the question how users and systems may cooperate to improve information access, with a specific focus on the value of a combination of structured and unstructured information representations. He is a founding member of Spinque, a company that integrates databases and information retrieval to develop search solutions with and for information specialists.
Abstract: Finance professionals face a myriad of tasks in their day-to-day workflows, including finding relevant information, generating trade ideas, staying up to date with breaking news or general trends in the world, and coming up with novel ways to generate “alpha”. Historically, this has centered mainly around traditional sources of data such as stock exchange ticks, news stories, company filings, etc. An increasing amount of data is being generated in textual form on social media and elsewhere, however, and given the simultaneous increase in more machine-readable forms of “alternative data” such as web site usage, credit card transactions, and mobile app analytics, we face a unique opportunity to identify, score, rank, suggest, filter, and alert to financially relevant information and events. In this talk, Edgar describes typical information needs of finance professionals and how Bloomberg uses techniques such as search, summarization, entity linking, and natural language understanding & generation to address those needs.
Bio: Edgar Meij is the head of the Artificial Intelligence (AI) Discovery group in Bloomberg’s Engineering department. He leads several teams of researchers and engineers who develop systems that provide question answering capabilities, smart contextual suggestions with severe latency constraints, as well as the Bloomberg Knowledge Graph with its advanced machine learning-based analytics that is used to generate accurate, timely, and contextual financial insights. Edgar holds a PhD in computer science from the University of Amsterdam and has an extensive track record in information retrieval, natural language processing, and machine learning. Before joining Bloomberg, Edgar worked at Yahoo! Labs on all aspects related to entities in the context of web search.