19.12.2024

Teaser image to Epistemic Foundations and Limitations of Statistics and Science

Epistemic Foundations and Limitations of Statistics and Science

Blogpost on the Replication Crisis

The Open Science Initiative in Statistics and the MCML recently hosted a workshop about epistemic foundations and limitations of statistics and science. The event brought together researchers from diverse fields to discuss one of science’s most pressing challenges: The replication crisis. While the crisis is often attributed to systemic issues like “publish or perish” incentives, the discussions highlighted an overlooked culprit: a lack of understanding and acknowledgment of the epistemic foundations of statistics. After the workshop, our MCML members Lisa Wimmer, Moritz Herrmann and Patrick Schenk wrote a blog post with their thoughts on the topic.

Statistics suffers from a replication crisis

«The fact that Statistics, as a field, is undergoing a replication crisis might seem puzzling at first.»


Lisa Wimmer

MCML Junior Member

The fact that Statistics, as a field, is undergoing a replication crisis might seem puzzling at first. More applied disciplines like Psychology have been known for producing results that don’t replicate (i.e., prompt the same scientific conclusions). Much of this has been attributed to researchers’ misconception about complex statistical entities, such as the notorious p-value, but surely this can’t be a problem for statisticians themselves? Unfortunately, our field suffers from many of the same issues that have tripped others. For one, the pressure to publish enforces a tendency to emphasize positive results–in the sense of successful methods–while negative results, which are still valuable to the community, remain in the file drawer. A more worrying aspect is that good scientific practice has proven hard to adhere to even with the best of intentions. As the famous physicist Richard Feynmann put it: “The first principle is that you must not fool yourself— and you are the easiest person to fool.” We argue that fields like Statistics and Machine Learning need to revisit epistemic foundations and limitations, educating ourselves and others about the principles of empirical sciences.

Neither big data nor large models are going to solve the crisis

«Neither big data nor large models are going to solve the crisis.»


Lisa Wimmer

MCML Junior Member

The foundations for today’s powerful statistical models were laid in the latter half of the past century. New mathematical insights and a leap in available computing power have brought into existence AI agents that people increasingly look to as their companions. Their dazzling capabilities, however, mask the brittleness of their theoretical underpinnings. Anecdotes of, e.g., ChatGPT hallucinating to give dreadfully wrong answers, or AI turning racist, are abundant. Such undesirable effects occur due to faulty development processes: models overfitting to toy datasets, black-box algorithms picking up spurious patterns and producing surprising outcomes, or an omission to incorporate all relevant sources of uncertainty that inevitably enters the data way before they are used to build models. These examples already hint at the complexity of the endeavor–it is simply very easy to miss relevant aspects and make mistakes at some point along the way.

In this conundrum, some turn to Big Data as the savior of us all. Can’t we create an appropriate representation of the world if we just feed our models enough data? Sadly, the answer is no for two reasons at least. First, a well-established result of learning theory states that there can be no learning without inductive biases, i.e., some assumptions we are willing to make about the nature of the data-generating process (otherwise, we could build one model to rule them all and abolish the field of Statistics altogether). Second, it can be shown that data pooled from multiple sources–as is the case in many instances of Big Data–rarely give rise to a well-defined joint probability distribution. In other words, data cobbled together from different corners of the internet don’t tell a coherent story. This may be exacerbated in the future by the incestuous evolution of training data that is to be expected from addressing the perennial data shortage with AI-generated imitations.

With so many unresolved issues, society risks being carried away on an enthusiastic wave of adopting technological progress when the foundations of this progress remain shaky. All this means that our field must continue to strive for excellence in scientific principles – sometimes this includes taking a step back and thinking about whether we too have been swept into the wrong direction. Science is a cumulative endeavor in which researchers ought to be able to rely on previous results. We can only achieve this by holding ourselves to the highest-possible standards. Otherwise, we’re building a house of cards.

We need clarity about concepts more urgently than procedures and formalism

«What we actually need is conceptual clarity.»


Lisa Wimmer

MCML Junior Member

Alas, scientists (and perhaps statisticians in particular) are prone to get bogged down in discussions about methodological details. What we actually need is conceptual clarity. Take the example of reproducibility. Our field broadly seems to consider computational reproducibility, i.e., the guarantee to produce the exact same numerical results when re-running experiment code, necessary and sufficient to tick off replicability. While computational reproducibility is frequently desirable, making it the sole yardstick falls desperately short of good scientific practice. Program code typically stands at the end of a long succession of design choices. Decisions about research questions (which often conflate exploratory and confirmatory endeavors), model classes, datasets, evaluation criteria, etc. heavily influence the scientific conclusions we can draw. Any two studies about the same research questions must be expected to differ due to assumptions in varying degree of violation alone.

It doesn’t help that scientific progress is often judged by one-dimensional metrics. “When a measure becomes a target, it ceases to be a good measure” has become known as Goodhart’s law. If p-values below 5 % or above-baseline values of accuracy signal scientific quality, it’s not surprising that researchers work towards those indicators more than actual knowledge gain. The abstraction provided by quantitative methods is actually a core virtue of Statistics, with numbers as a lingua franca for people from any scientific (or social, geographical, temporal) background. We need to make sure, however, that quantification doesn’t lead to oversimplification, decontextualization, and measure hacking.

«Statistics and Machine Learning need to revisit epistemic foundations and limitations, educating ourselves and others about the principles of empirical sciences.»


Lisa Wimmer

MCML Junior Member

This is all the more important in our current geopolitical landscape. Contrary to what naive realism or positivism would have us believe, Statistics doesn’t operate in a vacuum devoid of social processes. We have a responsibility to take into account the circumstances under which data have been generated, the personal perspectives shaping our scientific work, and, ultimately, the implications of employing our models in decision-making.

From our discussions we infer a number of opportunities to save our field from the looming replication crisis. Rather than measure-hack our way forward, we should discern more clearly between exploratory research (in which not every idea can be a winner) and confirmatory research (with proper scientific hypotheses). Better infrastructure can further discourage bad practices: well-maintained software and well-curated, well-understood datasets ensure that promising results don’t depend on lucky experimental settings. Besides getting the incentives right, we all need more education–ignorance can’t be an excuse for questionable science. We hope that initiatives like our workshop are steps into the right direction. So, statisticians, roll up your sleeves, there’s a crisis to be solved (freely adapted from Seibold et al., 2021).


For our article, we drew from discussions and talks in our workshop. We emphasize that the above arguments reflect our own interpretation and not participants’ opinions.

Our speakers and their insightful topics

  • Rudolf Seising: An Interwoven History of AI and Statistics
  • Uwe Saint-Mont: How Feynman Predicted the Replication Crisis
  • Jürgen Landes: Data Aggregation of Big Data Is Not Enough
  • Walter Radermacher: Epistemology and Sociology of Quantification Based on Convention Theory
  • Sabina Leonelli: What Reproducibility Can’t Solve
  • Moritz Herrmann: When Measures Become Targets
  • Sabine Hoffmann: How Foundational Assumptions about Probability, Uncertainty, and Subjectivity Jeopardize the Replicability of Research Findings
  • Michael Schomaker: Replicability When Considering Unconditional Interpretations and Gradations of Evidence

A list of references from our speakers


A far from exhaustive list of further references

19.12.2024


Subscribe to RSS News feed

Related

Link to Understanding Vision Loss and the Need for Early Treatment

11.12.2024

Understanding Vision Loss and the Need for Early Treatment

Researcher in focus: Jesse Grootjen is writing his doctoral thesis at LMU, focusing on enhancing human abilities through digital technologies.


Link to AI and weather predictions

04.12.2024

AI and Weather Predictions

Researcher in focus: Kevin Höhlein, PhD student at TUM, applies data science and machine learning to analyze meteorological data.


Link to Enhancing the integrity of social media with AI

28.11.2024

Enhancing the Integrity of Social Media With AI

Researcher in focus: Dominik Bär, PhD at LMU’s Institute of AI in Management, focuses on social media analytics in Stefan Feuerriegel's group.


Link to Exploring the impact of AI in medicine

12.11.2024

Exploring the Impact of AI in Medicine

Researcher in focus: Anna Reithmeir specializes in image alignment research to track anatomical changes like tumor growth.


Link to Get to know MCML Junior Member Lukas Gosch

08.04.2024

Get to Know MCML Junior Member Lukas Gosch

Article by Jonas Grimm (DJS) about MCML Junior Member Lukas Gosch who conducts research on how quickly - or slowly - AI algorithms react to changes.