Home | Publications | KNF+24

Towards Localization via Data Embedding for TabPFN

MCML Authors

Thomas Nagler

Prof. Dr.

Principal Investigator

Computational Statistics & Data Science

Matthias Feurer

Prof. Dr.

Thomas Bayes Fellow

* Former Thomas Bayes Fellow

Abstract

Prior-data fitted networks (PFNs), especially TabPFN, have shown significant promise in tabular data prediction. However, their scalability is limited by the quadratic complexity of the transformer architecture's attention across training points. In this work, we propose a method to localize TabPFN, which embeds data points into a learned representation and performs nearest neighbor selection in this space. We evaluate it across six datasets, demonstrating its superior performance over standard TabPFN when scaling to larger datasets. We also explore its design choices and analyze the bias-variance trade-off of this localization method, showing that it reduces bias while maintaining manageable variance. This work opens up a pathway for scaling TabPFN to arbitrarily large tabular datasets.

inproceedings KNF+24