TabPFN for Data-Scarce Industrial Settings

Research output: Chapter in Book/Report/Conference proceedingChapterpeer-review

Abstract

Tabular foundation models such as TabPFN v2 perform in-context learning by conditioning on a small labeled support set and a query instance, enabling fast adaptation to heterogeneous tabular regression tasks without per-dataset training. Many industrial applications operate in a tiny-sample regime due to cost, and process constraints. We analyze TabPFN under extreme label scarcity for regression, positioning it against established tabular baselines and tracing dataset-size–dependent predictive performance. Our study analyzes sample sizes from 5 labeled points per task, including an industrial steelmaking regression problem and public benchmarks. In steelmaking, in-process target measurements are rarely feasible, with intermediate targets embedded in delayed end-of-process data. Since data collection is slow and scarce, effective use requires integrating heterogeneous datasets across vessels, processes, and plants.

A central finding is that the TabPFN support set size dependency varies widely with dataset quality and information content. While most benchmark tasks achieve satisfactory performance beyond support set sizes of 20, the investigated industrial datasets require at least 100 samples to consistently outperform a naive mean baseline. We discuss implications for deploying in-context tabular models in the low-data regime and show dataset size dependencies for various competitive tabular regression methods.
Original languageEnglish
Title of host publicationEurIPS, AITD Workshop
Number of pages5
Publication statusPublished - 2025
EventEurIPS 2025 Workshop, AITD 2025: AI for Tabular Data - Copenhagen, Denmark
Duration: 6 Dec 20256 Dec 2025

Conference

ConferenceEurIPS 2025 Workshop, AITD 2025
Country/TerritoryDenmark
CityCopenhagen
Period6/12/256/12/25

Fields of Expertise

  • Information, Communication & Computing

Fingerprint

Dive into the research topics of 'TabPFN for Data-Scarce Industrial Settings'. Together they form a unique fingerprint.

Cite this