points by westurner 2 years ago

> At the same time, the results show that the explanations obtained from IGNNet are aligned with the true Shapley values of the features without incurring any additional computational overhead

TabPFN: https://github.com/automl/TabPFN https://twitter.com/FrankRHutter/status/1583410845307977733

"TabPFN: A Transformer That Solves Small Tabular Classification Problems in a Second" (2022) https://arxiv.org/abs/2308.08945

FWIU TabPFN is Bayesian-calibrated/trained with better performance than xgboost for non-categorical data

PaulHoule 2 years ago

Right, the significance of the original article and the related field of research is that ChatGPT-like models don't handle tabular data well and there's a lot of need for things that do.

  • westurner 2 years ago

    There are multiple metrics to optimize for when optimizing.

    FWIU, from the diagram in the photo in the linked tweet, which is similar to a diagram on page 16 of the TabPFN paper [1], on the OpenML-CC18, TabPFN has a better ROC Receiver Operating Characteristic after 1 second than XGboost, Catboost, LightGBM, KNN, SAINT, Reg. Cocktail, and Autogluon after any amount of time, but Auto-sklearn 2.0 required 5 minutes to reach ~ROC parity with TabPFN.

    1. "TabPFN: A Transformer That Solves Small Tabular Classification Problems in a Second" (May 2023) https://arxiv.org/abs/2207.01848

    2. "Interpretable Graph Neural Networks for Tabular Data" (Aug 2023) https://arxiv.org/abs/2308.08945