Ask HN: Why did AlphaProof fail at IMO combinatorial problems?

ckcheng 1 year ago

To try answering myself:

It's because algebraic type problems are typically of the form "Prove algebraic equation E", where E is relatively easy to translate into Lean language.

Whereas combinatoric problem types are of the form "Prove some open-ended expression P" that even if translated into Lean, it's not clear where to start to connect it to axioms or the eventual problem solution.

If that’s an okay way to describe it, then AlphaProof’s language model, likely trained only on Lean statements (?), would have a tough time generating plausible next steps to combinatorial problems for MCTS to explore?

That doesn’t mean the LM + MCTS architecture is bad though. It means the language model needs to be better at generating plausible next steps (sequence of tokens) with maybe a less axiomatic approach to evaluating the fitness of that step (less Lean reliance?).

bbconn 1 year ago

Yes you can get a 7/7 score for a combinatorics problem in IMO with a good amount of handwaving e.g. instead of formally describing a configuration of monsters in P5 of IMO 2024 you can get away with a few sentences that describes what you're doing. There's actually a decent amount of compression being done by using informal language, so I expect formal proofs of combinatorics problems to be significantly longer than formal proofs of algebra/number theory/geometry problems.
I should say that handwaving is also ok in other topics, but formalizing handwaving in other topics is easier.