Math-AI: Dawn Song & Samy Bengio

NeurIPS 2024 · Read our full coverage  
Workshop

The Math-AI workshop was well-attended, full of optimism, and featured several high-profile speakers from across machine learning and mathematics.

Poster titled "The 4th MATH-Al Workshop", listing the speakers and organisers as well as a timetable for the day.

Schedule for the day

The first invited talk was given by UC Berkeley’s Dawn Song, and was broadly based on an upcoming position paper titled “Formal Mathematical Reasoning: A New Frontier in AI”.

The view taken in the paper and talk is that AI-based formal mathematical reasoning has reached an inflection point, and significant progress is expected in the next few years, through the flywheel loop where “human-written formal math data leads to more capable LLMs, which in turn eases the creation of more data”.

For more background on this area, read our introduction to ML and formal mathematics.

A speaker stands behind a lectern, smiling at the audience.

Dawn Song

Song laid out four main directions for future work, data, algorithms, tools for human mathematicians, and code generation/verification.

Song also emphasised the importance of benchmarks, and the paper proposes a new taxonomy inspired by the autonomy levels for self-driving cars. For theorem proving the proposed system ranges from level 0, “checking formal proofs”, to level 5, “solving problems and discovering new math beyond the human level”. Level 3 is the highest level covered by existing evaluations.

Similar taxonomies are defined for informal reasoning, autoformalisation, conjecturing, and code generation and verification.

The second invited talk was given by Apple’s Samy Bengio.

Bengio’s talk focused on one question: can transformers learn deductive reasoning?

A speaker with an Apple logo on his t-shirt addresses the audience from behind a lectern.

Samy Bengio

He gave an overview of a new measure, globality degree, proposed by his lab in a NeurIPS paper this year, which aims to capture the learnability of tasks by transformers.

This research showed that tasks with high globality degree cannot easily be solved by transformers trained through SGD, and that naive approaches such as “agnostic scratchpads” (general meta-generation techniques, like chain-of-thought, not designed with these types of tasks in mind) do not mitigate this.

To address this the paper introduced a new approach, inductive scratchpads, which can help transformers learn problems with high globality degree.