Kaggle: Konwinski Prize Announcement

NeurIPS 2024 · Read our full coverage

Kaggle’s talk included an announcement of an exciting new million-dollar prize for AI software engineering agents.

Kaggle CEO D. Sculley kicked off the session with a primer on empirical rigour, covering what happens when assumptions about data are violated in practice, and how to deal with concerns around leakage and contamination.

D. Sculley standing behind a lectern, addressing an audience — D. Sculley discussing empirical rigour

After reviewing pros and cons of static benchmarks and community leaderboards for comparing methods, he explained some ways in which Kaggle have mitigated these issues in their competitions.

One extremely effective mitigation: requiring researchers (or competition participants) to submit their models before the test data is generated. In the CAFA 5 Protein Function Prediction competition, for example, this meant gathering predictions before measurements for the test set proteins were made in the lab.

Sculley was joined on stage first by Carlos Jimenez and John Yang of SWE-bench (a benchmark for automated code-generation systems measuring their ability to resolve GitHub issues), and then by Databricks cofounder Andy Konwinski.

Four men on stage, one behind a lectern addressing the audience and gesturing. — D. Sculley, Carlos Jimenez, John Yang, and Andy Konwinski.

After Jimenez and Yang explained SWE-bench, Konwinski posted a tweet announcing the million-dollar challenge live on-stage.

I’ll give $1M to the first open source AI that gets 90% on this sweet new contamination-free version of SWE-bench - kprize.ai.

Andy Konwinski’s live-tweet during the session

For more details on the prize, go to kprize.ai or the Kaggle competition page