Artificial Intelligence▼ bearishImpact 6/10
ComBench: A Benchmark for Rigorous Proof Reasoning and Constructive Realization in Olympiad-Level Combinatorics
cs.AI updates on arXiv.org·
✦AI Analysis
ComBench is a new benchmark designed to evaluate the combinatorial reasoning abilities of large language models, particularly in Olympiad-level mathematics. Current models show significant gaps in creative mathematical reasoning, with the strongest achieving only moderate performance on the benchmark's problems.
Key Topics
ComBenchGPT-5.5Kimi-K2.6
Originally reported by cs.AI updates on arXiv.org. Read the full article ↗