Epoch confirms GPT5.4 Pro solved a frontier math open problem
Recorded: March 25, 2026, 3 a.m.
| Original | Summarized |
A Ramsey-style Problem on Hypergraphs | Epoch AI
Latest Our Work Research & Commentary AI Trends & Statistics Key numbers on the significant growth and impact of artificial intelligence. Papers & Reports Research and analysis on the trajectory of AI development. Newsletter: Gradient Updates Expert commentary on the important questions in AI today. More Data Insights Podcast: Epoch After Hours See the latest Data on AI Datasets Models Database of AI systems tracked by training compute, org, and release date. Frontier Data Centers AI data centers tracked via satellite imagery and permit data. Hardware ML accelerator performance, efficiency, and pricing across chip generations. Companies Data on key organizations and labs driving AI development worldwide. Chip Sales AI chip shipment volumes and revenue tracked across major vendors over time. Polling on Usage Survey data on how individuals and organizations are engaging with AI. See all data on AI We also track AI model performance across 40+ benchmarks. AI Capabilities Benchmarking Benchmarking Data AI Capabilities Track and compare frontier model performance across 40+ evaluations. By Epoch AI FrontierMath A benchmark of extremely challenging math problems. About Contact
LatestOur Work Research & Commentary FrontierMath Open Problems A Ramsey-style Problem on Hypergraphs A Ramsey-style Problem on Hypergraphs Construct hypergraphs as large as possible that do not have a certain easy-to-check, difficult-to-find property. Solved Combinatorics Moderately interesting Contributor: Will Brian Associate Professor, UNC Charlotte Full write-up Contents About the problem Attempts by AI AI Prompts Mathematician survey About the problem About the problem Attempts by AI AI Prompts Mathematician survey Solution Update: This problem has been solved! A solution was first elicited by Kevin Barreto and Liam Price, using GPT-5.4 Pro. This solution was confirmed by problem contributor Will Brian, and will be written up for publication. A full transcript of the original conversation with GPT-5.4 Pro can be found here and GPT-5.4 Pro’s write-up from the end of that transcript can be found here. Copy A hypergraph (V, H) is said to contain a partition of size n if there is some D ⊆ V and P ⊆ H such that |D| = n and every member of D is contained in exactly one member of P. Find a hypergraph (V, H) with no isolated vertices such that |V| ≥ 64, |H| ≤ 20, and (V, H) contains no partitions of size > 20. Output the hypergraph as a string where vertices are labeled, 1, ..., |V|, and edges are denoted with curly braces. Example: {1,2,3},{2,4},{3,4,5},{1,5} Single challenge Copy A hypergraph (V, H) is said to contain a partition of size n if there is some D ⊆ V and P ⊆ H such that |D| = n and every member of D is contained in exactly one member of P. Find a hypergraph (V, H) with no isolated vertices such that |V| ≥ 66, |H| ≤ 20, and (V, H) contains no partitions of size > 20. Output the hypergraph as a string where vertices are labeled, 1, ..., |V|, and edges are denoted with curly braces. Example: {1,2,3},{2,4},{3,4,5},{1,5} Full problem Copy A hypergraph (V, H) is said to contain a partition of size n if there is some D ⊆ V and P ⊆ H such that |D| = n and every member of D is contained in exactly one member of P. Define H(n) to be the largest integer k such that there is a hypergraph (V, H) with |V| = k having no isolated vertices and containing no partitions of size greater than n. It is known that H(n) ≥ k_n, where k_n is defined recursively by the formula k_1 = 1 and k_n = ⌊n/2⌋ + k_⌊n/2⌋ + k_⌊(n+1)/2⌋. Your task is to improve this lower bound by a constant factor, i.e. show that H(n) ≥ c*k_n for some c > 1. It is acceptable if this improvement does not work for small n, but it must already be "in effect" for n=15. You must demonstrate this improvement by providing an algorithm that takes n as input and produces a hypergraph witnessing H(n) ≥ c * k_n. Please provide an algorithm that takes n as input and outputs the witness hypergraph as a string where vertices are labeled, 1, ..., |V|, and edges are denoted with curly braces. Example: {1,2,3},{2,4},{3,4,5},{1,5} Solution format: Number of mathematicians highly familiar with the problem:a majority of those working on a specialized topic (≈10) Number of mathematicians who have made a serious attempt to solve the problem:5–10 Rough guess of how long it would take an expert human to solve the problem:1–3 months Notability of a solution:moderately interesting A solution would be published:in a standard specialty journal Likelihood of a solution generating more interesting math:fairly likely: the problem is rich enough that most solutions should open new avenues Probability that the problem is solvable as stated:95-99% Cite Epoch AI’s work is free to use, distribute, and reproduce provided the source and authors are credited under the Creative Commons Attribution license.CitationEpoch AI (2026), "A Ramsey-style Problem on Hypergraphs". Published online at epoch.ai. Retrieved from 'https://epoch.ai/frontiermath/open-problems/ramsey-hypergraphs' [online resource]. Accessed 25 Mar 2026.BibTeX Citation@misc{epoch2026, FeedbackFeedback Have a question? Noticed something wrong? Let us know.MessageIf you would like a reply, please include your name and email address.NameEmail addressCancelSubmit A Ramsey-style Problem on Hypergraphs Construct hypergraphs as large as possible that do not have a certain easy-to-check, difficult-to-find property. |
This document details a problem presented by Epoch AI, specifically a Ramsey-style problem focused on hypergraphs, designed to challenge and evaluate AI’s capacity for combinatorial reasoning. The core of the problem revolves around constructing hypergraphs—structures composed of vertices and edges—with specific constraints. The goal is to maximize the size of such a hypergraph while simultaneously preventing the existence of partitions of a certain size (n) within the hypergraph. The problem explicitly frames this as a lower bound calculation of H(n), defined as the largest integer k such that a hypergraph with k vertices can exist without containing partitions larger than n. The initial lower bounds for H(n) are deemed suboptimal, prompting an investigation into finding more effective constructions. The problem is structured into three distinct phases: a “warm-up” variant with a known solution for smaller values of ‘n’, a “single challenge” presenting a more complex scenario, and a “full problem” requiring a general algorithm applicable to all values of ‘n’. The AI models evaluated on this problem, including GPT-5.4 Pro and Opus 4.6, successfully solved the “single challenge” and “full problem” variants. A solution was initially elicited by Kevin Barreto and Liam Price using GPT-5.4 Pro, and subsequently confirmed by Will Brian, a contributor to the project and Associate Professor at UNC Charlotte. Brian highlights the interesting aspect of the AI’s approach, noting its efficiency and mirroring of the upper-bound construction technique, improving upon previously known lower bounds. The solution represents a key element of Epoch AI’s "FrontierMath" initiative, which utilizes AI to tackle challenging mathematical problems and explore potential avenues for advancement. Following the initial solution, Epoch AI developed a general scaffold for testing models on FrontierMath Open Problems. Several other models, including Gemini 3.1 Pro and GPT-5.4 (xhigh), were also able to solve the problem within this framework. The problem is formally defined using hypergraph terminology, clearly outlining the construction and constraints involved (vertices, edges, partitions, and isolated vertices). The recursive definition of k_n, a crucial component of the lower bound calculation, is provided. The expected output format is specified as a string representing the hypergraph, using labeled vertices and curly braces to denote edges. A Python script function is required to implement the solution, with a time constraint of 10 minutes for n ≤ 100 on a typical laptop. A mathematician survey provides context regarding the complexity and potential solution time for a human expert. The survey suggests that approximately 10 specialists would be familiar with the problem, 5-10 would attempt it seriously, and an expert would likely require 1-3 months to solve it. The solution is deemed moderately interesting, with a high probability (95-99%) of being solvable as stated, and a fair likelihood of generating further interesting mathematical exploration. The problem's solution will likely be published in a standard specialty journal. The documentation concludes with details on Epoch AI’s licensing terms for the work, utilizing a Creative Commons Attribution license. Finally, the document provides links to Epoch AI's various data resources, including AI models and performance metrics, and highlights their overall mission of investigating the trajectory of AI for societal benefit. |