Anthropic's original take home assignment open sourced

Recorded: Jan. 21, 2026, 11:03 a.m.

Original

Summarized

GitHub - anthropics/original_performance_takehome: Anthropic's original performance take-home, now open for you to try!

Navigation Menu

Toggle navigation

Appearance settings

PlatformAI CODE CREATIONGitHub CopilotWrite better code with AIGitHub SparkBuild and deploy intelligent appsGitHub ModelsManage and compare promptsMCP RegistryNewIntegrate external toolsDEVELOPER WORKFLOWSActionsAutomate any workflowCodespacesInstant dev environmentsIssuesPlan and track workCode ReviewManage code changesAPPLICATION SECURITYGitHub Advanced SecurityFind and fix vulnerabilitiesCode securitySecure your code as you buildSecret protectionStop leaks before they startEXPLOREWhy GitHubDocumentationBlogChangelogMarketplaceView all featuresSolutionsBY COMPANY SIZEEnterprisesSmall and medium teamsStartupsNonprofitsBY USE CASEApp ModernizationDevSecOpsDevOpsCI/CDView all use casesBY INDUSTRYHealthcareFinancial servicesManufacturingGovernmentView all industriesView all solutionsResourcesEXPLORE BY TOPICAISoftware DevelopmentDevOpsSecurityView all topicsEXPLORE BY TYPECustomer storiesEvents & webinarsEbooks & reportsBusiness insightsGitHub SkillsSUPPORT & SERVICESDocumentationCustomer supportCommunity forumTrust centerPartnersOpen SourceCOMMUNITYGitHub SponsorsFund open source developersPROGRAMSSecurity LabMaintainer CommunityAcceleratorArchive ProgramREPOSITORIESTopicsTrendingCollectionsEnterpriseENTERPRISE SOLUTIONSEnterprise platformAI-powered developer platformAVAILABLE ADD-ONSGitHub Advanced SecurityEnterprise-grade security featuresCopilot for BusinessEnterprise-grade AI featuresPremium SupportEnterprise-grade 24/7 supportPricing

Search or jump to...

Search code, repositories, users, issues, pull requests...

Clear

Search syntax tips

Provide feedback

We read every piece of feedback, and take your input very seriously.

Include my email address so I can be contacted

Cancel

Submit feedback

Saved searches

Use saved searches to filter your results more quickly

Name

Query

To see all available qualifiers, see our documentation.

Cancel

Create saved search

Appearance settings

Resetting focus

You signed in with another tab or window. Reload to refresh your session.
You signed out in another tab or window. Reload to refresh your session.
You switched accounts on another tab or window. Reload to refresh your session.

Dismiss alert

anthropics

/

original_performance_takehome

Public

Notifications
You must be signed in to change notification settings

Fork
92

Star
605

Anthropic's original performance take-home, now open for you to try!

605
stars

92
forks

Branches

Tags

Activity

Star

Notifications
You must be signed in to change notification settings

Code

Issues
2

Pull requests
1

Actions

Projects
0

Security

Uh oh!

There was an error while loading. Please reload this page.

Insights

Additional navigation options

Code

Issues

Pull requests

Actions

Projects

Security

Insights

anthropics/original_performance_takehome

mainBranchesTagsGo to fileCodeOpen more actions menuFolders and filesNameNameLast commit messageLast commit dateLatest commit History1 Committeststests .gitignore.gitignore Readme.mdReadme.md perf_takehome.pyperf_takehome.py problem.pyproblem.py watch_trace.htmlwatch_trace.html watch_trace.pywatch_trace.py View all filesRepository files navigationREADMEAnthropic's Original Performance Take-Home
This repo contains a version of Anthropic's original performance take-home, before Claude Opus 4.5 started doing better than humans given only 2 hours.
Now you can try to beat Claude Opus 4.5 given unlimited time!
Performance benchmarks
measured in clock cycles from the simulated machine:

2164 cycles: Claude Opus 4 after many hours in the test-time compute harness
1790 cycles: Claude Opus 4.5 in a casual Claude Code session, approximately matching the best human performance in 2 hours
1579 cycles: Claude Opus 4.5 after 2 hours in our test-time compute harness
1548 cycles: Claude Sonnet 4.5 after many more than 2 hours of test-time compute
1487 cycles: Claude Opus 4.5 after 11.5 hours in the harness
1363 cycles: Claude Opus 4.5 in an improved test time compute harness

If you optimize below 1487 cycles, beating Claude Opus 4.5's best performance at launch, email us at performance-recruiting@anthropic.com with your code (and ideally a resume) so we can be appropriately impressed and perhaps discuss interviewing.
Run python tests/submission_tests.py to see which thresholds you pass.

About

Anthropic's original performance take-home, now open for you to try!

Resources

Readme

Uh oh!

There was an error while loading. Please reload this page.

Activity

Custom properties
Stars

605
stars
Watchers

1
watching
Forks

92
forks

Report repository

Releases
No releases published

Packages
0

No packages published

Uh oh!

There was an error while loading. Please reload this page.

Languages

Python
88.7%

HTML
11.3%

Footer

Footer navigation

Terms

Privacy

Security

Status

Community

Docs

Contact

Manage cookies

Do not share my personal information

You can’t perform that action at this time.

This GitHub repository, “original_performance_takehome” from Anthropic, presents a challenge centered around optimizing a Python program to achieve the lowest possible clock cycle count, ultimately surpassing the performance of Claude Opus 4.5 during its initial launch. The project’s genesis lies in a performance evaluation conducted before the emergence of Claude Opus 4.5’s superior capabilities within a significantly reduced timeframe. The core objective is to develop an algorithm or set of instructions that minimizes the number of clock cycles required by a simulated machine to solve a yet-unspecified problem – presumably a computationally intensive task.

The repository’s documentation, specifically the `README.md` file, explicitly details the performance benchmarks established for comparison. These benchmarks serve as criteria for assessing the success of submitted solutions. The documented cycle counts represent the performance of several Claude models under varying computational conditions. Initially, Claude Opus 4 achieved 2164 cycles after extended use within Anthropic's “test-time compute harness.” Subsequently, Claude Opus 4.5 reached 1790 cycles during a typical, casual coding session, mirroring the best human performance achieved within the two-hour timeframe. Further refinements yielded cycle counts of 1579 for Claude Opus 4.5 after two hours in the enhanced harness, 1548 for Claude Sonnet 4.5 over extended periods, 1487 for Claude Opus 4.5 within a 11.5-hour test, and 1363 for Claude Opus 4.5 utilizing an improved test-time compute harness.

The repository’s structure indicates a simple Python-based implementation, evidenced by the presence of `perf_takehome.py`, `problem.py`, `watch_trace.html`, and `watch_trace.py`. The `submission_tests.py` file is used to verify whether submitted code achieves the required performance thresholds. The inclusion of the HTML file, `watch_trace.html`, suggests a mechanism for monitoring the program’s execution and identifying potential bottlenecks which could then be addressed by optimizing the code. The `problem.py` file likely contains the core algorithmic challenge, while `watch_trace.py` provides tools for observation and debugging.

The stated goal – reducing the clock cycle count to below 1487 – represents a significant hurdle, acknowledging the advanced capabilities of the Claude models. Anthropic’s recruitment team has established a direct pathway for engagement: successful optimization—achieving a cycle count below the specified threshold—requires the submission of the code and a resume to the email address performance-recruiting@anthropic.com. This incentivizes developers to dedicate themselves to the challenge. The project’s design is a carefully orchestrated performance evaluation, a competitive test case rather than a typical development exercise. It’s clear that Anthropic is seeking deeply insightful solutions, ones that demonstrate a sophisticated understanding of computational efficiency and algorithm design. The structured approach, including defined benchmarks and a clear route for evaluation, underscores the intent to identify exceptional talent within the AI development community.