LmCast :: Stay tuned in

ATLAS: Autoformalized Textbook Library At Scale

Recorded: May 29, 2026, 6:03 p.m.

Original Summarized

GitHub - facebookresearch/atlas-lean: ATLAS Autoformalized Textbook Library At Scale · GitHub

Skip to content

Navigation Menu

Toggle navigation

Sign in

Appearance settings

PlatformAI CODE CREATIONGitHub CopilotWrite better code with AIGitHub SparkBuild and deploy intelligent appsGitHub ModelsManage and compare promptsMCP RegistryNewIntegrate external toolsDEVELOPER WORKFLOWSActionsAutomate any workflowCodespacesInstant dev environmentsIssuesPlan and track workCode ReviewManage code changesAPPLICATION SECURITYGitHub Advanced SecurityFind and fix vulnerabilitiesCode securitySecure your code as you buildSecret protectionStop leaks before they startEXPLOREWhy GitHubDocumentationBlogChangelogMarketplaceView all featuresSolutionsBY COMPANY SIZEEnterprisesSmall and medium teamsStartupsNonprofitsBY USE CASEApp ModernizationDevSecOpsDevOpsCI/CDView all use casesBY INDUSTRYHealthcareFinancial servicesManufacturingGovernmentView all industriesView all solutionsResourcesEXPLORE BY TOPICAISoftware DevelopmentDevOpsSecurityView all topicsEXPLORE BY TYPECustomer storiesEvents & webinarsEbooks & reportsBusiness insightsGitHub SkillsSUPPORT & SERVICESDocumentationCustomer supportCommunity forumTrust centerPartnersView all resourcesOpen SourceCOMMUNITYGitHub SponsorsFund open source developersPROGRAMSSecurity LabMaintainer CommunityAcceleratorGitHub StarsArchive ProgramREPOSITORIESTopicsTrendingCollectionsEnterpriseENTERPRISE SOLUTIONSEnterprise platformAI-powered developer platformAVAILABLE ADD-ONSGitHub Advanced SecurityEnterprise-grade security featuresCopilot for BusinessEnterprise-grade AI featuresPremium SupportEnterprise-grade 24/7 supportPricing

Search or jump to...

Search code, repositories, users, issues, pull requests...

Search

Clear

Search syntax tips

Provide feedback


We read every piece of feedback, and take your input very seriously.

Include my email address so I can be contacted

Cancel

Submit feedback

Saved searches

Use saved searches to filter your results more quickly

Name

Query

To see all available qualifiers, see our documentation.

Cancel

Create saved search

Sign in

Sign up

Appearance settings

Resetting focus

You signed in with another tab or window. Reload to refresh your session.
You signed out in another tab or window. Reload to refresh your session.
You switched accounts on another tab or window. Reload to refresh your session.

Dismiss alert

facebookresearch

/

atlas-lean

Public

Notifications
You must be signed in to change notification settings

Fork
8

Star
87

Code

Issues
0

Pull requests
1

Actions

Projects

Security and quality
0

Insights

Additional navigation options

Code

Issues

Pull requests

Actions

Projects

Security and quality

Insights


facebookresearch/atlas-lean

 mainBranchesTagsGo to fileCodeOpen more actions menuFolders and filesNameNameLast commit messageLast commit dateLatest commit History3 Commits3 CommitsAtlasAtlas  assetsassets  .gitignore.gitignore  Atlas.leanAtlas.lean  CODE_OF_CONDUCT.mdCODE_OF_CONDUCT.md  CONTRIBUTING.mdCONTRIBUTING.md  LICENSELICENSE  README.mdREADME.md  formalizing_mathematics_at_scale.pdfformalizing_mathematics_at_scale.pdf  lake-manifest.jsonlake-manifest.json  lakefile.tomllakefile.toml  lean-toolchainlean-toolchain  View all filesRepository files navigationREADMECode of conductContributingLicenseSecurity

ATLAS - Autoformalized Textbook Library At Scale
ATLAS is a Lean 4 library of textbook mathematics autoformalized by LLMs:
informal statements and proofs translated into Lean code. It draws from
undergraduate and graduate textbooks across analysis, algebra, geometry,
topology, combinatorics, probability, statistics, PDEs, number theory, and
theoretical computer science.
The goal of ATLAS is to provide reusable formal building blocks for future
human- and machine-driven Lean formalization. This is an active effort: we are
continuing to scale to more sources, curate the generated material, improve
coverage and maintainability, and move the library closer to Mathlib
conventions.
ATLAS was generated with
AutoformBot, our
autoformalization pipeline.
Links

Companion paper: Formalizing Mathematics at Scale
Formalization harness: https://github.com/facebookresearch/autoform-bot
Earlier related work: Formalization of Algebraic Combinatorics

Library Data
Each book directory under Atlas/ contains:

Lean source files for the generated definitions, statements, and proofs.
targets.yaml, listing the textbook statements selected for
formalization.
report.json, containing automated evaluation results for matched Lean
declarations, including faithfulness, proof-integrity, and code-quality
scores.

Visualizer
A visualizer is provided at https://rammalahmad.github.io/atlas/. It allows
users to browse ATLAS, compare informal
statements with their Lean formalizations, inspect logical dependency graphs
between results, and extract the Lean code needed to state a selected theorem.

Status and Contributions
ATLAS is an ongoing, machine-generated extension effort rather than a finished
product. We are actively working on scaling the corpus, curating
the generated code, formalizing remaining statements, and improving idiomatic
Mathlib reuse. External contributions are welcome!
To build the full library with the pinned Lean and Mathlib versions, run:
lake build
Statistics (May 2026)
26 books · 630,999 lines of code · 483,917 lines of Lean code (excl. comments/blanks) · 46,203 declarations · 42,837 proved (92.7%) · 2,855 / 4,007 statements formalized (71.3%) · 183,157M tokens

Book
Target Statements
Formalized
% Formalized
Lines of Code
Lines of Lean
Declarations
Proved
% Proved
Tokens (M)

AlgebraNotes
176
151
85.8%
5,037
4,409
274
261
95.3%
1,962.99

AlgebraicCombinatorics
39
37
94.9%
10,695
9,343
737
734
99.6%
1,440.73

AlgebraicGeometryI
186
112
60.2%
40,678
27,393
4,499
4,210
93.6%
7,629.26

AlgebraicTopologyI
171
110
64.3%
29,154
20,142
2,416
2,063
85.4%
10,323.27

AnAlgorithmistsToolkit
158
131
82.9%
9,656
8,234
712
668
93.8%
2,004.00

ArithmeticGeometry
335
266
79.4%
39,257
29,573
3,047
2,861
93.9%
11,100.62

BooleanFunctions
108
44
40.7%
9,516
7,949
667
614
92.1%
2,327.49

Buildings
74
44
59.5%
64,383
48,809
4,345
4,247
97.7%
20,442.93

CombinatorialOptimization
36
22
61.1%
8,908
7,934
428
414
96.7%
2,475.65

ComplexVariables
38
37
97.4%
7,231
6,225
285
280
98.2%
1,250.91

DifferentialAnalysis
113
88
77.9%
31,302
23,713
1,634
1,506
92.2%
11,743.27

DifferentialGeometry
147
112
76.2%
10,592
8,942
888
781
88.0%
1,933.97

EllipticCurves
360
212
58.9%
32,819
22,316
3,483
2,981
85.6%
11,058.00

FourierAnalysis
38
34
89.5%
7,943
6,671
373
359
96.2%
1,185.90

GeometryOfManifolds
72
40
55.6%
22,686
16,408
3,251
3,098
95.3%
6,864.93

HighDimensionalStatistics
73
65
89.0%
39,656
31,715
1,564
1,518
97.1%
975.36

IntroductionToFunctionalAnalysis
72
68
94.4%
2,709
2,006
113
109
96.5%
553.64

IntroductionToPartialDifferentialEquations
105
86
81.9%
27,666
20,740
1,585
1,414
89.2%
2,972.23

LieGroups
185
74
40.0%
60,285
50,594
4,219
3,814
90.4%
45,384.33

NumberTheoryI
576
460
79.9%
64,958
54,760
3,764
3,591
95.4%
15,424.36

ProbabilisticMethodsInCombinatorics
210
109
51.9%
20,555
15,604
1,272
1,089
85.6%
2,720.15

ProjectionTheory
111
73
65.8%
13,357
9,672
979
871
89.0%
2,678.00

RealAnalysis
177
175
98.9%
2,886
2,224
149
147
98.7%
585.64

TensorCategories
229
137
59.8%
42,812
29,729
3,373
3,176
94.2%
11,338.45

TheoryOfComputation
118
84
71.2%
15,094
10,581
1,553
1,482
95.4%
3,580.36

TheoryOfProbability
100
84
84.0%
11,164
8,231
593
549
92.6%
3,200.61

Total
4,007
2,855
71.3%
630,999
483,917
46,203
42,837
92.7%
183,157

Contributors
The initial ATLAS effort was led by Ahmad Rammal, Niket Patel, Fabian Gloeckle, Amaury Hayat, Julia Kempe, Remi Munos, Charles Arnal, and Vivien Cabannes.
Citation
If you find this work useful, please cite our paper:
@misc{rammal2026formalizingmathematicsscale,
title={Formalizing Mathematics at Scale},
author={Ahmad Rammal and Niket Patel and Fabian Gloeckle and Amaury Hayat and Julia Kempe and Remi Munos and Charles Arnal and Vivien Cabannes},
year={2026},
eprint={2605.29955},
archivePrefix={arXiv},
primaryClass={cs.AI},
url={https://arxiv.org/abs/2605.29955},
}

About

ATLAS Autoformalized Textbook Library At Scale

Resources

Readme

License

View license

Code of conduct

Code of conduct

Contributing

Contributing

Security policy

Security policy

Uh oh!

There was an error while loading. Please reload this page.


Activity

Custom properties
Stars

87
stars
Watchers

0
watching
Forks

8
forks

Report repository

Releases
No releases published

Packages
0

 

 

 

Uh oh!

There was an error while loading. Please reload this page.


Uh oh!

There was an error while loading. Please reload this page.


Contributors

Uh oh!

There was an error while loading. Please reload this page.


Languages

Lean
100.0%

Footer

© 2026 GitHub, Inc.

Footer navigation

Terms

Privacy

Security

Status

Community

Docs

Contact

Manage cookies

Do not share my personal information

You can’t perform that action at this time.

ATLAS represents a large-scale library of textbook mathematics that has been autoformalized using Large Language Models, built upon the Lean 4 programming language. The project aims to create reusable formal building blocks to facilitate future formalization efforts driven by both human and machine intelligence. This effort draws foundational knowledge from undergraduate and graduate textbooks spanning diverse mathematical disciplines, including analysis, algebra, geometry, topology, combinatorics, probability, statistics, partial differential equations, number theory, and theoretical computer science. The overarching objective of ATLAS is to serve as an active effort to scale the corpus, curate the generated mathematical material, and improve formalization coverage and maintainability, moving the library closer to the conventions established in Mathlib.

The formalization process was executed through an autoformalization pipeline, AutoformBot. The structure of the repository organizes content by book, with each book directory containing the generated Lean source files for definitions, statements, and proofs, along with targets.yaml detailing the specific textbook statements selected for formalization. Furthermore, the project includes report.json files that contain automated evaluation results for the matched Lean declarations, quantifying metrics such as faithfulness, proof integrity, and code quality scores. To enhance user interaction with this complex body of work, a visualizer has been developed, allowing users to browse the library, compare informal statements with their Lean formalizations, inspect logical dependency graphs between results, and extract the necessary Lean code for selected theorems.

The project demonstrates significant scaling, having formalized content across 26 books, accumulating 630,999 lines of code and 483,917 lines of Lean code. Of the formalized material, over 46,203 declarations have been recorded, with a high rate of proof success, achieving 92.7% formalization. Specific explorations into various mathematical areas show successful formalization; for instance, the AlgebraicCombinatorics book achieved a 99.6% proved rate, while RealAnalysis reached 98.9% proof success. The work incorporates a wide range of mathematical topics, including the formalization of Lie Groups, Differential Geometry, Tensor Categories, and Theory of Probability, highlighting a comprehensive approach to scaling mathematical knowledge.

The foundational effort for ATLAS was driven by a collaborative group of contributors, including Ahmad Rammal, Niket Patel, Fabian Gloeckle, Amaury Hayat, Julia Kempe, Remi Munos, Charles Arnal, and Vivien Cabannes. This work is referenced in the companion paper Formalizing Mathematics at Scale. The project is positioned not as a complete, finished product but as an ongoing, machine-generated extension, emphasizing continuous work on expanding the corpus, refining generated code, and improving idiomatic reuse within the Lean ecosystem.