Show HN: Nano PDF – A CLI Tool to Edit PDFs with Gemini's Nano Banana
Recorded: Nov. 30, 2025, 1:05 a.m.
| Original | Summarized |
GitHub - gavrielc/Nano-PDF: Edit PDF files with Nano Banana Skip to content Navigation Menu Toggle navigation
Sign in
Appearance settings PlatformAI CODE CREATIONGitHub CopilotWrite better code with AIGitHub SparkBuild and deploy intelligent appsGitHub ModelsManage and compare promptsMCP RegistryNewIntegrate external toolsDEVELOPER WORKFLOWSActionsAutomate any workflowCodespacesInstant dev environmentsIssuesPlan and track workCode ReviewManage code changesAPPLICATION SECURITYGitHub Advanced SecurityFind and fix vulnerabilitiesCode securitySecure your code as you buildSecret protectionStop leaks before they startEXPLOREWhy GitHubDocumentationBlogChangelogMarketplaceView all featuresSolutionsBY COMPANY SIZEEnterprisesSmall and medium teamsStartupsNonprofitsBY USE CASEApp ModernizationDevSecOpsDevOpsCI/CDView all use casesBY INDUSTRYHealthcareFinancial servicesManufacturingGovernmentView all industriesView all solutionsResourcesEXPLORE BY TOPICAISoftware DevelopmentDevOpsSecurityView all topicsEXPLORE BY TYPECustomer storiesEvents & webinarsEbooks & reportsBusiness insightsGitHub SkillsSUPPORT & SERVICESDocumentationCustomer supportCommunity forumTrust centerPartnersOpen SourceCOMMUNITYGitHub SponsorsFund open source developersPROGRAMSSecurity LabMaintainer CommunityAcceleratorArchive ProgramREPOSITORIESTopicsTrendingCollectionsEnterpriseENTERPRISE SOLUTIONSEnterprise platformAI-powered developer platformAVAILABLE ADD-ONSGitHub Advanced SecurityEnterprise-grade security featuresCopilot for BusinessEnterprise-grade AI featuresPremium SupportEnterprise-grade 24/7 supportPricing Search or jump to... Search code, repositories, users, issues, pull requests...
Search Clear
Search syntax tips Provide feedback Include my email address so I can be contacted Cancel Submit feedback Saved searches
Name Query To see all available qualifiers, see our documentation. Cancel Create saved search Sign in Sign up
Appearance settings Resetting focus You signed in with another tab or window. Reload to refresh your session. Dismiss alert gavrielc Nano-PDF Public
Notifications
Fork
Star Edit PDF files with Nano Banana MIT license 107 5 Branches Tags Activity
Star
Notifications Code Issues Pull requests Actions Projects Security Uh oh! There was an error while loading. Please reload this page. Insights
Additional navigation options
Code Issues Pull requests Actions Projects Security Insights
gavrielc/Nano-PDF
mainBranchesTagsGo to fileCodeOpen more actions menuFolders and filesNameNameLast commit messageLast commit dateLatest commit History7 Commits.github.github assetsassets nano_pdfnano_pdf .gitignore.gitignore CHANGELOG.mdCHANGELOG.md CODE_OF_CONDUCT.mdCODE_OF_CONDUCT.md CONTRIBUTING.mdCONTRIBUTING.md LICENSELICENSE README.mdREADME.md pyproject.tomlpyproject.toml View all filesRepository files navigationREADMECode of conductContributingMIT license Nano PDF Editor A CLI tool to edit PDF slides using natural language prompts, powered by Google's Gemini 3 Pro Image ("Nano Banana") model. Natural Language Editing: "Update the graph to include data from 2025", "Change the chart to a bar graph". How It Works Page Rendering: Converts target PDF pages to images using Poppler The tool processes multiple pages in parallel for speed, with configurable resolution (4K/2K/1K) to balance quality vs. cost. Get an API key from Google AI Studio export GEMINI_API_KEY="your_api_key_here" # Add a slide after page 5 --use-context / --no-use-context: Include the full text of the PDF as context for the model. Disabled by default for edit, enabled by default for add. Use --no-use-context to disable. Examples # Disable Google Search if you want the model to only use provided context Python 3.10+ System Dependencies Use --resolution "2K" or --resolution "1K" for faster processing Running from Source # Install dependencies # Run the tool About Edit PDF files with Nano Banana Readme MIT license Code of conduct Code of conduct Contributing Contributing Uh oh! There was an error while loading. Please reload this page. Activity 107 1 5 Report repository Releases Packages No packages published Contributors gavrielc
claude
Languages Python Footer © 2025 GitHub, Inc. Footer navigation Terms Privacy Security Status Community Docs Contact Manage cookies Do not share my personal information You can’t perform that action at this time. |
Nano PDF is a compelling CLI tool designed to streamline PDF editing using natural language prompts, powered by Google’s Gemini 3 Pro Image (dubbed “Nano Banana”). This tool provides a unique approach to modifying PDF slides, leveraging AI to understand and execute editing requests. The core functionality centers around a multi-stage process: page rendering via Poppler, optional style reference analysis, AI-driven generation, OCR re-hydration for text, and PDF stitching to preserve document integrity. The application’s strengths lie in its intuitive user interface – primarily through command-line arguments – and its ability to perform batch edits. The system supports a variety of operations, including updating slide content, adding entirely new slides, and making visual design changes, all by simply providing a natural language prompt. Notably, the tool incorporates advanced features such as style reference pages that allow the AI to learn and mimic the original slide’s aesthetic. Furthermore, the system’s concurrent processing of multiple pages significantly reduces editing time. However, the tool’s reliance on a paid Google Gemini API key is a critical requirement, emphasizing the tool's accessibility is directly tied to the user’s access to this service. The tool’s workflow includes several dependencies that need to be installed and configured, namely Poppler and Tesseract, along with Python 3.10+. The user needs to ensure they have a good understanding of their environment, as troubleshooting often involves checking system dependencies, like verifying the PATH environment variable. The user experience is further influenced by the tool’s parameters, which offer granular control over the editing process. Options like `--resolution` (4K, 2K, or 1K), `--use-context`, and `--disable-google-search` allow users to balance image quality with processing speed and control the level of context provided to the AI. Although not explicitly detailed in the documentation, the developer, gavrielc, has provided a Code of Conduct and contributing guidelines which aim to maintain a positive and collaborative community. The tool’s architecture can be viewed as modular, with distinct components handling page rendering, style reference analysis, AI generation, and OCR re-hydration. This separation facilitates future development and potential enhancements. For example, future development could explore ways to improve OCR accuracy or develop more sophisticated style reference matching. It's important to note that while the documentation details a streamlined workflow, the actual performance depends significantly on the specified parameters and quality of the underlying Google Gemini API. The provided “Troubleshooting” section highlights common issues like missing dependencies and API key errors, underscoring the need for careful setup and configuration while the developer intends on addressing and expanding the tool’s capabilities, it is currently a beta or development tool that requires some level of expertise to operate optimally. |