Fara-7B: An efficient agentic model for computer use
Recorded: Nov. 27, 2025, 1:02 a.m.
| Original | Summarized |
GitHub - microsoft/fara Skip to content Navigation Menu Toggle navigation
Sign in
Appearance settings Platform GitHub Copilot
Write better code with AI GitHub Spark Build and deploy intelligent apps GitHub Models Manage and compare prompts GitHub Advanced Security
Find and fix vulnerabilities Actions
Automate any workflow Codespaces
Instant dev environments Issues
Plan and track work Code Review
Manage code changes Discussions
Collaborate outside of code Code Search
Find more, search less Explore Why GitHub
Documentation
GitHub Skills
Blog
Integrations GitHub Marketplace
MCP Registry
View all features
Solutions By company size Enterprises
Small and medium teams
Startups
Nonprofits
By use case App Modernization
DevSecOps
DevOps
CI/CD
View all use cases
By industry Healthcare
Financial services
Manufacturing
Government
View all industries
View all solutions
Resources Topics AI
DevOps
Security
Software Development
View all
Explore Learning Pathways
Events & Webinars
Ebooks & Whitepapers
Customer Stories
Partners
Executive Insights
Open Source GitHub Sponsors
Fund open source developers The ReadME Project
GitHub community articles Repositories Topics
Trending
Collections
Enterprise Enterprise platform
AI-powered developer platform Available add-ons GitHub Advanced Security
Enterprise-grade security features Copilot for business
Enterprise-grade AI features Premium Support
Enterprise-grade 24/7 support Pricing Search or jump to... Search code, repositories, users, issues, pull requests...
Search Clear
Search syntax tips Provide feedback Include my email address so I can be contacted Cancel Submit feedback Saved searches
Name Query To see all available qualifiers, see our documentation. Cancel Create saved search Sign in Sign up
Appearance settings Resetting focus You signed in with another tab or window. Reload to refresh your session. Dismiss alert microsoft fara Public
Notifications
Fork
Star License MIT license 486 40 Branches Tags Activity
Star
Notifications Code Issues Pull requests Actions Projects Models Security Uh oh! There was an error while loading. Please reload this page. Insights
Additional navigation options
Code Issues Pull requests Actions Projects Models Security Insights
microsoft/fara
mainBranchesTagsGo to fileCodeOpen more actions menuFolders and filesNameNameLast commit messageLast commit dateLatest commit History15 Commitsautogen @ a9d2927autogen @ a9d2927 endpoint_configsendpoint_configs figuresfigures scriptsscripts src/farasrc/fara webevalwebeval .gitattributes.gitattributes .gitignore.gitignore .gitmodules.gitmodules CODE_OF_CONDUCT.mdCODE_OF_CONDUCT.md LICENSELICENSE README.mdREADME.md SECURITY.mdSECURITY.md TRANSPARENCY_NOTE.mdTRANSPARENCY_NOTE.md pyproject.tomlpyproject.toml View all filesRepository files navigationREADMECode of conductMIT licenseSecurity Overview # 2. Setup environment Operates visually by perceiving webpages and taking actions like scrolling, typing, and clicking on directly predicted coordinates Fara-7B is trained using a novel synthetic data generation pipeline built on the Magentic-One multi-agent framework, with 145K trajectories covering diverse websites, task types, and difficulty levels. The model is based on Qwen2.5-VL-7B and trained with supervised fine-tuning. Searching for information and summarizing results Performance Highlights Model SoM Agents SoM Agent (GPT-4o-0513) SoM Agent (o3-mini) SoM Agent (GPT-4o) GLM-4.1V-9B-Thinking Computer Use Models OpenAI computer-use-preview UI-TARS-1.5-7B Fara-7B Table: Online agent evaluation results showing success rates (%) across four web benchmarks. Results are averaged over 3 runs. Task Segment Single-Site Tasks Shopping Flights Hotels Restaurants Activities Ticketing Real Estate Jobs/Careers Multi-Step Tasks Shopping List (2 items) Comparison Shopping Compositional Tasks Overall Macro Average Micro Average Table: Breakdown of WebTailBench results across all 11 segments. Success rates (%) are averaged over 3 independent runs. Fara-7B achieves the highest performance among computer-use models across all task categories. Task Verification pipeline for LLM-as-a-judge evaluation Evaluation Infrastructure Playwright - A cross-browser automation framework that replicates browser environments Note: Fara-7B is an experimental release designed to invite hands-on exploration and feedback from the community. We recommend running it in a sandboxed environment, monitoring its execution, and avoiding sensitive data or high-risk domains. Installation Hosting the Model Deploy the Fara-7B model on Azure Foundry and obtain your endpoint URL and API key # Edit one of the existing config files or create a new one Run the Fara agent: fara-cli --task "how many pages does wikipedia have" --start_page "https://www.bing.com" Thought #1: To find the current number of Wikipedia pages, I'll search for the latest Wikipedia page count statistics. Thought #2: Wikipedia currently has 7,095,446 articles. Final Answer: Wikipedia currently has 7,095,446 articles. Enter another task (or press Enter to exit): Reproducibility Removed ~48 impossible tasks from the original WebVoyager benchmark Environment Error Handling Trajectories are retried up to 5 times when environment errors occur Step Budget # Install fara package # Install autogen submodule # Install webeval # Install playwright We use the same LLM-as-a-judge prompts and model (GPT-4o) as WebVoyager, hence the --eval_oai_config argument Analyzing Evaluation Results Model name Example path: Each evaluation folder contains: gpt_eval/ - LLM-as-a-judge evaluation results final_answer.json (e.g., Amazon--1_final_answer.json) - <no_answer> indicates abortion or step budget exceeded Running Analysis Identifies trajectories aborted mid-execution and diagnostic reasons To re-run failed tasks, execute the evaluation script again with the same run_id and username - it will skip non-aborted tasks. Example WebVoyager GPT Eval Result Citation
About No description, website, or topics provided. Readme MIT license Code of conduct Code of conduct Security policy Security policy Uh oh! There was an error while loading. Please reload this page. Activity Custom properties 486 3 40 Report repository Releases Packages No packages published Uh oh! There was an error while loading. Please reload this page. Contributors Languages Python Jupyter Notebook JavaScript Footer © 2025 GitHub, Inc. Footer navigation Terms Privacy Security Status Community Docs Contact Manage cookies Do not share my personal information You can’t perform that action at this time. |
Fara-7B is Microsoft’s first agentic small language model (SLM) designed specifically for computer use. With only 7 billion parameters, it’s an ultra-compact Computer Use Agent (CUA) achieving state-of-the-art performance within its size class and competitively matching larger, more resource-intensive agentic systems. Fara-7B operates by utilizing computer interfaces—mouse and keyboard—to perform multi-step tasks on behalf of users, mirroring human interaction with technology. Unlike traditional chat models, it directly perceives webpages and takes actions like scrolling, typing, and clicking, relying on the same modalities as humans. This design results in on-device deployment, reducing latency and enhancing privacy by keeping user data local. Fara-7B averages only ~16 steps per task, in comparison to approximately 41 steps for comparable models, demonstrating efficient task completion. The model is trained utilizing a novel synthetic data generation pipeline built on the Magentic-One multi-agent framework, leveraging 145K trajectories that cover diverse websites, task types, and levels of difficulty. It's based on Qwen2.5-VL-7B and refined through supervised fine-tuning. Key Capabilities and Performance Highlights: Fara-7B is capable of automating typical everyday web tasks including searching for information and summarizing results, filling out forms and managing accounts, booking travel, movie tickets, and restaurant reservations, and shopping and comparing prices across retailers. It achieved state-of-the-art results across multiple web agent benchmarks, surpassing comparable-sized models and larger systems. Specifically, in evaluations using WebVoyager, OnlineMind2Web, and DeepShop, Fara-7B demonstrated success rates of 73.5%, 34.1%, and 26.2%, respectively, surpassing many larger agents. These results were evaluated across eleven segments of WebTailBench, a new benchmark focusing on 11 real-world task types that are underrepresented or missing in existing benchmarks. The benchmark includes 609 tasks across diverse categories, with the first 8 segments testing single skills or objectives, and the remaining 3 evaluating more difficult multi-step or cross-site tasks. The success of Fara-7B is largely due to its efficiency and ability to accurately complete tasks, as demonstrated in the WebTailBench evaluation. Fara-7B consistently outperformed other models, particularly in multi-step tasks, achieving success rates above 30% for complex scenarios. The evaluation framework, dubbed WebEval, incorporates several crucial elements to ensure the reproducibility and reliability of results. This includes BrowserBase integration for robust browser session management, time-sensitive task updates to keep benchmarks achievable, and stringent error handling, including retry mechanisms and budget limitations to prevent task abortion or deviation from intended actions. Finally, a comprehensive analysis of results is supported through an analysis notebook. The research team provided detailed documentation outlining the deployment methodology, including instructions for running the model locally using VLLM or Azure Foundry hosting, setting up environments, and executing test scenarios. The details of data generation and evaluation are thoroughly described offering a clear insight into the development and validation of Fara-7B . |