Vsora Jotunn-8 5nm European inference chip
Recorded: Nov. 28, 2025, 1:02 a.m.
| Original | Summarized |
Jotunn 8 - VSORA
Skip to content Home Jotunn 8 Applications Explore Jotunn 8 Home Jotunn 8 Applications Jotunn 8 The UltimateAI Chip Where efficiency meets innovation The magic number 0
This is Jotunn 8 Introducing the World’s Most Efficient AI Inference Chip In modern data centers, success means deploying trained models with blistering speed, minimal cost, and effortless scalability. Designing and operating inference systems requires balancing key factors such as high throughput, low latency, optimized power consumption, and sustainable infrastructure. Achieving optimal performance while maintaining cost and energy efficiency is critical to meeting the growing demand for large-scale, real-time AI services across a variety of applications.Unlock the full potential of your AI investments with our high-performance inference solutions. Engineered for speed, efficiency, and scalability, our platform ensures your AI models deliver maximum impact—at lower operational costs and with a commitment to sustainability. Whether you’re scaling up deployments or optimizing existing infrastructure, we provide the technology and expertise to help you stay competitive and drive business growth.This is not just faster inference. It’s a new foundation for AI at scale. Explore Jotunn 8
Ultra-low Latency Critical for real-time applications like chatbots, fraud detection, and search.
Very High Throughput Essential for high-demand services like recommendation engines or LLM APIs.
Cost Efficient AI inference is often run at massive scale—reducing cost per inference is essential for business viability.
Power Efficient Performance per watt. Power is a major operational expense and carbon footprint driver. Performance Memory Flexibility This is Jotunn 8 Let's Have a Look
Explore Jotunn 8 This is Jotunn 8 AI – Demystified and Delivered In the world of AI data centers, speed, efficiency, and scale aren’t optional—they’re everything. Jotunn8, our ultra-high-performance inference chip is built to deploy trained models with lightning-fast throughput, minimal cost, and maximum scalability. Designed around what matters most—performance, cost-efficiency, and sustainability—they deliver the power to run AI at scale, without compromise! Explore Jotunn 8
Llama3 405B Jotunn 8 Outperforms the Market Why it matters: Critical for real-time applications like chatbots, fraud detection, and search.
Different Models, Different Purposes – Same Hardware Reasoning models, Generative AI and Agentic AI are increasingly being combined to build more capable and reliable systems. Generative AI provide flexibility and language fluency. Reasoning models provide rigor and correctness. Agentic frameworks provide autonomy and decision-making. The VSORA architecture enables smooth and easy integration of these algorithms, providing near-theory performance. Type Key Role Strengths Weaknesses Reasoning Models Logical inference and problem-solving Accuracy, consistency Limited generalization, slow LLMs / Generative AI Natural language generation and understanding Versatile, broad, creative Can hallucinate, lacks deep reasoning Agentic AI Goal-directed, autonomous action Agentic AIIndependence, planning, coordination Still experimental, hard to align and control Explore Jotunn 8 Cost efficient More Speed For the Bucks Why it matters: AI inference is often run at massive scale – reducing cost per inference is essential for business viability.
info@vsora.com
HQ13 rue Jeanne BraconnierImmeuble Le Pasteur92360 Meudon-La-ForêtFrance AsiaTaipeiTaiwan JapanTokyoJapan KoreaSeoulKorea USASan Diego, CAUSA Explore Jotunn 8 Copyright © 2025 VSORA | All rights reserved Terms of Use Privacy Policy Cookie Policy Manage Consent To provide the best experiences, we use technologies like cookies to store and/or access device information. Consenting to these technologies will allow us to process data such as browsing behavior or unique IDs on this site. Not consenting or withdrawing consent, may adversely affect certain features and functions. Functional Functional Always active The technical storage or access is strictly necessary for the legitimate purpose of enabling the use of a specific service explicitly requested by the subscriber or user, or for the sole purpose of carrying out the transmission of a communication over an electronic communications network. Preferences Preferences The technical storage or access is necessary for the legitimate purpose of storing preferences that are not requested by the subscriber or user. Statistics Statistics The technical storage or access that is used exclusively for statistical purposes. Marketing Marketing The technical storage or access is required to create user profiles to send advertising, or to track the user on a website or across several websites for similar marketing purposes. Manage options Accept {title} Manage Consent To provide the best experiences, we use technologies like cookies to store and/or access device information. Consenting to these technologies will allow us to process data such as browsing behavior or unique IDs on this site. Not consenting or withdrawing consent, may adversely affect certain features and functions. Functional Functional Always active The technical storage or access is strictly necessary for the legitimate purpose of enabling the use of a specific service explicitly requested by the subscriber or user, or for the sole purpose of carrying out the transmission of a communication over an electronic communications network. Preferences Preferences The technical storage or access is necessary for the legitimate purpose of storing preferences that are not requested by the subscriber or user. Statistics Statistics The technical storage or access that is used exclusively for statistical purposes. Marketing Marketing The technical storage or access is required to create user profiles to send advertising, or to track the user on a website or across several websites for similar marketing purposes. Manage options Accept {title} Manage consent
Explore Tyr Unmatched Performance at the Edge with Edge AI. Name Company
I want to explore Flexibility
Fully programmableAlgorithm agnosticHost processor agnosticRISC-V core to offload & run AI completely on-chip Explore Tyr Memory
Capacity HBM: 36GB Throughput HBM: 1 TB/s Explore Tyr Performance
Tensorcore (dense) Tyr 4fp8: 1600 Tflopsfp16: 400 Tflops Tyr 2fp8: 800 Tflopsfp16: 200 Tflops General Purpose Tyr 4fp8/int8: 50 Tflopsfp16/int16: 25 Tflopsfp32/int32: 12 Tflops Tyr 2fp8/int8: 25 Tflopsfp16/int16: 12 Tflopsfp32/int32: 6 Tflops Close to theory efficiency Explore Tyr Flexibility
Fully programmableAlgorithm agnosticHost processor agnosticRISC-V cores to offload host & run AI completely on-chip. Explore Jotunn 8 Memory
Capacity HBM: 288GB Throughput HBM: 8 TB/s Explore Jotunn 8 Performance
Tensorcore (dense) fp8: 3200 Tflopsfp16: 800 Tflops General Purpose fp8/int8: 100 Tflopsfp16/int16: 50 Tflopsfp32/int32: 25 TflopsClose to theory efficiency Explore Jotunn 8 Explore Jotunn 8 Introducing the World’s Most Efficient AI Inference Chip. Name Company
I want to explore |
The Jotunn 8 chip, developed by VSORA, represents a significant advancement in AI inference technology, specifically targeting the demands of modern data centers. This document details the core capabilities and architectural design of the chip, highlighting its focus on efficiency, scalability, and performance. The core tenet of the Jotunn 8 is to deliver optimal performance for AI models across a range of applications, reducing operational costs and maximizing sustainability. At the heart of the Jotunn 8’s design is its ultra-low latency and high throughput, critical for applications like chatbots, fraud detection, and search, where real-time responsiveness is paramount. The architecture is engineered to efficiently handle massive datasets and complex AI algorithms, supporting high demand services like recommendation engines and Large Language Model (LLM) APIs. Central to its design is a commitment to cost-efficiency, recognizing that AI inference at scale demands a chip that minimizes the cost per inference. Furthermore, the Jotunn 8 prioritizes power efficiency, significantly reducing operational expenses and the associated carbon footprint. The Jotunn 8 is built around a RISC-V core, capable of offloading and executing AI workloads entirely on-chip. This design eliminates the bottlenecks associated with transferring data between the processor and external memory, dramatically improving performance and reducing latency. The chip employs High Bandwidth Memory (HBM) – 36GB for the Jotunn 8 and 288GB for the Jotunn, delivering an impressive throughput of 1 TB/s and 8 TB/s, respectively, further maximizing the efficiency of data access. The chip boasts a variety of performance levels, catering to diverse AI models and applications. The Jotunn 8 utilizes TensorCore (dense) architecture, achieving 1600 Tflops in FP8 and 400 Tflops in FP16 formats. Similarly, the Jotunn 8 supports FP8 at 3200 Tflops and FP16 at 800 Tflops, while offering general-purpose performance with FP8/int8 at 100 Tflops, FP16/int16 at 50 Tflops, and FP32/int32 at 25 Tflops. The Jotunn 8 also provides lower-performance configurations, exemplified by its 800 Tflops FP8 and 200 Tflops FP16 formats. The Jotunn 8, and its counterpart, the Jotunn 8, are designed to accommodate diverse AI models, including Reasoning Models, Generative AI, and Agentic AI. These different types of models require distinct strengths – accuracy and consistency for Reasoning Models, versatility and creative capabilities for Generative AI, and autonomy and decision-making for Agentic AI. VSORA’s VSORA architecture, featured in the Jotunn 8, enables smooth and easy integration of these algorithms, facilitating near-theoretical performance. The chip’s versatility is further enhanced by its fully programmable and algorithm-agnostic design. It is compatible with various host processors and utilizes RISC-V cores to offload and run AI workloads on-chip. This adaptability ensures the Jotunn 8 can be deployed across a wide range of applications, irrespective of the specific AI model or framework utilized. The architecture efficiently enables diverse applications such as chatbots, fraud detection, and search, offering high throughput and low latency, along with supporting high-demand services like recommendation engines and LLM APIs. Ultimately, the Jotunn 8 represents a crucial step forward in AI infrastructure, offering a blend of high performance, efficiency, and adaptability that addresses the evolving needs of modern data centers and AI applications. |