Vsora Jotunn-8 5nm European inference chip

Recorded: Nov. 28, 2025, 1:02 a.m.

Original

Summarized

Jotunn 8 - VSORA

Home
Products

Jotunn 8
Tyr

Applications
News
Projects
The Company

Explore Jotunn 8

Home
Products

Jotunn 8
Tyr

Applications
News
Projects
The Company

Jotunn 8

The UltimateAI Chip

Where efficiency meets innovation

The magic number

0
/tflops

This is Jotunn 8

Introducing the World’s Most Efficient AI Inference Chip

In modern data centers, success means deploying trained models with blistering speed, minimal cost, and effortless scalability. Designing and operating inference systems requires balancing key factors such as high throughput, low latency, optimized power consumption, and sustainable infrastructure. Achieving optimal performance while maintaining cost and energy efficiency is critical to meeting the growing demand for large-scale, real-time AI services across a variety of applications.Unlock the full potential of your AI investments with our high-performance inference solutions. Engineered for speed, efficiency, and scalability, our platform ensures your AI models deliver maximum impact—at lower operational costs and with a commitment to sustainability. Whether you’re scaling up deployments or optimizing existing infrastructure, we provide the technology and expertise to help you stay competitive and drive business growth.This is not just faster inference. It’s a new foundation for AI at scale.

Explore Jotunn 8

Ultra-low Latency

Critical for real-time applications like chatbots, fraud detection, and search.

Very High Throughput

Essential for high-demand services like recommendation engines or LLM APIs.

Cost Efficient

AI inference is often run at massive scale—reducing cost per inference is essential for business viability.

Power Efficient

Performance per watt. Power is a major operational expense and carbon footprint driver.

Performance

Memory

Flexibility

This is Jotunn 8

Let's Have a Look

Explore Jotunn 8

This is Jotunn 8

AI – Demystified and Delivered

In the world of AI data centers, speed, efficiency, and scale aren’t optional—they’re everything. Jotunn8, our ultra-high-performance inference chip is built to deploy trained models with lightning-fast throughput, minimal cost, and maximum scalability. Designed around what matters most—performance, cost-efficiency, and sustainability—they deliver the power to run AI at scale, without compromise!

Explore Jotunn 8

Llama3 405B

Jotunn 8 Outperforms the Market

Why it matters: Critical for real-time applications like chatbots, fraud detection, and search.

Different Models, Different Purposes – Same Hardware

Reasoning models, Generative AI and Agentic AI are increasingly being combined to build more capable and reliable systems. Generative AI provide flexibility and language fluency. Reasoning models provide rigor and correctness. Agentic frameworks provide autonomy and decision-making. The VSORA architecture enables smooth and easy integration of these algorithms, providing near-theory performance.

Type

Key Role

Strengths

Weaknesses

Reasoning Models

Logical inference and problem-solving

Accuracy, consistency

Limited generalization, slow

LLMs / Generative AI

Natural language generation and understanding

Versatile, broad, creative

Can hallucinate, lacks deep reasoning

Agentic AI

Goal-directed, autonomous action

Agentic AIIndependence, planning, coordination

Still experimental, hard to align and control

Explore Jotunn 8

Cost efficient

More Speed For the Bucks

Why it matters: AI inference is often run at massive scale – reducing cost per inference is essential for business viability.

info@vsora.com

HQ13 rue Jeanne BraconnierImmeuble Le Pasteur92360 Meudon-La-ForêtFrance

AsiaTaipeiTaiwan

JapanTokyoJapan

KoreaSeoulKorea

USASan Diego, CAUSA

Explore Jotunn 8

Manage Consent

To provide the best experiences, we use technologies like cookies to store and/or access device information. Consenting to these technologies will allow us to process data such as browsing behavior or unique IDs on this site. Not consenting or withdrawing consent, may adversely affect certain features and functions.

Functional

Always active

The technical storage or access is strictly necessary for the legitimate purpose of enabling the use of a specific service explicitly requested by the subscriber or user, or for the sole purpose of carrying out the transmission of a communication over an electronic communications network.

Preferences

The technical storage or access is necessary for the legitimate purpose of storing preferences that are not requested by the subscriber or user.

Statistics

The technical storage or access that is used exclusively for statistical purposes.
The technical storage or access that is used exclusively for anonymous statistical purposes. Without a subpoena, voluntary compliance on the part of your Internet Service Provider, or additional records from a third party, information stored or retrieved for this purpose alone cannot usually be used to identify you.

Marketing

The technical storage or access is required to create user profiles to send advertising, or to track the user on a website or across several websites for similar marketing purposes.

Manage options
Manage services
Manage {vendor_count} vendors
Read more about these purposes

Accept
Deny
View preferences
Save preferences
View preferences

{title}
{title}
{title}

Manage Consent

Functional

Always active

Preferences

The technical storage or access is necessary for the legitimate purpose of storing preferences that are not requested by the subscriber or user.

Statistics

Marketing

The technical storage or access is required to create user profiles to send advertising, or to track the user on a website or across several websites for similar marketing purposes.

Manage options
Manage services
Manage {vendor_count} vendors
Read more about these purposes

Accept
Deny
View preferences
Save preferences
View preferences

{title}
{title}
{title}

Manage consent
Manage consent

Explore Tyr

Unmatched Performance at the Edge with Edge AI.

Name

Company

I want to explore

Flexibility

Fully programmableAlgorithm agnosticHost processor agnosticRISC-V core to offload & run AI completely on-chip

Explore Tyr

Memory

Capacity

HBM: 36GB

Throughput

HBM: 1 TB/s

Explore Tyr

Performance

Tensorcore (dense)

Tyr 4fp8: 1600 Tflopsfp16: 400 Tflops

Tyr 2fp8: 800 Tflopsfp16: 200 Tflops

General Purpose

Tyr 4fp8/int8: 50 Tflopsfp16/int16: 25 Tflopsfp32/int32: 12 Tflops

Tyr 2fp8/int8: 25 Tflopsfp16/int16: 12 Tflopsfp32/int32: 6 Tflops

Close to theory efficiency

Explore Tyr

Flexibility

Fully programmableAlgorithm agnosticHost processor agnosticRISC-V cores to offload host & run AI completely on-chip.

Explore Jotunn 8

Memory

Capacity

HBM: 288GB

Throughput

HBM: 8 TB/s

Explore Jotunn 8

Performance

Tensorcore (dense)

fp8: 3200 Tflopsfp16: 800 Tflops

General Purpose

fp8/int8: 100 Tflopsfp16/int16: 50 Tflopsfp32/int32: 25 TflopsClose to theory efficiency

Explore Jotunn 8

Introducing the World’s Most Efficient AI Inference Chip.

Name

Company

I want to explore

The Jotunn 8 chip, developed by VSORA, represents a significant advancement in AI inference technology, specifically targeting the demands of modern data centers. This document details the core capabilities and architectural design of the chip, highlighting its focus on efficiency, scalability, and performance. The core tenet of the Jotunn 8 is to deliver optimal performance for AI models across a range of applications, reducing operational costs and maximizing sustainability.

At the heart of the Jotunn 8’s design is its ultra-low latency and high throughput, critical for applications like chatbots, fraud detection, and search, where real-time responsiveness is paramount. The architecture is engineered to efficiently handle massive datasets and complex AI algorithms, supporting high demand services like recommendation engines and Large Language Model (LLM) APIs. Central to its design is a commitment to cost-efficiency, recognizing that AI inference at scale demands a chip that minimizes the cost per inference. Furthermore, the Jotunn 8 prioritizes power efficiency, significantly reducing operational expenses and the associated carbon footprint.

The Jotunn 8 is built around a RISC-V core, capable of offloading and executing AI workloads entirely on-chip. This design eliminates the bottlenecks associated with transferring data between the processor and external memory, dramatically improving performance and reducing latency. The chip employs High Bandwidth Memory (HBM) – 36GB for the Jotunn 8 and 288GB for the Jotunn, delivering an impressive throughput of 1 TB/s and 8 TB/s, respectively, further maximizing the efficiency of data access.

The chip boasts a variety of performance levels, catering to diverse AI models and applications. The Jotunn 8 utilizes TensorCore (dense) architecture, achieving 1600 Tflops in FP8 and 400 Tflops in FP16 formats. Similarly, the Jotunn 8 supports FP8 at 3200 Tflops and FP16 at 800 Tflops, while offering general-purpose performance with FP8/int8 at 100 Tflops, FP16/int16 at 50 Tflops, and FP32/int32 at 25 Tflops. The Jotunn 8 also provides lower-performance configurations, exemplified by its 800 Tflops FP8 and 200 Tflops FP16 formats.

The Jotunn 8, and its counterpart, the Jotunn 8, are designed to accommodate diverse AI models, including Reasoning Models, Generative AI, and Agentic AI. These different types of models require distinct strengths – accuracy and consistency for Reasoning Models, versatility and creative capabilities for Generative AI, and autonomy and decision-making for Agentic AI. VSORA’s VSORA architecture, featured in the Jotunn 8, enables smooth and easy integration of these algorithms, facilitating near-theoretical performance.

The chip’s versatility is further enhanced by its fully programmable and algorithm-agnostic design. It is compatible with various host processors and utilizes RISC-V cores to offload and run AI workloads on-chip. This adaptability ensures the Jotunn 8 can be deployed across a wide range of applications, irrespective of the specific AI model or framework utilized. The architecture efficiently enables diverse applications such as chatbots, fraud detection, and search, offering high throughput and low latency, along with supporting high-demand services like recommendation engines and LLM APIs.

Ultimately, the Jotunn 8 represents a crucial step forward in AI infrastructure, offering a blend of high performance, efficiency, and adaptability that addresses the evolving needs of modern data centers and AI applications.