bijou64
Tenfold: Celebrating 10 Years of Ink & Switch
bijou64
April 2026
Brooklyn Zelenka
It’s nice when you work on security and accidentally get some performance for free. This is the story of a small encoding called bijou64 — a variable-length integer (varint) encoding that we developed for the Subduction CRDT sync protocol. It was intended to fix a subtle signature-verification bug by making each number only representable a single way. It turned out to also run a few times faster than the more common varint LEB128. That “bijou” is French for “small jewel” is a happy coincidence 💎 We didn’t set out to write a fast varint, but it turns out that our design constraints made for an encoding that has to do less work. The Problem Many binary protocols need a compact way to encode integers that are usually small but occasionally large. Variable-length integer encodings (“varints”) solve this, but most designs treat canonicality as an afterthought — something enforced by a runtime check in the decoder rather than by the structure of the encoding itself. Since it’s the most common varint, we’re going to pick on LEB128 a bit here. I want to emphasize how much LEB128 is a great choice for many projects, and the reasons that it was not a good choice for us also applies to the other formats that we looked at. It just happened to not be a perfect fit for our use case. This is where the “128” in “LEB128” comes from, since 2⁷ = 128 LEB128 encodes a number as a sequence of 7-bit segments, with the high bit of each byte signaling “more bytes follow” (there can be many such continuation segments, though we only show 2 segments below). This lets you avoid always writing 8 bytes (64 bits) even when you’re representing a small number (which would be mostly zeroes). This is like writing 5 instead of 000000005 just to get the correct number of characters. Putting aside that working in 7-bit is odd, this is a practical solution!
LEB128 layout
But there’s a problem: the number 0 can be encoded as the single byte 0x00, but it can also be encoded as 0x80 0x00. Or 0x80 0x80 0x00. Or any longer sequence of 0x80s ending in a zero byte. 0x80 is 1 0000000, so you can have as many of those as you want and still get 0! Most LEB128 decoders will happily accept any of them. This isn’t unique to zero; nearly every number in LEB128 can be represented many ways.
Zero represented two ways in LEB128
This causes problems for signed data if you ever want to do things like compression since you need to know the exact bytes that were signed. Having an extra 0x80 will result in a different signature. If you only have one unique way to represent a number, you can do things like deduplicate runs of numbers when storing without needing to retain the entire original. Canonicalisation One solution is to simply enforce a special “canonical” form. While encoding a varint, you must ensure that you use the canonical encoding (not all libraries will always do this). When decoding, you must validate that it matches your expected format. It would be really nice to not have to do that additional work. So What? You could reasonably ask: who actually shows up with adversarial varints? The answer is “anyone who would benefit from your protocol mistaking two different byte strings for the same value.” For a signed protocol, that could be a lot of people. While not about varints per se, the textbook case is ASN.1 (the abstract syntax notation behind X.509 certificates, LDAP, and many other things that are widely depended on). Canonicalisation attacks have been used against PKCS#1 v1.5, Mozilla NSS, GnuTLS, JWTs and Bitcoin transactions. The pattern in all of these is something like:
The spec says: “the canonical encoding is X; any other encoding MUST be rejected.” The implementation has one or more ifs that enforces this. The check is separately deletable from the rest of the parser. Removing it doesn’t break round-trip tests; it doesn’t break tests using honestly-encoded data; it doesn’t break performance benchmarks. It only breaks under adversarial input, which is rarely in the test suite. The check is forgotten, optimised away, or never ported. The protocol’s security property silently degrades. This is the bug class bijou64 is designed to make impossible. Not by adding more checks, but by removing the one that mattered — and making the format such that, with no canonicality check at all, the only encoding that exists for any given value is the canonical one.
(Almost) Canonical by Construction bijou64 eliminates the possibility of more than one encoding per integer. Just like our normal written number system, there is exactly one way to write each number. Bijou uses two tricks: 1. First Byte Double Duty This “first byte is a tag sometimes” technique is used by a few encodings, but we were introduced to it by VARU64 The first byte represents 0–247 as normal. If you get 0x42, it just decodes to 0x42. 248–255 switch into a different mode: they’re a tag for how many bytes to expect after this first one, which will represent the number. This is really nice for decoding, because we know how much memory to allocate as soon as we read the first byte (O(1)). LEB128, by contrast, has to keep reading bytes until it sees one without the continuation bit set (O(n)).
Byte/tag structure
2. Offsets The tag alone is not enough to guarantee canonicity, but it points the way: instead of repeating 0-247 in the second byte (0xF8 0x00 == 0x00), we offset the next byte by 248 (0xF8). This means that 0xF8 0x00 == 0xF8 == 248, not 0 (since 0 is already represented as 0x00). Here’s a worked example of decoding 1738, which takes the tag plus two bytes:
Worked example
All numbers at this length (3 bytes total AKA tag + 2 data bytes) are offset by 504 (0x1F8). Each successive length increases the offset in a predictable pattern. See if you can spot it:
Total Length Offset
1 0x00
2 0xF8
3 0x01F8
4 0x0101F8
5 0x010101F8
6 0x01010101F8
7 0x0101010101F8
8 0x010101010101F8
9 0x01010101010101F8
This is a lookup table based on the first byte! There is one exception: because the numbers are always offset down, the largest values (9 bytes) need a manual check that they are in bounds. The 9-byte (tag + 8 bytes of data) slot can actually represent numbers larger than 2⁶⁴ due to the offset, but since bijou64 targets u64, we cap it there. This isn’t the canonicity problem from earlier — every in-range number still has exactly one encoding — we’re just clipping extra headroom that we don’t want. So when the tag is 255 (the biggest one), the decoder checks that the value is below the cutoff. Benchmarks You can find the complete shootout here Bijou64 has to do all of this bit shuffling, tag lookups, and whatnot. There needs to be a cost for all of this, and there is over the regular, fixed length 64-bit numbers. Surely it’s slower than much widely used encodings like LEB128 and the very clever vu128, right? We benched it on ARM (Apple M2 Pro) and x86 (AMD Zen 5), and were not expecting what came back. Decoding
Median decode time per batch of 4096 values. Lower is better. Distributions described in the methodology.
bijou64 comes out as pretty fast! It decodes roughly 2-10 times faster than LEB128 even when we don’t take the overhead of canonical checks on LEB128 into account. Small numbers (which encode to a single LEB128 byte) are about twice as fast in these benchmarks. Larger numbers, which force LEB128 to scan continuation bits across many bytes, are around 8–10 times faster. On a uniform full-u64 distribution — about as adversarial as a benchmark gets — bijou64 processes a batch of 4096 values in ~3 µs (≈0.75 ns per value) where LEB128 takes ~30 µs (≈7.3 ns per value). The bar chart only shows medians, though. The CDFs underneath are where the variance story lives:
Decode times per batch of 4096 values, plotted as a CDF for each library × distribution cell. Curves further to the left are faster. Curves that rise more steeply have more consistent performance.
In these benchmarks, bijou64’s CDFs are nearly vertical — every recorded batch timing sits within a tiny band around the median. LEB128’s curves lean over and trail off to the right, because the continuation-bit scan length depends on the value, and the branch predictor never gets a chance to lock in. As you might imagine, the delta is even larger when doing canonical decoding since bijou64 gets this “for free” thanks to its encoding for all but the largest numbers:
Canonical decode time per batch of 4096 values — that is, decode plus the runtime overlong-rejection check for libraries that need one.
In bijou64, canonical decoding is decoding — the canonicality check is the format. In the others, the canonicality check is additional work. Encoding Encoding is also generally faster, with one exception:
Median encode time per batch of 4096 values.
On the “small” distribution (248 – 65,535), LEB128 wins by about 1.24×. Encoded Size bijou64 isn’t the most compact varint for every distribution. Tier-boundary differences aside, bijou64 and LEB128 produce wire-byte counts within a couple of percent of each other across realistic workloads.
Length to represent certain numbers. These were chosen specifically to show where they differ; the vast majority of numbers are identical in length.
Why? In hindsight, this kind of makes sense: Length from the first byte Since there is no continuation-bit scanning. The decoder knows immediately how many bytes to read; the encoder knows immediately how many to write. LEB128’s decoder has to scan the high bit of every byte until it finds the terminator. Branch predictors love the bijou64 pattern; they hate the LEB128 one, especially for large values where the continuation chain is long. Big-endian, contiguous payloads The payload is a single contiguous big-endian integer, not 7-bit chunks with bookkeeping bits sprinkled through. Modern CPUs have byte-swap instructions for exactly this; the compiler turns the read into a single load + bswap. LEB128’s 7-bits-per-byte layout forces the decoder to mask and shift every byte. Predictable branches Tier selection is a small fixed match. For any one workload, branch prediction settles into a stable pattern almost immediately — exactly what those steep CDF curves show. Arithmetic is cheap Adding OFFSET[tier] is a constant load and an add (or sub if you’re encoding). A previous version had an if and some branching, but the arithmetic version actually has fewer instructions in the hot path on most modern CPUs. Should you use bijou64? Maybe! Like most interesting questions, it depends on your goals. This is a brand-new format, and it hasn’t been nearly as battle-tested as LEB128, which has mature, well-exercised implementations in every major language. Our benchmarks are encouraging, but big claims need big evidence — and we’ve only tested on three CPUs so far (M2 Pro and Zen 5 are in the published benchmarks; a Zen 3 we tried looks similar to Zen 5). LEB128 isn’t going away, and it shouldn’t. But if you’re designing a new format and canonicality matters — for signatures, content-addressing, or any kind of “two implementations must agree on the bytes” property — there’s an alternative that’s structurally safer and runs faster on every benchmark we threw at it. All of these algorithms work differently and no benchmark is perfect. We’ve tired our best to make a fair shootout that reflects practical use. We’re all ears if you have a suggestion on how to improve them! The library is published as bijou64 on crates.io, dual MIT / Apache-2.0, and the spec is CC BY-SA 4.0 if you’d like to port it. A Wasm/JavaScript wrapper exists, and a family of width extensions (bijou32, bijou128) is sketched in the spec’s future-extensions section. If you find a bug — or, more interestingly, a workload where bijou64 loses to something we benchmarked against — we’d love to hear about it!
Research Areas
Local-first Software Malleable Software Programmable Ink Universal Version Control
Connect with us
Say hello@inkandswitch.com Subscribe to our RSS feed Follow on Bluesky Code at Github
The Ink & Switch Dispatch Keep up-to-date with the lab's latest findings, appearances, and happenings by subscribing to our newsletter. For a sneak peek, browse the archive.
Email |
The development of bijou64 stems from the need to address subtle signature-verification bugs in binary protocols by ensuring that each number has only a single representation. The authors developed bijou64 as a variable-length integer encoding intended for the Subduction CRDT sync protocol, which incidentally resulted in performance improvements over the commonly used LEB128 encoding. The motivation arose from recognizing that many binary protocols treat canonicality as an afterthought, relying on runtime checks in the decoder rather than enforcing it within the encoding structure itself, which leaves protocol security vulnerable to adversarial inputs, mirroring issues seen in standards like ASN.1 and Bitcoin transactions. The goal of bijou64 is to eliminate this possibility by designing an encoding where the only representation for any given value is the canonical one, thereby making canonicality inherent to the format.
The standard variable-length integer encoding, LEB128, uses 7-bit segments where the most significant bit of each byte signals continuation, allowing for compact representation of small numbers. However, this method suffers from ambiguity, particularly with zero, where multiple byte sequences can represent the same value, which poses problems for signed data where precise byte sequencing is necessary for operations like compression. To solve this, bijou64 employs two primary techniques to ensure unique canonical encoding. First, it utilizes a first byte with "double duty," where the initial byte can either represent the value directly for small numbers or act as a tag indicating how many subsequent bytes represent the actual number, facilitating $\text{O}(1)$ memory allocation for decoding. Second, it uses offsets to point to the subsequent bytes in a structured manner rather than repeating the number itself. For instance, instead of simply repeating a value, offsets are used to point to the next byte, thereby ensuring that the overall encoding uniquely maps to the integer, for example, by offsetting subsequent bytes by a value like 248 when transitioning to a new length.
This structural approach leads to a specific layout where the length of the encoding is encoded in a predictable pattern across the byte sequence. The method results in byte sequences that are contiguous big-endian integers, which modern CPUs can efficiently process using byte-swap instructions, contrasting with LEB128’s interleaved 7-bit chunks. Furthermore, this format promotes predictable branch prediction; the tag selection often results in stable patterns almost immediately, which contributes to superior performance. Arithmetic operations are also made relatively cheap, as adding the offset is a simple constant operation in the hot path of the encoding process.
Benchmarks demonstrated that bijou64 is remarkably fast. In comparisons executed on ARM and x86 architectures, bijou64 decodes roughly two to ten times faster than LEB128, even before accounting for additional canonical checks that some implementations might perform on LEB128. For small numbers, the speed advantage is even more pronounced; for larger numbers that require extensive scanning of continuation bits in LEB128, bijou64 demonstrates substantial speedups, processing large batches of data significantly faster than LEB128. The performance metrics, particularly the distribution curves shown in the benchmarks, indicate that bijou64 provides highly consistent performance, unlike LEB128, whose performance varies based on the magnitude of the number being decoded. Although some encoding methods like bijou64 may not be the most compact in terms of wire-byte size across all distributions, the superior performance, structural safety, and inherent canonicality make it a compelling alternative for protocols where unambiguous byte representations are critical. |