RC RANDOM CHAOS

Bijou64: A varint encoding where canonicality is the format, not a check

· via Hacker News

Original source

Bijou64: A variable-length integer encoding

Hacker News →

Ink & Switch’s bijou64 is a variable-length integer encoding designed to make the canonicalization bug class structurally impossible. Conventional varints like LEB128 allow multiple byte sequences to decode to the same integer (e.g., 0 can be 0x00, 0x80 0x00, 0x80 0x80 0x00, and so on), forcing implementations to layer a separate canonicality check on top. Those checks are notorious for being silently dropped, optimized away, or never ported — a recurring source of signature-bypass and parser-confusion attacks, most famously in ASN.1 and PKCS#1.

Bijou64 collapses the check into the format itself. The first byte encodes values 0–247 directly, while 248–255 serve as length tags indicating how many data bytes follow. Crucially, each subsequent length tier is offset by the cumulative range of shorter encodings, so every integer has exactly one valid representation by construction. Only the top tier needs a bounds check to clip values above 2^64.

The security-driven design also wins on performance. Knowing the full length from the first byte means O(1) allocation and no continuation-bit scanning, which keeps the branch predictor happy. Decoding benchmarks run 2–10× faster than LEB128 with far tighter variance, and encoding is generally faster too (LEB128 only edges ahead on a narrow small-number band). Wire size stays within a few percent of LEB128 on realistic workloads.

Read the full article

Continue reading at Hacker News →

This is an AI-generated summary. Read the original for the full story.