How Cubrim compresses, step by step
Cubrim turns a stream of bytes into a compact archive through a fixed pipeline. Watch each stage run on a real input below — drag to rotate, step through the stages, and see exactly what happens to your data.
3D visualisation requires WebGL. Stage descriptions remain available below.
Domainize
Bytes → valuesThe input bytes become a sequence of values. Each position in the file carries one value in the range 0–255. This is the raw material the cube is built from.
Build the cube
φ: index → coordinatesEvery value lands at a coordinate inside an N-dimensional cube. By default N=2 and the edge is 256, so the file becomes a 256×256 grid: position i maps to (i mod 256, i ÷ 256) via the mixed-radix φ function. The value at each cell is its height and colour.
Distance map
Gaps between populated pointsOnly the cells that actually hold data are kept. Along each axis Cubrim records the distance from one populated point to the next — the gaps — instead of absolute coordinates. Sparse data turns into a short list of gaps.
Run-length encode
Collapse repeated gapsThe gap streams are run-length encoded: a run of identical gaps collapses into a single (value, count) pair. Regular, clustered layouts shrink dramatically here.
Code the values
bitpack · RLE-codes · HuffmanThe value stream is packed. The default packs each value into the fewest fixed bits; clustered streams use run-length codes; and the entropy scheme assigns short canonical-Huffman codewords to frequent values and long ones to rare values.
Assemble & safeguard
header + blob, or raw-storeA compact header (the parameters needed to rebuild everything) is prepended and the streams are concatenated into the final archive. If the cube path would not beat storing the input verbatim, Cubrim falls back to raw-store — so output never grows. Decoding reverses every step exactly: lossless, byte for byte.
Byte values 0–255 as coloured cells
Height = value magnitude
Flat 1D ribbon folds into a 2D grid
24×24 cells shown (scaled-down illustration)
Empty cells dimmed; arrows show gaps
Only populated points kept
Repeated gap bars collapse to one block
(value × count) = compact pair
Frequent values → short bit codes
Rare values → long codes (Huffman)
Streams assembled into one archive
Raw-store fallback if cube > input
Lossless, always
Every stage is reversible. Decoding reads the header and walks the pipeline backwards to reconstruct the original bytes exactly — there is no lossy mode.