Files
pkl/docs/modules/bindings-specification/pages/binary-encoding.adoc
Jen Basch 6c036bf82a Implement Pkl binary renderer and parser (#1203)
Implements a binary renderer for Pkl values, which is a lossless capturing of Pkl data.

This follows the pkl binary format that is already used with `pkl server` calls, and is
made available as a Java API and also an in-language API.

Also, introduces a binary parser into the corresponding `PObject` types in Java.
2025-10-20 09:10:22 -07:00

230 lines
4.7 KiB
Plaintext

= Pkl Binary Encoding
include::ROOT:partial$component-attributes.adoc[]
include::partial$component-attributes.adoc[]
Pkl values can be encoded into a binary format.
This format is used for Pkl's non-JVM language bindings, for example, for its Go and Swift bindings.
The binary format is uses link:{uri-messagepack}[MessagePack] encoding.
== Primitives
All Pkl primitives turn into their respective MessagePack primitive.
|===
|Pkl Type|MessagePack format
|link:{uri-stdlib-Int}[Int]
|link:{uri-messagepack-int}[int]
|link:{uri-stdlib-Float}[Float]
|link:{uri-messagepack-float}[float]
|link:{uri-stdlib-String}[String]
|link:{uri-messagepack-str}[string]
|link:{uri-stdlib-Boolean}[Boolean]
|link:{uri-messagepack-bool}[bool]
|link:{uri-stdlib-Null}[Null]
|link:{uri-messagepack-nil}[nil]
|===
NOTE: Pkl integers are encoded into the smallest int type that the number will fit into.
For example, value `8` gets encoded as MessagePack `int8` format.
== Non-primitives
All non-primitive values are encoded as MessagePack arrays.
The first slot of the array designates the value's type.
The remaining slots have fixed meanings depending on the type.
Additional slots may be added to types in future Pkl releases. Decoders *must* be designed to defensively discard values beyond the number of known slots for a type or provide meaningful error messages.
The array's length is the number of slots that are filled. For example, xref:{uri-stdlib-List}[List] is encoded as an MessagePack array with two elements.
|===
|Pkl type |Slot 1 2+|Slot 2 2+|Slot 3 2+|Slot 4
||code |type |description |type |description |type |description
|link:{uri-stdlib-Typed}[Typed], link:{uri-stdlib-Dynamic}[Dynamic]
|`0x01`
|link:{uri-messagepack-str}[str]
|<<type-name-encoding,Class name>>
|link:{uri-messagepack-str}[str]
|Enclosing module URI
|link:{uri-messagepack-array}[array]
|Array of <<object-members,object members>>
|link:{uri-stdlib-Map}[Map]
|`0x02`
|link:{uri-messagepack-map}[map]
|Map of `<value>` to `<value>`
|
|
|
|
|link:{uri-stdlib-Mapping}[Mapping]
|`0x03`
|link:{uri-messagepack-map}[map]
|Map of `<value>` to `<value>`
|
|
|
|
|link:{uri-stdlib-List}[List]
|`0x04`
|link:{uri-messagepack-array}[array]
|Array of `<value>`
|
|
|
|
|link:{uri-stdlib-Listing}[Listing]
|`0x05`
|link:{uri-messagepack-array}[array]
|Array of `<value>`
|
|
|
|
|link:{uri-stdlib-Set}[Set]
|`0x06`
|link:{uri-messagepack-array}[array]
|Array of `<value>`
|
|
|
|
|link:{uri-stdlib-Duration}[Duration]
|`0x07`
|{uri-messagepack-float}[float64]
|Duration value
|link:{uri-messagepack-str}[str]
|link:{uri-stdlib-DurationUnit}[Duration unit] (`"ns"`, `"ms"`, etc.)
|
|
|link:{uri-stdlib-DataSize}[DataSize]
|`0x08`
|link:{uri-messagepack-float}[float64]
|Value (float64)
|link:{uri-messagepack-str}[str]
|link:{uri-stdlib-DataSizeUnit}[DataSize unit] (`"b"`, `"kb"`, etc.)
|
|
|link:{uri-stdlib-Pair}[Pair]
|`0x09`
|`<value>`
|First value
|`<value>`
|Second value
|
|
|link:{uri-stdlib-IntSeq}[IntSeq]
|`0x0A`
|link:{uri-messagepack-int}[int]
|Start
|link:{uri-messagepack-int}[int]
|End
|link:{uri-messagepack-int}[int]
|Step
|link:{uri-stdlib-Regex}[Regex]
|`0x0B`
|link:{uri-messagepack-str}[str]
|Regex string representation
|
|
|
|
|link:{uri-stdlib-Class}[Class]
|`0x0C`
|link:{uri-messagepack-str}[str]
|<<type-name-encoding,Class name>>
|link:{uri-messagepack-str}[str]
|Module URI
|
|
|link:{uri-stdlib-TypeAlias}[TypeAlias]
|`0x0D`
|link:{uri-messagepack-str}[str]
|<<type-name-encoding,TypeAlias name>>
|link:{uri-messagepack-str}[str]
|Module URI
|
|
|link:{uri-stdlib-Function}[Function]
|`0x0E`
|
|
|
|
|
|
|link:{uri-stdlib-Bytes}[Bytes]
|`0x0F`
|link:{uri-messagepack-bin}[bin]
|Binary contents
|
|
|
|
|===
[[type-name-encoding]]
[NOTE]
====
Type names have specific encoding rules:
* When the module URI is `pkl:base`:
** If the type name is `ModuleClass`, this type represents the module class of `pkl:base`.
** Otherwise, the type name corresponds to a type in `pkl:base`.
* For all other module URIs:
** When the type name contains `\#`, the string after the `#` character corresponds to a type in that module. The string before the `#` is the name of the module.
** Otherwise, the type name is the name of the module and represents the class of the module.
====
[[object-members]]
== Object Members
Like non-primitive values, object members are encoded as MessagePack arrays, where the first slot designates the value's type.
|===
|Member type |Slot 1 2+|Slot 2 2+|Slot 3
| |code |type |description |type |description
|Property
|`0x10`
|link:{uri-messagepack-str}[str]
|key
|`<value>`
|property value
|Entry
|`0x11`
|`<value>`
|entry key
|`<value>`
|entry value
|Element
|`0x12`
|link:{uri-messagepack-int}[int]
|index
|`<value>`
|element value
|===