Skip to content

Commit 2017084

Browse files
JesusMcCloudnodhsandwwraith
authored
Add Cbor features for COSE compliance (#2412)
This PR contains all features required to serialize and parse COSE-compliant CBOR (thanks to @nodh). While some canonicalization steps (such as sorting keys) still need to be performed manually. It does get the job done quite well. Namely, we have successfully used the features introduced here to create and validate ISO/IEC 18013-5:2021-compliant mobile driving license data. This PR introduces the following features to the CBOR format: - Serial Labels - Tagging of keys and values - Definite length encoding (this is the largest change, as it effectively makes the cbor encoder two-pass) - Option to globally prefer major type 2 for byte array encoding - Various QoL changes, such as public CborEncoder/CborDecoder interfaces and separate CborConfiguration class. This PR obsoletes #2371 and #2359 as it contains the features of both PRs and many more. Fixes #1955 Fixes #1560 Co-authored-by: Christian Kollmann <[email protected]> Co-authored-by: Leonid Startsev <[email protected]>
1 parent af5095e commit 2017084

31 files changed

+4230
-1472
lines changed

benchmark/build.gradle.kts

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -61,6 +61,7 @@ dependencies {
6161
implementation(libs.okio)
6262
implementation(libs.kotlinx.io)
6363
implementation(project(":kotlinx-serialization-core"))
64+
implementation(project(":kotlinx-serialization-cbor"))
6465
implementation(project(":kotlinx-serialization-json"))
6566
implementation(project(":kotlinx-serialization-json-okio"))
6667
implementation(project(":kotlinx-serialization-json-io"))
Lines changed: 62 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,62 @@
1+
/*
2+
* Copyright 2017-2024 JetBrains s.r.o. Use of this source code is governed by the Apache 2.0 license.
3+
*/
4+
5+
package kotlinx.benchmarks.cbor
6+
7+
import kotlinx.serialization.Serializable
8+
import kotlinx.serialization.cbor.*
9+
import org.openjdk.jmh.annotations.*
10+
import java.util.concurrent.*
11+
12+
@Serializable
13+
data class KTestAllTypes(
14+
val i32: Int,
15+
val i64: Long,
16+
val f: Float,
17+
val d: Double,
18+
val s: String,
19+
val b: Boolean = false,
20+
)
21+
22+
@Serializable
23+
data class KTestOuterMessage(
24+
val a: Int,
25+
val b: Double,
26+
val inner: KTestAllTypes,
27+
val s: String,
28+
val ss: List<String>
29+
)
30+
31+
@Warmup(iterations = 5, time = 1)
32+
@Measurement(iterations = 10, time = 1)
33+
@BenchmarkMode(Mode.Throughput)
34+
@OutputTimeUnit(TimeUnit.MILLISECONDS)
35+
@State(Scope.Benchmark)
36+
@Fork(1)
37+
open class CborBaseline {
38+
val baseMessage = KTestOuterMessage(
39+
42,
40+
256123123412.0,
41+
s = "string",
42+
ss = listOf("a", "b", "c"),
43+
inner = KTestAllTypes(-123124512, 36253671257312, Float.MIN_VALUE, -23e15, "foobarbaz")
44+
)
45+
46+
val cbor = Cbor {
47+
encodeDefaults = true
48+
encodeKeyTags = false
49+
encodeValueTags = false
50+
useDefiniteLengthEncoding = false
51+
preferCborLabelsOverNames = false
52+
}
53+
54+
val baseBytes = cbor.encodeToByteArray(KTestOuterMessage.serializer(), baseMessage)
55+
56+
@Benchmark
57+
fun toBytes() = cbor.encodeToByteArray(KTestOuterMessage.serializer(), baseMessage)
58+
59+
@Benchmark
60+
fun fromBytes() = cbor.decodeFromByteArray(KTestOuterMessage.serializer(), baseBytes)
61+
62+
}

buildSrc/src/main/kotlin/animalsniffer-conventions.gradle.kts

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -53,6 +53,7 @@ afterEvaluate { // Can be applied only when the project is evaluated
5353
"kotlinx-serialization-core" -> "kotlinx.serialization.internal.SuppressAnimalSniffer"
5454
"kotlinx-serialization-hocon" -> "kotlinx.serialization.hocon.internal.SuppressAnimalSniffer"
5555
"kotlinx-serialization-protobuf" -> "kotlinx.serialization.protobuf.internal.SuppressAnimalSniffer"
56+
"kotlinx-serialization-cbor" -> "kotlinx.serialization.cbor.internal.SuppressAnimalSniffer"
5657
else -> "kotlinx.serialization.json.internal.SuppressAnimalSniffer"
5758
}
5859

docs/formats.md

Lines changed: 90 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,10 @@ stable, these are currently experimental features of Kotlin Serialization.
1313
* [CBOR (experimental)](#cbor-experimental)
1414
* [Ignoring unknown keys](#ignoring-unknown-keys)
1515
* [Byte arrays and CBOR data types](#byte-arrays-and-cbor-data-types)
16+
* [Definite vs. Indefinite Length Encoding](#definite-vs-indefinite-length-encoding)
17+
* [Tags and Labels](#tags-and-labels)
18+
* [Arrays](#arrays)
19+
* [Custom CBOR-specific Serializers](#custom-cbor-specific-serializers)
1620
* [ProtoBuf (experimental)](#protobuf-experimental)
1721
* [Field numbers](#field-numbers)
1822
* [Integer types](#integer-types)
@@ -164,6 +168,8 @@ Per the [RFC 7049 Major Types] section, CBOR supports the following data types:
164168

165169
By default, Kotlin `ByteArray` instances are encoded as **major type 4**.
166170
When **major type 2** is desired, then the [`@ByteString`][ByteString] annotation can be used.
171+
Moreover, the `alwaysUseByteString` configuration switch allows for globally preferring **major type 2** without needing
172+
to annotate every `ByteArray` in a class hierarchy.
167173

168174
<!--- INCLUDE
169175
import kotlinx.serialization.*
@@ -221,6 +227,90 @@ BF # map(*)
221227
FF # primitive(*)
222228
```
223229

230+
### Definite vs. Indefinite Length Encoding
231+
CBOR supports two encodings for maps and arrays: definite and indefinite length encoding. kotlinx.serialization defaults
232+
to the latter, which means that a map's or array's number of elements is not encoded, but instead a terminating byte is
233+
appended after the last element.
234+
Definite length encoding, on the other hand, omits this terminating byte, but instead prepends number of elements
235+
to the contents of a map or array. The `useDefiniteLengthEncoding` configuration switch allows for toggling between the
236+
two modes of encoding.
237+
238+
239+
### Tags and Labels
240+
241+
CBOR allows for optionally defining *tags* for properties and their values. These tags are encoded into the resulting
242+
byte string to transport additional information
243+
(see [RFC 8949 Tagging of Items](https://datatracker.ietf.org/doc/html/rfc8949#name-tagging-of-items) for more info).
244+
The [`@KeyTags`](Tags.kt) and [`@ValueTags`](Tags.kt) annotations can be used to define such tags while
245+
writing and verifying such tags can be toggled using the `encodeKeyTags`, `encodeValueTags`, `verifyKeyTags`, and
246+
`verifyValueTags` configuration switches respectively.
247+
In addition, it is possible to directly declare classes to always be tagged.
248+
This then applies to all instances of such a tagged class, regardless of whether they are used as values in a list
249+
or when they are used as a property in another class.
250+
Forcing objects to always be tagged in such a manner is accomplished by the [`@ObjectTags`](Tags.kt) annotation,
251+
which works just as `ValueTags`, but for class definitions.
252+
When serializing, `ObjectTags` will always be encoded directly before to the data of the tagged object, i.e. a
253+
value-tagged property of an object-tagged type will have the value tags preceding the object tags.
254+
Writing and verifying object tags can be toggled using the `encodeObjectTags` and `verifyObjectTags` configuration
255+
switches. Note that verifying only value tags can result in some data with superfluous tags to still deserialize
256+
successfully, since in this case - by definition - only a partial validation of tags happens.
257+
Well-known tags are specified in [`CborTag`](Tags.kt).
258+
259+
In addition, CBOR supports keys of all types which work just as `SerialName`s.
260+
COSE restricts this again to strings and numbers and calls these restricted map keys *labels*. String labels can be
261+
assigned by using `@SerialName`, while number labels can be assigned using the [`@CborLabel`](CborLabel.kt) annotation.
262+
The `preferCborLabelsOverNames` configuration switch can be used to prefer number labels over SerialNames in case both
263+
are present for a property. This duality allows for compact representation of a type when serialized to CBOR, while
264+
keeping expressive diagnostic names when serializing to JSON.
265+
266+
A predefined Cbor instance (in addition to the default [`Cbor.Default`](Cbor.kt) one) is available, adhering to COSE
267+
encoding requirements as [`Cbor.CoseCompliant`](Cbor.kt). This instance uses definite length encoding,
268+
encodes and verifies all tags and prefers labels to serial names.
269+
270+
### Arrays
271+
272+
Classes may be serialized as a CBOR Array (major type 4) instead of a CBOR Map (major type 5).
273+
274+
Example usage:
275+
276+
```
277+
@Serializable
278+
data class DataClass(
279+
val alg: Int,
280+
val kid: String?
281+
)
282+
283+
Cbor.encodeToByteArray(DataClass(alg = -7, kid = null))
284+
```
285+
286+
will normally produce a Cbor map: bytes `0xa263616c6726636b6964f6`, or in diagnostic notation:
287+
288+
```
289+
A2 # map(2)
290+
63 # text(3)
291+
616C67 # "alg"
292+
26 # negative(6)
293+
63 # text(3)
294+
6B6964 # "kid"
295+
F6 # primitive(22)
296+
```
297+
298+
When annotated with `@CborArray`, serialization of the same object will produce a Cbor array: bytes `0x8226F6`, or in diagnostic notation:
299+
300+
```
301+
82 # array(2)
302+
26 # negative(6)
303+
F6 # primitive(22)
304+
```
305+
This may be used to encode COSE structures, see [RFC 9052 2. Basic COSE Structure](https://www.rfc-editor.org/rfc/rfc9052#section-2).
306+
307+
308+
### Custom CBOR-specific Serializers
309+
Cbor encoders and decoders implement the interfaces [CborEncoder](CborEncoder.kt) and [CborDecoder](CborDecoder.kt), respectively.
310+
These interfaces contain a single property, `cbor`, exposing the current CBOR serialization configuration.
311+
This enables custom cbor-specific serializers to reuse the current `Cbor` instance to produce embedded byte arrays or
312+
react to configuration settings such as `preferCborLabelsOverNames` or `useDefiniteLengthEncoding`, for example.
313+
224314
## ProtoBuf (experimental)
225315

226316
[Protocol Buffers](https://developers.google.com/protocol-buffers) is a language-neutral binary format that normally

docs/serialization-guide.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -147,6 +147,10 @@ Once the project is set up, we can start serializing some classes.
147147
* <a name='cbor-experimental'></a>[CBOR (experimental)](formats.md#cbor-experimental)
148148
* <a name='ignoring-unknown-keys'></a>[Ignoring unknown keys](formats.md#ignoring-unknown-keys)
149149
* <a name='byte-arrays-and-cbor-data-types'></a>[Byte arrays and CBOR data types](formats.md#byte-arrays-and-cbor-data-types)
150+
* <a name='definite-vs-indefinite-length-encoding'></a>[Definite vs. Indefinite Length Encoding](formats.md#definite-vs-indefinite-length-encoding)
151+
* <a name='tags-and-labels'></a>[Tags and Labels](formats.md#tags-and-labels)
152+
* <a name='arrays'></a>[Arrays](formats.md#arrays)
153+
* <a name='custom-cbor-specific-serializers'></a>[Custom CBOR-specific Serializers](formats.md#custom-cbor-specific-serializers)
150154
* <a name='protobuf-experimental'></a>[ProtoBuf (experimental)](formats.md#protobuf-experimental)
151155
* <a name='field-numbers'></a>[Field numbers](formats.md#field-numbers)
152156
* <a name='integer-types'></a>[Integer types](formats.md#integer-types)

formats/cbor/api/kotlinx-serialization-cbor.api

Lines changed: 119 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,26 +7,144 @@ public synthetic class kotlinx/serialization/cbor/ByteString$Impl : kotlinx/seri
77

88
public abstract class kotlinx/serialization/cbor/Cbor : kotlinx/serialization/BinaryFormat {
99
public static final field Default Lkotlinx/serialization/cbor/Cbor$Default;
10-
public synthetic fun <init> (ZZLkotlinx/serialization/modules/SerializersModule;Lkotlin/jvm/internal/DefaultConstructorMarker;)V
10+
public synthetic fun <init> (Lkotlinx/serialization/cbor/CborConfiguration;Lkotlinx/serialization/modules/SerializersModule;Lkotlin/jvm/internal/DefaultConstructorMarker;)V
1111
public fun decodeFromByteArray (Lkotlinx/serialization/DeserializationStrategy;[B)Ljava/lang/Object;
1212
public fun encodeToByteArray (Lkotlinx/serialization/SerializationStrategy;Ljava/lang/Object;)[B
13+
public final fun getConfiguration ()Lkotlinx/serialization/cbor/CborConfiguration;
1314
public fun getSerializersModule ()Lkotlinx/serialization/modules/SerializersModule;
1415
}
1516

1617
public final class kotlinx/serialization/cbor/Cbor$Default : kotlinx/serialization/cbor/Cbor {
18+
public final fun getCoseCompliant ()Lkotlinx/serialization/cbor/Cbor;
19+
}
20+
21+
public abstract interface annotation class kotlinx/serialization/cbor/CborArray : java/lang/annotation/Annotation {
22+
}
23+
24+
public synthetic class kotlinx/serialization/cbor/CborArray$Impl : kotlinx/serialization/cbor/CborArray {
25+
public fun <init> ()V
1726
}
1827

1928
public final class kotlinx/serialization/cbor/CborBuilder {
29+
public final fun getAlwaysUseByteString ()Z
2030
public final fun getEncodeDefaults ()Z
31+
public final fun getEncodeKeyTags ()Z
32+
public final fun getEncodeObjectTags ()Z
33+
public final fun getEncodeValueTags ()Z
2134
public final fun getIgnoreUnknownKeys ()Z
35+
public final fun getPreferCborLabelsOverNames ()Z
2236
public final fun getSerializersModule ()Lkotlinx/serialization/modules/SerializersModule;
37+
public final fun getUseDefiniteLengthEncoding ()Z
38+
public final fun getVerifyKeyTags ()Z
39+
public final fun getVerifyObjectTags ()Z
40+
public final fun getVerifyValueTags ()Z
41+
public final fun setAlwaysUseByteString (Z)V
2342
public final fun setEncodeDefaults (Z)V
43+
public final fun setEncodeKeyTags (Z)V
44+
public final fun setEncodeObjectTags (Z)V
45+
public final fun setEncodeValueTags (Z)V
2446
public final fun setIgnoreUnknownKeys (Z)V
47+
public final fun setPreferCborLabelsOverNames (Z)V
2548
public final fun setSerializersModule (Lkotlinx/serialization/modules/SerializersModule;)V
49+
public final fun setUseDefiniteLengthEncoding (Z)V
50+
public final fun setVerifyKeyTags (Z)V
51+
public final fun setVerifyObjectTags (Z)V
52+
public final fun setVerifyValueTags (Z)V
53+
}
54+
55+
public final class kotlinx/serialization/cbor/CborConfiguration {
56+
public final fun getAlwaysUseByteString ()Z
57+
public final fun getEncodeDefaults ()Z
58+
public final fun getEncodeKeyTags ()Z
59+
public final fun getEncodeObjectTags ()Z
60+
public final fun getEncodeValueTags ()Z
61+
public final fun getIgnoreUnknownKeys ()Z
62+
public final fun getPreferCborLabelsOverNames ()Z
63+
public final fun getUseDefiniteLengthEncoding ()Z
64+
public final fun getVerifyKeyTags ()Z
65+
public final fun getVerifyObjectTags ()Z
66+
public final fun getVerifyValueTags ()Z
67+
public fun toString ()Ljava/lang/String;
68+
}
69+
70+
public abstract interface class kotlinx/serialization/cbor/CborDecoder : kotlinx/serialization/encoding/Decoder {
71+
public abstract fun getCbor ()Lkotlinx/serialization/cbor/Cbor;
72+
}
73+
74+
public final class kotlinx/serialization/cbor/CborDecoder$DefaultImpls {
75+
public static fun decodeNullableSerializableValue (Lkotlinx/serialization/cbor/CborDecoder;Lkotlinx/serialization/DeserializationStrategy;)Ljava/lang/Object;
76+
public static fun decodeSerializableValue (Lkotlinx/serialization/cbor/CborDecoder;Lkotlinx/serialization/DeserializationStrategy;)Ljava/lang/Object;
77+
}
78+
79+
public abstract interface class kotlinx/serialization/cbor/CborEncoder : kotlinx/serialization/encoding/Encoder {
80+
public abstract fun getCbor ()Lkotlinx/serialization/cbor/Cbor;
81+
}
82+
83+
public final class kotlinx/serialization/cbor/CborEncoder$DefaultImpls {
84+
public static fun beginCollection (Lkotlinx/serialization/cbor/CborEncoder;Lkotlinx/serialization/descriptors/SerialDescriptor;I)Lkotlinx/serialization/encoding/CompositeEncoder;
85+
public static fun encodeNotNullMark (Lkotlinx/serialization/cbor/CborEncoder;)V
86+
public static fun encodeNullableSerializableValue (Lkotlinx/serialization/cbor/CborEncoder;Lkotlinx/serialization/SerializationStrategy;Ljava/lang/Object;)V
87+
public static fun encodeSerializableValue (Lkotlinx/serialization/cbor/CborEncoder;Lkotlinx/serialization/SerializationStrategy;Ljava/lang/Object;)V
2688
}
2789

2890
public final class kotlinx/serialization/cbor/CborKt {
2991
public static final fun Cbor (Lkotlinx/serialization/cbor/Cbor;Lkotlin/jvm/functions/Function1;)Lkotlinx/serialization/cbor/Cbor;
3092
public static synthetic fun Cbor$default (Lkotlinx/serialization/cbor/Cbor;Lkotlin/jvm/functions/Function1;ILjava/lang/Object;)Lkotlinx/serialization/cbor/Cbor;
3193
}
3294

95+
public abstract interface annotation class kotlinx/serialization/cbor/CborLabel : java/lang/annotation/Annotation {
96+
public abstract fun label ()J
97+
}
98+
99+
public synthetic class kotlinx/serialization/cbor/CborLabel$Impl : kotlinx/serialization/cbor/CborLabel {
100+
public fun <init> (J)V
101+
public final synthetic fun label ()J
102+
}
103+
104+
public final class kotlinx/serialization/cbor/CborTag {
105+
public static final field BASE16 J
106+
public static final field BASE64 J
107+
public static final field BASE64_URL J
108+
public static final field BIGFLOAT J
109+
public static final field BIGNUM_NEGAIVE J
110+
public static final field BIGNUM_POSITIVE J
111+
public static final field CBOR_ENCODED_DATA J
112+
public static final field CBOR_SELF_DESCRIBE J
113+
public static final field DATE_TIME_EPOCH J
114+
public static final field DATE_TIME_STANDARD J
115+
public static final field DECIMAL_FRACTION J
116+
public static final field INSTANCE Lkotlinx/serialization/cbor/CborTag;
117+
public static final field MIME_MESSAGE J
118+
public static final field REGEX J
119+
public static final field STRING_BASE64 J
120+
public static final field STRING_BASE64_URL J
121+
public static final field URI J
122+
}
123+
124+
public abstract interface annotation class kotlinx/serialization/cbor/KeyTags : java/lang/annotation/Annotation {
125+
public abstract fun tags ()[J
126+
}
127+
128+
public synthetic class kotlinx/serialization/cbor/KeyTags$Impl : kotlinx/serialization/cbor/KeyTags {
129+
public synthetic fun <init> ([JLkotlin/jvm/internal/DefaultConstructorMarker;)V
130+
public final synthetic fun tags ()[J
131+
}
132+
133+
public abstract interface annotation class kotlinx/serialization/cbor/ObjectTags : java/lang/annotation/Annotation {
134+
public abstract fun tags ()[J
135+
}
136+
137+
public synthetic class kotlinx/serialization/cbor/ObjectTags$Impl : kotlinx/serialization/cbor/ObjectTags {
138+
public synthetic fun <init> ([JLkotlin/jvm/internal/DefaultConstructorMarker;)V
139+
public final synthetic fun tags ()[J
140+
}
141+
142+
public abstract interface annotation class kotlinx/serialization/cbor/ValueTags : java/lang/annotation/Annotation {
143+
public abstract fun tags ()[J
144+
}
145+
146+
public synthetic class kotlinx/serialization/cbor/ValueTags$Impl : kotlinx/serialization/cbor/ValueTags {
147+
public synthetic fun <init> ([JLkotlin/jvm/internal/DefaultConstructorMarker;)V
148+
public final synthetic fun tags ()[J
149+
}
150+

0 commit comments

Comments
 (0)