Some benchmark cleanup by hassila · Pull Request #3 · mustiikhalil/flatbuffers

hassila · 2023-11-21T11:25:09Z

Added ARC metric, some overall cleanup to use the built-in support for inner loops.

mustiikhalil · 2023-11-21T11:29:17Z

tests/swift/benchmarks/Benchmarks/FlatbuffersBenchmarks/FlatbuffersBenchmarks.swift


-  Benchmark("structs") { benchmark in
-    let structCount = 1_000_000
-
+  Benchmark("Structs", configuration: kiloConfiguration) { benchmark in
+    let structCount = 1_000


why are the structCount 1_000 now?

Yeah, should have commented that - sorry - it would crash due to what looks like OOM with 1M:

* thread #2, queue = 'com.apple.root.default-qos.cooperative', stop reason = Swift runtime failure: Not enough bits to represent the passed value frame #0: 0x000000010012a47c FlatbuffersBenchmarks`Int.convertToPowerofTwo.getter [inlined] Swift runtime failure: Not enough bits to represent the passed value at <compiler-generated>:0 [opt] * frame #1: 0x000000010012a47c FlatbuffersBenchmarks`Int.convertToPowerofTwo.getter [inlined] generic specialization <Swift.UInt32, Swift.Int> of Swift.UnsignedInteger< where τ_0_0: Swift.FixedWidthInteger>.init<τ_0_0 where τ_1_0: Swift.BinaryInteger>(τ_1_0) -> τ_0_0 at <compiler-generated>:0 [opt] frame #2: 0x000000010012a47c FlatbuffersBenchmarks`Int.convertToPowerofTwo.getter(self=4294967296) at Int+extension.swift:31:13 [opt] frame #3: 0x00000001001202c8 FlatbuffersBenchmarks`ByteBuffer.Storage.reallocate(size=80, writerSize=2147483576, alignment=8, self=0x00000001005c4b70) at ByteBuffer.swift:88:27 [opt] frame #4: 0x0000000100120c98 FlatbuffersBenchmarks`closure #1 (Swift.UnsafeRawBufferPointer) -> () in FlatBuffers.ByteBuffer.push<τ_0_0 where τ_0_0: FlatBuffers.NativeStruct>(elements: Swift.Array<τ_0_0>) -> () at ByteBuffer.swift:371:16 frame #5: 0x0000000100129720 FlatbuffersBenchmarks`partial apply for closure #1 in ByteBuffer.push<A>(elements:) at <compiler-generated>:0 [opt] frame #6: 0x0000000190348b6c libswiftCore.dylib`Swift.Array.withUnsafeBytes<τ_0_0>((Swift.UnsafeRawBufferPointer) throws -> τ_1_0) throws -> τ_1_0 + 352 frame #7: 0x0000000100126efc FlatbuffersBenchmarks`FlatBufferBuilder.createVector<A>(ofStructs:) [inlined] FlatBuffers.ByteBuffer.push<τ_0_0 where τ_0_0: FlatBuffers.NativeStruct>(elements=<unavailable>, self=FlatBuffers.ByteBuffer @ 0x000000016fe863e0) -> () at ByteBuffer.swift:246:14 [opt] frame #8: 0x0000000100126ec0 FlatbuffersBenchmarks`FlatBufferBuilder.createVector<A>(structs=<unavailable>, self=FlatBuffers.FlatBufferBuilder @ 0x000000016fe863d8) at FlatBufferBuilder.swift:626:9 [opt] frame #9: 0x0000000100130d80 FlatbuffersBenchmarks`closure #12 in closure #1 in variable initialization expression of benchmarks(benchmark=<unavailable>, array=5 values) at FlatbuffersBenchmarks.swift:153:25 [opt] frame #10: 0x00000001000a51d0 FlatbuffersBenchmarks`BenchmarkExecutor.run(_:) at Benchmark.swift:344:13 [opt] frame #11: 0x00000001000a51b8 FlatbuffersBenchmarks`BenchmarkExecutor.run(benchmark=0x00000001006b0d80, self=0x0000000100605480) at BenchmarkExecutor.swift:53:23 [opt] frame #12: 0x00000001000c6988 FlatbuffersBenchmarks`BenchmarkRunner.run(self=Benchmark.BenchmarkRunner @ 0x0000000100b8c440) at BenchmarkRunner.swift:192:49 [opt] frame #13: 0x00000001000c2f84 FlatbuffersBenchmarks`static BenchmarkRunnerHooks.main(self=0x00000001001ad048) at BenchmarkRunner.swift:92 [opt] frame #14: 0x000000010015fd58 FlatbuffersBenchmarks`specialized thunk for @escaping @convention(thin) @async () -> () at <compiler-generated>:0 [opt]

(previously the inner loop was a single iteration, when we moved up to kiloConfiguration we run 1K inner loops, so it gets to 1M total - but I guess the question here is really "what do you want to measure"? - we are putting 1M vectors into the single fb here - what is the intended desired benchmark really?)

Can we keep the million but make the iterations less? Or we clear the buffer after each iteration?

Clearing buffer gave runtime of 220 seconds (+another 220 seconds for the single warmup iteration).

I will return it back to how it was with single iteration, it still gives 10+ samples and runtime ~223ms.

Structs ╒════════════════════════════╤═════════╤═════════╤═════════╤═════════╤═════════╤═════════╤═════════╤═════════╕ │ Metric │ p0 │ p25 │ p50 │ p75 │ p90 │ p99 │ p100 │ Samples │ ╞════════════════════════════╪═════════╪═════════╪═════════╪═════════╪═════════╪═════════╪═════════╪═════════╡ │ Malloc (total) │ 21 │ 21 │ 21 │ 21 │ 21 │ 21 │ 21 │ 13 │ ├────────────────────────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┤ │ Memory (resident peak) (M) │ 155 │ 155 │ 155 │ 155 │ 156 │ 156 │ 156 │ 13 │ ├────────────────────────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┤ │ Releases (K) │ 6000 │ 6000 │ 6000 │ 6000 │ 6000 │ 6000 │ 6000 │ 13 │ ├────────────────────────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┤ │ Time (total CPU) (ms) │ 223 │ 223 │ 224 │ 225 │ 226 │ 228 │ 228 │ 13 │ ├────────────────────────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┤ │ Time (wall clock) (ms) │ 224 │ 225 │ 225 │ 226 │ 226 │ 229 │ 229 │ 13 │ ╘════════════════════════════╧═════════╧═════════╧═════════╧═════════╧═════════╧═════════╧═════════╧═════════╛

But in the code above we are only measuring the amount it takes to add 5 structs into the fb right? or are we measuring the amount of time we add 5 structs a million time into a buffer? as in the total time?

That is 5 structs a million times into a buffer, for _ in benchmark.scaledIterations { is 1M times

Okay then, perfect!

Some benchmark cleanup

1f6bcd3

github-actions bot added the swift label Nov 21, 2023

hassila mentioned this pull request Nov 21, 2023

[Swift] Migrating benchmarks to a newer lib. google/flatbuffers#8168

Merged

mustiikhalil approved these changes Nov 21, 2023

View reviewed changes

hassila added 2 commits November 21, 2023 12:58

Return back to 1M structs

1d826e4

Tweak Structs benchmark

c93b3fc

mustiikhalil merged commit 1310f14 into mustiikhalil:update-struct-pushing-to-buffer Nov 21, 2023

hassila deleted the benchmark-fixes branch November 21, 2023 12:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Some benchmark cleanup#3

Some benchmark cleanup#3
mustiikhalil merged 3 commits intomustiikhalil:update-struct-pushing-to-bufferfrom
hassila:benchmark-fixes

hassila commented Nov 21, 2023 •

edited

Loading

Uh oh!

mustiikhalil Nov 21, 2023

Uh oh!

hassila Nov 21, 2023

Uh oh!

hassila Nov 21, 2023

Uh oh!

mustiikhalil Nov 21, 2023

Uh oh!

hassila Nov 21, 2023

Uh oh!

hassila Nov 21, 2023

Uh oh!

hassila Nov 21, 2023

Uh oh!

mustiikhalil Nov 21, 2023

Uh oh!

hassila Nov 21, 2023

Uh oh!

mustiikhalil Nov 21, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

hassila commented Nov 21, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

hassila commented Nov 21, 2023 •

edited

Loading