Skip to content

Conversation

DeeJayLSP
Copy link
Contributor

@DeeJayLSP DeeJayLSP commented Sep 19, 2025

HashMap has a reserve() function, so I thought: why not expose it to Dictionary? There is a lot of places within the engine we could use it to reduce bottleneck.

Additionally, implementing is quite simple since all needed is to call the HashMap's own reserve(), much like how clear() in Dictionary is mostly a wrapper to its internal HashMap's clear().

I noticed in the GDScript VM that the constructor for both untyped and typed Dictionary uses a for loop with known final size, so that was possibly the best candidate for a stress test. The test consists in declaring a Dictionary with 63 elements mid-function 2²¹ times.

Script
extends Node


func _ready() -> void:
	benchmark_untyped()
	benchmark_typed()


func benchmark_untyped() -> void:
	var time_start := Time.get_ticks_usec()
	for i in pow(2, 21):
		var x := {
			1: 1,
			2: 2,
			3: 4,
			4: 8,
			5: 16,
			6: 32,
			7: 64,
			8: 128,
			9: 256,
			10: 512,
			11: 1024,
			12: 2048,
			13: 4096,
			14: 8192,
			15: 16384,
			16: 32768,
			17: 65536,
			18: 131072,
			19: 262144,
			20: 524288,
			21: 1048576,
			22: 2097152,
			23: 4194304,
			24: 8388608,
			25: 16777216,
			26: 33554432,
			27: 67108864,
			28: 134217728,
			29: 268435456,
			30: 536870912,
			31: 1073741824,
			32: 2147483648,
			33: 4294967296,
			34: 8589934592,
			35: 17179869184,
			36: 34359738368,
			37: 68719476736,
			38: 137438953472,
			39: 274877906944,
			40: 549755813888,
			41: 1099511627776,
			42: 2199023255552,
			43: 4398046511104,
			44: 8796093022208,
			45: 17592186044416,
			46: 35184372088832,
			47: 70368744177664,
			48: 140737488355328,
			49: 281474976710656,
			50: 562949953421312,
			51: 1125899906842624,
			52: 2251799813685248,
			53: 4503599627370496,
			54: 9007199254740992,
			55: 18014398509481984,
			56: 36028797018963968,
			57: 72057594037927936,
			58: 144115188075855872,
			59: 288230376151711744,
			60: 576460752303423488,
			61: 1152921504606846976,
			62: 2305843009213693952,
			63: 4611686018427387904,
		}
	var time_end := Time.get_ticks_usec()
	print("time untyped: %dms" % ((time_end - time_start) / 1000))


func benchmark_typed() -> void:
	var time_start := Time.get_ticks_usec()
	for i in pow(2, 21):
		var x: Dictionary[int, int] = {
			1: 1,
			2: 2,
			3: 4,
			4: 8,
			5: 16,
			6: 32,
			7: 64,
			8: 128,
			9: 256,
			10: 512,
			11: 1024,
			12: 2048,
			13: 4096,
			14: 8192,
			15: 16384,
			16: 32768,
			17: 65536,
			18: 131072,
			19: 262144,
			20: 524288,
			21: 1048576,
			22: 2097152,
			23: 4194304,
			24: 8388608,
			25: 16777216,
			26: 33554432,
			27: 67108864,
			28: 134217728,
			29: 268435456,
			30: 536870912,
			31: 1073741824,
			32: 2147483648,
			33: 4294967296,
			34: 8589934592,
			35: 17179869184,
			36: 34359738368,
			37: 68719476736,
			38: 137438953472,
			39: 274877906944,
			40: 549755813888,
			41: 1099511627776,
			42: 2199023255552,
			43: 4398046511104,
			44: 8796093022208,
			45: 17592186044416,
			46: 35184372088832,
			47: 70368744177664,
			48: 140737488355328,
			49: 281474976710656,
			50: 562949953421312,
			51: 1125899906842624,
			52: 2251799813685248,
			53: 4503599627370496,
			54: 9007199254740992,
			55: 18014398509481984,
			56: 36028797018963968,
			57: 72057594037927936,
			58: 144115188075855872,
			59: 288230376151711744,
			60: 576460752303423488,
			61: 1152921504606846976,
			62: 2305843009213693952,
			63: 4611686018427387904,
		}
	var time_end := Time.get_ticks_usec()
	print("time typed: %dms" % ((time_end - time_start) / 1000))

I made two template release builds, with and without this PR, and ran the script.

Before:

time untyped: 7852ms
time typed: 27344ms

After:

time untyped: 7253ms
time typed: 23191ms

There are countless places where reserve() could be used, so this is just a beginning.

@DeeJayLSP DeeJayLSP requested review from a team as code owners September 19, 2025 22:56
Copy link
Member

@Ivorforce Ivorforce left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great idea! reserve is especially useful for hashing types because on growth, it needs to re-hash, leading to a lot of wasted cpu cycles.
I just got one interface design comment, otherwise I think we should move forwards with this.

@clayjohn
Copy link
Member

so I thought: why not expose it to Dictionary

"Why not?" is never a good reason to add something to Core. While the code is minimal, it has no reason to be exposed to end users in the first place. It just adds another entry to the API that users will never touch. If sizing Dictionaries is a bottleneck, then you are already using the wrong data structure and no amount of calling reserve() is going to save you

This is exactly the sort of improvement contemplated by the first two of our contributor best practices:
https://contributing.godotengine.org/en/latest/engine/guidelines/best_practices.html#to-solve-the-problem-it-has-to-exist-in-the-first-place

@DeeJayLSP
Copy link
Contributor Author

DeeJayLSP commented Sep 20, 2025

"Why not?" is never a good reason to add something to Core.

You might have understood things wrong. Dictionary is used in countless places within the engine that can benefit from this.

While the code is minimal, it has no reason to be exposed to end users in the first place. It just adds another entry to the API that users will never touch.

Except it isn't? It's only exposed for internal use within the engine where beneficial.

My apologies if using the word "expose" led to wrong conclusions. I did a bit of rewording.

@clayjohn
Copy link
Member

That's my bad. I didn't notice that there wasn't any binding code touched. That takes away my concern about this being exposed to end users. Adding something like this for internal use is a lot less of a problem.

That being said, if it's truly a benefit. Then this PR should be using it somewhere where dictionary allocations were a problem. Remember, the problem always needs to come first. We don't add solutions to the engine and then plan on finding problems. We start by identifying a problem, then we make changes to solve that problem.

This may very well be the correct solution to a problem, but in order to merge it, we need to know what that problem is.

It isn't enough to say that some places could benefit from this change. Show me what places do benefit from this change by profiling them before and after the change.

@Shadows-of-Fire
Copy link
Contributor

If sizing Dictionaries is a bottleneck, then you are already using the wrong data structure and no amount of calling reserve() is going to save you

That seems like a bit of a nonsensical take on this particular feature. reserve() is a standard feature for maps in most languages, and if you know the size AOT you can optimize with a reservation, which can be significant for large dictionaries or for manual copies or transformations of large dictionaries.

I would consider this a core library feature for Dictionary that is currently absent. Given that the cost of adding the API surface is effectively zero (as the API is a simple passthrough to the underlying HashMap), there would not be sufficient justification to deny it.

This may very well be the correct solution to a problem, but in order to merge it, we need to know what that problem is.

You do know what that problem is. Churn in a HashMap is wasteful. You don't need to have strict performance numbers for something that is this straightforward.

@DeeJayLSP
Copy link
Contributor Author

DeeJayLSP commented Sep 20, 2025

This may very well be the correct solution to a problem, but in order to merge it, we need to know what that problem is.

There are numerous situations within the engine where known final size dictionaries could be built faster for basically free by reserving. Isn't this a problem?

The point is that doing everything inside a single PR would be hard to keep track of.

Take #105928 for example, which does the same thing but for Vector, Array and String. It adds the possibility, and there are certainly more cases for that than this PR, but doesn't apply it anywhere.

The reason I did apply on the GDScript VM was mostly because that case was able to be reproduced and benchmarked within GDScript.

@clayjohn
Copy link
Member

There are numerous situations within the engine where known final size dictionaries could be built faster for basically free by reserving. Isn't this a problem?

Sure, find one and show me. If there are numerous it should not be hard. I'm really not asking for much.

The point is that doing everything inside a single PR would be hard to keep track of.

Sure, don't do everything then. Just solve the low hanging fruit issues. As it stands this PR does nothing, I am asking for it to do something before merging.

@clayjohn
Copy link
Member

That seems like a bit of a nonsensical take on this particular feature. reserve() is a standard feature for maps in most languages, and if you know the size AOT you can optimize with a reservation, which can be significant for large dictionaries or for manual copies or transformations of large dictionaries.

@Shadows-of-Fire Dictionary isn't our standard map in the engine, Hashmap is. By design we don't use Dictionary in any performance sensitive areas. Which is why I said above, if we find Dictionary in a performance sensitive areas, we are likely using the wrong data structure.

As I said above, if it's so obvious that this will fix a problem somewhere, then just find that problem and point it out. I'm not saying we shouldn't merge this, quite the contrary, I'm saying we should merge it after we have identified the problem it is solving and then used it to solve that problem.

If the problem is so obvious, you should have no problem finding it. To be blunt, don't waste your time arguing about whether something is useful in theory. I'm certain that 'reserve()' is useful in theory. Instead, spend your time showing where it is useful 'in practice'

@DeeJayLSP
Copy link
Contributor Author

DeeJayLSP commented Sep 20, 2025

A few cases I could find in core/, performance critical or not:

  • Engine::get_version/author/copyright/donor/license_info()
  • ProjectSettings::_add_builtin_input_map()
  • Image::_get_data()
  • marshalls.cpp (decode_variant())
  • PListNode::get_value()
  • ResourceLoaderBinary::parse_variant()
  • ResourceFormatLoader::rename_dependencies()
  • Resource::_duplicate_recursive()
  • AStarGrid2D::get_point_data_in_region()
  • TriangleMesh::intersect_*_scriptwrap()
  • A few functions within Object
  • Script::_get_script_constant_map() and ScriptServer::save_global_classes()
  • OS::get_memory_info()
  • Most functions in Time
  • TranslationPO::_get_messages()
  • Translation::_get_messages()

The one case I applied reserve() in this PR should speed up Dictionary construction from GDScript, as seen in my benchmark results. And no, we can't use HashMap there as it explicitly wants a Dictionary.

As it stands this PR does nothing

...

@Ivorforce
Copy link
Member

Ivorforce commented Sep 20, 2025

@clayjohn Growth is especially costly for hashing types, so reserve is especially useful. The PR currently optimizes the case of constructing dictionaries in GDScript, which sees a 8-15% gain from GDScript. This is already motivates the PR in its own right.

In general though, while we don't use Dictionary internally, we can expect this function to be useful when we have to construct Dictionary for the user (e.g. for function returns).
The main alternative would be offering copy / move constructors, which is generally better but not always feasible.

Copy link
Member

@Ivorforce Ivorforce left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my opinion, this is needed. But I'd like to clear up concerns with @clayjohn before merge.

@Mickeon
Copy link
Member

Mickeon commented Sep 20, 2025

"Expose" is a very strong word in Godot, as you can see. I would suggest renaming the commit and title to something else. Maybe "Allow Dictionary to use reserve() in the GDScript VM"?

@AThousandShips
Copy link
Member

I'd say "Add reserve and use in GDScriptVM"

@DeeJayLSP DeeJayLSP changed the title Expose reserve() to Dictionary, use it on the GDScript VM Add reserve() to Dictionary, apply to constructors on GDScript VM Sep 20, 2025
@DeeJayLSP
Copy link
Contributor Author

DeeJayLSP commented Sep 20, 2025

HashMap has a constructor where you can input the initial capacity and it will reserve right away. Would it be wise to add one to Dictionary too?

@AThousandShips
Copy link
Member

I'd keep that separately, that would only work for the C++ side due to overrides etc. so it's less critical, we don't have it for LocalVector either, it just saves one line

@DeeJayLSP
Copy link
Contributor Author

Rebased on top of #110717 changes.

I also noticed that Dictionary::recursive_duplicate() has a fill loop. I added a reserve there.

Copy link
Member

@Ivorforce Ivorforce left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-approving after changes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants