PYTHON-5144 - Add async performance benchmarks #2188

NoahStapp · 2025-03-07T19:35:37Z

First draft.

ShaneHarvey · 2025-03-10T18:13:34Z

.evergreen/config.yml


-      # Platform notes
+    # Platform notes


I was going to ask if this was an intentional change but then I noticed all these platform notes are long out of date. Let's remove them?

ShaneHarvey · 2025-03-10T18:33:33Z

test/performance/async_perf_test.py

+
+    async def do_task(self):
+        command = self.client.perftest.command
+        await asyncio.gather(*[command("hello", True) for _ in range(NUM_DOCS)])


For this and other asyncio.gather calls can we replace with the concurrent() helper?

In what way? Currently each test calls concurrent(self.n_tasks, self.do_task), which results in self.n_tasks calls of self.do_tasks, each of which has a gather call for NUM_DOCS coroutines.

Oh I see the real problem. We're running NUM_DOCS concurrent tasks here even though this benchmark is supposed to be serial. We need to change this to:

async def do_task(self): command = self.client.perftest.command for _ in range(NUM_DOCS): await command("hello", True)

Only the concurrent tests should be using asyncio.gather.

Ah, so we need to have separate serial benchmarks and concurrent ones using the same tests? So only the benchmarks with n_tasks > 1 should use gather?

ShaneHarvey · 2025-03-10T18:35:28Z

test/performance/async_perf_test.py

+
+
+# class TestRunCommand800Tasks(TestRunCommand):
+#     n_tasks = 800


Any reason these are commented out?

Whoops--these were benchmarks to measure extremely concurrent workloads, they take too long to run regularly imo.

ShaneHarvey · 2025-03-10T18:45:11Z

test/performance/async_perf_test.py

+
+async def concurrent(n_tasks, func):
+    tasks = [func() for _ in range(n_tasks)]
+    await asyncio.gather(*tasks)


In the threaded version of the tests we run X threads each running a loop of operations so we're comparing different things here. Perhaps we should change the sync version to use a ThreadPoolExecutor and farm out the operations one by one. I'll create a ticket for it.

NoahStapp · 2025-03-10T18:51:43Z

Run: https://spruce.mongodb.com/version/67cb3e4374cee300078de8b0/tasks?sorts=STATUS%3AASC%3BBASE_STATUS%3ADESC

ShaneHarvey · 2025-03-11T18:15:46Z

test/performance/async_perf_test.py

+
+    async def do_task(self):
+        command = self.client.perftest.command
+        await asyncio.gather(*[command("hello", True) for _ in range(NUM_DOCS)])


Oh I see the real problem. We're running NUM_DOCS concurrent tasks here even though this benchmark is supposed to be serial. We need to change this to:

async def do_task(self): command = self.client.perftest.command for _ in range(NUM_DOCS): await command("hello", True)

Only the concurrent tests should be using asyncio.gather.

NoahStapp · 2025-03-11T19:06:42Z

The test failure is also present in master, tracked in jira.mongodb.org/browse/PYTHON-5198.

NoahStapp · 2025-03-12T13:06:08Z

A run of the latest changes: https://spruce.mongodb.com/version/67d0840be58c830007b75115/tasks?sorts=STATUS%3AASC%3BBASE_STATUS%3ADESC.

ShaneHarvey · 2025-03-12T19:12:26Z

I did some wrangling and here's the comparison on the last patch (8.0 non-ssl):

test	async/sync	async	sync
FindManyAndEmptyCursor	1.035496424	69.77490806	67.38305075
FindManyAndEmptyCursor8Tasks	1.507496722	71.0279643	47.11649668
FindManyAndEmptyCursor80Tasks	#DIV/0!	64.47844669	0
FindOneByID	0.2723894945	1.211740536	4.448558262
FindOneByID8Tasks	0.5625644218	2.341984316	4.163050888
FindOneByID80Tasks	#DIV/0!	2.12813611	0
GridFsDownload	1.250465656	794.822496	635.6212123
GridFsUpload	1.134718501	502.2968367	442.6620666
LargeDocBulkInsert	0.9867934465	96.99364676	98.29174191
LargeDocClientBulkInsert	1.001660327	83.80030809	83.66140285
LargeDocInsertOne	1.54110005	153.1988567	99.40876757
RunCommand	0.4323049556	0.0280939154	0.0649863367
RunCommand8Tasks	0.5468943557	0.0270325139	0.0494291331
RunCommand80Tasks	#DIV/0!	0.0259197011	0
SmallDocBulkInsert	0.9685460146	35.55207278	36.70664299
SmallDocBulkMixedOps	0.5066807796	0.3225225767	0.6365399867
SmallDocClientBulkInsert	0.944401396	22.55657991	23.88452623
SmallDocClientBulkMixedOps	1.027585166	1.574819428	1.532543948
SmallDocInsertOne	0.5590928165	0.4870381601	0.8711221924

For the "async/sync" column, >1 means async throughput is higher and <1 means sync is higher.

What I did is use the Download CSV link on the EVG perf panel, import both files into Sheets, then sort by test name, delete all columns but "patch_result" and "test", then combine the sheets. Is there an easier way to compare the results?

I'd like to see the same for the 6.0 ssl tasks.

ShaneHarvey · 2025-03-12T19:15:11Z

test/performance/async_perf_test.py

+
+    async def do_task(self):
+        command = self.client.perftest.command
+        await asyncio.gather(*[command("hello", True) for _ in range(NUM_DOCS)])


We're still using gather() here and below accidentally. concurrent() should be the only place gather() is used.

Using asyncio.gather() here spawns a new task for each command() call. Are you picturing a different model of concurrency than the one in use here?

Right but we don't want to spawn concurrent tasks in any benchmark besides the XXTasks ones and in those we only want to spawn XX concurrent tasks, not NUM_DOCS concurrent tasks.

But the XXTask benchmarks will then only be testing small concurrent workloads that contain lots of looping. Wouldn't we get a better measurement of concurrent performance with less looping and more concurrent tasks?

Sure but that's what we want. Currently the benchmark says "FindOneByID80Tasks" but it's not running 80 tasks, it's running 80*NUM_DOCS tasks. That's a bug.

Is that what we want? We want some benchmarks that run lots of concurrent tasks that more accurately represents a highly concurrent workload. If the issue here is just the naming, then I can add those benchmarks separately.

NoahStapp · 2025-03-12T19:34:55Z

Writing a script to ingest the raw JSON from the EG perf panel and spit out this comparison should be pretty straightforward. Let me take a stab at it.

NoahStapp · 2025-03-12T20:33:43Z

Using https://gist.github.com/NoahStapp/16aa44b865d009261c564c841a0685ad, here are the results from a run of the 6.0-ssl tasks:

Async throughput / sync throughput:
	LargeDocBulkInsert: 1.0078337905239618
	GridFsDownload: 1.2246978241796695
	SmallDocBulkInsert: 1.002887208998979
	SmallDocInsertOne: 0.6272763199110158
	SmallDocClientBulkMixedOps: 0.9869360165276234
	LargeDocInsertOne: 1.5416201189677572
	RunCommand: 0.4778627262846021
	SmallDocClientBulkInsert: 0.9455995942235073
	GridFsUpload: 1.2192035545671303
	FindOneByID: 0.3466335507326009
	LargeDocClientBulkInsert: 1.0260300595712932
	FindManyAndEmptyCursor: 1.0977886637249936
	SmallDocBulkMixedOps: 0.529439661767573
Async-only test throughput:
	RunCommand8Tasks: 0.0261686816727816
	FindOneByID80Tasks: 2.1194720587431
	FindManyAndEmptyCursor8Tasks: 69.6872796695631
	RunCommand80Tasks: 0.0245952602969892
	FindOneByID8Tasks: 2.27405421145451
	FindManyAndEmptyCursor80Tasks: 63.8084299953291

The run itself: https://spruce.mongodb.com/version/67d0840be58c830007b75115/tasks?sorts=STATUS%3AASC%3BBASE_STATUS%3ADESC

NoahStapp · 2025-03-13T17:42:27Z

Updated 8.0 comparison:

Async throughput / sync throughput:
	FindManyAndEmptyCursor: 1.0713559030068505
	FindManyAndEmptyCursor8Tasks: 1.4795313246609896
	FindOneByID: 0.5400884974214064
	FindOneByID8Tasks: 0.7874637480943002
	GridFsDownload: 1.1928245548194891
	GridFsUpload: 1.1288262436715597
	LargeDocBulkInsert: 1.0018917671180718
	LargeDocClientBulkInsert: 1.0169017022503946
	LargeDocInsertOne: 1.5453087596553046
	RunCommand: 0.43045923951875237
	RunCommand8Tasks: 0.7046663599353192
	SmallDocBulkInsert: 0.9718963479412474
	SmallDocBulkMixedOps: 0.5076476995225617
	SmallDocClientBulkInsert: 0.9594405742454791
	SmallDocClientBulkMixedOps: 1.032939212184362
	SmallDocInsertOne: 0.5446315066024968
Async-only test throughput:
	FindManyAndEmptyCursor80Tasks: 58.7702816020482
	FindManyAndEmptyCursorManyTasks: 53.7951227130617
	FindOneByID80Tasks: 3.12601199692593
	FindOneByIDManyTasks: 2.25178724747275
	RunCommand80Tasks: 0.0307018919795521
	RunCommandManyTasks: 0.0259767188333268

6.0 with ssl:

Async throughput / sync throughput:
	FindManyAndEmptyCursor: 1.1148888328889521
	FindManyAndEmptyCursor8Tasks: 1.2292068139214227
	FindOneByID: 0.5628404916816315
	FindOneByID8Tasks: 0.7250775180506075
	GridFsDownload: 1.2265791168581033
	GridFsUpload: 1.1172493351535546
	LargeDocBulkInsert: 1.0241106971206142
	LargeDocClientBulkInsert: 1.0411601857296626
	LargeDocInsertOne: 1.6042860101449359
	RunCommand: 0.4622670498004722
	RunCommand8Tasks: 0.6483034992665351
	SmallDocBulkInsert: 1.0083055402152894
	SmallDocBulkMixedOps: 0.5265934792474817
	SmallDocClientBulkInsert: 0.9523263557731153
	SmallDocClientBulkMixedOps: 0.9849426645112269
	SmallDocInsertOne: 0.6176642987581807
Async-only test throughput:
	FindManyAndEmptyCursor80Tasks: 63.0472761170005
	FindManyAndEmptyCursorManyTasks: 57.4746242684951
	FindOneByID80Tasks: 3.1210268089663
	FindOneByIDManyTasks: 2.43428121205249
	RunCommand80Tasks: 0.0312356720588811
	RunCommandManyTasks: 0.0279175701126114

Patch: https://spruce.mongodb.com/version/67d309f7d22fb0000771a758/tasks?sorts=STATUS%3AASC%3BBASE_STATUS%3ADESC

ShaneHarvey · 2025-03-13T18:14:07Z

test/performance/async_perf_test.py

+class TestSmallDocInsertOne(SmallDocInsertTest, AsyncPyMongoTestCase):
+    async def do_task(self):
+        insert_one = self.corpus.insert_one
+        await asyncio.gather(*[insert_one(doc) for doc in self.documents])


Still need to remove this asyncio.gather

ShaneHarvey · 2025-03-13T18:14:10Z

test/performance/async_perf_test.py

+class TestLargeDocInsertOne(LargeDocInsertTest, AsyncPyMongoTestCase):
+    async def do_task(self):
+        insert_one = self.corpus.insert_one
+        await asyncio.gather(*[insert_one(doc) for doc in self.documents])


Still need to remove this asyncio.gather

ShaneHarvey · 2025-03-13T18:15:18Z

test/performance/async_perf_test.py

+    n_tasks = 80
+
+
+class TestRunCommandManyTasks(TestRunCommand):


Should we call these "UnlimitedTasks" rather than "Many"?

ShaneHarvey · 2025-03-13T18:23:21Z

test/performance/async_perf_test.py

+
+
+class TestFindManyAndEmptyCursorManyTasks(TestFindManyAndEmptyCursor):
+    n_tasks = 1000


Should we remove the n_tasks=1000 benchmarks? We already have 80, do we need 1000?

Agreed. This benchmark doesn't meaningfully scale with concurrency.

PYTHON-5144 - Add async performance benchmarks

d22abeb

NoahStapp requested a review from ShaneHarvey March 7, 2025 19:35

ShaneHarvey requested changes Mar 10, 2025

View reviewed changes

Address review

d8f57ac

ShaneHarvey requested changes Mar 11, 2025

View reviewed changes

Add serial benchmarks

dcbf212

NoahStapp requested a review from ShaneHarvey March 11, 2025 18:30

NoahStapp added 3 commits March 11, 2025 14:31

Merge branch 'master' into PYTHON-5144

4ba8513

Fix typing

ab8d3d6

Fix config

847e534

ShaneHarvey requested changes Mar 12, 2025

View reviewed changes

Clearer tests

2f2536c

NoahStapp requested a review from ShaneHarvey March 13, 2025 17:42

ShaneHarvey requested changes Mar 13, 2025

View reviewed changes

ShaneHarvey reviewed Mar 13, 2025

View reviewed changes

Better concurrent benchmarks

363cc92

NoahStapp requested a review from ShaneHarvey March 13, 2025 18:44

ShaneHarvey approved these changes Mar 13, 2025

View reviewed changes

NoahStapp merged commit e6e8650 into mongodb:master Mar 13, 2025
31 of 33 checks passed



		# class TestRunCommand800Tasks(TestRunCommand):
		# n_tasks = 800



		class TestFindManyAndEmptyCursorManyTasks(TestFindManyAndEmptyCursor):
		n_tasks = 1000

PYTHON-5144 - Add async performance benchmarks #2188

PYTHON-5144 - Add async performance benchmarks #2188

Uh oh!

Conversation

NoahStapp commented Mar 7, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

NoahStapp Mar 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ShaneHarvey Mar 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

NoahStapp commented Mar 10, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

NoahStapp commented Mar 11, 2025

Uh oh!

NoahStapp commented Mar 12, 2025

Uh oh!

ShaneHarvey commented Mar 12, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ShaneHarvey Mar 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

NoahStapp commented Mar 12, 2025 • edited by ShaneHarvey Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

NoahStapp commented Mar 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

NoahStapp commented Mar 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

NoahStapp Mar 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

NoahStapp Mar 11, 2025 •

edited

Loading

ShaneHarvey Mar 10, 2025 •

edited

Loading

ShaneHarvey Mar 12, 2025 •

edited

Loading

NoahStapp commented Mar 12, 2025 •

edited by ShaneHarvey

Loading

NoahStapp commented Mar 12, 2025 •

edited

Loading

NoahStapp commented Mar 13, 2025 •

edited

Loading

NoahStapp Mar 13, 2025 •

edited

Loading