Skip to content

PYTHON-5144 - Add async performance benchmarks #2188

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Mar 13, 2025

Conversation

NoahStapp
Copy link
Contributor

First draft.

@NoahStapp NoahStapp requested a review from ShaneHarvey March 7, 2025 19:35

# Platform notes
# Platform notes
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was going to ask if this was an intentional change but then I noticed all these platform notes are long out of date. Let's remove them?


async def do_task(self):
command = self.client.perftest.command
await asyncio.gather(*[command("hello", True) for _ in range(NUM_DOCS)])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this and other asyncio.gather calls can we replace with the concurrent() helper?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In what way? Currently each test calls concurrent(self.n_tasks, self.do_task), which results in self.n_tasks calls of self.do_tasks, each of which has a gather call for NUM_DOCS coroutines.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I see the real problem. We're running NUM_DOCS concurrent tasks here even though this benchmark is supposed to be serial. We need to change this to:

    async def do_task(self):
        command = self.client.perftest.command
        for _ in range(NUM_DOCS):
            await command("hello", True)

Only the concurrent tests should be using asyncio.gather.

Copy link
Contributor Author

@NoahStapp NoahStapp Mar 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, so we need to have separate serial benchmarks and concurrent ones using the same tests? So only the benchmarks with n_tasks > 1 should use gather?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes.



# class TestRunCommand800Tasks(TestRunCommand):
# n_tasks = 800
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason these are commented out?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whoops--these were benchmarks to measure extremely concurrent workloads, they take too long to run regularly imo.


async def concurrent(n_tasks, func):
tasks = [func() for _ in range(n_tasks)]
await asyncio.gather(*tasks)
Copy link
Member

@ShaneHarvey ShaneHarvey Mar 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the threaded version of the tests we run X threads each running a loop of operations so we're comparing different things here. Perhaps we should change the sync version to use a ThreadPoolExecutor and farm out the operations one by one. I'll create a ticket for it.

@NoahStapp
Copy link
Contributor Author


async def do_task(self):
command = self.client.perftest.command
await asyncio.gather(*[command("hello", True) for _ in range(NUM_DOCS)])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I see the real problem. We're running NUM_DOCS concurrent tasks here even though this benchmark is supposed to be serial. We need to change this to:

    async def do_task(self):
        command = self.client.perftest.command
        for _ in range(NUM_DOCS):
            await command("hello", True)

Only the concurrent tests should be using asyncio.gather.

@NoahStapp NoahStapp requested a review from ShaneHarvey March 11, 2025 18:30
@NoahStapp
Copy link
Contributor Author

The test failure is also present in master, tracked in jira.mongodb.org/browse/PYTHON-5198.

@NoahStapp
Copy link
Contributor Author

@ShaneHarvey
Copy link
Member

I did some wrangling and here's the comparison on the last patch (8.0 non-ssl):

test async/sync async sync
FindManyAndEmptyCursor 1.035496424 69.77490806 67.38305075
FindManyAndEmptyCursor8Tasks 1.507496722 71.0279643 47.11649668
FindManyAndEmptyCursor80Tasks #DIV/0! 64.47844669 0
FindOneByID 0.2723894945 1.211740536 4.448558262
FindOneByID8Tasks 0.5625644218 2.341984316 4.163050888
FindOneByID80Tasks #DIV/0! 2.12813611 0
GridFsDownload 1.250465656 794.822496 635.6212123
GridFsUpload 1.134718501 502.2968367 442.6620666
LargeDocBulkInsert 0.9867934465 96.99364676 98.29174191
LargeDocClientBulkInsert 1.001660327 83.80030809 83.66140285
LargeDocInsertOne 1.54110005 153.1988567 99.40876757
RunCommand 0.4323049556 0.0280939154 0.0649863367
RunCommand8Tasks 0.5468943557 0.0270325139 0.0494291331
RunCommand80Tasks #DIV/0! 0.0259197011 0
SmallDocBulkInsert 0.9685460146 35.55207278 36.70664299
SmallDocBulkMixedOps 0.5066807796 0.3225225767 0.6365399867
SmallDocClientBulkInsert 0.944401396 22.55657991 23.88452623
SmallDocClientBulkMixedOps 1.027585166 1.574819428 1.532543948
SmallDocInsertOne 0.5590928165 0.4870381601 0.8711221924

For the "async/sync" column, >1 means async throughput is higher and <1 means sync is higher.

What I did is use the Download CSV link on the EVG perf panel, import both files into Sheets, then sort by test name, delete all columns but "patch_result" and "test", then combine the sheets. Is there an easier way to compare the results?

I'd like to see the same for the 6.0 ssl tasks.


async def do_task(self):
command = self.client.perftest.command
await asyncio.gather(*[command("hello", True) for _ in range(NUM_DOCS)])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're still using gather() here and below accidentally. concurrent() should be the only place gather() is used.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using asyncio.gather() here spawns a new task for each command() call. Are you picturing a different model of concurrency than the one in use here?

Copy link
Member

@ShaneHarvey ShaneHarvey Mar 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right but we don't want to spawn concurrent tasks in any benchmark besides the XXTasks ones and in those we only want to spawn XX concurrent tasks, not NUM_DOCS concurrent tasks.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But the XXTask benchmarks will then only be testing small concurrent workloads that contain lots of looping. Wouldn't we get a better measurement of concurrent performance with less looping and more concurrent tasks?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure but that's what we want. Currently the benchmark says "FindOneByID80Tasks" but it's not running 80 tasks, it's running 80*NUM_DOCS tasks. That's a bug.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that what we want? We want some benchmarks that run lots of concurrent tasks that more accurately represents a highly concurrent workload. If the issue here is just the naming, then I can add those benchmarks separately.

@NoahStapp
Copy link
Contributor Author

NoahStapp commented Mar 12, 2025

Writing a script to ingest the raw JSON from the EG perf panel and spit out this comparison should be pretty straightforward. Let me take a stab at it.

@NoahStapp
Copy link
Contributor Author

NoahStapp commented Mar 12, 2025

Using https://gist.github.com/NoahStapp/16aa44b865d009261c564c841a0685ad, here are the results from a run of the 6.0-ssl tasks:

Async throughput / sync throughput:
	LargeDocBulkInsert: 1.0078337905239618
	GridFsDownload: 1.2246978241796695
	SmallDocBulkInsert: 1.002887208998979
	SmallDocInsertOne: 0.6272763199110158
	SmallDocClientBulkMixedOps: 0.9869360165276234
	LargeDocInsertOne: 1.5416201189677572
	RunCommand: 0.4778627262846021
	SmallDocClientBulkInsert: 0.9455995942235073
	GridFsUpload: 1.2192035545671303
	FindOneByID: 0.3466335507326009
	LargeDocClientBulkInsert: 1.0260300595712932
	FindManyAndEmptyCursor: 1.0977886637249936
	SmallDocBulkMixedOps: 0.529439661767573
Async-only test throughput:
	RunCommand8Tasks: 0.0261686816727816
	FindOneByID80Tasks: 2.1194720587431
	FindManyAndEmptyCursor8Tasks: 69.6872796695631
	RunCommand80Tasks: 0.0245952602969892
	FindOneByID8Tasks: 2.27405421145451
	FindManyAndEmptyCursor80Tasks: 63.8084299953291

The run itself: https://spruce.mongodb.com/version/67d0840be58c830007b75115/tasks?sorts=STATUS%3AASC%3BBASE_STATUS%3ADESC

@NoahStapp
Copy link
Contributor Author

NoahStapp commented Mar 13, 2025

Updated 8.0 comparison:

Async throughput / sync throughput:
	FindManyAndEmptyCursor: 1.0713559030068505
	FindManyAndEmptyCursor8Tasks: 1.4795313246609896
	FindOneByID: 0.5400884974214064
	FindOneByID8Tasks: 0.7874637480943002
	GridFsDownload: 1.1928245548194891
	GridFsUpload: 1.1288262436715597
	LargeDocBulkInsert: 1.0018917671180718
	LargeDocClientBulkInsert: 1.0169017022503946
	LargeDocInsertOne: 1.5453087596553046
	RunCommand: 0.43045923951875237
	RunCommand8Tasks: 0.7046663599353192
	SmallDocBulkInsert: 0.9718963479412474
	SmallDocBulkMixedOps: 0.5076476995225617
	SmallDocClientBulkInsert: 0.9594405742454791
	SmallDocClientBulkMixedOps: 1.032939212184362
	SmallDocInsertOne: 0.5446315066024968
Async-only test throughput:
	FindManyAndEmptyCursor80Tasks: 58.7702816020482
	FindManyAndEmptyCursorManyTasks: 53.7951227130617
	FindOneByID80Tasks: 3.12601199692593
	FindOneByIDManyTasks: 2.25178724747275
	RunCommand80Tasks: 0.0307018919795521
	RunCommandManyTasks: 0.0259767188333268

6.0 with ssl:

Async throughput / sync throughput:
	FindManyAndEmptyCursor: 1.1148888328889521
	FindManyAndEmptyCursor8Tasks: 1.2292068139214227
	FindOneByID: 0.5628404916816315
	FindOneByID8Tasks: 0.7250775180506075
	GridFsDownload: 1.2265791168581033
	GridFsUpload: 1.1172493351535546
	LargeDocBulkInsert: 1.0241106971206142
	LargeDocClientBulkInsert: 1.0411601857296626
	LargeDocInsertOne: 1.6042860101449359
	RunCommand: 0.4622670498004722
	RunCommand8Tasks: 0.6483034992665351
	SmallDocBulkInsert: 1.0083055402152894
	SmallDocBulkMixedOps: 0.5265934792474817
	SmallDocClientBulkInsert: 0.9523263557731153
	SmallDocClientBulkMixedOps: 0.9849426645112269
	SmallDocInsertOne: 0.6176642987581807
Async-only test throughput:
	FindManyAndEmptyCursor80Tasks: 63.0472761170005
	FindManyAndEmptyCursorManyTasks: 57.4746242684951
	FindOneByID80Tasks: 3.1210268089663
	FindOneByIDManyTasks: 2.43428121205249
	RunCommand80Tasks: 0.0312356720588811
	RunCommandManyTasks: 0.0279175701126114

Patch: https://spruce.mongodb.com/version/67d309f7d22fb0000771a758/tasks?sorts=STATUS%3AASC%3BBASE_STATUS%3ADESC

@NoahStapp NoahStapp requested a review from ShaneHarvey March 13, 2025 17:42
class TestSmallDocInsertOne(SmallDocInsertTest, AsyncPyMongoTestCase):
async def do_task(self):
insert_one = self.corpus.insert_one
await asyncio.gather(*[insert_one(doc) for doc in self.documents])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still need to remove this asyncio.gather

class TestLargeDocInsertOne(LargeDocInsertTest, AsyncPyMongoTestCase):
async def do_task(self):
insert_one = self.corpus.insert_one
await asyncio.gather(*[insert_one(doc) for doc in self.documents])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still need to remove this asyncio.gather

n_tasks = 80


class TestRunCommandManyTasks(TestRunCommand):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we call these "UnlimitedTasks" rather than "Many"?



class TestFindManyAndEmptyCursorManyTasks(TestFindManyAndEmptyCursor):
n_tasks = 1000
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we remove the n_tasks=1000 benchmarks? We already have 80, do we need 1000?

Copy link
Contributor Author

@NoahStapp NoahStapp Mar 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. This benchmark doesn't meaningfully scale with concurrency.

@NoahStapp NoahStapp requested a review from ShaneHarvey March 13, 2025 18:44
@NoahStapp NoahStapp merged commit e6e8650 into mongodb:master Mar 13, 2025
31 of 33 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants