Add matxStreamDestroy function to better handle device_async allocations. #578

nvjonwong · 2024-02-15T18:14:59Z

When making tensors with MATX_ASYNC_DEVICE_MEMORY the stream is recorded in the memtracker. If the stream is destroyed, before the tensor is deallocated a segfault will occur because the stream is no longer valid.

Since there is no way to check if a stream is valid after it has been destroyed, the proposed fix is to add a matxStreamDestroy that changes references to the stream to be destroyed to the null stream. This keeps the allocations alive and allows the memtracker to properly deallocate device_async allocations even if the original stream is gone.

luitjens · 2024-02-15T18:48:31Z

Why would we have a destroy api without a create api? Is this really a matx issue or a calling code issue?

nvjonwong · 2024-02-15T19:01:04Z

Why would we have a destroy api without a create api? Is this really a matx issue or a calling code issue?

It is not a calling code issue. See unit_test.

The issue is that we memorize the stream used to allocate device_async. This stream may not exist (already been destroyed) when the memory finally gets deallocated (say with memtracker/program scope).

luitjens · 2024-02-15T19:21:43Z

but the problem here is that you are destroy the stream before the object goes out of scope. Another possible fix would be to provide you a delete_tensor(t) option which frees the memory associated with it. This is one of the reasons I hate refcounting and deleting as the scope exits.

nvjonwong · 2024-02-15T19:27:00Z

but the problem here is that you are destroy the stream before the object goes out of scope. Another possible fix would be to provide you a delete_tensor(t) option which frees the memory associated with it. This is one of the reasons I hate refcounting and deleting as the scope exits.

To provide more context, this unit_test is a poor analogy to the problem I initially found. The original issue that I found is, when we do matx:fft(...).run(stream) we create a workspace on stream that the user is not privy to. Then I delete the stream I created in my function at the end of my function scope (normal CUDA semantics). Since the workspace for fft lives past the function scope, the memorized stream gets carried over to program scope. Currently there's no reliable way to check if the stream memorized in the memtracker (AFAIK) is still alive.

cliffburdick · 2024-02-15T23:36:05Z

@nvjonwong can we close this in lieu of #579

add StreamDestroy api function and add test in BasicTensorTestsAll

8276791

nvjonwong self-assigned this Feb 15, 2024

nvjonwong requested a review from cliffburdick February 15, 2024 18:15

nvjonwong closed this Feb 16, 2024

cliffburdick deleted the nvjonwong/streamdestroy branch November 19, 2024 03:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add matxStreamDestroy function to better handle device_async allocations. #578

Add matxStreamDestroy function to better handle device_async allocations. #578

Uh oh!

nvjonwong commented Feb 15, 2024

Uh oh!

luitjens commented Feb 15, 2024

Uh oh!

nvjonwong commented Feb 15, 2024

Uh oh!

luitjens commented Feb 15, 2024

Uh oh!

nvjonwong commented Feb 15, 2024 •

edited

Loading

Uh oh!

cliffburdick commented Feb 15, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Add matxStreamDestroy function to better handle device_async allocations. #578

Add matxStreamDestroy function to better handle device_async allocations. #578

Uh oh!

Conversation

nvjonwong commented Feb 15, 2024

Uh oh!

luitjens commented Feb 15, 2024

Uh oh!

nvjonwong commented Feb 15, 2024

Uh oh!

luitjens commented Feb 15, 2024

Uh oh!

nvjonwong commented Feb 15, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cliffburdick commented Feb 15, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

nvjonwong commented Feb 15, 2024 •

edited

Loading