Add prerun_done_ flag to prevent duplicate PreRun executions in transform operators #1065
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Introduced a mutable bool prerun_done_ flag to 40 transform operators that allocate temporary tensors in their PreRun() method. This prevents duplicate memory allocations and executions when PreRun() is called multiple times on the same operator instance.
In an fft convolution example it was inadvertently launching 2 more kernels than it needed to since the assignment inside fft_impl was causing a second PreRun to be called after the initial binary operator.
The flag is checked at the start of PreRun() and causes an early return if the method has already been executed. The flag is set to true after temporary tensor allocation but before calling Exec().
Modified operators:
Operators not modified (don't allocate temps in PreRun): lu, unique, svd, qr, sparse2sparse, argmax, argmin, argminmax, find, find_idx, find_peaks, einsum, eig, dense2sparse