Skip to content

Conversation

@cliffburdick
Copy link
Collaborator

@cliffburdick cliffburdick commented Aug 13, 2025

When a statement op = transform occurred where op was not a tensor, the transform was being called twice incorrectly leading to poor performance. This simplifies the code and calls it once.

@copy-pr-bot
Copy link

copy-pr-bot bot commented Aug 13, 2025

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@cliffburdick
Copy link
Collaborator Author

/build

tbensonatl added a commit that referenced this pull request Aug 14, 2025
Copy link
Collaborator

@tbensonatl tbensonatl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reshape.h needs to be added, but otherwise this change looks good to me.

@cliffburdick
Copy link
Collaborator Author

/build

@cliffburdick cliffburdick merged commit a5b571e into main Aug 15, 2025
1 check passed
@cliffburdick cliffburdick deleted the lhs_op_fix branch August 15, 2025 14:05
cliffburdick pushed a commit that referenced this pull request Aug 25, 2025
* Add new zipvec operator

Add a new zipvec operator that zips multiple input operators into
a vectorized operation (e.g., an operator with type float3 when
zipping x, y, and z coordinates).

Signed-off-by: Thomas Benson <[email protected]>

* Update zipvec documentation

Signed-off-by: Thomas Benson <[email protected]>

* Address part of review comments

Use cuda::std namespace for template helpers, update copyright
data and static assertion comment

Signed-off-by: Thomas Benson <[email protected]>

* Remove is_narrowing_conversion helpers and add get_impl
helper for zipvec operator() methods.

* Remove support for half types

* Use scalar loads for vectorized types

* Handle the sizeof(T) != alignment_by_type<T>() only in load()

* Special-case alignment checks for sizeof(T) != alignment_by_type<T>()

* Address remaining review feedback

Signed-off-by: Thomas Benson <[email protected]>

* Remove use of mtie in zipvec (see PR #1037)

Signed-off-by: Thomas Benson <[email protected]>

* Add back assignment operator with self_type

Signed-off-by: Thomas Benson <[email protected]>

* Update ZipVecOp class documentation block

Signed-off-by: Thomas Benson <[email protected]>

---------

Signed-off-by: Thomas Benson <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants