Add triangular solve function for sparse CSR tensors in XPU#3261
Add triangular solve function for sparse CSR tensors in XPU#3261tszulist-hbn wants to merge 1 commit intointel:mainfrom
Conversation
a47240d to
fb6f866
Compare
There was a problem hiding this comment.
Pull request overview
Adds SparseCsrXPU support for triangular_solve by wiring a SparseCsrXPU dispatch and implementing an XPU sparse CSR fallback that densifies A and calls the existing dense XPU solve.
Changes:
- Register
SparseCsrXPUdispatch fortriangular_solve.X(and delegatetriangular_solveto it). - Implement
triangular_solve_out_sparse_csr_xpuin the XPU sparse CSR math file usingA.to_dense()+at::triangular_solve. - Add an
_nnz()==0fast-path intended to match CUDA behavior (fillsXwith NaNs).
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
| yaml/native/native_functions.yaml | Adds SparseCsrXPU dispatch registration for triangular_solve.X and structured delegate entry for triangular_solve. |
| src/ATen/native/sparse/xpu/SparseCsrTensorMath.cpp | Implements SparseCsrXPU out-kernel that densifies A and delegates to dense at::triangular_solve, with an _nnz()==0 special-case. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
fb6f866 to
2a8ed8f
Compare
2a8ed8f to
1abc210
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 2 out of 2 changed files in this pull request and generated no new comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
6b2c07b to
f242fda
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
f242fda to
2ab054a
Compare
There was a problem hiding this comment.
Overall LGTM. Could you please add a test case? Maybe test/regressions is a good place.
Defer to @CuiYifeng
Resolves: #3167
This PR adds the SparseCsrXPU dispatch for triangular_solve by converting the sparse matrix to dense and delegating to the existing dense XPU triangular_solve implementation. This follows the same pattern used by other XPU sparse ops (addmm, baddbmm, etc.).
Changes:
yaml/native/native_functions.yaml — Register SparseCsrXPU: triangular_solve_out_sparse_csr_xpu dispatch for both triangular_solve.X (structured out variant) and triangular_solve (structured delegate).
src/ATen/native/sparse/xpu/SparseCsrTensorMath.cpp — Implement triangular_solve_out_sparse_csr_xpu() which handles the zero-nnz edge case (fills X with NaN, matching CUDA behavior) and otherwise converts sparse A to dense before calling at::triangular_solve.
Tests verified:
test_block_triangular_solve — 64/64 variants pass (block_size 2/3, int32/int64, contiguous/noncontiguous, float32/float64/complex64/complex128)
test_sparse_triangular_solve_xpu — 4/4 variants pass (float32/float64/complex64/complex128)