Skip to content

[PyTorch] Add RFC for lightweight dispatch #40

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Mar 2, 2022

Conversation

larryliu0820
Copy link
Contributor

@larryliu0820 larryliu0820 commented Feb 16, 2022

larryliu0820 added a commit that referenced this pull request Feb 16, 2022
ghstack-source-id: 03ff616
Pull Request resolved: #40
larryliu0820 added a commit that referenced this pull request Feb 16, 2022
ghstack-source-id: d88f7a2
Pull Request resolved: #40
Copy link

@raziel raziel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@@ -0,0 +1,291 @@
# Context
With recent developments in PyTorch Edge applications on embedded systems and resource limited devices, the issue of op registration/dispatching runtime overhead and build-time complexity has risen to the fore.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"PyTorch Edge" is not public yet, so I'd change to Mobile.

Or just rephrase it like:

"As PyTorch aims to support a wider set of devices and use cases, we see the need for a lighter version of our operator dispatching mechanism".


# Motivation
* **Performance**
* For recent use cases of Edge interpreter, we need to satisfy more and more strict initialization latency requirements, where analysis shows op registration contributes to a large portion of it.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rename edge>mobile

* Also with static dispatch, we don’t have to register all of the ops into the JIT op registry, which saves runtime memory usage and further reduces static initialization time.
* It is possible to avoid dispatching at runtime.
* **Modularity and binary size**
* Currently the mobile runtime consists of both JIT op registry and c10 dispatcher. This project will make it possible to not depend on the c10 dispatcher, delivering a cleaner runtime library.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* Currently the mobile runtime consists of both JIT op registry and c10 dispatcher. This project will make it possible to not depend on the c10 dispatcher, delivering a cleaner runtime library.
* Currently the mobile runtime consists of both JIT op registry and c10 dispatcher. This project will make it possible to not depend on the c10 dispatcher (opt-in), delivering a cleaner runtime library.

* Currently the mobile runtime consists of both JIT op registry and c10 dispatcher. This project will make it possible to not depend on the c10 dispatcher, delivering a cleaner runtime library.
* This project creates an opportunity to reduce binary size by getting rid of the dispatcher and enables further size optimization on unboxing wrappers.
* **Ability to incorporate custom implementation of ATen ops**
* For some of the edge use cases, we need to support custom implementations of ATen ops. With an extra op registration path such as codegen unboxing it is easier to hookup ops with custom native functions.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* For some of the edge use cases, we need to support custom implementations of ATen ops. With an extra op registration path such as codegen unboxing it is easier to hookup ops with custom native functions.
* For some of the mobile use cases, we need to support custom implementations of ATen ops. With an extra op registration path such as codegen unboxing, it is easier to hookup ops with custom native functions.

![codegen drawio](https://user-images.githubusercontent.com/8188269/154173938-baad9ee6-0e3c-40bb-a9d6-649137e3f3f9.png)


Currently the lite interpreter (or Edge runtime) registers all ATen ops into the dispatcher and some other ops into the JIT op registry. At model inference time the interpreter will look for the operator name in the JIT op registry first, if not found then it will look into the dispatcher. This proposal **adds a build flavor that moves these ATen ops from dispatcher to JIT op registry** so that it’s easier to optimize (e.g., avoid schema parsing) and can also reduce dependencies.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Currently the lite interpreter (or Edge runtime) registers all ATen ops into the dispatcher and some other ops into the JIT op registry. At model inference time the interpreter will look for the operator name in the JIT op registry first, if not found then it will look into the dispatcher. This proposal **adds a build flavor that moves these ATen ops from dispatcher to JIT op registry** so that it’s easier to optimize (e.g., avoid schema parsing) and can also reduce dependencies.
Currently the mobile interpreter registers all ATen ops into the dispatcher and some other ops into the JIT op registry. At model inference time, the interpreter will look for the operator name in the JIT op registry first, if not found then it will look into the dispatcher. This proposal **adds a build flavor that moves these ATen ops from dispatcher to JIT op registry** so that it’s easier to optimize (e.g. avoid schema parsing) and reduce dependencies.

* **JIT type -> C++ type**. This is necessary for some of the optional C++ types, e.g., we need to map `int` to `int64_t` for the last argument in the example.
* This is already done in [types.py](https://github.com/pytorch/pytorch/blob/master/tools/codegen/api/types.py), and we need to integrate it into our new codegen.
* **JIT type -> IValue to basic type conversion C++ code.** E.g., the first argument of this operator: `Tensor(a) self` needs to be translated to: `(std::move(peek(stack, 0, 4))).toTensor()`
* IValue provides APIs to directly convert an ivalue to these basic types. See [ivalue_inl.h](https://github.com/pytorch/pytorch/blob/master/aten/src/ATen/core/ivalue_inl.h#L1453-L1493)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* IValue provides APIs to directly convert an ivalue to these basic types. See [ivalue_inl.h](https://github.com/pytorch/pytorch/blob/master/aten/src/ATen/core/ivalue_inl.h#L1453-L1493)
* IValue provides APIs to directly convert an IValue to these basic types. See [ivalue_inl.h](https://github.com/pytorch/pytorch/blob/master/aten/src/ATen/core/ivalue_inl.h#L1453-L1493)

* This is already done in [types.py](https://github.com/pytorch/pytorch/blob/master/tools/codegen/api/types.py), and we need to integrate it into our new codegen.
* **JIT type -> IValue to basic type conversion C++ code.** E.g., the first argument of this operator: `Tensor(a) self` needs to be translated to: `(std::move(peek(stack, 0, 4))).toTensor()`
* IValue provides APIs to directly convert an ivalue to these basic types. See [ivalue_inl.h](https://github.com/pytorch/pytorch/blob/master/aten/src/ATen/core/ivalue_inl.h#L1453-L1493)
* Here’s a [list](#bookmark=id.deyvpbsb5yel) of all the JIT types appearing in native_functions.yaml, most of them can be converted using ivalue’s API.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* Here’s a [list](#bookmark=id.deyvpbsb5yel) of all the JIT types appearing in native_functions.yaml, most of them can be converted using ivalue’s API.
* Here’s a [list](#bookmark=id.deyvpbsb5yel) of all the JIT types appearing in native_functions.yaml, most of them can be converted using IValue’s API.

* For the scenario of adding a new operator (not to `native_functions.yaml`), we need to provide clear guidance to add it to the JIT op registry as well, otherwise JIT execution will break.
* We can add tests on the mobile build for the sake of coverage.

For OSS mobile integration, we will need to have a new build flavor to switch between c10 dispatcher vs jit op registry. This new flavor will include codegen source files (`CodegenFunctions.h, CodegenFunctions.cpp, CodegenUnboxingWrappers.cpp`) instead of existing dispatcher related source files: `Operators.cpp`, `RegisterSchema.cpp `etc, similar to the internal build configuration.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's clarify here again that this new path will be opt in

* This is already done in [types.py](https://github.com/pytorch/pytorch/blob/master/tools/codegen/api/types.py), and we need to integrate it into our new codegen.
* **JIT type -> IValue to basic type conversion C++ code.** E.g., the first argument of this operator: `Tensor(a) self` needs to be translated to: `(std::move(peek(stack, 0, 4))).toTensor()`
* IValue provides APIs to directly convert an IValue to these basic types. See [ivalue_inl.h](https://github.com/pytorch/pytorch/blob/master/aten/src/ATen/core/ivalue_inl.h#L1453-L1493)
* Here’s a [list](#bookmark=id.deyvpbsb5yel) of all the JIT types appearing in native_functions.yaml, most of them can be converted using IValue’s API.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the [list](#bookmark=id.deyvpbsb5yel) valid?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me attach the list as appendix.


#### Codegen source file details

With the logic from the previous section, we should be able to wrap the code into a function pointer and register it into [torch::jit::OperatorRegistry](https://github.com/pytorch/pytorch/blob/master/torch/csrc/jit/runtime/operator.cpp#L19).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The size overhead of operator.cpp is around 10 KB. It's much smaller than torch library, but we still need to watch the size for embedded use cases.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm surprised it's that small. The file looks huge to me.

structured: True
structured_inherits: TensorIteratorBase
dispatch:
CPU, CUDA: acosh_out
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

QuantizedCPU is not listed as a dispatch (it's listed in relu_). Would code-gen consider the dispath listed here? If so, we don't need to generate QuantizedCPU for this op.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right this may be a bad example, but on high level we will look at this dispatch keys and generate switch cases.

DispatchKeySet _dk_set = c10::detail::multi_dispatch_key_set(tensor);
DispatchKey _dk = _dk_set.highestPriorityBackendTypeId();
switch (_dk) {
case DispatchKey::CPU:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it's runtime-control flow, does it mean that we need to build both the kernels, even if we just use one? We could use tracing based selective build, but tracing would still be required.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should trace backend AOT to avoid building both kernels. I think Dhruv has a proposal of doing so.

facebook-github-bot pushed a commit to pytorch/pytorch that referenced this pull request Mar 1, 2022
Summary:
RFC: pytorch/rfcs#40

This PR (re)introduces python codegen for unboxing wrappers. Given an entry of `native_functions.yaml` the codegen should be able to generate the corresponding C++ code to convert ivalues from the stack to their proper types. To trigger the codegen, run
```
tools/jit/gen_unboxing.py -d cg/torch/share/ATen
```

Merged changes on CI test. In #71782 I added an e2e test for static dispatch + codegen unboxing. The test exports a mobile model of mobilenetv2, load and run it on a new binary for lite interpreter: `test/mobile/custom_build/lite_predictor.cpp`.

## Lite predictor build specifics

1. Codegen: `gen.py` generates `RegisterCPU.cpp` and `RegisterSchema.cpp`. Now with this PR, once `static_dispatch` mode is enabled, `gen.py` will not generate `TORCH_LIBRARY` API calls in those cpp files, hence avoids interaction with the dispatcher. Once `USE_LIGHTWEIGHT_DISPATCH` is turned on, `cmake/Codegen.cmake` calls `gen_unboxing.py` which generates `UnboxingFunctions.h`, `UnboxingFunctions_[0-4].cpp` and `RegisterCodegenUnboxedKernels_[0-4].cpp`.
2. Build: `USE_LIGHTWEIGHT_DISPATCH` adds generated sources into `all_cpu_cpp` in `aten/src/ATen/CMakeLists.txt`. All other files remain unchanged. In reality all the `Operators_[0-4].cpp` are not necessary but we can rely on linker to strip them off.

## Current CI job test coverage update

Created a new CI job `linux-xenial-py3-clang5-mobile-lightweight-dispatch-build` that enables the following build options:
* `USE_LIGHTWEIGHT_DISPATCH=1`
* `BUILD_LITE_INTERPRETER=1`
* `STATIC_DISPATCH_BACKEND=CPU`

This job triggers `test/mobile/lightweight_dispatch/build.sh` and builds `libtorch`. Then the script runs C++ tests written in `test_lightweight_dispatch.cpp` and `test_codegen_unboxing.cpp`. Recent commits added tests to cover as many C++ argument type as possible: in `build.sh` we installed PyTorch Python API so that we can export test models in `tests_setup.py`. Then we run C++ test binary to run these models on lightweight dispatch enabled runtime.

Pull Request resolved: #69881

Reviewed By: iseeyuan

Differential Revision: D33692299

Pulled By: larryliu0820

fbshipit-source-id: 211e59f2364100703359b4a3d2ab48ca5155a023
pytorchmergebot pushed a commit to pytorch/pytorch that referenced this pull request Mar 1, 2022
Summary:
RFC: pytorch/rfcs#40

This PR (re)introduces python codegen for unboxing wrappers. Given an entry of `native_functions.yaml` the codegen should be able to generate the corresponding C++ code to convert ivalues from the stack to their proper types. To trigger the codegen, run
```
tools/jit/gen_unboxing.py -d cg/torch/share/ATen
```

Merged changes on CI test. In #71782 I added an e2e test for static dispatch + codegen unboxing. The test exports a mobile model of mobilenetv2, load and run it on a new binary for lite interpreter: `test/mobile/custom_build/lite_predictor.cpp`.

## Lite predictor build specifics

1. Codegen: `gen.py` generates `RegisterCPU.cpp` and `RegisterSchema.cpp`. Now with this PR, once `static_dispatch` mode is enabled, `gen.py` will not generate `TORCH_LIBRARY` API calls in those cpp files, hence avoids interaction with the dispatcher. Once `USE_LIGHTWEIGHT_DISPATCH` is turned on, `cmake/Codegen.cmake` calls `gen_unboxing.py` which generates `UnboxingFunctions.h`, `UnboxingFunctions_[0-4].cpp` and `RegisterCodegenUnboxedKernels_[0-4].cpp`.
2. Build: `USE_LIGHTWEIGHT_DISPATCH` adds generated sources into `all_cpu_cpp` in `aten/src/ATen/CMakeLists.txt`. All other files remain unchanged. In reality all the `Operators_[0-4].cpp` are not necessary but we can rely on linker to strip them off.

## Current CI job test coverage update

Created a new CI job `linux-xenial-py3-clang5-mobile-lightweight-dispatch-build` that enables the following build options:
* `USE_LIGHTWEIGHT_DISPATCH=1`
* `BUILD_LITE_INTERPRETER=1`
* `STATIC_DISPATCH_BACKEND=CPU`

This job triggers `test/mobile/lightweight_dispatch/build.sh` and builds `libtorch`. Then the script runs C++ tests written in `test_lightweight_dispatch.cpp` and `test_codegen_unboxing.cpp`. Recent commits added tests to cover as many C++ argument type as possible: in `build.sh` we installed PyTorch Python API so that we can export test models in `tests_setup.py`. Then we run C++ test binary to run these models on lightweight dispatch enabled runtime.

Pull Request resolved: #69881

Reviewed By: iseeyuan

Differential Revision: D33692299

Pulled By: larryliu0820

fbshipit-source-id: 211e59f2364100703359b4a3d2ab48ca5155a023
(cherry picked from commit 58e1c9a)
Copy link

@raziel raziel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm I think there's a couple of things to follow up on, but maybe better to just get this version in.

@larryliu0820 larryliu0820 merged commit 2f8e3ca into gh/larryliu0820/1/base Mar 2, 2022
@larryliu0820 larryliu0820 added the accepted RFC proposal is accepted in principle label Mar 2, 2022
cyyever pushed a commit to cyyever/pytorch_private that referenced this pull request Mar 3, 2022
Summary:
RFC: pytorch/rfcs#40

This PR (re)introduces python codegen for unboxing wrappers. Given an entry of `native_functions.yaml` the codegen should be able to generate the corresponding C++ code to convert ivalues from the stack to their proper types. To trigger the codegen, run
```
tools/jit/gen_unboxing.py -d cg/torch/share/ATen
```

Merged changes on CI test. In pytorch/pytorch#71782 I added an e2e test for static dispatch + codegen unboxing. The test exports a mobile model of mobilenetv2, load and run it on a new binary for lite interpreter: `test/mobile/custom_build/lite_predictor.cpp`.

## Lite predictor build specifics

1. Codegen: `gen.py` generates `RegisterCPU.cpp` and `RegisterSchema.cpp`. Now with this PR, once `static_dispatch` mode is enabled, `gen.py` will not generate `TORCH_LIBRARY` API calls in those cpp files, hence avoids interaction with the dispatcher. Once `USE_LIGHTWEIGHT_DISPATCH` is turned on, `cmake/Codegen.cmake` calls `gen_unboxing.py` which generates `UnboxingFunctions.h`, `UnboxingFunctions_[0-4].cpp` and `RegisterCodegenUnboxedKernels_[0-4].cpp`.
2. Build: `USE_LIGHTWEIGHT_DISPATCH` adds generated sources into `all_cpu_cpp` in `aten/src/ATen/CMakeLists.txt`. All other files remain unchanged. In reality all the `Operators_[0-4].cpp` are not necessary but we can rely on linker to strip them off.

## Current CI job test coverage update

Created a new CI job `linux-xenial-py3-clang5-mobile-lightweight-dispatch-build` that enables the following build options:
* `USE_LIGHTWEIGHT_DISPATCH=1`
* `BUILD_LITE_INTERPRETER=1`
* `STATIC_DISPATCH_BACKEND=CPU`

This job triggers `test/mobile/lightweight_dispatch/build.sh` and builds `libtorch`. Then the script runs C++ tests written in `test_lightweight_dispatch.cpp` and `test_codegen_unboxing.cpp`. Recent commits added tests to cover as many C++ argument type as possible: in `build.sh` we installed PyTorch Python API so that we can export test models in `tests_setup.py`. Then we run C++ test binary to run these models on lightweight dispatch enabled runtime.

Pull Request resolved: pytorch/pytorch#69881

Reviewed By: iseeyuan

Differential Revision: D33692299

Pulled By: larryliu0820

fbshipit-source-id: 211e59f2364100703359b4a3d2ab48ca5155a023
(cherry picked from commit 58e1c9a25e3d1b5b656282cf3ac2f548d98d530b)
cyyever pushed a commit to cyyever/pytorch_private that referenced this pull request Mar 3, 2022
Summary:
RFC: pytorch/rfcs#40

This PR (re)introduces python codegen for unboxing wrappers. Given an entry of `native_functions.yaml` the codegen should be able to generate the corresponding C++ code to convert ivalues from the stack to their proper types. To trigger the codegen, run
```
tools/jit/gen_unboxing.py -d cg/torch/share/ATen
```

Merged changes on CI test. In pytorch/pytorch#71782 I added an e2e test for static dispatch + codegen unboxing. The test exports a mobile model of mobilenetv2, load and run it on a new binary for lite interpreter: `test/mobile/custom_build/lite_predictor.cpp`.

## Lite predictor build specifics

1. Codegen: `gen.py` generates `RegisterCPU.cpp` and `RegisterSchema.cpp`. Now with this PR, once `static_dispatch` mode is enabled, `gen.py` will not generate `TORCH_LIBRARY` API calls in those cpp files, hence avoids interaction with the dispatcher. Once `USE_LIGHTWEIGHT_DISPATCH` is turned on, `cmake/Codegen.cmake` calls `gen_unboxing.py` which generates `UnboxingFunctions.h`, `UnboxingFunctions_[0-4].cpp` and `RegisterCodegenUnboxedKernels_[0-4].cpp`.
2. Build: `USE_LIGHTWEIGHT_DISPATCH` adds generated sources into `all_cpu_cpp` in `aten/src/ATen/CMakeLists.txt`. All other files remain unchanged. In reality all the `Operators_[0-4].cpp` are not necessary but we can rely on linker to strip them off.

## Current CI job test coverage update

Created a new CI job `linux-xenial-py3-clang5-mobile-lightweight-dispatch-build` that enables the following build options:
* `USE_LIGHTWEIGHT_DISPATCH=1`
* `BUILD_LITE_INTERPRETER=1`
* `STATIC_DISPATCH_BACKEND=CPU`

This job triggers `test/mobile/lightweight_dispatch/build.sh` and builds `libtorch`. Then the script runs C++ tests written in `test_lightweight_dispatch.cpp` and `test_codegen_unboxing.cpp`. Recent commits added tests to cover as many C++ argument type as possible: in `build.sh` we installed PyTorch Python API so that we can export test models in `tests_setup.py`. Then we run C++ test binary to run these models on lightweight dispatch enabled runtime.

Pull Request resolved: pytorch/pytorch#69881

Reviewed By: iseeyuan

Differential Revision: D33692299

Pulled By: larryliu0820

fbshipit-source-id: 211e59f2364100703359b4a3d2ab48ca5155a023
(cherry picked from commit 58e1c9a25e3d1b5b656282cf3ac2f548d98d530b)
larryliu0820 added a commit to larryliu0820/pytorch that referenced this pull request Mar 14, 2022
Summary:
Pull Request resolved: pytorch#74069

RFC: pytorch/rfcs#40
In pytorch#69881 we added the ability to generate codegen unboxing source files. Notice that the generated code to register an operator looks like this:
```
    // aten::add.Tensor(Tensor self, Tensor other, *, Scalar alpha=1) -> Tensor
    OperatorGenerator(
        TORCH_SELECTIVE_SCHEMA("aten::add.Tensor(Tensor self, Tensor other, *, Scalar alpha=1) -> Tensor"),
        [](Stack & stack) {
            RECORD_FUNCTION("add", std::vector<c10::IValue>());
            at::unboxing::add_Tensor(stack);
        },
        aliasAnalysisFromSchema()
    ),
```
However, this means we have to parse the schema and get back arguments with default values in static init time. As written in the RFC, there's a more performant option: providing these arguments with default values using codegen, then we don't have to do expensive regex pattern matching in parsing. Here's how it looks like:
```
    // aten::add.Tensor(Tensor self, Tensor other, *, Scalar alpha=1) -> Tensor
    OperatorGenerator(
        "aten::add",
        "Tensor",
        {
            c10::Argument("self", nullptr, c10::nullopt, c10::IValue(c10::nullopt)),
    	    c10::Argument("other", nullptr, c10::nullopt, c10::IValue(c10::nullopt)),
    	    c10::Argument("alpha", nullptr, c10::nullopt, c10::IValue(1))
        },
        {
            c10::Argument("")
        },
        [](Stack & stack) {
            RECORD_FUNCTION("add", std::vector<c10::IValue>());
            at::unboxing::add_Tensor(stack);
        },
        aliasAnalysisFromSchema()
    ),
```

We also added corresponding APIs in `operator.h` to take in the arguments.

Test Plan: Rely on CI

Reviewed By: kimishpatel

Differential Revision: D33077733

fbshipit-source-id: db6fcb3066ba352cb338a018324c02c32d67b941
facebook-github-bot pushed a commit to pytorch/pytorch that referenced this pull request Mar 15, 2022
Summary:
Pull Request resolved: #74069

RFC: pytorch/rfcs#40
In #69881 we added the ability to generate codegen unboxing source files. Notice that the generated code to register an operator looks like this:
```
    // aten::add.Tensor(Tensor self, Tensor other, *, Scalar alpha=1) -> Tensor
    OperatorGenerator(
        TORCH_SELECTIVE_SCHEMA("aten::add.Tensor(Tensor self, Tensor other, *, Scalar alpha=1) -> Tensor"),
        [](Stack & stack) {
            RECORD_FUNCTION("add", std::vector<c10::IValue>());
            at::unboxing::add_Tensor(stack);
        },
        aliasAnalysisFromSchema()
    ),
```
However, this means we have to parse the schema and get back arguments with default values in static init time. As written in the RFC, there's a more performant option: providing these arguments with default values using codegen, then we don't have to do expensive regex pattern matching in parsing. Here's how it looks like:
```
    // aten::add.Tensor(Tensor self, Tensor other, *, Scalar alpha=1) -> Tensor
    OperatorGenerator(
        "aten::add",
        "Tensor",
        {
            c10::Argument("self", nullptr, c10::nullopt, c10::IValue(c10::nullopt)),
    	    c10::Argument("other", nullptr, c10::nullopt, c10::IValue(c10::nullopt)),
    	    c10::Argument("alpha", nullptr, c10::nullopt, c10::IValue(1))
        },
        {
            c10::Argument("")
        },
        [](Stack & stack) {
            RECORD_FUNCTION("add", std::vector<c10::IValue>());
            at::unboxing::add_Tensor(stack);
        },
        aliasAnalysisFromSchema()
    ),
```

We also added corresponding APIs in `operator.h` to take in the arguments.

Test Plan: Rely on CI

Reviewed By: kimishpatel

Differential Revision: D33077733

fbshipit-source-id: e7f13a2f162c70d4e506b4f64cdbb7afec39f4e6
pytorchmergebot pushed a commit to pytorch/pytorch that referenced this pull request Mar 15, 2022
Summary:
Pull Request resolved: #74069

RFC: pytorch/rfcs#40
In #69881 we added the ability to generate codegen unboxing source files. Notice that the generated code to register an operator looks like this:
```
    // aten::add.Tensor(Tensor self, Tensor other, *, Scalar alpha=1) -> Tensor
    OperatorGenerator(
        TORCH_SELECTIVE_SCHEMA("aten::add.Tensor(Tensor self, Tensor other, *, Scalar alpha=1) -> Tensor"),
        [](Stack & stack) {
            RECORD_FUNCTION("add", std::vector<c10::IValue>());
            at::unboxing::add_Tensor(stack);
        },
        aliasAnalysisFromSchema()
    ),
```
However, this means we have to parse the schema and get back arguments with default values in static init time. As written in the RFC, there's a more performant option: providing these arguments with default values using codegen, then we don't have to do expensive regex pattern matching in parsing. Here's how it looks like:
```
    // aten::add.Tensor(Tensor self, Tensor other, *, Scalar alpha=1) -> Tensor
    OperatorGenerator(
        "aten::add",
        "Tensor",
        {
            c10::Argument("self", nullptr, c10::nullopt, c10::IValue(c10::nullopt)),
    	    c10::Argument("other", nullptr, c10::nullopt, c10::IValue(c10::nullopt)),
    	    c10::Argument("alpha", nullptr, c10::nullopt, c10::IValue(1))
        },
        {
            c10::Argument("")
        },
        [](Stack & stack) {
            RECORD_FUNCTION("add", std::vector<c10::IValue>());
            at::unboxing::add_Tensor(stack);
        },
        aliasAnalysisFromSchema()
    ),
```

We also added corresponding APIs in `operator.h` to take in the arguments.

Test Plan: Rely on CI

Reviewed By: kimishpatel

Differential Revision: D33077733

fbshipit-source-id: e7f13a2f162c70d4e506b4f64cdbb7afec39f4e6
(cherry picked from commit 08a0769)
@facebook-github-bot facebook-github-bot deleted the gh/larryliu0820/1/head branch April 2, 2022 14:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
accepted RFC proposal is accepted in principle cla signed commenting RFC is open for discussions
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants