allow self influence iteration options #1002

99warriors · 2022-07-29T16:21:46Z

Summary:

For self influence computation, there needs to be an iteration over both checkpoints as well as batches. This diff adds a by_checkpoints option. If true, the outer iteration is over checkpoints. If false, the outer iteration is over checkpoints. Because self influence computation can be called through the influence and self_influence methods, this option is added to both methods. Because only TracInCP and TracInCPFast should be used for self influence computation, only those classes are changed.
The implement this option, the old self_influence method, which had the outer iteration over checkpoints, is renamed to be a private _self_influence_by_checkpoints method. A new _self_influence_by_batches method is added, which has an outer iteration over batches, and re-uses the _self_influence_by_checkpoints method to compute self influence scores for a single batch (this method can accept both a single batch, as well as a dataloader yielding batches). Because the logic of this method is the same for all classes, a helper method, _self_influence_by_batches_helper, is added to captum.influence._utils.common. Finally, the new self_influence method simply chooses whether to call _self_influence_by_checkpoints or _self_influence_by_batches.
Documentation describing the two options for by_checkpoints is added to the self_influence and influence methods.
test_tracin_show_progress now differentiates between 2 modes: "self influence by checkpoints" (the original test for progress bar when calculating self influence scores, which checks whether the outer progress bar over checkpoints and inner progress bars over batches both reach 100%), and the newly added mode "self influence by batches", which checks whether the progress bar over batches reaches 100%.
test_tracin_self_influence now also checks whether computing self influence scores gives the same result regardless of whether by_checkpoints is True or False

Reviewed By: NarineK

Differential Revision: D37743920

facebook-github-bot · 2022-07-29T16:22:15Z

This pull request was exported from Phabricator. Differential Revision: D37743920

facebook-github-bot · 2022-07-31T17:07:48Z

This pull request was exported from Phabricator. Differential Revision: D37743920

Summary: Pull Request resolved: pytorch#1002 - For self influence computation, there needs to be an iteration over both checkpoints as well as batches. This diff adds a `by_checkpoints` option. If true, the outer iteration is over checkpoints. If false, the outer iteration is over checkpoints. Because self influence computation can be called through the `influence` and `self_influence` methods, this option is added to both methods. Because only `TracInCP` and `TracInCPFast` should be used for self influence computation, only those classes are changed. - The implement this option, the old `self_influence` method, which had the outer iteration over checkpoints, is renamed to be a private `_self_influence_by_checkpoints` method. A new `_self_influence_by_batches` method is added, which has an outer iteration over batches, and re-uses the `_self_influence_by_checkpoints` method to compute self influence scores for a single batch (this method can accept both a single batch, as well as a dataloader yielding batches). Because the logic of this method is the same for all classes, a helper method, `_self_influence_by_batches_helper`, is added to `captum.influence._utils.common`. Finally, the new `self_influence` method simply chooses whether to call `_self_influence_by_checkpoints` or `_self_influence_by_batches`. - Documentation describing the two options for `by_checkpoints` is added to the `self_influence` and `influence` methods. - `test_tracin_show_progress` now differentiates between 2 modes: "self influence by checkpoints" (the original test for progress bar when calculating self influence scores, which checks whether the outer progress bar over checkpoints and inner progress bars over batches both reach 100%), and the newly added mode "self influence by batches", which checks whether the progress bar over batches reaches 100%. - `test_tracin_self_influence` now also checks whether computing self influence scores gives the same result regardless of whether `by_checkpoints` is True or False Reviewed By: NarineK Differential Revision: D37743920 fbshipit-source-id: e7fe669d3fdbc2d2b3c4c16ed3eb56651b0bd8fa

Summary: Pull Request resolved: pytorch#994 change `TracInCP._self_influence_batch_tracincp` and `TracInCP._self_influence_batch_tracincp` `TracInCP._self_influence_batches_tracincp_fast` to be named `self_influence`, which is now public, and now accept a DataLoader yielding batches (as well as a single batch, as before). The modified helper function can be called by external functions to compute self influence. The helper itself is also changed to improve efficiency, by reducing the number of times checkpoints are loaded. The modified helper, despite being able to compute self influence scores for a dataloader yielding batches, still only loads each checkpoint once, per call. This is because the modified helper now has an outer iteration over checkpoints, and an inner iteration over batches (the order of iteration is reversed compared to before). This helper is called by `influence` when running it in self influence mode. The reason we cannot just increase the batch size to reduce the number of checkpoint loadings is that for large models (precisely those for which loading checkpoints is expensive), the model takes up too much memory, so that the batch size cannot be too large. Minor change: for `influence_src_dataset` argument of all `__init__`'s, add description of what assumptions we make of the batches yielded by the dataloader. Differential Revision: D35603078 fbshipit-source-id: 87063052e68441b82514489f4d9f9ad29b396da4

Summary: Pull Request resolved: pytorch#1002 - For self influence computation, there needs to be an iteration over both checkpoints as well as batches. This diff adds a `by_checkpoints` option. If true, the outer iteration is over checkpoints. If false, the outer iteration is over checkpoints. Because self influence computation can be called through the `influence` and `self_influence` methods, this option is added to both methods. Because only `TracInCP` and `TracInCPFast` should be used for self influence computation, only those classes are changed. - The implement this option, the old `self_influence` method, which had the outer iteration over checkpoints, is renamed to be a private `_self_influence_by_checkpoints` method. A new `_self_influence_by_batches` method is added, which has an outer iteration over batches, and re-uses the `_self_influence_by_checkpoints` method to compute self influence scores for a single batch (this method can accept both a single batch, as well as a dataloader yielding batches). Because the logic of this method is the same for all classes, a helper method, `_self_influence_by_batches_helper`, is added to `captum.influence._utils.common`. Finally, the new `self_influence` method simply chooses whether to call `_self_influence_by_checkpoints` or `_self_influence_by_batches`. - Documentation describing the two options for `by_checkpoints` is added to the `self_influence` and `influence` methods. - `test_tracin_show_progress` now differentiates between 2 modes: "self influence by checkpoints" (the original test for progress bar when calculating self influence scores, which checks whether the outer progress bar over checkpoints and inner progress bars over batches both reach 100%), and the newly added mode "self influence by batches", which checks whether the progress bar over batches reaches 100%. - `test_tracin_self_influence` now also checks whether computing self influence scores gives the same result regardless of whether `by_checkpoints` is True or False Reviewed By: NarineK Differential Revision: D37743920 fbshipit-source-id: a4e0c44299b31bf50fe2b5b4cb4d2e62c669208a

facebook-github-bot · 2022-07-31T23:15:54Z

This pull request was exported from Phabricator. Differential Revision: D37743920

facebook-github-bot added cla signed fb-exported labels Jul 29, 2022

99warriors force-pushed the export-D37743920 branch from ed79fd4 to 986e5be Compare July 31, 2022 17:07

99warriors added 2 commits July 31, 2022 16:14

99warriors force-pushed the export-D37743920 branch from 986e5be to d3b1487 Compare July 31, 2022 23:15

facebook-github-bot closed this in a08883f Aug 1, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

allow self influence iteration options #1002

allow self influence iteration options #1002

Uh oh!

99warriors commented Jul 29, 2022

Uh oh!

facebook-github-bot commented Jul 29, 2022

Uh oh!

facebook-github-bot commented Jul 31, 2022

Uh oh!

facebook-github-bot commented Jul 31, 2022

Uh oh!

Uh oh!

allow self influence iteration options #1002

allow self influence iteration options #1002

Uh oh!

Conversation

99warriors commented Jul 29, 2022

Uh oh!

facebook-github-bot commented Jul 29, 2022

Uh oh!

facebook-github-bot commented Jul 31, 2022

Uh oh!

facebook-github-bot commented Jul 31, 2022

Uh oh!

Uh oh!