Skip to content

DO NOT MERGE - Optimization module for Feature Visualization #412

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

ludwigschubert
Copy link

@ludwigschubert ludwigschubert commented Jun 23, 2020

This is an early-stage PR for a WIP implementation of feature visualization and other techniques from papers I co-authored in PyTorch. The current code was created together with Gabriel Goh and is based on earlier Tensorflow implementations of these techniques by Chris Olah, Alex Mordvintsev, and me—while the commit structure doesn't reflect this, one aspect we should consider is some form of attribution to those original authors.

I'm posting it so SK and maybe others can collaborate on it in the next weeks! :-) It is not yet ready for merging, and there may still be major module reorganizations.

Roadmap

  • Attribution to original DeepDream authors + Gabe re: FFT
  • integrate existing tests into captum's test suite
  • add more tests! \o/ I'd love to get 100% line coverage on the core modules
  • unify API with Captum API: a single class that's callable per "technique"(? check for details before implementing)
  • Consider if we need an abstraction around "an optimization process" (in terms of stopping criteria, reporting losses, etc) or if there are sufficiently strong conventions in PyTorch land for such tasks
  • check if we can integrate Eli's FFT param changes (mostly for simplification)
  • make a table of other PyTorch interpretability tools for readme?
  • do we need image viewing helpers and io helpers or throw those out?
  • can we integrate paper references closer with the code?
  • future: can we use attribution from the methods in the attribution package for our visualization tasks that require it?
  • Write up how core issues are solved in various frameworks: getting gradients from and wrt arbitrary tensors in the graph (in Pytorch: both for modules and for jitted graphs), overriding gradients for ops, extracting activations (in PyTorch: forward hooks), etc.

WIP optim module
@greentfrapp
Copy link

Hey @ludwig, happy to be able to help!

Just looked through the code and here are some comments, feel free to disagree!

  • Loving the nn.Module for everything! Really torch-y! (Embarrassingly, I've never actually tried using the buffers before so TIL)

  • Is the flow gonna be something like:

net = googlenet(pretrained=True)
image = images.NaturalImage()
target = net.mixed3b._pool_reduce[1]
loss_fn = objectives.channel_activation(target, 137)
obj = objectives.InputOptimization(model, image, transforms, [target], loss_fn)
optimize(obj)

I think this is fine. I was thinking we might wanna:

  • Move InputOptimization to optimize.py
  • Have optimize as a method for InputOptimization

I was also thinking if there was a better way to identify targets. I imagine it will be tedious for larger networks with many more nested modules.

  • ImageTensor sounds great

    • We can have methods like load (image files), show and save (then maybe we don't need separate viewing and io helpers)
    • Attributes height, width, channels, (parametrization?)
    • Maybe optionally apply normalization from inside the ImageTensor when passing into a model, and un-normalize when visualizing or exporting
  • I assume with suppress(AbortForwardException) is meant for the RuntimeError that comes from passing in inputs that are smaller than expected?

  • Just dropping a pin here about the dead ReLU issue - not super sure what's a good torch-y solution

  • Re. an abstraction around "an optimization process": I think in general for optimization-based interpretability in PyTorch these are some of the more cumbersome parts (there may be more but these come to mind for now):

    1. Having to hook the modules
    2. Fixing the dead ReLU problem
    3. Suppressing RuntimeError for too-small inputs because net(image) runs the entire model

The first 2 issues are probably separate from the actual optimization step. For that last thing, I'm not sure what I feel about writing suppress everytime. That may be a reason for abstracting optimize.

  • On that note, if we are not abstracting optimize then we may want to change this (-1 * loss_value.mean()) i.e. put the -1 into the objective functions so that people don't forget about it

  • Lastly, haha I suppose clarity.pytorch is an internal library used by the Clarity team?

@NarineK
Copy link
Contributor

NarineK commented Jul 4, 2020

Thank you again for the PR @ludwigschubert and thank you for the feedback comments @greentfrapp.

I looked into the packaging structure on the high level and here are couple ideas. I'd be happy to discuss and hear your thoughts about it as well.

captum.optim.tech package -> It looks like here we have specific examples, visualizations, dependency to kornia, clarity.pytorch, lucid. I'd suggest to move this to a tutorial notebook. Some of the functionally can be moved to optim.utils'package for example. We can make it not to depend on clarity.pytorch, lucid and be more self contained.

captum.optim.param.transform -> This is great to support different types of input transformations. In order to not depend on a specific library 'kornia' we can make rotate, scale, shear, translate functions configurable from outside. In the tutorials we can use 'kornia'.

captum.optim.optim.objectives and captum.optim.optimize -> In captum we put core logic under _core package. I think that since this can be considered as core, we can move objectives.py together with optimize.py and output_hook.py into captum.optim._core package.
I agree with @greentfrapp, it would make sense to move InputOptimization to optimize.py and have optimize as a method of InputOptimization.
Also, I noticed that there is single_target_objective function and SingleTargetObjective class. It would be good to be consistent and extend from Objective abstract class. It also looks like InputOptimization extends from Objective and at the same time we pass neuron_activation as loss_function. I think that it would make sense either to leave loss_function or use Objective abstraction. I guess we need to play a bit with it and see if we want to keep Objective abstraction.
I agree, it would be great to abstract out: -1 * loss_value.mean() In some implementations in the OSS I have also noticed that some people preferred to use loss_value.sum() for example instead of mean. Making it customizable would be great.

captum.optim.models package -> It looks like here we have some custom code related to inception model and reference to tensorflow. We might think of using torchvision if possible, if not, then we can see how we can reduce the dependency on external packages such as tensorflow. Since some of it is specifically related to inception model we can also think of moving some of it to tutorials section.

captum.optim.io package -> It looks like here as well we have reference to lucid and image specific implementation. I agree with @greentfrapp. If we can move some of it to ImageTensor - that would be great.

captum.optim.optim/output_hooks.py -> As mentioned above, it would be great to move this to _core package. Hooks have some limitations we can discuss it more. ModuleReuseException looks great - we can make you of it in the hook.
I'd also love to discuss more about the dead ReLU.

Test cases: To be consistent with captum tests we can move the test cases to optim package under the test folder: https://github.com/pytorch/captum/tree/master/tests
.
Agree, let's increase test coverage for the sub-packages and functionalities under captum.optim.

Thank you everyone for contributing :)
It might be good to set up a meeting and discuss how we want to proceed further. I have also created a separate branch called optim-wip we can open this PR against that branch, merge it and make further improvements there until it gets ready to be merged with master.

Let me know what do you think.
Thank you :)

@greentfrapp
Copy link

Hey @NarineK I'm reviving this thread since we are kickstarting this again. Also to refresh myself on this PR.

Re. your comments:

captum.optim.techs -> Yup I think moving these to a tutorial notebook sounds good, probably to https://github.com/pytorch/captum/tree/master/tutorials? We can make it more self-contained, like my previous work on Lucent.

captum.optim.param.transform -> I've received some related feedback from @ProGamerGov that helps to remove dependencies on Kornia for differentiable transformations, if that's what we are looking for.

captum.optim.optim.(objectives/optimize/output_hook) -> I don't mind moving these to _core. Although I think we should have an early discussion about what gets moved to _core and what is kept within optim. This would be helpful for structuring this entire PR. Agree with your other comments as well!

captum.optim.models -> In our previous work on Lucent, we used a PyTorch version of the GoogLeNet/InceptionV1 model (with the same weights) that was converted by @ProGamerGov. So Lucent does not require any tf imports to load the model. We can probably do the same here. Alternatively, we can use a torchvision-supported model for tests, demos and tutorials instead.

captum.optim.io -> Agree re. implementing an ImageTensor class that includes io functions. We can definitely work on this.

Tests -> Yup! Moving to the test folder sounds good.

Summary/Discussion/Next Steps:

  • I suggest we start by thinking about the overall structure i.e. what to keep in optim and what to move to existing Captum folders e.g.
    • _core -> objectives, optimize, output_hook
    • tests -> tests
    • tutorials -> tutorials
    • optim -> current optim.param (transforms and input image parameterizations), ImageTensor
  • Related to above, draft out main components - currently there seems to be
    • InputOptimization
    • Different objectives
    • Different transforms
    • Different input image parameterizations
    • ImageTensor

Happy to have a meeting about this!

@ProGamerGov
Copy link
Contributor

captum.optim.models -> If the default model that @greentfrapp is using is a bit too confusing in terms of layer names, I also have a converted version of the model and others that more closely resemble PyTorch's GoogleNet model class format. This version also uses adaptive pooling so that any size input is supported. This may make it easier for PyTorch users to use and understand the model.

@NarineK
Copy link
Contributor

NarineK commented Nov 30, 2020

Thank you for the collaboration on this PR: @ProGamerGov, @greentfrapp and @ludwigschubert ! Closing this PR since we have merged it into optim-wip branch and currently we have PRs opened in that branch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants