Can we get rid of `Thunk`?

@jrevels: @willtebbutt  and I were going through the Differentials to make sure we actually know what they are for.
And started to wonder if we need them.

Each one we get rid of simplifies things a lot,
especially when  it comes to https://github.com/JuliaDiff/ChainRulesCore.jl/issues/16

I think we might be able to just have 
`Wirtinger`, `One`, `Zero`, and `DNE`.

### Wirtinger

I have only the barest understanding of what this is.
It effectively seems like a particularly convient way to deal with
deriviatives  with respect complex number (In contrast to handling them as structs to (#4))
Probably useful.

### DNE

Does not exist. Obviously useful.

### One, Zero
Useful identities that are evaluated lazily, and can thus be removed from the chain efficiently.

### Casted

It is kind of the generalization of One, and Zero.
(in that `One()` could also be written `Casted(true)` etc).
It lets us lazily delay computing a broadcast,
so that it can be fused later.
But I think in the short term we can simplify the code
by replacing say
`Rule((Δx, Δy) -> sum(Δx * cast(y)) + sum(cast(x) * Δy))`
with 
`Rule((Δx, Δy) -> sum(Δx .*y) + sum(x .* Δy)`
(from [here](https://github.com/JuliaDiff/ChainRules.jl/blob/e24c190e2a82c4651266ba994df7c124cf44cf33/src/rulesets/LinearAlgebra/dense.jl#L12)
which for that particular case would even be identical in performance I think.
Since it does not end up returning any kind of lazy computation.
And later we can try getting back the lazy computation and broadcast fusing by returning `broadcasted`.

Getting rid of Casted would solve #10 

### Thunk

Thunk seemed really useful at first,
but I am not sure anymore that it actually does anything.

A thunk is basically wrapping a function returning Differentiable `f(v)` in a `()->f(v)`
so as not to have to compute it yet.
But Any time you interact with it (e.g. via `add` or `mul`) it gets `extern`ed,
because if you don't do that you can get huge chains of thunks that call thunks,
and also because at the time you are called e.g. `add` you probably do actually want the value -- your not going to skip it and only use the other part.

And the using it inside a rule isn't actually making anything extra deferred until the backwards pass, since rules themselfs are deffered until backward pass.

E.g. lookinng at [this rule](https://github.com/JuliaDiff/ChainRules.jl/blob/e24c190e2a82c4651266ba994df7c124cf44cf33/src/rulesets/LinearAlgebra/dense.jl#L29-L33)
Rather  than 
```
function rrule(::typeof(inv), x::AbstractArray)
    Ω = inv(x)
    m = @thunk(-Ω')
    return Ω, Rule(ΔΩ -> m * ΔΩ * Ω')
end
```
we could just do
```
function rrule(::typeof(inv), x::AbstractArray)
    Ω = inv(x)
    return Ω, Rule(ΔΩ -> -Ω' * ΔΩ * Ω')
end
```
Which boils down to the same thing since it when the rule is invoked it gets `extern`ed anyway. by `*` becoming `mul`.

Even in the case of the derivative for multiple things, so you would have multiple rules referencing the `thunk`, it still doesn't change anything since thunks don't cache
(#7).
I recall @jrevels  saying that they used to cache, so maybe still having them is a legacy of that time and we just didn't notice that they didn't do anything anymore.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Can we get rid of `Thunk`? #18

Wirtinger

DNE

One, Zero

Casted

Thunk

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Can we get rid of Thunk? #18

Description

Wirtinger

DNE

One, Zero

Casted

Thunk

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Can we get rid of `Thunk`? #18