Skip to content

Separate data and instruction #128

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Mar 3, 2022
Merged

Separate data and instruction #128

merged 7 commits into from
Mar 3, 2022

Conversation

KDr2
Copy link
Member

@KDr2 KDr2 commented Mar 3, 2022

Implement #127 and #97.

The type info task will be in another PR.

@KDr2
Copy link
Member Author

KDr2 commented Mar 3, 2022

This PR reduces the Integrate Test from 140-160 mins to about 100 mins.

@rikhuijzer
Copy link
Contributor

rikhuijzer commented Mar 3, 2022

This PR reduces the Integrate Test from 140-160 mins to about 100 mins.

Tested locally or on the runners?

EDIT: Benchmark below: #128 (comment)

Copy link
Contributor

@rikhuijzer rikhuijzer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand most of the code, but I like that the struct fields have clearer types

Comment on lines +90 to +94
haskey(tf.bindings, :_1) && (tf.bindings[:_1].val = tf.func)
for i in 1:length(args)
slot = Symbol("_", i + 1)
haskey(tf.bindings, slot) && (tf.bindings[slot].val = args[i])
end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be faster if one could avoid looking up every key twice, once in haskey and once in setindex!.

One alternative would be to use get with a default value such as a NoKeyFound() singleton instead of haskey (nothing seems ambiguous for the arguments unfortunately). Another alternative would be to iterate over keys - but this would only work, or be efficient at least, if most keys are :_i with <= length(args) + 1.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need an Option or Maybe type in Julia 😄

Here most keys in the Dict will not be :_i.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The plan is to replace to.bindings::Dict with NamedTuple in the next PR. So it probably won’t matter.

@KDr2
Copy link
Member Author

KDr2 commented Mar 3, 2022

This PR reduces the Integrate Test from 140-160 mins to about 100 mins.

Tested locally or on the runners?

I simply read the time spent on github action, but it may be not so accurate due to the job concurrency and queue?

@KDr2
Copy link
Member Author

KDr2 commented Mar 3, 2022

This PR reduces the Integrate Test from 140-160 mins to about 100 mins.

Tested locally or on the runners?

I simply read the time spent on github action, but it may be not so accurate due to the job concurrency and queue?

You can see the time the IntegerateTest consumed here: https://github.com/TuringLang/Libtask.jl/actions/workflows/IntegrationTest.yml

@rikhuijzer
Copy link
Contributor

rikhuijzer commented Mar 3, 2022

This PR reduces the Integrate Test from 140-160 mins to about 100 mins.

Nice. I can confirm that this PR is good for performance. I ran the following subset of the Turing tests (Turing.jl/test/runtests.jl) on my system:

[...]
include(pkgdir(Turing)*"/test/test_utils/AllUtils.jl")

using Logging
Logging.disable_logging(Logging.Info)

@testset "Turing" begin
    @time include("inference/AdvancedSMC.jl")
    @time include("inference/gibbs.jl")
    @time include("contrib/inference/sghmc.jl")
    @time include("stdlib/RandomMeasures.jl")
end

master branch

 43.139966 seconds (87.17 M allocations: 5.339 GiB, 3.38% gc time, 94.57% compilation time)
443.315047 seconds (1.18 G allocations: 77.239 GiB, 3.21% gc time, 18.66% compilation time)
 13.570206 seconds (39.32 M allocations: 2.376 GiB, 4.77% gc time, 98.35% compilation time)
  9.087712 seconds (17.42 M allocations: 1.047 GiB, 4.29% gc time, 73.16% compilation time)

sep-data branch (this PR)

 41.664786 seconds (84.94 M allocations: 5.297 GiB, 3.60% gc time, 94.48% compilation time)
390.978920 seconds (842.42 M allocations: 69.394 GiB, 3.04% gc time, 20.74% compilation time)
 13.259697 seconds (39.32 M allocations: 2.376 GiB, 4.87% gc time, 98.34% compilation time)
  8.926231 seconds (17.57 M allocations: 1.072 GiB, 4.81% gc time, 67.26% compilation time)

The number of allocations either reduced or stayed roughly the same.

@rikhuijzer
Copy link
Contributor

This PR reduces the Integrate Test from 140-160 mins to about 100 mins.

Tested locally or on the runners?

I simply read the time spent on github action, but it may be not so accurate due to the job concurrency and queue?

Mostly due to different CPUs. Not all runners use the same CPU and the difference in running time can be as big as 40%.

@KDr2
Copy link
Member Author

KDr2 commented Mar 3, 2022

Nice. I can confirm that this PR is good for performance.

I think the main reason behind this is that we now only need to copy the data(variables), and don't need to copy the instructions anymore.

Copy link
Member

@yebai yebai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @KDr2 - nice improvements!


abstract type AbstractInstruction end
abstract type Taped end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe keep this type for now - it’ll become useful again in future work.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At this moment, it only has one subtype: TapedFunction, so I removed it. And I think it's not hard to add it back when we need it.

But I'm OK to take it back now if you insist to do so 😄

Comment on lines +90 to +94
haskey(tf.bindings, :_1) && (tf.bindings[:_1].val = tf.func)
for i in 1:length(args)
slot = Symbol("_", i + 1)
haskey(tf.bindings, slot) && (tf.bindings[slot].val = args[i])
end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The plan is to replace to.bindings::Dict with NamedTuple in the next PR. So it probably won’t matter.

@yebai
Copy link
Member

yebai commented Mar 3, 2022

I think it’s ready to merge when the CI passes.

@yebai yebai merged commit f4e64d5 into master Mar 3, 2022
@delete-merged-branch delete-merged-branch bot deleted the sep-data branch March 3, 2022 15:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Remove the dependency on MacroTools Seprate code and data in instructions
4 participants