Skip to content

Conversation

DhairyaLGandhi
Copy link
Member

No description provided.

@DhairyaLGandhi
Copy link
Member Author

CI failure seems unrelated

@vchuravy
Copy link

bors try

@bors
Copy link
Contributor

bors bot commented Sep 12, 2019

🔒 Permission denied

Existing reviewers: click here to make vchuravy a reviewer

@DhairyaLGandhi
Copy link
Member Author

bors try

bors bot added a commit that referenced this pull request Sep 12, 2019
@bors
Copy link
Contributor

bors bot commented Sep 12, 2019

try

Build failed

@maleadt
Copy link
Collaborator

maleadt commented Sep 18, 2019

I'd drop the branch restrictions, that way you get automatic testing for people who push to branches on Flux.jl (and only need Bors for external contributors).

@DhairyaLGandhi
Copy link
Member Author

Thanks @maleadt!

Does this look better?

@maleadt
Copy link
Collaborator

maleadt commented Sep 18, 2019

Yes. Any particular reason you define your own flux target and don't use the templates (e.g. https://raw.githubusercontent.com/JuliaGPU/gitlab-ci/master/templates/v4/test_v1.2.yml)?

@DhairyaLGandhi
Copy link
Member Author

We used to need it since we had to add CuArrays as a dependency for testing on the GPUs. But since we don't need to do that anymore, we can safely get rid of it.

@MikeInnes
Copy link
Member

Is this ready to go?

@DhairyaLGandhi
Copy link
Member Author

I think the code is there, but I am seeing access to undefined reference on Julia v1.1 and v1.2

@maleadt
Copy link
Collaborator

maleadt commented Sep 19, 2019

@MikeInnes reported something similar yesterday, I'll have a closer look.
EDIT: I misremembered, JuliaGPU/CuArrays.jl#404 (comment) is different.

@DhairyaLGandhi
Copy link
Member Author

Am I doing something incorrect here with regard to the error from the bors job above?

@maleadt
Copy link
Collaborator

maleadt commented Sep 19, 2019

No, bors just reported that the build failed.

@maleadt
Copy link
Collaborator

maleadt commented Sep 20, 2019

Interesting, this is a legitimate issue now that we depend on CuArrays from Flux and as a result the CuArrays __init__ functions might not have run before Flux' global code (which performs an allocation in global scope). I'll push a quick fix on a stable branch of CuArrays, but I've fixed the issue in the PR that moves over Flux DNN functionality (including that global allocation) to CuArrays. JuliaGPU/CuArrays.jl#404 (comment) was related after all 😄

@dhairyagandhi96 could you try the CuArrays stable branch? https://github.com/JuliaGPU/CuArrays.jl/tree/stable

@DhairyaLGandhi
Copy link
Member Author

bors try

Thanks!

bors bot added a commit that referenced this pull request Sep 20, 2019
@DhairyaLGandhi
Copy link
Member Author

ERROR: LoadError: LoadError: LoadError: CUDA error: invalid device context (code #201, ERROR_INVALID_CONTEXT)

Seems related to the initialisation

@maleadt
Copy link
Collaborator

maleadt commented Sep 20, 2019

Maybe it's easier to include the fix here instead: JuliaGPU/CuArrays.jl@5048e89

@bors
Copy link
Contributor

bors bot commented Sep 20, 2019

try

Build failed

@DhairyaLGandhi
Copy link
Member Author

bors try

bors bot added a commit that referenced this pull request Sep 22, 2019
@bors
Copy link
Contributor

bors bot commented Sep 22, 2019

try

Build succeeded

@bors
Copy link
Contributor

bors bot commented Sep 23, 2019

try

Build failed

@DhairyaLGandhi
Copy link
Member Author

bors try

bors bot added a commit that referenced this pull request Sep 23, 2019
@bors
Copy link
Contributor

bors bot commented Sep 23, 2019

try

Build failed

@DhairyaLGandhi
Copy link
Member Author

bors try

bors bot added a commit that referenced this pull request Sep 24, 2019
@bors
Copy link
Contributor

bors bot commented Sep 24, 2019

try

Build failed

@DhairyaLGandhi
Copy link
Member Author

bors try

bors bot added a commit that referenced this pull request Sep 24, 2019
@maleadt
Copy link
Collaborator

maleadt commented Sep 24, 2019

Why do you need to extend manually? The failures before didn't seem CI infrastructure-related.

@DhairyaLGandhi
Copy link
Member Author

In the default the template does a dev of the project per
https://github.com/JuliaGPU/gitlab-ci/blob/52d59dadbc16b7b5cbb87f6a276cb50f68ecf554/templates/v4/common.yml#L25

With having dependencies on the master in a couple places, it is possible that the exact commit is not picked up (or rather the tag is), hence returning the pullback not defined

Trying this out to see if this if this way allows for better reproducibility.

@maleadt
Copy link
Collaborator

maleadt commented Sep 24, 2019

With having dependencies on the master in a couple places, it is possible that the exact commit is not picked up (or rather the tag is), hence returning the pullback not defined

There shouldn't be a difference between develop $PATH and --project $PATH instantiate, right? Ah wait, I guess the former ignores any Manifest. But isn't that risky for CI, where regular package installation ignores the Manifest anyhow?

@MikeInnes
Copy link
Member

The benefits are that (a) the CI builds are completely reproducible and (b) it makes it substantially easier to do WIP on multiple packages at once (e.g. we can update Flux for Zygote changes and make patches as needed, rather than needing to tag a Zygote release, then doing several patch releases for issues Flux finds).

Of course at release time we need to make sure we're using release versions; the manifest then serves as a clear log of what needs tags and/or compat updates, without us having to keep track of that throughout the release cycle.

@DhairyaLGandhi
Copy link
Member Author

Right. It does mean that any breaking changes in the dependencies can break CI. This comes against the reproducibility of the environment (since the Manifest is ignored)

@DhairyaLGandhi
Copy link
Member Author

bors try

@bors
Copy link
Contributor

bors bot commented Sep 24, 2019

try

Already running a review

@maleadt
Copy link
Collaborator

maleadt commented Sep 24, 2019

That sounds like a good fit for CUDAnative/CuArrays/GPUArrays too. @dhairyagandhi96 the v3 templates used instantiate instead of develop, I can update those so that we can continue using them.

@DhairyaLGandhi
Copy link
Member Author

Okay, sounds good! We can then just use the corresponding targets without rewriting them here.

@bors
Copy link
Contributor

bors bot commented Sep 24, 2019

try

Build succeeded

@DhairyaLGandhi
Copy link
Member Author

bors r+

bors bot added a commit that referenced this pull request Sep 24, 2019
861: GPU CI maintainance  r=dhairyagandhi96 a=dhairyagandhi96



Co-authored-by: Dhairya Gandhi <[email protected]>
@bors
Copy link
Contributor

bors bot commented Sep 24, 2019

Build succeeded

@bors bors bot merged commit ce910da into FluxML:master Sep 24, 2019
@maleadt maleadt added the cuda label Nov 6, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants