Skip to content

Running out of memory with julia v0.6 but not v0.5 #151

Closed
@raminammour

Description

@raminammour

Hello,

The code below is a minimal reproducible example that shows the behavior (the original code came up in an application I was writing). Sorry it is a bit convoluted, but on my machine it needed to be so to reproduce the error.

On julia 0.5 the code runs without running out of memory, but not on 0.6

addprocs(16)
@everywhere using DistributedArrays
 function test_gc(dA,dlA)
       @sync for ip in procs(dA)
       @spawnat ip begin @time begin
                 localpart(dlA)[:,:,ip]+=2localpart(dA)[:,:,1]+svd(localpart(dA)[:,:,2])[1][1]+2convert(Array,dA[1:size(localpart(dA),1),1:size(localpart(dA),2),3])
                 localpart(dlA)[:,:,ip+1]+=3localpart(dA)[:,:,1]+svd(localpart(dA)[:,:,5])[1][1]+4convert(Array,dA[1:size(localpart(dA),1),1:size(localpart(dA),2),3])
                 localpart(dlA)[:,:,ip+2]+=4localpart(dA)[:,:,1]+(localpart(dA)[:,:,7])+5convert(Array,dA[1:size(localpart(dA),1),1:size(localpart(dA),2),3])
       end
       end
       end
end

n1,n2,n3=2001,2001,701
for i=1:16
       dA=drand((n1,n2,n3),workers()[1:i]);dlA=similar(dA);

       println(i);
       @time test_gc(dA,dlA);

       d_closeall();
       @everywhere gc()
end

I monitored the memory usage using top and what happens is the following:
1- The total memory should be the same (about 60% on a 64 Gb node), split in i procs
2- Each function call creates temporary arrays that need to be garbage collected
3- As the number of procs increases, the code runs faster as one would hope
4- For some reason, in 0.5 the memory de-allocation and garbage collection is faster than 0.6
5- As a result, as memory is allocated for run i, residual memory from runs i-1,i-2,... is still being deallocated
6- Code runs out of memory...

I am not sure if this is expected behavior, or why 0.5 was more robust.

p.s: I am on master for DistributedArrays
Cheers!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions