You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository was archived by the owner on Mar 12, 2021. It is now read-only.
I'm fairly new to CuArrays and Flux, and I met this problem of having halted training after some epochs. There is no CUDA out of memory error, but the usage is extremely high for this simple linear model (99.97% on a 1080 Ti). The code would sometimes finish all 500 epochs without problems, but other times halt around Epoch 150.
using LinearAlgebra
using Flux
using CuArrays, CUDAnative
using Flux.Optimise: update!
using Flux: crossentropy
device!(1)
CuArrays.allowscalar(false)
pred_loss(x, y) =sum((x .- y) .^2)
# dimens
B =250
linear =Dense(400, 144) |> gpu
# norm
linear.W .= linear.W ./sqrt.(sum(linear.W .^2, dims=1));
# training
E =500
opt_U =Descent(0.01)
for e =1:E
running_l =0
c =0for b =1:100
y =rand(144, B) |> gpu
R, =zeros(400, size(y)[2]) |> gpu
l =0
grads =gradient(params(linear.W)) do
l =pred_loss(y, linear(R))
running_l += l
return l
endupdate!(opt_U, linear.W, grads[linear.W])
linear.W .= linear.W ./sqrt.(sum(linear.W .^2, dims=1))
c +=1endprintln("Epoch: $e, Running loss: $(running_l / c)")
end
I'm having this problem on Ubuntu 18.04, using CuArrays v 2.1.0. Would appreciate some pointers on this.