Skip to content

Some ensemble run directories don't get copied over to HPC #3019

@Aariq

Description

@Aariq

Bug Description

When running an ED2 model, start_model_runs() sometimes fails to copy some ensembles in the run directory to a remote host (HPC). One possible reason for this this is that rsync is currently being run inside of a for-loop and maybe there are some limits to how many connections to the server are open or how often connections can be made. It'll be more efficient to just rsync all the ensemble files over at once outside of a for-loop anyways, even if it doesn't fix this bug.

It's either happening here:

PEcAn.remote::remote.copy.to(
host = settings$host,
src = file.path(settings$rundir, run_id_string),
dst = settings$host$rundir,
delete = TRUE)
}

Or maybe here (can't remember)

out <- PEcAn.remote::start_qsub(
run = run,
qsub_string = settings$host$qsub,
rundir = settings$rundir,
host = settings$host,
host_rundir = settings$host$rundir,
host_outdir = settings$host$outdir,
stdout_log = "stdout.log",
stderr_log = "stderr.log",
job_script = "job.sh")

To Reproduce

difficult to reproduce, sorry.

Expected behavior

All files for ensemble runs should be copied over and if they can't be, there should be an informative warning or error.

Additional context

Add any other context about the problem here.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions