Skip to content

How to print the metric (across all working trainer in DDP mode) into console after each epoch is end? #6122

@rudaoshi

Description

@rudaoshi

❓ Questions and Help

What is your question?

How to print the metric (across all working trainer in DDP mode) into console after each epoch is end?

When I run the program in a remote cluster, it is not convinent to use tensorboard to view the log.

Code

What have you tried?

First, I want to find a hook called after the metrics are gathered through multiple workers, but I'm not able to find it.
Callbacks are supposed to be designed to solve similar problem. But I do not know which one is the correct entry.
I've tried on_validation_epoch_end, on_validation_end, both print a message in each worker.

Second, about to produce the logging message, I tried logging and getLoger, notworking. I tried print, it works. But I still want to use logging. Any suggestions?

What's your environment?

  • OS: [e.g. iOS, Linux, Win] Mac
  • Packaging [e.g. pip, conda] pip
  • Version [e.g. 0.5.2.1] 1.1.8

Metadata

Metadata

Assignees

No one assigned

    Labels

    questionFurther information is requestedwon't fixThis will not be worked on

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions