FYI, I started writing a simple Pascal VOC dataset class. https://github.com/fmassa/vision/tree/voc_dataset