Skip to content

Conversation

jveitchmichaelis
Copy link
Collaborator

@jveitchmichaelis jveitchmichaelis commented Aug 21, 2025

This PR adds support for a basic DinoV3 backbone for RetinaNet.

As this is a WIP, I've added a few improvements to the CLI for debugging and logging. Some of this I'd like to PR separately and there is a minor fix to the dataset so that it actually uses root_dir for CSVs with full image paths. I also added a config option for the log folder.

To use Comet, make COMET_API_KEY and COMET_WORKSPACE available in your environment.

Train with:

[uv run] deepforest --config-name dinov3 train

Please try to use the CLI as much as possible so we can test the user experience.

For development I'd suggest making another config file with the train/val directories set up.

defaults:
  - dinov3
  - _self_

train:
    csv_file: 
    root_dir: 


validation:
    csv_file: 
    root_dir:

This will probably fail CI because we need to add a secret to pull the weights for testing. Locally the sanity checks pass (inference + train forward).

@jveitchmichaelis jveitchmichaelis marked this pull request as draft August 21, 2025 21:38
@bw4sz bw4sz self-requested a review August 21, 2025 21:49
@jveitchmichaelis jveitchmichaelis force-pushed the dinov3 branch 2 times, most recently from 6639118 to 4996ef0 Compare August 22, 2025 20:34
@jveitchmichaelis
Copy link
Collaborator Author

jveitchmichaelis commented Aug 22, 2025

I think this is roughly the different paths we're comparing to (except we would always use COCO-pretrained ResNet to start)

flowchart TD

    %% Datasets -> Backbones
    ImageNet([ImageNet]) --> ResNet[ResNet Backbone]
    ImageNet -.-> MSCOCO([MS-COCO])
    MSCOCO --> ResNet

    Sat493M([Sat-493M]) --> Dinov3[Dinov3 Backbone]
    LVD1689M([LVD-1689M]) --> Dinov3

    %% Backbones -> Pretrained RetinaNet
    ResNet --> Baseline[Pre-Trained RetinaNet]
    Dinov3 --> Baseline

    %% Fine-tuning paths
    Baseline --> FineTuned([Hand Annotations])
    Baseline -.-> LIDAR([Weak LIDAR Supervision])
    LIDAR --> FineTuned

    %% Merge paths into evaluation
    FineTuned --> NeonTree([NeonTreeEvaluation])
Loading

@jveitchmichaelis
Copy link
Collaborator Author

jveitchmichaelis commented Aug 22, 2025

In-progress training logs can be found here: https://www.comet.com/jveitchmichaelis/deepforest/view/new/panels

To dos:

  • Evaluation for DinoV3-ViT-L (300M params)
  • Evaluation for DinoV3-ViT-7B (7B params)
  • Evaluation for ResNet50 (25M params) to confirm reproducibility of existing pipeline

Currently performing cross-evaluation for the training dataset, followed by a "holdout" run on all train + NeonTreeEval. All Dino backbones are frozen for now, but generally we fine-tune ResNet. Previous hyper-params for resnet:

  • 40 epochs
  • lr 1e-4

Also potentially different hyper-parameters for feature pooling, following conventions in ViTDet: https://arxiv.org/abs/2203.16527

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants