Skip to content

Conversation

ianstenbit
Copy link
Contributor

I've verified that the numerics for SAM match correctly after this change.

This removes the requirement that users do the awkward imagenet mean+stddev scaling themselves -- now the model rescales correctly as it should (this is how we did the scaling for ViT as well)

@ianstenbit
Copy link
Contributor Author

/gcbrun

1 similar comment
@ianstenbit
Copy link
Contributor Author

/gcbrun

@ianstenbit
Copy link
Contributor Author

/gcbrun

@@ -123,6 +124,11 @@ def __init__(
# Use common rescaling strategy across keras_cv
x = keras.layers.Rescaling(1.0 / 255.0)(x)

# VITDet scales inputs based on the standard ImageNet mean/stddev.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we want to apply both types of rescaling? I'm a bit confused.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The include_rescaling check and associated rescaling layer make sure that the inputs are scaled from 0-1.

The subsequent bit rescales that using the mean and stddev of ImageNet, with the prior assumption that inputs are scaled 0-1

Copy link
Contributor

@jbischof jbischof left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds like you already verified, but wouldn't rescaling by 255 affect the ImageNet rescaling?

@ianstenbit
Copy link
Contributor Author

Sounds like you already verified, but wouldn't rescaling by 255 affect the ImageNet rescaling?

Correct -- see comment inline

@ianstenbit ianstenbit merged commit 7712a81 into keras-team:master Sep 22, 2023
@ianstenbit ianstenbit deleted the vitdet-scaling branch September 22, 2023 23:31
Comment on lines +127 to +130
# VITDet scales inputs based on the standard ImageNet mean/stddev.
x = (x - ops.array([0.229, 0.224, 0.225], dtype=x.dtype)) / (
ops.array([0.485, 0.456, 0.406], dtype=x.dtype)
)
Copy link
Contributor

@tirthasheshpatel tirthasheshpatel Sep 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two things:

  • I think you got the mean and std mixed up :) From the SAM repo, mean is [123.675, 116.28, 103.53] ([0.485, 0.456, 0.406] when normalized) and std is [58.395, 57.12, 57.375] ([0.229, 0.224, 0.225] when normalized).
  • We should not do this because the SAM model first normalizes, then pads. If padded inputs are passed, the preprocessing operation would not remain the same. It's giving me suboptimal outputs when I run the demos.

@ianstenbit Let's revert this and document the preprocess step, what do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes this does look like the mean/std are backwards -- I'll fix that. Can you send your demos so that I can run them?
My test demo looked fine with this change, which makes me think it's not sensitive enough.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes this does look like the mean/std are backwards -- I'll fix that. Can you send your demos so that I can run them?

Here: https://colab.research.google.com/drive/1wHTsYfmmZVuC71I4St1NshaAOQ6nUPFg?usp=sharing

I ran them with the patch in #2087 and the outputs are looking better. Output masks are now much closer to the demo on the original repo, with slight noise here and there (I think this is because the padded outlines having a non-zero value because of the normalization step). Nothing big though.

My test demo looked fine with this change, which makes me think it's not sensitive enough.

I agree, it's not super-sensitive, which is good! I just thought there might be cases where this could lead to a huge difference. What do you think about keeping the normalization step an opt-in rather than always having it on?

ghost pushed a commit to y-vectorfield/keras-cv that referenced this pull request Nov 16, 2023
* Update VITDet to conform to KerasCV scaling standards

* dtype fix
yuvraj-wale pushed a commit to yuvraj-wale/keras-cv that referenced this pull request Feb 8, 2024
* Update VITDet to conform to KerasCV scaling standards

* dtype fix
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants