-
Notifications
You must be signed in to change notification settings - Fork 291
Description
Hello, thanks for the great work on this project!
I ran into an issue that looks very similar to the one reported here:
TransformerLensOrg/TransformerLens#1005
In my case, I was able to temporarily fix it using the approach proposed in this PR:
TransformerLensOrg/TransformerLens#999
This works for left padding. For example:
However, when I run ReplacementModel, the same issue appears again. Specifically, when I pass a batch of prompts as input, the results differ from running the same prompts one by one. For example:
I am using meta-llama/Llama-3.1-8B-Instruct as LLM, and the layer-0 transcoder from facebook/crv-8b-instruct-transcoders. Since ReplacementModel does not seem to support multi-GPU setups, I am currently testing with a single layer on one GPU.
I’m wondering whether ReplacementModel changes some internal configuration (e.g. padding, attention mask handling, or other settings) that could cause this behavior.