Batch vs single prompt outputs differ when using ReplacementModel

Hello, thanks for the great work on this project!

I ran into an issue that looks very similar to the one reported here:
https://github.com/TransformerLensOrg/TransformerLens/issues/1005

In my case, I was able to temporarily fix it using the approach proposed in this PR:
https://github.com/TransformerLensOrg/TransformerLens/pull/999

This works for left padding. For example:

<img width="1853" height="586" alt="Image" src="https://github.com/user-attachments/assets/4ed11160-b634-45ff-b668-a7f3aa12fd96" />

However, when I run ``ReplacementModel``, the same issue appears again. Specifically, when I pass a batch of prompts as input, the results differ from running the same prompts one by one. For example:

<img width="1855" height="569" alt="Image" src="https://github.com/user-attachments/assets/8845f7fc-2d20-42a9-b83f-ed6916fc1b63" />

I am using ``meta-llama/Llama-3.1-8B-Instruct`` as LLM, and the layer-0 transcoder from ``facebook/crv-8b-instruct-transcoders``. Since ``ReplacementModel`` does not seem to support multi-GPU setups, I am currently testing with a single layer on one GPU.

I’m wondering whether ``ReplacementModel`` changes some internal configuration (e.g. padding, attention mask handling, or other settings) that could cause this behavior.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Batch vs single prompt outputs differ when using ReplacementModel #67

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Batch vs single prompt outputs differ when using ReplacementModel #67

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions