Skip to content

bugfixes to kylesayrs/transform_apply #375

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jul 2, 2025

Conversation

brian-dellabetta
Copy link
Contributor

bugfixes to get this branch to work with kylesayrs/transform-modifier.

Currently failing on load-up in vllm:

Loading safetensors checkpoint shards:   0% Completed | 0/14 [00:00<?, ?it/s]
ERROR 07-01 22:26:57 [core.py:515] EngineCore failed to start.
ERROR 07-01 22:26:57 [core.py:515] Traceback (most recent call last):
ERROR 07-01 22:26:57 [core.py:515]   File "/home/bdellabe/projects/.venv/lib/python3.10/site-packages/vllm/v1/engine/core.py", line 506, in run_engine_core
ERROR 07-01 22:26:57 [core.py:515]     engine_core = EngineCoreProc(*args, **kwargs)
ERROR 07-01 22:26:57 [core.py:515]   File "/home/bdellabe/projects/.venv/lib/python3.10/site-packages/vllm/v1/engine/core.py", line 390, in __init__
ERROR 07-01 22:26:57 [core.py:515]     super().__init__(vllm_config, executor_class, log_stats,
ERROR 07-01 22:26:57 [core.py:515]   File "/home/bdellabe/projects/.venv/lib/python3.10/site-packages/vllm/v1/engine/core.py", line 76, in __init__
ERROR 07-01 22:26:57 [core.py:515]     self.model_executor = executor_class(vllm_config)
ERROR 07-01 22:26:57 [core.py:515]   File "/home/bdellabe/projects/.venv/lib/python3.10/site-packages/vllm/executor/executor_base.py", line 53, in __init__
ERROR 07-01 22:26:57 [core.py:515]     self._init_executor()
ERROR 07-01 22:26:57 [core.py:515]   File "/home/bdellabe/projects/.venv/lib/python3.10/site-packages/vllm/executor/uniproc_executor.py", line 48, in _init_executor
ERROR 07-01 22:26:57 [core.py:515]     self.collective_rpc("load_model")
ERROR 07-01 22:26:57 [core.py:515]   File "/home/bdellabe/projects/.venv/lib/python3.10/site-packages/vllm/executor/uniproc_executor.py", line 57, in collective_rpc
ERROR 07-01 22:26:57 [core.py:515]     answer = run_method(self.driver_worker, method, args, kwargs)
ERROR 07-01 22:26:57 [core.py:515]   File "/home/bdellabe/projects/.venv/lib/python3.10/site-packages/vllm/utils.py", line 2671, in run_method
ERROR 07-01 22:26:57 [core.py:515]     return func(*args, **kwargs)
ERROR 07-01 22:26:57 [core.py:515]   File "/home/bdellabe/projects/.venv/lib/python3.10/site-packages/vllm/v1/worker/gpu_worker.py", line 180, in load_model
ERROR 07-01 22:26:57 [core.py:515]     self.model_runner.load_model()
ERROR 07-01 22:26:57 [core.py:515]   File "/home/bdellabe/projects/.venv/lib/python3.10/site-packages/vllm/v1/worker/gpu_model_runner.py", line 1601, in load_model
ERROR 07-01 22:26:57 [core.py:515]     self.model = model_loader.load_model(
ERROR 07-01 22:26:57 [core.py:515]   File "/home/bdellabe/projects/.venv/lib/python3.10/site-packages/vllm/model_executor/model_loader/base_loader.py", line 41, in load_model
ERROR 07-01 22:26:57 [core.py:515]     self.load_weights(model, model_config)
ERROR 07-01 22:26:57 [core.py:515]   File "/home/bdellabe/projects/.venv/lib/python3.10/site-packages/vllm/model_executor/model_loader/default_loader.py", line 269, in load_weights
ERROR 07-01 22:26:57 [core.py:515]     loaded_weights = model.load_weights(
ERROR 07-01 22:26:57 [core.py:515]   File "/home/bdellabe/projects/.venv/lib/python3.10/site-packages/vllm/model_executor/models/llama.py", line 601, in load_weights
ERROR 07-01 22:26:57 [core.py:515]     return loader.load_weights(
ERROR 07-01 22:26:57 [core.py:515]   File "/home/bdellabe/projects/.venv/lib/python3.10/site-packages/vllm/model_executor/models/utils.py", line 278, in load_weights
ERROR 07-01 22:26:57 [core.py:515]     autoloaded_weights = set(self._load_module("", self.module, weights))
ERROR 07-01 22:26:57 [core.py:515]   File "/home/bdellabe/projects/.venv/lib/python3.10/site-packages/vllm/model_executor/models/utils.py", line 236, in _load_module
ERROR 07-01 22:26:57 [core.py:515]     yield from self._load_module(prefix,
ERROR 07-01 22:26:57 [core.py:515]   File "/home/bdellabe/projects/.venv/lib/python3.10/site-packages/vllm/model_executor/models/utils.py", line 209, in _load_module
ERROR 07-01 22:26:57 [core.py:515]     loaded_params = module_load_weights(weights)
ERROR 07-01 22:26:57 [core.py:515]   File "/home/bdellabe/projects/.venv/lib/python3.10/site-packages/vllm/model_executor/models/llama.py", line 465, in load_weights
ERROR 07-01 22:26:57 [core.py:515]     param = params_dict[name]
ERROR 07-01 22:26:57 [core.py:515] KeyError: 'layers.14.mlp.down_proj.u_output.perm'

Signed-off-by: Brian Dellabetta <[email protected]>
@brian-dellabetta brian-dellabetta changed the title bugfixes bugfixes to kylesayrs/transform_apply Jul 1, 2025
@kylesayrs
Copy link
Contributor

Yep, these are needed as discussed.

As for the error, you're trying to load a model with online rotations, but vllm doesn't implement online rotations (yet)

@kylesayrs kylesayrs merged commit 85f40b5 into kylesayrs/transform_apply Jul 2, 2025
@kylesayrs kylesayrs deleted the bdellabe/apply_transforms branch July 2, 2025 01:11
dsikka pushed a commit that referenced this pull request Jul 9, 2025
* add utilities

Signed-off-by: Kyle Sayers <[email protected]>

* add tests

Signed-off-by: Kyle Sayers <[email protected]>

* add additional tests

Signed-off-by: Kyle Sayers <[email protected]>

* add utils and tests

Signed-off-by: Kyle Sayers <[email protected]>

* Implement transform factories

Signed-off-by: Kyle Sayers <[email protected]>

* add permutations

Signed-off-by: Kyle Sayers <[email protected]>

* add delete_offload_module

Signed-off-by: Kyle Sayers <[email protected]>

* key inverses by weight

Signed-off-by: Kyle Sayers <[email protected]>

* fix tests

Signed-off-by: Kyle Sayers <[email protected]>

* standardize random hadamard

Signed-off-by: Kyle Sayers <[email protected]>

* prepend input hooks

Signed-off-by: Kyle Sayers <[email protected]>

* apply sqrt division first

Signed-off-by: Kyle Sayers <[email protected]>

* use divided hadamards

Signed-off-by: Kyle Sayers <[email protected]>

* fix typo

Signed-off-by: Kyle Sayers <[email protected]>

* add random option

Signed-off-by: Kyle Sayers <[email protected]>

* use random seeds, rename matrix multiply

Signed-off-by: Kyle Sayers <[email protected]>

* add deterministic generation to random matrix

Signed-off-by: Kyle Sayers <[email protected]>

* fix perm math

Signed-off-by: Kyle Sayers <[email protected]>

* update docstrings

Signed-off-by: Kyle Sayers <[email protected]>

* update docstrings

Signed-off-by: Kyle Sayers <[email protected]>

* cleanup

Signed-off-by: Kyle Sayers <[email protected]>

* cleanup 2

Signed-off-by: Kyle Sayers <[email protected]>

* make seed optional

Signed-off-by: Kyle Sayers <[email protected]>

* remove iterable check and missing return value

Signed-off-by: Kyle Sayers <[email protected]>

* Remove unrelated changes

* simplify code

Signed-off-by: Kyle Sayers <[email protected]>

* implement apply, use in tests

Signed-off-by: Kyle Sayers <[email protected]>

* use hadamards database file

Signed-off-by: Kyle Sayers <[email protected]>

* try manifest

Signed-off-by: Kyle Sayers <[email protected]>

* try setup, update hadamards list

Signed-off-by: Kyle Sayers <[email protected]>

* fix setup

Signed-off-by: Kyle Sayers <[email protected]>

* add docstrings, cleanup

Signed-off-by: Kyle Sayers <[email protected]>

* fix setup, thank you @dbarbuzzi

Signed-off-by: Kyle Sayers <[email protected]>

* remove numpy, add tests

Signed-off-by: Kyle Sayers <[email protected]>

* solidify dtype, add gpu tests

Signed-off-by: Kyle Sayers <[email protected]>

* fix docstring

Signed-off-by: Kyle Sayers <[email protected]>

* add device option

Signed-off-by: Kyle Sayers <[email protected]>

* construct on execution device, cache on offload device

Signed-off-by: Kyle Sayers <[email protected]>

* save construction device changes for later

Signed-off-by: Kyle Sayers <[email protected]>

* construct on execution device, cache on offload device

* cite nja sloane

Signed-off-by: Kyle Sayers <[email protected]>

* remove dreg

Signed-off-by: Kyle Sayers <[email protected]>

* put on device via safe_open

Signed-off-by: Kyle Sayers <[email protected]>

* nits and docstrings

Signed-off-by: Kyle Sayers <[email protected]>

* update docstring

Signed-off-by: Kyle Sayers <[email protected]>

* Merge

* merge with construct: construct in float32

Signed-off-by: Kyle Sayers <[email protected]>

* construct with same dtype, constructing on fp32 found no difference

Signed-off-by: Kyle Sayers <[email protected]>

* remove unnecessary imports

Signed-off-by: Kyle Sayers <[email protected]>

* bugfixes (#375)

Signed-off-by: Brian Dellabetta <[email protected]>

* use factory_kwargs

Signed-off-by: Kyle Sayers <[email protected]>

* add frozen dict to deps

Signed-off-by: Kyle Sayers <[email protected]>

* fix style

Signed-off-by: Kyle Sayers <[email protected]>

* merge

Signed-off-by: Kyle Sayers <[email protected]>

* use delete_offload_module

Signed-off-by: Kyle Sayers <[email protected]>

* add docstrign

Signed-off-by: Kyle Sayers <[email protected]>

* use parametrize

Signed-off-by: Kyle Sayers <[email protected]>

* remove random from tests

Signed-off-by: Kyle Sayers <[email protected]>

---------

Signed-off-by: Kyle Sayers <[email protected]>
Signed-off-by: Brian Dellabetta <[email protected]>
Co-authored-by: Brian Dellabetta <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants