Add Mixture of Experts override to the UI#7527
Open
Quiet-Joker wants to merge 14 commits intooobabooga:mainfrom
Open
Add Mixture of Experts override to the UI#7527Quiet-Joker wants to merge 14 commits intooobabooga:mainfrom
Quiet-Joker wants to merge 14 commits intooobabooga:mainfrom
Conversation
Added handling for MoE expert overrides and reset logic for model loading.
Added MoE expert information variables for model loading.
Refactor MoE expert settings to use shared variables instead of model_settings.
-Removed spurious f prefix from the no-placeholder string in update_gpu_layers_and_vram -Refactored the VRAM formula into named intermediate variables (kv_term, layer_term) to sidestep the W503/W504 conflict (those two rules are mutually exclusive, so restructuring is the only clean resolution)
-Both except Exception as e: blocks captured e but then used traceback.format_exc() instead, making e dead. Changed to except Exception: in both places.
Removed the row_split checkbox from the UI model menu.
Stops gather_interface_values from reading a stale True when the user clicks Load after switching models.
Tighten the guard to also require moe_total_experts > 0, so even if shared.args.moe_experts_override_enabled is somehow left True, a non-MoE model (which has no expert_count key in its GGUF) can never trigger the override. This is the defensive backstop.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Checklist:
-Add the ability to override the number of experts on models such as Gemma, GLM, Mistral, etc. Without having to manually write the command in the extra flags of the llama server.