Skip to content

Conversation

@n-k
Copy link
Contributor

@n-k n-k commented Nov 3, 2025

Add support for adding tensor buffer type overrides:

  • Added a new impl block for LlamaModelParams with functions for added buffer type overrides
  • Currently only allows setting CPU overrides
  • Convenience function equivalent to --cmoe in llama.cpp CLI
  • Updated simple example to add --cmoe option
  • Tested with GPT OSS 20B

Copy link
Contributor

@MarcusDunn MarcusDunn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove eprintln in favor of tracing (or remove totally) and this looks good.

@n-k
Copy link
Contributor Author

n-k commented Nov 4, 2025

Removed the unnecessary eprintln.

@MarcusDunn
Copy link
Contributor

will merge and release pending checks pass

@MarcusDunn
Copy link
Contributor

it's going to need a cargo fmt

@n-k
Copy link
Contributor Author

n-k commented Nov 5, 2025

Done. Formatting is fixed.

@MarcusDunn MarcusDunn merged commit 1187928 into utilityai:main Nov 5, 2025
3 of 5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants