Move to updated GPTQ with new PTB and C4 eval

**Description**

As visible in the referenced commit below, the GPTQ authors published some improvements to the quantization and those changes are now in QWOP's implementation.

I imagine some testing is required as QWOP didn't implement the changes as a PR with confirmed quality before merging. That said, at least adding a flag to TGW such that --new-eval is passed on to QWOP would be a good place to start for testing.

**Additional Context**

Added to the readme for GPTQ-for-LLaMA:

- Changed to support new features proposed by [GPTQ](https://github.com/IST-DASLab/gptq#new-features).
- Slightly adjusted preprocessing of C4 and PTB for more realistic evaluations (used in our updated results); can be activated via the flag --new-eval.
- two new tricks:--act-order (quantizing columns in order of decreasing activation size) and --true-sequential (performing sequential quantization even within a single Transformer block). Those fix GPTQ's strangely bad performance on the 7B model (from 7.15 to 6.09 Wiki2 PPL) and lead to slight improvements on most models/settings in general.

[Commits as of early 2023-03-24 by QWOPQWOP200](https://github.com/qwopqwop200/GPTQ-for-LLaMa/commit/078930322e3259185c9e0d620af6d096e34fb412)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Move to updated GPTQ with new PTB and C4 eval #541

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Move to updated GPTQ with new PTB and C4 eval #541

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions