You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@YangWang92 I very much would like to add support VPTQ quantization into GPTQModel for both quantization and inference. As GPTQModel is currently a modular codebase after recent refractor we are now able to add new quantization methods like QQQ via a plugin (process) structure. We are also part of the upstream HF Transformers, Optimum, Peft, vLLM, SGLang eco-system when it comes to quantization support.
I see there are some samples to quantize but not sure how to first generate the hessians. GPTQ also use hessians based on calibration data fed to module forwards so not sure if this is the same thing.
The above guide requires hessian_path and inv_ hessian_path matrix but I don't see a hessian processor? Or are the hessian paths optional and if not exists, it will generate it on the fly (slow)?
Thanks!
The text was updated successfully, but these errors were encountered:
Uh oh!
There was an error while loading. Please reload this page.
@YangWang92 I very much would like to add support VPTQ quantization into GPTQModel for both quantization and inference. As GPTQModel is currently a modular codebase after recent refractor we are now able to add new quantization methods like QQQ via a plugin (process) structure. We are also part of the upstream HF Transformers, Optimum, Peft, vLLM, SGLang eco-system when it comes to quantization support.
I see there are some samples to quantize but not sure how to first generate the hessians. GPTQ also use hessians based on calibration data fed to module forwards so not sure if this is the same thing.
Ref: https://github.com/microsoft/VPTQ/issues/126
The above guide requires
hessian_path
andinv_ hessian_path
matrix but I don't see ahessian
processor? Or are the hessian paths optional and if not exists, it will generate it on the fly (slow)?Thanks!
The text was updated successfully, but these errors were encountered: