Skip to content

[FEATURE] ADD VPTQ #1463

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
Qubitium opened this issue Mar 14, 2025 · 3 comments
Open

[FEATURE] ADD VPTQ #1463

Qubitium opened this issue Mar 14, 2025 · 3 comments

Comments

@Qubitium
Copy link
Collaborator

Qubitium commented Mar 14, 2025

@YangWang92 I very much would like to add support VPTQ quantization into GPTQModel for both quantization and inference. As GPTQModel is currently a modular codebase after recent refractor we are now able to add new quantization methods like QQQ via a plugin (process) structure. We are also part of the upstream HF Transformers, Optimum, Peft, vLLM, SGLang eco-system when it comes to quantization support.

I see there are some samples to quantize but not sure how to first generate the hessians. GPTQ also use hessians based on calibration data fed to module forwards so not sure if this is the same thing.

Ref: https://github.com/microsoft/VPTQ/issues/126

The above guide requires hessian_path and inv_ hessian_path matrix but I don't see a hessian processor? Or are the hessian paths optional and if not exists, it will generate it on the fly (slow)?

Thanks!

@YangWang92
Copy link

YangWang92 commented Mar 19, 2025

Hi @Qubitium , sorry about the late reply, thanks for ping me and letting me quickly check your requirements.

@YangWang92
Copy link

BTW, my colleague @wejoncy integrated VPTQ in his qllm. Here is an example of how qllm processes the hessian matrix. https://github.com/wejoncy/QLLM/blob/main/qllm/quantization/vptq/qllm_hessian.py

@Qubitium
Copy link
Collaborator Author

BTW, my colleague @wejoncy integrated VPTQ in his qllm. Here is an example of how qllm processes the hessian matrix. https://github.com/wejoncy/QLLM/blob/main/qllm/quantization/vptq/qllm_hessian.py

@YangWang92 I think this is exactly what I was looking for!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants