Skip to content

Commit f3d5326

Browse files
committed
chore(model gallery): add kalomaze_qwen3-16b-a3b
Signed-off-by: Ettore Di Giacinto <[email protected]>
1 parent c0a206b commit f3d5326

File tree

1 file changed

+23
-0
lines changed

1 file changed

+23
-0
lines changed

gallery/index.yaml

Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -472,6 +472,29 @@
472472
- filename: Qwen3-30B-A1.5B-High-Speed.Q4_K_M.gguf
473473
sha256: 2fca25524abe237483de64599bab54eba8fb22088fc21e30ba45ea8fb04dd1e0
474474
uri: huggingface://mradermacher/Qwen3-30B-A1.5B-High-Speed-GGUF/Qwen3-30B-A1.5B-High-Speed.Q4_K_M.gguf
475+
- !!merge <<: *qwen3
476+
name: "kalomaze_qwen3-16b-a3b"
477+
urls:
478+
- https://huggingface.co/kalomaze/Qwen3-16B-A3B
479+
- https://huggingface.co/bartowski/kalomaze_Qwen3-16B-A3B-GGUF
480+
description: |
481+
A man-made horror beyond your comprehension.
482+
483+
But no, seriously, this is my experiment to:
484+
485+
measure the probability that any given expert will activate (over my personal set of fairly diverse calibration data), per layer
486+
prune 64/128 of the least used experts per layer (with reordered router and indexing per layer)
487+
488+
It can still write semi-coherently without any additional training or distillation done on top of it from the original 30b MoE. The .txt files with the original measurements are provided in the repo along with the exported weights.
489+
490+
Custom testing to measure the experts was done on a hacked version of vllm, and then I made a bespoke script to selectively export the weights according to the measurements.
491+
overrides:
492+
parameters:
493+
model: kalomaze_Qwen3-16B-A3B-Q4_K_M.gguf
494+
files:
495+
- filename: kalomaze_Qwen3-16B-A3B-Q4_K_M.gguf
496+
sha256: 34c86e1a956349632a05af37a104203823859363f141e1002abe6017349fbdcb
497+
uri: huggingface://bartowski/kalomaze_Qwen3-16B-A3B-GGUF/kalomaze_Qwen3-16B-A3B-Q4_K_M.gguf
475498
- &gemma3
476499
url: "github:mudler/LocalAI/gallery/gemma.yaml@master"
477500
name: "gemma-3-27b-it"

0 commit comments

Comments
 (0)