Open
Description
I want to read from gguf and do downstream inference, while I get these output fields from a IQS_1-quanted tensor:
ReaderTensor(name='blk.22.ffn_down_exps.weight', tensor_type=<GGMLQuantizationType.IQ1_S: 19>, shape=memmap([2048, 7168, 256], dtype=uint64), n_elements=3758096384, n_bytes=734003200, data_offset=48
603148224, data=memmap([[[ 94, 24, 191, ..., 63, 82, 74],
[255, 21, 227, ..., 235, 230, 234],
[185, 23, 138, ..., 209, 33, 115],
...,
...,
[ 98, 24, 79, ..., 100, 143, 198],
[141, 23, 244, ..., 253, 90, 94],
[ 37, 24, 112, ..., 102, 106, 84]]],
shape=(256, 7168, 400), dtype=uint8), field=ReaderField(offset=5260585, name='blk.22.ffn_down_exps.weight', parts=[memmap([27], dtype=uint64), memmap([ 98, 108, 107, 46, 50, 50, 46, 102, 102, 110, 95, 100, 111,
119, 110, 95, 101, 120, 112, 115, 46, 119, 101, 105, 103, 104,
116], dtype=uint8), memmap([3], dtype=uint32), memmap([2048, 7168, 256], dtype=uint64), memmap([19], dtype=uint32), memmap([48597887488], dtype=uint64)], data=[1, 3, 4, 5], types=[]))
Is there explanation of how each field is used in its dequantization?
Metadata
Metadata
Assignees
Labels
No labels