Zheng Han's picture

Zheng Han

traphix

·

AI & ML interests

None yet

Recent Activity

new activity 2 days ago

RedHatAI/Qwen3.6-35B-A3B-NVFP4:Regarding the correctness of the int4 quantization script

new activity 5 days ago

RedHatAI/Qwen3.6-35B-A3B-NVFP4:Creation details?

new activity 5 days ago

apolo13x/Qwen3.5-27B-quantized.w4a16:Any creation details?

View all activity

Organizations

None yet

New activity in RedHatAI/Qwen3.6-35B-A3B-NVFP4 2 days ago

Regarding the correctness of the int4 quantization script

#5 opened 2 days ago by

New activity in RedHatAI/Qwen3.6-35B-A3B-NVFP4 5 days ago

Creation details?

#3 opened 5 days ago by

New activity in apolo13x/Qwen3.5-27B-quantized.w4a16 5 days ago

Any creation details?

#1 opened 5 days ago by

New activity in nm-testing/MiniMax-M2.5-W4A16 19 days ago

oneshot vs model_free_ptq? which one has better recovery?

#1 opened 19 days ago by

New activity in RedHatAI/Qwen3.5-397B-A17B-FP8-dynamic 25 days ago

W4A16 quant

#1 opened 2 months ago by

New activity in apolo13x/Qwen3.5-35B-A3B-quantized.w4a16 26 days ago

Any creation details?

#2 opened 26 days ago by

New activity in RedHatAI/Qwen3.5-397B-A17B-FP8-dynamic 29 days ago

Creation details?

#8 opened 29 days ago by

New activity in RedHatAI/Qwen3.5-122B-A10B-FP8-dynamic about 1 month ago

Creation details?

#2 opened about 1 month ago by

New activity in nivvis/Qwen3.5-122B-A10B-heretic-v2-FP8 about 1 month ago

Which framework was used for FP8 quantization? LLM-compressor?

#1 opened about 1 month ago by

New activity in huihui-ai/Huihui-Qwen3-Coder-Next-abliterated about 1 month ago

GPTQ quantization

#2 opened 2 months ago by

New activity in win10/Huihui-Qwen3.5-27B-abliterated-FP8 about 1 month ago

Which framework was used to quantize this model? llm-compressor? or Can you share the quantization Python script?

#1 opened about 1 month ago by

New activity in edp1096/Huihui-Qwen3.5-27B-abliterated-FP8 about 1 month ago

Which framework was used to quantize this model? llm-compressor? or Can you share the quantization Python script?

#2 opened about 1 month ago by

New activity in inference-optimization/Qwen3-Coder-Next.w4a16 about 1 month ago

Question about weight_observer？

#1 opened about 1 month ago by

New activity in RedHatAI/MiniMax-M2.5 about 2 months ago

INT4 w4a16 quantinization？

#1 opened about 2 months ago by

New activity in RedHatAI/Qwen3.5-397B-A17B-FP8-dynamic about 2 months ago

Quantization code for int4(w4a16) ?

#6 opened about 2 months ago by

New activity in RedHatAI/Qwen3-Next-80B-A3B-Instruct-quantized.w4a16 3 months ago

Tokenizer you are loading with an incorrect regex pattern

#2 opened 4 months ago by

New activity in Intel/Qwen3-Next-80B-A3B-Instruct-int4-AutoRound 4 months ago

Failed to find a kernel that can implement the WNA16 linear layer

#1 opened 4 months ago by

New activity in RedHatAI/Qwen3-Next-80B-A3B-Instruct-quantized.w4a16 4 months ago

vllm error: Extra inputs are not permitted

#1 opened 4 months ago by

New activity in RedHatAI/Qwen3-235B-A22B-Instruct-2507-NVFP4 5 months ago

Can A100 run Qwen3-235B-A22B-Instruct-2507-NVFP4?

#1 opened 5 months ago by

New activity in Qwen/Qwen3-Next-80B-A3B-Instruct-FP8 7 months ago

Error on 4 x L40s

#4 opened 7 months ago by