Michael Goin's picture

Michael Goin

mgoin

·

mgoin_
mgoin

AI & ML interests

LLM inference optimization, compression, quantization, pruning, distillation

Recent Activity

new activity 5 days ago

poolside/Laguna-XS.2-INT4:Add base_model

new activity 5 days ago

poolside/Laguna-XS.2-NVFP4:Add base_model

new activity 5 days ago

poolside/Laguna-XS.2-FP8:Add base_model

View all activity

Organizations

New activity in poolside/Laguna-XS.2-INT4 5 days ago

Add base_model

#1 opened 5 days ago by

New activity in poolside/Laguna-XS.2-NVFP4 5 days ago

Add base_model

#1 opened 5 days ago by

New activity in poolside/Laguna-XS.2-FP8 5 days ago

Add base_model

#1 opened 5 days ago by

New activity in GadflyII/GLM-4.7-Flash-MXFP4 3 months ago

Update MXFP4 format to compressed-tensors

#3 opened 3 months ago by

New activity in kernels-community/vllm-flash-attn3 8 months ago

Support for B200s?

#7 opened 8 months ago by

New activity in RedHatAI/Mistral-Small-3.2-24B-Instruct-2506-FP8 10 months ago

Quantization recipe?

#3 opened 10 months ago by

Not working with vLLM 0.9.1

#1 opened 10 months ago by

New activity in RedHatAI/Llama-3.2-3B-Instruct-quantized.w8a8 10 months ago

Update config.json with the correct state

#1 opened 10 months ago by

New activity in MiniMaxAI/MiniMax-Text-01 11 months ago

Make model config compatible with Hugging Face MiniMax implementation

#39 opened 11 months ago by

New activity in mistralai/Magistral-Small-2506 11 months ago

Missing Tokenizer/Processor for use with Transformers

#3 opened 11 months ago by

New activity in RedHatAI/Mistral-Small-3.1-24B-Instruct-2503-FP8-dynamic about 1 year ago

How should I input the image?

#3 opened about 1 year ago by

New activity in RedHatAI/Qwen2.5-VL-72B-Instruct-quantized.w8a8 about 1 year ago

用vllm serve启动不了

#2 opened about 1 year ago by

New activity in RedHatAI/Qwen2.5-VL-72B-Instruct-FP8-dynamic about 1 year ago

Fix processor_class to match upstream

#4 opened about 1 year ago by

New activity in RedHatAI/Qwen2.5-VL-3B-Instruct-FP8-dynamic about 1 year ago

Remove image_processor_type

#1 opened about 1 year ago by

pooya-davoodi-parasail

New activity in nm-testing/Llama-3_1-Nemotron-Ultra-253B-v1-FP8-dynamic about 1 year ago

OSError: nm-testing/Llama-3_1-Nemotron-Ultra-253B-v1-FP8-dynamic does not appear to have a file named decilm.py

#2 opened about 1 year ago by

how to deploy this model without internet connection

#1 opened about 1 year ago by

New activity in RedHatAI/Llama-4-Scout-17B-16E-Instruct-FP8-dynamic about 1 year ago

Why not FP8 with static and per-tensor quantization?

#2 opened about 1 year ago by

New activity in mistralai/Mistral-Small-3.1-24B-Instruct-2503 about 1 year ago

Address discrepancies in the languages supported by the Mistral Small 3.1 2503

#54 opened about 1 year ago by

New activity in RedHatAI/Mistral-Small-3.1-24B-Instruct-2503-FP8-dynamic about 1 year ago

Please update the chat template

#1 opened about 1 year ago by

New activity in mistralai/Mistral-Small-3.1-24B-Instruct-2503 about 1 year ago

FP8 Dynamic/W8A16 Quants Please

#44 opened about 1 year ago by