Inference Providers
Active filters: quark
superbigtree/Mistral-Nemo-Instruct-2407-FP8_aq
12B • Updated • 23
aigdat/Llama-3.2-1B-Instruct-awq-uint4-float16
0.4B • Updated • 1
aigdat/Llama-3.2-3B-Instruct-awq-uint4-float16
0.8B • Updated • 1
aigdat/Phi-3.5-mini-instruct-awq-uint4-float16
0.6B • Updated • 2
aigdat/DeepSeek-R1-Distill-Qwen-1.5B_quantized_int4_bfloat16
0.4B • Updated • 1
aigdat/Qwen3-0.6B_quantized_int4_float16
0.2B • Updated • 4
aigdat/Arch-Function-Chat-3B_quantized_int4_float16
0.7B • Updated • 3
aigdat/DeepCoder-14B-Preview_quantized_int4_float16
3B • Updated • 7
aigdat/Qwen2.5-Coder-1.5B-Instruct_quantized_int4_bfloat16
0.4B • Updated • 1
aigdat/Qwen2.5-Coder-7B-Instruct_quantized_int4_bfloat16
1B • Updated • 2
aigdat/Qwen2.5-3B-Instruct_quantized_int4_bfloat16
0.7B • Updated • 3
aigdat/Qwen2.5-Coder-32B-Instruct_quantized_int4_bfloat16
5B • Updated • 1
aigdat/Llama-xLAM-2-8b-fc-r_quantized_int4_bfloat16
fxmarty/qwen_1.5-moe-a2.7b-mxfp4
8B • Updated • 6.18k
amd/Llama-3.3-70B-Instruct-MXFP4-Preview
38B • Updated • 4.37k
• 2
fxmarty/deepseek_r1_3_layers_mxfp4
8B • Updated • 109
• 1
fxmarty/Llama-4-Scout-17B-16E-Instruct-2-layers-mxfp4
5B • Updated • 3.32k
• 1
371B • Updated • 98.5k
• 5
mohitsha/Llama-2-7b-hf-w_mx_fp4_per_group_sym
4B • Updated amd/Llama-3.1-405B-Instruct-MXFP4-Preview
218B • Updated • 447
• 1
amd/DeepSeek-R1-MXFP4-ASQ
363B • Updated • 3k
• 1
haoyang-amd/qwen1.5-0.5B-ptpc
0.5B • Updated • 1
amd/DeepSeek-R1-0528-MXFP4
356B • Updated • 19.2k
• 1
fxmarty/Llama-3.1-70B-Instruct-2-layers-mxfp6
3B • Updated • 4.38k
fxmarty/qwen1.5_moe_a2.7b_chat_w_fp4_a_fp6_e2m3
8B • Updated • 5.19k
fxmarty/qwen1.5_moe_a2.7b_chat_w_fp6_e2m3_a_fp6_e2m3
11B • Updated • 2
fxmarty/qwen1.5_moe_a2.7b_chat_w_fp6_e3m2_a_fp6_e3m2
11B • Updated • 6.13k
amd/Llama-2-70b-chat-hf-WMXFP4-AMXFP4-KVFP8-Scale-UINT8-MLPerf-GPTQ
37B • Updated • 5
sudhab1988/rakuten-7b-awq-g128-int4-asym-fp16-hf
1B • Updated • 1
matmelis/Llama_3.2_1B_w_uint4_gptq
0.4B • Updated • 6