Update README.md
Browse files
README.md
CHANGED
|
@@ -17,7 +17,7 @@ tags:
|
|
| 17 |
# Model Overview
|
| 18 |
|
| 19 |
## Description:
|
| 20 |
-
The NVIDIA MiniMax-M2.5-NVFP4 model is the quantized version of MiniMax's MiniMax-M2.5 model, which is an auto-regressive language model that uses an optimized transformer architecture. For more information, please check [here](https://huggingface.co/MiniMaxAI/MiniMax-M2.5). The NVIDIA MiniMax-M2.5 NVFP4 model is quantized with [
|
| 21 |
|
| 22 |
This model is ready for commercial/non-commercial use. <br>
|
| 23 |
|
|
@@ -104,7 +104,7 @@ This model was obtained by quantizing the weights and activations of MiniMax-M2.
|
|
| 104 |
|
| 105 |
## Usage
|
| 106 |
|
| 107 |
-
To serve this checkpoint with [SGLang](https://github.com/sgl-project/sglang), you can start the docker `lmsysorg/sglang:
|
| 108 |
|
| 109 |
```sh
|
| 110 |
python3 -m sglang.launch_server --model nvidia/MiniMax-M2.5-NVFP4 \
|
|
|
|
| 17 |
# Model Overview
|
| 18 |
|
| 19 |
## Description:
|
| 20 |
+
The NVIDIA MiniMax-M2.5-NVFP4 model is the quantized version of MiniMax's MiniMax-M2.5 model, which is an auto-regressive language model that uses an optimized transformer architecture. For more information, please check [here](https://huggingface.co/MiniMaxAI/MiniMax-M2.5). The NVIDIA MiniMax-M2.5 NVFP4 model is quantized with [Nvidia Model Optimizer](https://github.com/NVIDIA/Model-Optimizer).
|
| 21 |
|
| 22 |
This model is ready for commercial/non-commercial use. <br>
|
| 23 |
|
|
|
|
| 104 |
|
| 105 |
## Usage
|
| 106 |
|
| 107 |
+
To serve this checkpoint with [SGLang](https://github.com/sgl-project/sglang), you can start the docker `lmsysorg/sglang:latest` and run the sample command below:
|
| 108 |
|
| 109 |
```sh
|
| 110 |
python3 -m sglang.launch_server --model nvidia/MiniMax-M2.5-NVFP4 \
|