zhiyucheng commited on
Commit
2c320c5
·
verified ·
1 Parent(s): 3e01c4b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -17,7 +17,7 @@ tags:
17
  # Model Overview
18
 
19
  ## Description:
20
- The NVIDIA MiniMax-M2.5-NVFP4 model is the quantized version of MiniMax's MiniMax-M2.5 model, which is an auto-regressive language model that uses an optimized transformer architecture. For more information, please check [here](https://huggingface.co/MiniMaxAI/MiniMax-M2.5). The NVIDIA MiniMax-M2.5 NVFP4 model is quantized with [TensorRT Model Optimizer](https://github.com/NVIDIA/TensorRT-Model-Optimizer).
21
 
22
  This model is ready for commercial/non-commercial use. <br>
23
 
@@ -104,7 +104,7 @@ This model was obtained by quantizing the weights and activations of MiniMax-M2.
104
 
105
  ## Usage
106
 
107
- To serve this checkpoint with [SGLang](https://github.com/sgl-project/sglang), you can start the docker `lmsysorg/sglang:nightly-dev-20260313-c21ddbc7` and run the sample command below:
108
 
109
  ```sh
110
  python3 -m sglang.launch_server --model nvidia/MiniMax-M2.5-NVFP4 \
 
17
  # Model Overview
18
 
19
  ## Description:
20
+ The NVIDIA MiniMax-M2.5-NVFP4 model is the quantized version of MiniMax's MiniMax-M2.5 model, which is an auto-regressive language model that uses an optimized transformer architecture. For more information, please check [here](https://huggingface.co/MiniMaxAI/MiniMax-M2.5). The NVIDIA MiniMax-M2.5 NVFP4 model is quantized with [Nvidia Model Optimizer](https://github.com/NVIDIA/Model-Optimizer).
21
 
22
  This model is ready for commercial/non-commercial use. <br>
23
 
 
104
 
105
  ## Usage
106
 
107
+ To serve this checkpoint with [SGLang](https://github.com/sgl-project/sglang), you can start the docker `lmsysorg/sglang:latest` and run the sample command below:
108
 
109
  ```sh
110
  python3 -m sglang.launch_server --model nvidia/MiniMax-M2.5-NVFP4 \