Update README.md

Files changed (1) hide show

README.md CHANGED Viewed

@@ -17,7 +17,7 @@ tags:
 # Model Overview
 ## Description:
-The NVIDIA MiniMax-M2.5-NVFP4 model is the quantized version of MiniMax's MiniMax-M2.5 model, which is an auto-regressive language model that uses an optimized transformer architecture. For more information, please check [here](https://huggingface.co/MiniMaxAI/MiniMax-M2.5). The NVIDIA MiniMax-M2.5 NVFP4 model is quantized with [TensorRT Model Optimizer](https://github.com/NVIDIA/TensorRT-Model-Optimizer).
 This model is ready for commercial/non-commercial use.  <br>
@@ -104,7 +104,7 @@ This model was obtained by quantizing the weights and activations of MiniMax-M2.
 ## Usage
-To serve this checkpoint with [SGLang](https://github.com/sgl-project/sglang), you can start the docker `lmsysorg/sglang:nightly-dev-20260313-c21ddbc7` and run the sample command below:
 ```sh
 python3 -m sglang.launch_server --model nvidia/MiniMax-M2.5-NVFP4 \

 # Model Overview
 ## Description:
+The NVIDIA MiniMax-M2.5-NVFP4 model is the quantized version of MiniMax's MiniMax-M2.5 model, which is an auto-regressive language model that uses an optimized transformer architecture. For more information, please check [here](https://huggingface.co/MiniMaxAI/MiniMax-M2.5). The NVIDIA MiniMax-M2.5 NVFP4 model is quantized with [Nvidia Model Optimizer](https://github.com/NVIDIA/Model-Optimizer).
 This model is ready for commercial/non-commercial use.  <br>
 ## Usage
+To serve this checkpoint with [SGLang](https://github.com/sgl-project/sglang), you can start the docker `lmsysorg/sglang:latest` and run the sample command below:
 ```sh
 python3 -m sglang.launch_server --model nvidia/MiniMax-M2.5-NVFP4 \