Instructions to use SteelStorage/G2-MS-Nyxora-27b with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use SteelStorage/G2-MS-Nyxora-27b with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="SteelStorage/G2-MS-Nyxora-27b") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("SteelStorage/G2-MS-Nyxora-27b") model = AutoModelForCausalLM.from_pretrained("SteelStorage/G2-MS-Nyxora-27b") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use SteelStorage/G2-MS-Nyxora-27b with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "SteelStorage/G2-MS-Nyxora-27b" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "SteelStorage/G2-MS-Nyxora-27b", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/SteelStorage/G2-MS-Nyxora-27b
- SGLang
How to use SteelStorage/G2-MS-Nyxora-27b with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "SteelStorage/G2-MS-Nyxora-27b" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "SteelStorage/G2-MS-Nyxora-27b", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "SteelStorage/G2-MS-Nyxora-27b" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "SteelStorage/G2-MS-Nyxora-27b", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use SteelStorage/G2-MS-Nyxora-27b with Docker Model Runner:
docker model run hf.co/SteelStorage/G2-MS-Nyxora-27b
censorship
thanks for the great model!!
this is the best in performance for gemma2 rp finetune i've ever had. the only downside maybe is censorship, it wont go further if it going to such content.
i tried big tiger, and it would drive toward unsafe topic just fine (some characters actively tries to wound me etc), maybe try to increase weight for tiger or maybe just straight up use tiger as base model? the performance in intelligent isnt really fall after i tried tiger for general use.
i think it would be the greatest model yet available if it hit a success, gemma 2 even outperform 70B models in some cases from my experience, plus its relatively much smaller.
thanks for the great model!!
this is the best in performance for gemma2 rp finetune i've ever had. the only downside maybe is censorship, it wont go further if it going to such content.
i tried big tiger, and it would drive toward unsafe topic just fine (some characters actively tries to wound me etc), maybe try to increase weight for tiger or maybe just straight up use tiger as base model? the performance in intelligent isnt really fall after i tried tiger for general use.
i think it would be the greatest model yet available if it hit a success, gemma 2 even outperform 70B models in some cases from my experience, plus its relatively much smaller.
Glad to hear you like it so far!
ill look to do a v2 sometime in the future as im wanting to do a merge using the new della method but there are errors at the moment with gemma 2. have you tried the supplied system prompts? ive noticed its much more likely to be de-censored with a good sys prompt. unlike regular gemma-2.
Yes i've tried it. It's not the that it refuse some topic, it just the model cannot actively trying to be on dark side like it never know what it should do it that circumstance. So it kinda fall off if the scenario involve war or something. Other than that, this is the best model i have used, even among 70b models.
Thanks for answering. I will support you
It seems more coherent than the base Gemma-2 and intelligent as well but unfortunately, whatever I tried, it's very censored and constantly refused even with different prompts. It's a pity because it really seems so promising and if only it would be uncensored (and a bit more adventurous), then this model would be a killer model and just the right sized model for a 24gb machine