Instructions to use SteelStorage/G2-MS-Nyxora-27b with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use SteelStorage/G2-MS-Nyxora-27b with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="SteelStorage/G2-MS-Nyxora-27b")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("SteelStorage/G2-MS-Nyxora-27b")
model = AutoModelForCausalLM.from_pretrained("SteelStorage/G2-MS-Nyxora-27b")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use SteelStorage/G2-MS-Nyxora-27b with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "SteelStorage/G2-MS-Nyxora-27b"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "SteelStorage/G2-MS-Nyxora-27b",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/SteelStorage/G2-MS-Nyxora-27b

SGLang

How to use SteelStorage/G2-MS-Nyxora-27b with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "SteelStorage/G2-MS-Nyxora-27b" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "SteelStorage/G2-MS-Nyxora-27b",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "SteelStorage/G2-MS-Nyxora-27b" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "SteelStorage/G2-MS-Nyxora-27b",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use SteelStorage/G2-MS-Nyxora-27b with Docker Model Runner:
```
docker model run hf.co/SteelStorage/G2-MS-Nyxora-27b
```

censorship

by hazkun - opened Jul 20, 2024

Discussion

hazkun

Jul 20, 2024

•

edited Jul 21, 2024

thanks for the great model!!

this is the best in performance for gemma2 rp finetune i've ever had. the only downside maybe is censorship, it wont go further if it going to such content.

i tried big tiger, and it would drive toward unsafe topic just fine (some characters actively tries to wound me etc), maybe try to increase weight for tiger or maybe just straight up use tiger as base model? the performance in intelligent isnt really fall after i tried tiger for general use.

i think it would be the greatest model yet available if it hit a success, gemma 2 even outperform 70B models in some cases from my experience, plus its relatively much smaller.

Steelskull

Steel Storage org Jul 21, 2024

•

edited Jul 21, 2024

thanks for the great model!!

this is the best in performance for gemma2 rp finetune i've ever had. the only downside maybe is censorship, it wont go further if it going to such content.

i tried big tiger, and it would drive toward unsafe topic just fine (some characters actively tries to wound me etc), maybe try to increase weight for tiger or maybe just straight up use tiger as base model? the performance in intelligent isnt really fall after i tried tiger for general use.

i think it would be the greatest model yet available if it hit a success, gemma 2 even outperform 70B models in some cases from my experience, plus its relatively much smaller.

Glad to hear you like it so far!
ill look to do a v2 sometime in the future as im wanting to do a merge using the new della method but there are errors at the moment with gemma 2. have you tried the supplied system prompts? ive noticed its much more likely to be de-censored with a good sys prompt. unlike regular gemma-2.

hazkun

Jul 21, 2024

Yes i've tried it. It's not the that it refuse some topic, it just the model cannot actively trying to be on dark side like it never know what it should do it that circumstance. So it kinda fall off if the scenario involve war or something. Other than that, this is the best model i have used, even among 70b models.

Thanks for answering. I will support you

JbJaz

Jul 21, 2024

It seems more coherent than the base Gemma-2 and intelligent as well but unfortunately, whatever I tried, it's very censored and constantly refused even with different prompts. It's a pity because it really seems so promising and if only it would be uncensored (and a bit more adventurous), then this model would be a killer model and just the right sized model for a 24gb machine

hazkun changed discussion status to closed Jul 23, 2024

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment