align with transformers merging (#19)
Browse files- align with transformers merging (26ee82e31d75b2a6cb77fe50bef93e60e96c3b32)
README.md
CHANGED
|
@@ -38,8 +38,9 @@ base_model:
|
|
| 38 |
| Youtu-LLM-2B-GGUF | Instruct model of Youtu-LLM-2B, in GGUF format | 🤗 [Model](https://huggingface.co/tencent/Youtu-LLM-2B-GGUF)|
|
| 39 |
|
| 40 |
## 📰 News
|
| 41 |
-
- [2026.01.
|
| 42 |
-
- [2026.01.
|
|
|
|
| 43 |
|
| 44 |
<a id="benchmarks"></a>
|
| 45 |
|
|
@@ -89,8 +90,12 @@ base_model:
|
|
| 89 |
## 🚀 Quick Start
|
| 90 |
This guide will help you quickly deploy and invoke the **Youtu-LLM-2B** model. This model supports "Reasoning Mode", enabling it to generate higher-quality responses through Chain of Thought (CoT).
|
| 91 |
|
| 92 |
-
|
|
|
|
|
|
|
|
|
|
| 93 |
|
|
|
|
| 94 |
Ensure your Python environment has the `transformers` library installed and that the version meets the requirements.
|
| 95 |
|
| 96 |
```bash
|
|
@@ -169,6 +174,93 @@ thought, final_answer = parse_reasoning(full_response)
|
|
| 169 |
print(f"\n{'='*20} Thought Process {'='*20}\n{thought}")
|
| 170 |
print(f"\n{'='*20} Final Answer {'='*20}\n{final_answer}")
|
| 171 |
```
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 172 |
|
| 173 |
### 3. Key Configuration Details
|
| 174 |
|
|
|
|
| 38 |
| Youtu-LLM-2B-GGUF | Instruct model of Youtu-LLM-2B, in GGUF format | 🤗 [Model](https://huggingface.co/tencent/Youtu-LLM-2B-GGUF)|
|
| 39 |
|
| 40 |
## 📰 News
|
| 41 |
+
- [2026.01.28] You can now directly use Youtu-LLM with [Transformers](https://github.com/huggingface/transformers/pull/43166).
|
| 42 |
+
- [2026.01.07] You can now fine-tune Youtu-LLM with [ModelScope](https://mp.weixin.qq.com/s/JJtQWSYEjnE7GnPkaJ7UNA).
|
| 43 |
+
- [2026.01.04] You can now fine-tune Youtu-LLM with [LlamaFactory](https://github.com/hiyouga/LlamaFactory/pull/9707).
|
| 44 |
|
| 45 |
<a id="benchmarks"></a>
|
| 46 |
|
|
|
|
| 90 |
## 🚀 Quick Start
|
| 91 |
This guide will help you quickly deploy and invoke the **Youtu-LLM-2B** model. This model supports "Reasoning Mode", enabling it to generate higher-quality responses through Chain of Thought (CoT).
|
| 92 |
|
| 93 |
+
<details>
|
| 94 |
+
<summary>Transformers below 5.0.0.dev0</summary>
|
| 95 |
+
|
| 96 |
+
If you wish to use Youtu-LLM-2B based on earlier versions of transformers, please make sure to download the model repository before this [commit](https://huggingface.co/tencent/Youtu-LLM-2B/commit/5690998a0a4cae7a7ec970d09262745e00bb6c5c).
|
| 97 |
|
| 98 |
+
### 1. Environment Preparation
|
| 99 |
Ensure your Python environment has the `transformers` library installed and that the version meets the requirements.
|
| 100 |
|
| 101 |
```bash
|
|
|
|
| 174 |
print(f"\n{'='*20} Thought Process {'='*20}\n{thought}")
|
| 175 |
print(f"\n{'='*20} Final Answer {'='*20}\n{final_answer}")
|
| 176 |
```
|
| 177 |
+
</details>
|
| 178 |
+
|
| 179 |
+
<details>
|
| 180 |
+
<summary>Transformers equals or higher than 5.0.0.dev0</summary>
|
| 181 |
+
|
| 182 |
+
### 1. Environment Preparation
|
| 183 |
+
Ensure your Python environment has the `transformers` library installed and that the version meets the requirements.
|
| 184 |
+
|
| 185 |
+
```bash
|
| 186 |
+
git clone https://github.com/huggingface/transformers.git
|
| 187 |
+
cd transformers
|
| 188 |
+
|
| 189 |
+
# pip
|
| 190 |
+
pip install '.[torch]'
|
| 191 |
+
|
| 192 |
+
# uv
|
| 193 |
+
uv pip install '.[torch]'
|
| 194 |
+
|
| 195 |
+
```
|
| 196 |
+
|
| 197 |
+
### 2. Core Code Example
|
| 198 |
+
|
| 199 |
+
The following example demonstrates how to load the model, enable Reasoning Mode, and use the `re` module to parse the "Thought Process" and the "Final Answer" from the output.
|
| 200 |
+
|
| 201 |
+
```python
|
| 202 |
+
import re
|
| 203 |
+
from transformers import AutoTokenizer, AutoModelForCausalLM
|
| 204 |
+
|
| 205 |
+
# 1. Configure Model
|
| 206 |
+
model_id = "tencent/Youtu-LLM-2B"
|
| 207 |
+
|
| 208 |
+
# 2. Initialize Tokenizer and Model
|
| 209 |
+
tokenizer = AutoTokenizer.from_pretrained(model_id)
|
| 210 |
+
model = AutoModelForCausalLM.from_pretrained(
|
| 211 |
+
model_id,
|
| 212 |
+
device_map="auto"
|
| 213 |
+
)
|
| 214 |
+
|
| 215 |
+
# 3. Construct Dialogue Input
|
| 216 |
+
prompt = "Hello"
|
| 217 |
+
messages = [{"role": "user", "content": prompt}]
|
| 218 |
+
|
| 219 |
+
# Use apply_chat_template to construct input; set enable_thinking=True to activate Reasoning Mode
|
| 220 |
+
input_text = tokenizer.apply_chat_template(
|
| 221 |
+
messages,
|
| 222 |
+
tokenize=False,
|
| 223 |
+
add_generation_prompt=True,
|
| 224 |
+
enable_thinking=True
|
| 225 |
+
)
|
| 226 |
+
|
| 227 |
+
model_inputs = tokenizer([input_text], return_tensors="pt").to(model.device)
|
| 228 |
+
print("Input prepared. Starting generation...")
|
| 229 |
+
|
| 230 |
+
# 4. Generate Response
|
| 231 |
+
outputs = model.generate(
|
| 232 |
+
**model_inputs,
|
| 233 |
+
max_new_tokens=512,
|
| 234 |
+
do_sample=True,
|
| 235 |
+
temperature=1.0,
|
| 236 |
+
top_k=20,
|
| 237 |
+
top_p=0.95,
|
| 238 |
+
repetition_penalty=1.05
|
| 239 |
+
)
|
| 240 |
+
print("Generation complete!")
|
| 241 |
+
|
| 242 |
+
# 5. Parse Results
|
| 243 |
+
full_response = tokenizer.decode(outputs[0], skip_special_tokens=True)
|
| 244 |
+
|
| 245 |
+
def parse_reasoning(text):
|
| 246 |
+
"""Extract thought process within <think> tags and the subsequent answer content"""
|
| 247 |
+
thought_pattern = r"<think>(.*?)</think>"
|
| 248 |
+
match = re.search(thought_pattern, text, re.DOTALL)
|
| 249 |
+
|
| 250 |
+
if match:
|
| 251 |
+
thought = match.group(1).strip()
|
| 252 |
+
answer = text.split("</think>")[-1].strip()
|
| 253 |
+
else:
|
| 254 |
+
thought = "(No explicit thought process generated)"
|
| 255 |
+
answer = text
|
| 256 |
+
return thought, answer
|
| 257 |
+
|
| 258 |
+
thought, final_answer = parse_reasoning(full_response)
|
| 259 |
+
|
| 260 |
+
print(f"\n{'='*20} Thought Process {'='*20}\n{thought}")
|
| 261 |
+
print(f"\n{'='*20} Final Answer {'='*20}\n{final_answer}")
|
| 262 |
+
```
|
| 263 |
+
</details>
|
| 264 |
|
| 265 |
### 3. Key Configuration Details
|
| 266 |
|