Instructions to use ControlLLM/Llama-3.1-8B-SynE-FPT with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use ControlLLM/Llama-3.1-8B-SynE-FPT with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="ControlLLM/Llama-3.1-8B-SynE-FPT")# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("ControlLLM/Llama-3.1-8B-SynE-FPT", dtype="auto") - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use ControlLLM/Llama-3.1-8B-SynE-FPT with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "ControlLLM/Llama-3.1-8B-SynE-FPT" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "ControlLLM/Llama-3.1-8B-SynE-FPT", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/ControlLLM/Llama-3.1-8B-SynE-FPT
- SGLang
How to use ControlLLM/Llama-3.1-8B-SynE-FPT with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "ControlLLM/Llama-3.1-8B-SynE-FPT" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "ControlLLM/Llama-3.1-8B-SynE-FPT", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "ControlLLM/Llama-3.1-8B-SynE-FPT" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "ControlLLM/Llama-3.1-8B-SynE-FPT", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use ControlLLM/Llama-3.1-8B-SynE-FPT with Docker Model Runner:
docker model run hf.co/ControlLLM/Llama-3.1-8B-SynE-FPT
| license: llama3.1 | |
| datasets: | |
| - survivi/Llama-3-SynE-Dataset | |
| - hfl/stem_zh_instruction | |
| - llamafactory/alpaca_zh | |
| - llamafactory/alpaca_gpt4_zh | |
| - hfl/ruozhiba_gpt4 | |
| - codingsteven/Llama-3-8B-chat | |
| language: | |
| - zh | |
| metrics: | |
| - accuracy | |
| base_model: | |
| - meta-llama/Llama-3.1-8B | |
| model-index: | |
| - name: Control-LLM-Llama3.1-8B-SynE-Full-Parameter-Tuning | |
| results: | |
| - task: | |
| type: pretraining-evaluation | |
| dataset: | |
| type: mixed | |
| name: Pretraining Evaluation Dataset | |
| metrics: | |
| - name: exact_match,strict-match (meta_pretrain) | |
| type: exact_match | |
| value: 0.45445720757159036 | |
| stderr: 0.0035036029889520047 | |
| verified: false | |
| - name: exact_match,strict-match (meta_bbh_3shot_cot_pretrain) | |
| type: exact_match | |
| value: 0.6482875134387959 | |
| stderr: 0.005918167158231359 | |
| verified: false | |
| - name: acc,none (meta_mmlu_5shot_pretrain) | |
| type: accuracy | |
| value: 0.649480131035465 | |
| stderr: 0.004026616190778244 | |
| verified: false | |
| - name: exact_match,strict-match (meta_mmlu_pro_5shot_pretrain) | |
| type: exact_match | |
| value: 0.34956781914893614 | |
| stderr: 0.004347262544061378 | |
| verified: false | |
| - task: | |
| type: chinese-evaluation | |
| dataset: | |
| type: mixed | |
| name: Chinese Evaluation Dataset | |
| metrics: | |
| - name: acc,none (ceval-valid) | |
| type: accuracy | |
| value: 0.5898959881129272 | |
| stderr: 0.012699457390113113 | |
| verified: false | |
| - name: exact_match,strict-match (ceval-valid-pretrain-cot_zh) | |
| type: exact_match | |
| value: 0.40193164933135217 | |
| stderr: 0.01265090064840271 | |
| verified: false | |
| - name: acc,none (cmmlu) | |
| type: accuracy | |
| value: 0.6018822310481782 | |
| stderr: 0.004420298073040671 | |
| verified: false | |
| - name: exact_match,strict-match (cmmlu_pretrain_cot_zh) | |
| type: exact_match | |
| value: 0.4425833189431877 | |
| stderr: 0.004506238417180843 | |
| verified: false | |
| pipeline_tag: text-generation | |
| library_name: transformers | |
| # Control-LLM-Llama3.1-8B-SynE-Full-Parameter-Tuning | |
| This is a fine-tuned model of Llama-3.1-8B for muliligual-Chinese tasks on SynE dataset. | |
| ## Linked Paper | |
| This model is associated with the paper: [Control LLM: Controlled Evolution for Intelligence Retention in LLM](https://huggingface.co/papers/2501.10979). | |
| ## Linked Open Source code - training, eval and benchmark | |
| This model is associated with the github: [Control-LLM](https://github.com/linkedin/ControlLLM). | |
| ## Evaluation Results | |
| Here is an overview of the evaluation results and findings: | |
| ### Benchmark Results Table | |
| The table below summarizes evaluation results across Chinese tasks and original capabilities. | |
| | **Model** | **CEval** | **CEvalC** | **CMMLU** | **CMMLUC** | **C-Avg** | **BBH** | **MLU** | **MLUP** | **O-Avg** | **Overall** | | |
| |--------------------|-----------|------------|-----------|------------|-----------|---------|---------|----------|-----------|-------------| | |
| | Llama3.1-8B | 48.3 | 12.8 | 51.1 | 14.1 | 13.9 | 65.2 | 65.4 | 35.5 | 45.9 | 29.9 | | |
| | Llama-3-SynE | 57.7 | 22.3 | 57.1 | 22.8 | 22.8 | 61.9 | 64.0 | 32.6 | 42.9 | 32.9 | | |
| | **Full Param Tune**| 59.0 | 40.2 | **60.2** | 44.3 | 43.8 | 64.8 | 64.9 | 35.0 | 45.4 | 44.6 | | |
| | Stack Expansion | 56.0 | 32.7 | 55.2 | 33.4 | 33.3 | 62.3 | 65.6 | 35.3 | 44.8 | 39.1 | | |
| | Concat-Lerp* | 57.1 | 34.8 | 57.0 | 37.4 | 37.1 | 64.4 | 64.6 | 35.8 | 45.9 | 41.5 | | |
| | **Hybrid Expansion**| **58.9** | 44.7 | 57.9 | 44.3 | 44.4 | 65.1 | **65.7**| 36.9 | 46.8 | 45.6 | | |
| | **Control LLM*** | 57.0 | **44.7** | 56.0 | **44.9** | **44.8** | **68.2**| 65.6 | **37.9** | **48.5** | **46.7** | | |
| --- | |
| ### Explanation: | |
| - **CEval**: Chinese Evaluation | |
| - **CEvalC**: Chinese Evaluation (CoT - Chain of Thought) | |
| - **CMMLU**: Chinese MMLU | |
| - **CMMLUC**: Chinese MMLU (CoT) | |
| - **C-Avg**: Chinese - Size Weighted Average across CEval, CEvalC, CMMLU, and CMMLUC | |
| - **BBH**: BigBench Hard | |
| - **MLU**: MMLU (Massive Multitask Language Understanding) | |
| - **MLUP**: MMLU Pro | |
| - **O-Avg**: Original Capability - Size Weighted Average across BBH, MLU, and MLUP | |
| - **Overall**: Combined average across all tasks | |
| ### Full Parameter Tuning on Chinese-SynE | |
| The following plot illustrates the Catastrophic Forgetting of full parameter tuning in terms of hidden states alignment drift. | |
|  |