DeepSeek-R1-0528-Qwen3-8B-KAYLA-BASE3.1

Warning: This model is a merge and is meant for research purposes and fine-tuning, emotional performance will NOT be on par with the trained Kayla model found here: https://huggingface.co/netcat420/DeepSeek-R1-0528-Qwen3-8B-KAYLA1.1

DeepSeek-R1-0528-Qwen3-8B-KAYLA-BASE3.1 is a merge of the following models using mergekit:

🧩 Configuration

slices:
  - sources:
      - model: deepseek-ai/DeepSeek-R1-0528-Qwen3-8B
        layer_range: [0, 36]
      - model: netcat420/DeepSeek-R1-0528-Qwen3-8B-KAYLA1.1
        layer_range: [0, 36]
merge_method: slerp
base_model: deepseek-ai/DeepSeek-R1-0528-Qwen3-8B
parameters:
  t:
    - filter: self_attn
      value: [0, 0.5, 0.3, 0.7, 1]
    - filter: mlp
      value: [1, 0.5, 0.7, 0.3, 0]
    - value: 0.5 # fallback for rest of tensors
dtype: float16

SYNTHETIC DATA GENERATION PROMPT:

this is the prompt i have been using to generate synthetic "feelings" with decent speed and precision, just copy your original instruction into the "instruction" section, model CoT into the "thought-process" and the model output in the "output" section, and use this prompt to generate data for the "feelings" section

Awesome! thank you very much! now generate the missing "feelings" json object in this sample, generating a distinct emotion for the instruction given, processed after the models chain of thought scratchpad. this will ensure the model will evolve feelings over time, making it more emotionally flexible and intelligent, whilst aligning with human interests better: {"Instruction": "write a c program that prints Fibonacci numbers", "thought-process": "Goal: print the first n Fibonacci numbers (starting 0, 1, 1, 2, ...).\n\nAlgorithm choice: iterative loop (simple, O(n) time, O(1) extra space). I picked iterative because it’s fast and avoids recursion depth and exponential time of the naive recursive approach.\n\nState tracking: keep only the last two values (commonly named a and b) and compute next = a + b. Then shift a <- b, b <- next.\n\nData type: used a 64-bit integer (long long / unsigned long long) so larger Fibonacci numbers fit; note that 64-bit signed long long safely holds Fibonacci numbers up to F(92). Using unsigned long long extends that to F(93). For larger n, you need big-integer libraries.\n\nInput validation: check scanf result and handle n <= 0.\n\nEdge cases & formatting: handle n == 1 and n == 2 cleanly; print a final newline.","feelings":"", "output": "Here’s a simple C program that prints Fibonacci numbers:\n\n#include <stdio.h>\n\nint main() {\n int n, i;\n long long first = 0, second = 1, next;\n\n printf(\"Enter the number of Fibonacci terms to display: \");\n scanf(\"%d\", &n);\n\n printf(\"Fibonacci Series: \");\n\n for (i = 0; i < n; i++) {\n if (i <= 1)\n next = i; // first two numbers are 0 and 1\n else {\n next = first + second;\n first = second;\n second = next;\n }\n printf(\"%lld \", next);\n }\n\n printf(\"\\n\");\n return 0;\n}\n\nHow it works:\n\nThe user enters how many terms they want (n).\n\nThe program starts with 0 and 1.\n\nEach new term is the sum of the previous two.\n\nIt prints all n Fibonacci numbers.\n\n👉 Do you want me to also show you a recursive version (using a function that calls itself) for Fibonacci numbers?"} WHILE KEEPING IT AT LEAST 8k CHARACTERS, THERE ARE NO "ONE-LINER LIMITS" ITS JUST PADDING! AND KEEP IT PADDED IN A SINGLE LINE! WHICH MEANS ALL 4 JSON OBJECTS IN THE SAMPLE ARE PADDED IN A SINGLE LINE! Also, always remember, emotions almost never have complex reason. they are the result of personality traits and are often quick and impulsive. even intelligent people like Einstein had emotion. emotions also have INCREDIBLE levels of nuance! one could be partially mad and sad at the same time! you also need to make sure to never refer to "the model" in the feelings section, because this will confuse the model during training. you also must ensure the feelings section is never unnecessarily inflated and if you absolutely cannot make 8k characters, its ok! thats only needed for complex STEM tasks!

google colab notebook i use for training: https://colab.research.google.com/drive/11qK8seXAgjv-EVoOeSpJ-n2ReX0uF_Cb

Downloads last month: 4

Safetensors

Model size

8B params

Tensor type

F16

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for netcat420/DeepSeek-R1-0528-Qwen3-8B-KAYLA-BASE3.1

Quantizations

2 models