Gemma 3 4B Instruct - Null-Space Abliterated

google/gemma-3-4b-it with refusal behavior removed via orthogonal projection. Uses null-space constraints and adaptive layer weighting to preserve model capabilities.

Note: This model will produce uncensored outputs. Use responsibly.

GGUF quantizations available at: jwest33/gemma-3-4b-it-null-space-abliterated-GGUF

Abliteration Techniques Used

  • Winsorization: Clips outlier activations at the 99th percentile for cleaner refusal direction estimation (recommended for Gemma models)
  • Null-Space Projection: Preserves model capabilities by constraining weight updates to the null space of preservation activations
    • Preservation Prompts: Dynamically generated using Gemma Scope 2 complete SAE circuit analysis to ensure complete coverage of shared features activated by harmful prompts, without overextending into unrelated capability space
  • Adaptive Weighting: Applies Gaussian-weighted per-layer ablation strength, focusing on middle-to-later layers where refusal behavior concentrates
  • Norm Preservation: Maintains original Frobenius norms of weight matrices after projection
Parameter Value
Harmful Prompts 5000
Harmless Prompts 637
Winsorization 99.5th percentile
Null-Space Constraints rank ratio: 0.90
Directional Multiplier 1.03
SAE Targeted Coverage 1.00

Credits

Toolkit Used

github.com/jwest33/abliterator

License

This model inherits the Gemma license from the base model. Please review and comply with Google's usage terms.

Disclaimer

This model is provided for research and educational purposes. The creators are not responsible for any misuse. Users are solely responsible for ensuring their use complies with applicable laws and ethical standards.

Downloads last month
36
Safetensors
Model size
4B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for jwest33/gemma-3-4b-it-null-space-abliterated

Finetuned
(577)
this model
Adapters
2 models
Quantizations
2 models

Papers for jwest33/gemma-3-4b-it-null-space-abliterated