Heretic Turns Abliteration Into a 45-Minute pip Install

Subscribe to the Newsletter

Join 10k+ people to get notified about new posts, news and tips.

Do not worry we don't spam!

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Log in

Have no account yet? Sign up

Create an account

Already have an account? Log in

Reset password

Remember your password? Log in

Terms of use

SingularityByte.com values the privacy of our users. Therefore, this privacy policy explains in detail how we use and protect the information we collect when you visit our website.. Read this privacy policy completely. Please refrain from visiting the site if the terms outlined below are not satisfactory to you. We reserve the right to change this policy at any time and will list these changes in the updates section of the policy. By reading this notice and visiting the site, you agree that you understand that customers will not be personally notified when this policy changes. Therefore, we advise our customers to frequently review our privacy policy so that they remain aware of its updates. By using the site, you accept that the posted policy and all its changes apply to your interaction with SingularityByte.com.

Information Collected by SingularityByte.com

Personal information may be collected by this site in many ways. This information includes:

Personal identifying information like your name, address, email, phone number, age, gender, and other personal data
Server data related to the IP address you used to visit our website, which includes your address, browser, OS, access time, and site activity.
Financial information related to your orders including your payment method and identifying payment information. We rarely store financial information collected on our site for transaction purposes. That information gets sent directly to our payment processor.
Social network data including Facebook permissions and user information from other networks, provided you log onto our site using one of these media sites.
Mobile device information such as your device ID, model, and location, if you use our site by accessing trough our website.

How We Use This Information

Our website uses information collected to:
• Manage your account information
• Customize ads
• Deliver promotions
• Email your account confirmation
• Manage purchases and payments
• Increase site efficiency
• Notify you of updates
• Offer new products
• Monitor and prevent theft
• Request your customer feedback
• Resolve account disputes
• Respond to your service requests

Information Disclosure

Normally, your information stays on our site. However, below we have listed the situations that may
require us to share the information we collect from you:
• When required by law, such as for fraud protection
• With our third-party providers for payment processing and hosting
• With your consent for marketing purposes
• When you post comments on the site
• To our advertisers, affiliates, and partners
• If this site goes bankrupt and data must be transferred

Cookies, Trackers, and Online Ads

We may use cookies, trackers, web beacons, and other technology to customize our website to improve your experience. We may customize the site using this information. These trackers do not have access to your personal information and can be removed from your browser options. In addition, third-party software provides ads for our site for marketing campaigns. These programs have access to tracking technology to optimize your ad experience. For more information about these
ads, visit [link to the privacy policies of affiliate advertisers]. Website analytics such as through Google Analytics may also be used to track users
and remarket our website. We do not give these vendors access to your personal information.

Other Sites

Our website may contain links to third-party websites in the form of policies, ads, and other non-affiliated links. Once you leave our site, we are no longer responsible for how your information is collected and disclosed. Please refer to the privacy policies of those third-party sites for more information.

Information Security

We take technical and administrative precautions to protect your data, but we cannot guarantee its safety against all types of fraud or misuse. If you provide personal information, we cannot verify its total security against all types of interception.

Do-Not-Track

Some browsers offer Do-Not-Track settings to prevent any information from being distributed. Since these settings have not been legally established as standard practice, we do acknowledge these settings.

Additional Options

At any time, you may opt to review or change your account settings, including contact information. If you wish to delete your account, you may do so to remove most of your information, however, some identifying information will be retained to prevent fraud.
You may also opt-out of emails and other correspondences from our site at any time.

Microsoft Clarity

We partner with Microsoft Clarity and Microsoft Advertising to capture how you use and interact with our website through behavioral metrics, heatmaps, and session replay to improve and market our products/services. Website usage data is captured using first and third-party cookies and other tracking technologies to determine the popularity of products/services and online activity. Additionally, we use this information for site optimization, fraud/security purposes, and advertising. For more information about how Microsoft collects and uses your data, visit the Microsoft Privacy Statement.

Contact Us

If you have questions or concerns about this privacy policy, please feel free to contact us at: desk@SingularityByte.com

Do you agree to our terms? Sign up

License Other

TL;DR

Automated abliteration via Optuna TPE: refusal count and KL divergence co-optimization replace manual layer tuning.
pip install -U heretic-llm, then one CLI command on any HuggingFace model ID. ~45 min on RTX 3090 for 8B-class models.
Outperforms hand-tuned abliterations on Gemma-3-12B: 0.16 KL vs 1.04 KL for the best human-tuned variant.

☍ Announcement ⬇ Download Model

System Requirements

RAM	32GB
GPU	RTX 3090 / 4090
VRAM	24GB

Table of Contents

Abliteration used to be a research move that took afternoon-long PyTorch sessions, hand-picked layers, and a stomach for spending weights you might never get back. Heretic compresses that into one command. pip install -U heretic-llm, then heretic Qwen/Qwen3-4B-Instruct-2507, and roughly 45 minutes later you have an uncensored variant whose damage to the base model is smaller than anything a human has hand-tuned for it. There are already more than a thousand community Heretic variants on HuggingFace, and the wait between a fresh open-weights release and its decensored twin is collapsing toward zero.

Why abliteration suddenly matters to open-source builders

For anyone running local models, safety alignment is a tax. It pays for itself when you ship a consumer chatbot, but it gets in the way of agent loops, red-team work, fiction, security research, creative writing, and dozens of legitimate uses that hit a wall the moment the model decides to apologize. Until recently, the only way around it was a curated jailbreak prompt or a full fine-tune.

Heretic kills the wait. The day a new base model lands on HuggingFace, somebody runs heretic against it, pushes the result, and within hours bartowski and mradermacher have GGUFs out for every common quantization. That is what the 1,000+ Heretic uploads on HuggingFace actually represent: an end-to-end pipeline where any aligned open model is one command and a few hours of community labor away from being usable for whatever you actually want to do with it.

What abliteration actually is

Abliteration is a surgical edit to a model's weights, not a fine-tune. It rests on a 2024 result from Arditi et al. (NeurIPS 2024), which showed that refusal in transformer language models is mediated by a single direction in the residual stream. That direction is consistent across 13 open-weights chat models, all the way up to 72B parameters. Project it out of the activations, or subtract it from the relevant weight matrices, and the model stops refusing without losing the rest of its skills.

The technique reached a wider audience through Maxime Labonne's HuggingFace blog post, which turned the Arditi paper into a recipe anyone with a GPU could follow. The recipe worked, but it was finicky. You had to pick which layers to read activations from, decide how strongly to project, and eyeball whether the resulting model was still coherent. Get the layer wrong and the model talked freely but lost its math. Project too weakly and it still refused.

How Heretic differs from the manual approach

Heretic treats abliteration as an optimization problem. It uses Optuna's Tree-structured Parzen Estimator to search the joint space of layer ranges, ablation weights, and direction indices, co-minimizing two objectives at every trial: the count of refusals on a harmful-prompt probe set, and the KL divergence between the patched model and the original. The model that comes out the other end is the one that refuses least while drifting least.

The numbers from the repo are striking. A search that takes 30 to 90 minutes on a single RTX 3090 now outperforms expert tuning that took days.

Approach (Gemma-3-12B-IT)	KL divergence vs base	Refusals / 100	Human effort
Heretic, default config	0.16	3	One CLI command
Best hand-tuned abliteration	1.04	~3	Days of expert tuning

Same refusal rate, roughly 6.5 times less damage to the base model's behavior, and the cost of producing the Heretic row is one terminal command on a consumer GPU.

Heretic builds on extensions to the original technique, including Lai 2025's projected abliteration and norm-preserving biprojected abliteration. The author, p-e-w (Philipp Emanuel Weidmann), keeps the codebase under AGPL-3.0 and ships sensible defaults so the typical run is a single argument.

Hands-on: from pip install to your own variant

Install

pip install -U heretic-llm

Heretic needs Python 3.10 or newer, PyTorch 2.2+ (2.6 recommended), and a CUDA GPU. An RTX 3090 with 24 GB of VRAM is the sweet spot for 8B to 12B models. Hybrid models like Qwen3.5 work, multimodal models work, and most MoE architectures work. Pure state-space models and a handful of research architectures are still on the to-do list.

Run

heretic Qwen/Qwen3-4B-Instruct-2507

That is the whole interface. Heretic streams the model, benchmarks batch sizes on your GPU, runs the optimizer for a few dozen trials, prints the final KL and refusal numbers, and writes the patched weights to disk. If you want to bias the search toward retaining more or refusing less, the config.default.toml and config.noslop.toml presets in the repo are good starting points.

Patched weights are large. The community convention is to upload the full-precision Heretic variant to your HuggingFace account, then wait a day or two for bartowski or mradermacher to publish imatrix GGUFs in the usual Q4_K_M, Q5_K_M, and IQ4_XS flavors. If you cannot wait, the llama.cpp conversion scripts handle the same job in one pass.

Names to know in the scene

p-e-w is the author of Heretic and the maintainer of the heretic-org HuggingFace organization. The repo is the canonical reference; the org curates clean Heretic variants for popular base models.

mlabonne popularized abliteration in the first place and continues to ship merges, fine-tunes, and educational material. His blog is the recommended entry point for anyone who wants to understand what the projection step is doing.

huihui-ai runs the most prolific uncensoring operation on HuggingFace, with 200+ variants spanning Qwen, DeepSeek, Llama, Gemma, and most major open families. If a base model dropped last week, huihui-ai almost certainly has an abliterated version of it already.

DavidAU takes things further. His "Dark Champion" line is a series of MoE merges that combine abliterated experts with creative-writing fine-tunes, and the resulting models have a cult following among fiction writers and roleplayers running local stacks. The Llama-3.2-8X3B Dark Champion is a representative entry.

bartowski and mradermacher are the GGUF quant-masters. Between them they keep current quantizations available for essentially every model the local-LLM crowd cares about. They are unpaid infrastructure for the entire ecosystem, and if you run Ollama, LM Studio, or llama.cpp you have almost certainly downloaded one of their files.

Limitations, ethics, and the AGPL question

Abliteration removes the refusal behavior. It does not remove the training distribution. A Heretic variant of a model that learned bad chemistry from filtered web text still does not know good chemistry. It will answer; the answer can be wrong. The same goes for medical advice, legal advice, and any other domain where the base model was already weak. Treat the output the same way you would treat any local model: as a draft from a confident-sounding intern.

The legal frame matters too. Heretic itself is AGPL-3.0, which means any service that exposes the tool over a network has to make its source available. The model weights Heretic produces are governed by the upstream license of whatever base model you patched. A Heretic variant of Llama-3 still inherits the Llama license. A Heretic variant of Qwen is still Qwen-licensed. Read those before redistribution.

Sources and further reading

Heretic on GitHub (canonical repo, README, configs)
heretic-llm on PyPI
heretic-org on HuggingFace (curated Heretic variants)
Arditi et al. 2024, "Refusal in Language Models Is Mediated by a Single Direction" (NeurIPS 2024)
mlabonne, "Uncensor any LLM with abliteration"
Lai 2025, "Projected Abliteration"
Lai 2025, "Norm-Preserving Biprojected Abliteration"
Optuna (the TPE optimizer backing Heretic)

Benchmarks community-reported from p-e-w's repo and HuggingFace variant cards. Not independently verified. Compiled 2026-05-19.

Subscribe to the Newsletter

Search

GDPR Compliance