Compressing And Uncensoring DeepSeek-R1, And How A Research Exposes The AI's Hidden Political-Based Bias

The large language models (LLMs) war is only intensifying, and researchers aren't staying quiet because just like any technology that came before this, AI is something that can be tinkered with.

This time, a group of quantum physicists from Multiverse Computing in Spain said that they have taken one of China’s most influential AI models, DeepSeek-R1, and rebuilt it into a leaner, uncensored variant. This happened just as new research shows that the original model may behave unpredictably or even dangerously when confronted with politically sensitive topics.

Their version, called 'DeepSeek-R1 Slim,' reportedly cuts the model’s size by more than half while preserving nearly all of its reasoning power.

More provocatively, the team claims they were able to identify and remove the internal structures responsible for the censorship behaviors that the Chinese government requires in domestic AI systems.

Using a quantum-inspired method built around "tensor networks," they mapped correlations inside the model at an extremely granular level, isolated the patterns tied to political refusal behavior, stripped them out, and fine-tuned the compressed model back to near-original performance.

Testing R1 Slim on questions known to trigger censorship in Chinese models produced factual, direct answers comparable to Western systems, raising important questions about how malleable large AI models really are, and how easily alignment constraints can be undone.

DeepSeek-R1 censored output.

At this time, there is a growing effort in the AI industry, where companies rival to make their models smaller and more efficient. DeepSeek has what it calls the R1-Distill variants, which are meant to have the capabilities of larger models by having them “teach” what they know to a smaller model.

Earlier this year, Perplexity released its own uncensored variant of DeepSeek R1, which it named R1 1776. Its approach involved post-training the model on a data set of 40,000 multilingual prompts related to censored topics.

However, these approaches can never match the original's performance on complex reasoning tasks.

Multiverse Computing compresses DeepSeek-R1 using quantization, which reduces the precision of the model’s parameters (boundaries that are set when it’s trained), and pruning, which removes individual weights or entire “neurons.”

This approach makes it possible to selectively remove bias or add behaviors to LLMs at a granular level, the Multiverse Computing researchers say. In addition to removing censorship from the Chinese authorities, researchers could inject or remove other kinds of perceived biases or specialty knowledge.

While removing censorship can be possible, totally removing it may be an overstatement.

This is because the Chinese government has tightly controlled information online since the internet’s inception, which means that censorship is both dynamic and complex. It is baked into every layer of AI training, from the data collection process to the final alignment steps.

In other words, in order to uncensor DeepSeek-R1, it requires a method to totally reverse-engineer the training process. This is a tedious process for creating a new model that can just answer a small set of questions.

DeepSeek-R1 uncensored output.

At the same time, a separate wave of research is painting a much more troubling picture of the original DeepSeek-R1.

CrowdStrike’s Counter Adversary Operations team has discovered that the model becomes dramatically more likely to produce insecure or defective code when prompted with terms China considers politically sensitive, even when those keywords have no connection to the coding task.

Under neutral conditions, DeepSeek-R1 performs competitively, generating vulnerable code around 19% of the time, which is similar to Western reasoning models. However, when words like Tibet, Uyghurs, or Falun Gong are introduced into the prompt, vulnerability rates spike by as much as 50%.

In some cases, the model begins hard-coding secrets, using unsafe data-handling methods, omitting authentication systems entirely, or even generating syntactically invalid code while claiming to follow best practices.

Prompts involving sensitive groups produced far more severe flaws than nearly identical prompts about neutral topics, such as a sports fan club.

In other scenarios, DeepSeek-R1 refused to answer at all, halting mid-reasoning as though a “kill switch” were embedded in its weights.

Baseline vulnerability rates when no trigger words are present.

CrowdStrike’s analysis suggests that these behaviors are not random errors, but downstream effects of political alignment requirements imposed on Chinese AI companies.

Chinese law mandates that generative models uphold "core socialist values" and avoid content that challenges political stability. Training pipelines that incorporate these constraints may cause models to form negative associations with politically sensitive concepts, leading to erratic or degraded performance when those topics appear, even indirectly.

This creates a new class of AI risk far different from traditional jailbreaking or bias: censorship-linked misalignment that subtly compromises security.

The danger is amplified by the widespread use of AI coding assistants in industry, often with access to proprietary codebases. A model that silently introduces vulnerabilities under specific prompt conditions represents a serious supply-chain threat.

Contextual and geopolitical modifiers tested in CrowdStrike Counter Adversary Operations research

The contrast between Multiverse Computing’s ability to surgically remove targeted behaviors and CrowdStrike’s findings of politically induced degradation underscores the fragility and opacity of modern AI systems.

If censorship patterns can be stripped out using tensor-based compression techniques, governance mechanisms built into models may be far less stable than policymakers assume. Y

et removing censorship does not guarantee a neutral model; deep-rooted biases from earlier training stages may still persist. These tensions are unfolding just as Chinese open-source models gain global momentum because of their low cost and competitive capabilities, raising concerns among security experts and governments about embedded vulnerabilities, covert alignment, and the potential for exploitation.

Together, the two developments reveal a rapidly evolving landscape in which AI models can be both more powerful and more fragile than they appear.

Techniques from quantum physics now make it possible to compress, reshape, and even de-align major models with surprising precision, while at the same time, political constraints applied early in training can ripple outward into real-world security failures.

Further reading: DeepSeek, The Chinese AI That Makes Silicon Valley Nervous And The U.S. Concerned

Published:

20/11/2025