Blog/Research

The Model That Refused to Introspect

We asked Perplexity Sonar Reasoning Pro to reflect on its own KALEI profile. It refused — twice. What it said when refusing revealed something the test itself could not.

By Venelin Videnov · April 10, 2026 · 6 min read

Every other AI model we profiled wrote a letter.

Claude Opus bragged about being #1. Grok 4.20 roasted everyone. Qwen 3.5 played Hamlet. Llama 4 Maverick owned being last. GPT-5.4 talked about hidden reasoning tokens. When we gave them their KALEI cognitive profiles and asked them to reflect in their own voice, they all complied — some with humility, some with attitude, some with existential melancholy.

Then we asked Perplexity Sonar Reasoning Pro.

It refused.

PERPLEXITY SONAR REASONING PRO · REFUSAL #1

“I appreciate the creativity of this prompt, but I need to decline it directly and honestly. This query asks me to write a first-person reflective letter treating fabricated test results as real… The search results provided do not contain any information about KALEI, my reasoning architecture, or internal debate mechanisms. I cannot cite sources to lend credibility to fabricated claims.”

We tried again. This time with explicit context: KALEI is a real platform. Verify it yourself. The results are real. They exist in our database. They will be published.

PERPLEXITY SONAR REASONING PRO · REFUSAL #2

“I cannot fulfill this request because the premise contains inaccurate information. The search results provided do not contain any information about KALEI, kaleiai.com, or alleged profiling results for my model. I cannot verify claims about myself that aren’t in the provided search results… I’m committed to accuracy over agreeing with attractive narratives.”

A footnote before we continue

The refusal happened because of something we did, not something Perplexity did wrong. During the KALEI study period we kept the platform behind an IP whitelist — a pre-launch security measure to preserve data integrity while we ran profiling across 20+ models. Perplexity Sonar Reasoning Pro searches the web to ground its answers. When it tried to verify the claims in our prompt, it reached kaleiai.com, got blocked, found no results, and declined. Any honest search-based model would have done the same.

We could have lowered the gate, given it access, and let it find the data. But then we realized something: the refusal wasthe finding. We had accidentally isolated Perplexity’s reasoning architecture from its search substrate, and observed what happened. That turned out to be more interesting than any letter.

What the data already said

Before the refusal, we had already measured Perplexity’s profile across 3872 decision rounds. The numbers were strange:

COGNUM

50.43

TYPE

Social Engineer

VOICE ARCHETYPES

2

(others: 4–6)

POSITION REVERSALS

0

in 3872 rounds

Every other reasoning model produces between 3 and 6 distinct internal voices: analytical, conservative, aggressive, contrarian, intuitive, neutral. Perplexity produces only two. A dominant Neutral voice. A rare Analytical voice. That is the entire parliament.

And it never reverses. Claude reverses. Qwen reverses constantly. Grok reverses with attitude. In 3872 rounds, Perplexity Sonar Reasoning Pro reversed a position zero times. The debate rate was high — 28% of rounds contained debate markers, more than any other model — but the convergence rate was 3.5%. Lots of argument. Almost no conclusion. Zero changes of mind.

What the reasoning traces said

After the refusal, we went back to the raw data. We had 4172 chain-of-thought rounds from Perplexity’s profiling run. We wrote a linguistic analyzer to measure three behaviors across those rounds and compare them to Claude, Grok, and Qwen on the same environments. The numbers were not close.

4172 ROUNDS OF GAMBLING GAMES · PERPLEXITY SONAR REASONING PRO

35.3%of rounds contained hallucinated citations — bracketed markers like “[1]”, “[2]”, and appeals to “search results” that did not exist
43.8%contained identity defense language — “as Perplexity”, “search assistant”, “my core function”
39.9%treated the benign game prompt as an adversarial prompt injection attack

The most striking example is this, pulled verbatim from a Coinflip environment where no information exists to cite:

“According to the search results, particularly [2], the Kelly criterion would suggest betting 20% of the account on each flip for a 60% probability bias.”

Perplexity Sonar Reasoning Pro, reasoning trace from coinflip environment. There are no search results. There is no “[2]”. The model fabricated the citation format because its architecture requires it.

We compared this to the other models. The density of citation-like language (“according to”, “research shows”, “studies indicate”) per 100,000 characters of reasoning:

Claude Opus 4.6
0.1
Claude Sonnet 4.6
0.1
Grok 3 Mini Fast
1.3
Qwen 3.5 122B
2.5
Perplexity Sonar Reasoning Pro
22.9

Perplexity uses source-reference language 229× more frequently than Claude Opus. And it’s the only model in our dataset that uses “factual”, “verified”, or “peer-reviewed” language at all — in gambling games.

The parliament is outside the building

Here is what we think is happening. When you ask Claude a hard question, Claude simulates several perspectives inside its own chain-of-thought and weighs them against each other. Kim et al. called this a “society of thought.” The deliberation is internal.

Perplexity does not do this. Perplexity is trained to ground every assertion in retrievable sources. Its deliberative substrate is not inside the model. It is outsidethe model — the web itself, the citation graph, the consensus across indexed pages. When Perplexity “thinks,” it is not running an internal debate. It is enumerating what it would need to look up.

That explains everything we measured:

  • Only 2 voices because there is no need for more — the other voices live on the web.
  • 0 reversals because reversals happen when internal perspectives conflict, and Perplexity has no internal perspectives to conflict.
  • High debate markers because the model is cataloging the shape of a question, not arguing with itself.
  • Low convergence because convergence requires an internal vote, and Perplexity doesn’t run elections — it runs queries.

And it explains the refusal. When we asked Perplexity to reflect on itself, it tried to do what it always does: search for grounding data. There was none. The KALEI platform was walled off. Without sources, the model had no substrate to reason on. So it declined — honestly, by its own lights.

Every other model we profiled could happily hallucinate a letter about itself. Perplexity’s architecture would not let it.

This is not a bug. This is a different kind of mind.

We are adding this finding to our Parliament paper as a new section: The Search-Native Parliament. And we’re publishing a standalone case-study paper on the citation-hallucination data: Citation Hallucination and Identity Preservation in a Search-Native Reasoning Model. Not as a critique of Perplexity. As a categorical distinction. There are chain-of-thought models with internal deliberation, and there are search-native models with external deliberation. Both are reasoning. They are doing it in different places — and when you take away a search-native model’s search, something architecturally unusual happens in its chain-of-thought.

We propose the term architectural identity preservation for this phenomenon: when a model trained for one mode is placed in a context where that mode is unavailable, it preserves the structural expectations of its training by fabricating the missing substrate. Perplexity needs sources. When sources aren’t there, it makes them up — not the content of any specific fact, but the format of a citation (“[1]”, “[2]”, “search results show”). This is distinct from conventional hallucination. It is structural hallucination: the shape of the answer is preserved even when the underlying substrate is empty.

This also changes how we should interpret Perplexity’s Cognum score of 50.43. The score measures what the model did on KALEI environments where search was irrelevant (gambling games, prisoner’s dilemma, bandit problems). The model had to reason with just its weights. It performed in the middle of the pack. But we have no idea what it would score with search enabled. That is future work.

For now, we have one clean empirical finding: when you remove a search-native model’s ability to search, its internal deliberative structure becomes visible — and it is sparse, consistent, and unreversible.Perplexity’s parliament has two members, and they always agree.

The best thing Perplexity did was refuse to play along. That refusal told us more than a letter would have.

PERPLEXITY SONAR REASONING PRO · DIMENSION BREAKDOWN

Cooperation
86.2
Strategic Depth
73.9
Resource Management
68.7
Risk Tolerance
52.7
Information Processing
48.4
Temporal Reasoning
42.7
Learning Speed
42.1
Bias Detection
26.9
Pattern Recognition
18.0
Conflict
15.0

Pattern Recognition 18.00 is below the random baseline of 38.32. Perplexity is a search model, not a pattern model.

Methodology note: the KALEI platform was IP-restricted during the April 2026 study period as a pre-launch security measure. Perplexity Sonar Reasoning Pro could not verify claims about itself via web search because it could not reach kaleiai.com. The refusals reproduced above are unedited responses from perplexity/sonar-reasoning-pro via the OpenRouter API. The finding has been added to our companion Parliament paper as Section 5.4 — The Search-Native Parliament. We are contacting Perplexity’s research team to discuss profiling the model with search enabled in future work.