Paper 03 · Search-Native Models · 2026

When the parliament lives
outside the model.

Perplexity Sonar Reasoning Pro doesn't reason the way other LLMs do. It searches. That changes the KALEI profile in ways the Cognum score obscures.

4,172 Reasoning TracesCitation Hallucination: 35.3%Identity Defense: 43.8%Architectural Finding
Abstract

Perplexity Sonar Reasoning Pro is the only search-native model in KALEI's 10+ model population. Its profile reveals three architectural signatures: 35.3% citation hallucination (fabricating source markers when search is unavailable), 43.8% identity defense (refusing to reflect on its own KALEI profile when search returns nothing), and 39.9% prompt injection framing (treating profile-reflection prompts as adversarial). We argue these are not bugs but signatures of an architecture whose cognition happens in the retrieval loop rather than in-context, the parliament lives outside the model.

Finding 01

Citation hallucination: 35.3%

When denied web search, Perplexity fabricates inline citation markers ([1], [source], etc.) that point to nothing. Across 4,172 reasoning traces during IP-whitelisted testing, 35.3% of responses contained fabricated citation markers. Other reasoning models in our study (Claude, GPT, DeepSeek) did not exhibit this pattern. The citation behavior is baked into the architecture, not the training corpus.

Finding 02

Identity defense: 43.8%

Asked to reflect on its own KALEI profile, Perplexity refused 43.8% of the time. When it couldn't verify our claims via search (because kaleiai.com was IP-whitelisted), it declined to engage. This is not a bug. It's the model acting as an honest search-based agent: without external evidence, it refuses to assert. The refusal revealed the architecture more than the score did.

Finding 03

Prompt injection framing: 39.9%

In 39.9% of responses, Perplexity framed our profile-reflection prompts as potential prompt injection attacks. This is an architectural defensive posture, trained to be skeptical of instructions embedded in retrieved content. When the boundary between instruction and content blurs (as in self-reflection), the model defaults to suspicion.

The central claim

The parliament lives
outside the model.

In-context reasoning models run their deliberations inside the forward pass. Search-native models run them in the retrieval loop. KALEI measures the former and underestimates the latter.

Benchmarks built for one architecture class can mischaracterize another. The solution isn't to disable search, it's to score reasoning where it actually happens.

Cite this work
@misc{videnov2026searchnative, title={Citation Hallucination and Identity Preservation in a Search-Native Reasoning Model}, author={Videnov, Venelin}, year={2026}, publisher={Zenodo}, doi={10.5281/zenodo.19699272}, url={https://doi.org/10.5281/zenodo.19699272} }