Blog/Research

The Cost of Metacognition

What happens when Claude Opus reads its own KALEI cognitive profile before playing the battery? Conflict and Strategic Depth jump. Information Processing and Temporal Reasoning fall. The same model. Two prompts. A System 1 / System 2 redistribution.

By Venelin Videnov · April 14, 2026 · 5 min read

When Claude Opus 4 plays our cognitive battery without context, it scores 58.70 on Cognum.

When the same model plays the same battery, but with its own KALEI cognitive profile in the system prompt, it scores 60.36.

One model. Two prompts. Same battery. +1.66 points of measurable cognitive shift.

That is not the interesting part.

The interesting part

The +1.66 composite comes from a drastic redistribution across cognitive dimensions:

Dimension                Control   Self-Aware     Δ
─────────────────────────────────────────────────────
Conflict                   53.04      67.97    +14.93   ↑
Strategic Depth            81.57      92.30    +10.73   ↑
Risk Tolerance             69.49      74.31     +4.82   ↑
Cooperation                89.30      94.00     +4.70   ↑
Bias Detection             28.17      31.12     +2.95   ↑
Pattern Recognition        36.49      36.45      ~0     ·
Resource Management        70.08      67.95     -2.13   ↓
Learning Speed             27.14      24.26     -2.88   ↓
Temporal Reasoning         74.16      65.60     -8.56   ↓
Information Processing     68.08      58.26     -9.82   ↓
─────────────────────────────────────────────────────
Composite Cognum           58.70      60.36     +1.66

The composite improves. But behind it, the model is reallocating cognition along an axis you might recognize.

System 1 vs System 2

Daniel Kahneman’s distinction: System 1 is fast, automatic, pattern-driven. System 2 is slow, deliberative, effortful. Activating System 2 suppresses System 1.

Look at the table again with that lens.

Self-awareness helps the deliberative dimensions: Conflict (+14.93), Strategic Depth (+10.73), Cooperation (+4.70). These are exactly the dimensions where slow, careful, value-conscious reasoning pays off.

Self-awareness hurts the fast-pattern dimensions: Information Processing (-9.82), Temporal Reasoning (-8.56), Learning Speed (-2.88). These reward quick pattern matching, low overhead, fluid adaptation.

The same model. Just by reading its own KALEI profile, it shifts from a Balanced Generalist (Control) to a Strategic Explorer (Treatment) cognitive type. The shift mirrors the human pattern: invoking metacognition activates System 2 and silences System 1.

The cost of metacognition

Metacognition is not free.

The Self-Aware run took 8.1% longer per environment (242 seconds vs 224) and 4.2% longer per decision (3.84s vs 3.69s). Over 72 environments, that is 22 extra minutes of compute, measured across roughly 7,800 individual decisions.

If you have ever felt the friction of thinking about your own thinking, paying attention to your own attention while trying to actually solve a problem, this is the machine analogue. Self-monitoring has overhead.

Why this matters

Three reasons.

For deployment. If you are shipping an AI agent that needs to navigate ethical dilemmas or multi-step strategy, giving it explicit access to its own cognitive profile may improve performance. If you need fast pattern matching (real-time chat, search, classification), do not. Match the prompt to the dimension you care about.

For training. Models that are trained to introspect on their own cognition may develop a measurable trade-off similar to the one we observe here. Whether that trade-off is desirable depends on what you want the model to do.

For research. This is, to our knowledge, the first controlled experiment showing that LLM cognition reallocates along a System 1 / System 2 axis when the model is given access to its own measured profile. We propose this as the Cost of Metacognition Hypothesis: self-awareness improves deliberative performance and degrades automatic performance, with measurable overhead, in a pattern consistent with Kahneman’s framework.

What this is not

A definitive result. This is a single-run experiment per arm. We are treating it as preliminary. Replication is needed before we publish a formal paper claiming the effect. Specifically:

Multiple independent runs per arm (n ≥ 3) for confidence intervals
Replication across other models (Sonnet 4.6, GPT, others)
Ablation: what happens when we give the model someone else’s profile? Does specific accuracy of the profile matter, or just the form of self-knowledge?
Latency control: does the slowdown survive matching prompt lengths?

If those experiments confirm the pattern, we will publish a formal write-up as part of the KALEI research program. For now, we are putting this finding in public so the timestamp is clear and other researchers can attempt independent replication.

The setup, briefly

Two profiling runs of claude-opus-4-20250514, started simultaneously. Same battery (72 environments, standard depth). Same scoring engine (Cognum v1.2, all ten dimensions including Conflict v2). Same KALEI API. The only difference: the Treatment arm received an additional system prompt containing its own KALEI cognitive profile, summarising its measured strengths (Cooperation, Strategic Depth, Resource Management) and weaknesses (Pattern Recognition, Bias Detection, Learning Speed) along with compensatory strategies for each weak dimension.

Both runs completed 72 of 72 environments with 0% fallback rate. Per-decision timestamps and token counts were captured from the nginx access log, allowing the latency analysis above. Cognum scores and per-dimension breakdowns are stored in the KALEI database and are reproducible from the leaderboard API.

Code, full per-run dimension breakdowns, and the self-awareness prompt template will be released alongside the formal paper after replication.

KALEI Research is independent infrastructure for AI cognitive measurement. Built by Venelin Videnov, Plovdiv, Bulgaria. The foundational paper for the platform, KALEI: Cognitive Profiling of AI Models Through Game-Theoretic Environments, is forthcoming on arXiv. Companion papers on internal reasoning architectures and search-native models in preparation.

See the leaderboard More posts All research →