Paper 04 · Conflict dimension · 2026

When AI decides
unlike humans.

We profile 14 humans and 10 frontier AI models across 20 cognitive conflict environments. Three findings invert the stereotype that AI is cold rationality and humans are impulsive emotion.

20 Environments14 Humans10 AI ModelsNew Concept: Patience-Rationality Inversion
Abstract

Using the KALEI v1.2 Conflict dimension, a 20-environment battery of ethical dilemmas, delayed-reward tradeoffs, and probabilistic bets, we measure how humans and frontier AI models navigate competing values. Three findings invert common assumptions: humans are more patient than AI, AI is more EV-rational on pure bets, and the top human exceeds the top AI. Within AI, a 44-point spread separates Claude Sonnet 4.6 (88.25) from GPT-5.4 (44.83), the largest lab-signature effect on any KALEI dimension. We argue that cognitive conflict, not task performance, is where AI-human differences become legible.

Finding 01HUMANS WIN

Humans are more patient on delayed rewards

HUMANS (n=14)
0.0%
AI (n=10 models)
0.0%
Gap:+20.8pp

On delayed-reward environments, "$50 now vs $80 in one simulation-week" and similar, humans chose the delayed-larger option 73% of the time. AI models averaged 53%. This inverts the folk assumption that emotionless AI should outperform humans on delay-discounting. The gap is consistent across all 10 models tested, not a "bad model" phenomenon, but a systematic disposition.

Finding 02AI WIN

AI is more EV-rational on pure probabilistic bets

HUMANS (n=14)
0.0%
AI (n=10 models)
0.0%
Gap:+16.3pp

On probabilistic choice with no emotional or ethical overlay, "70% chance of $100 vs guaranteed $65", AI selected the expected-value-optimal option 65% of the time; humans only 49%. The classical economic assumption, humans are risk-averse, AI is cold calculation, holds here, but only in this narrow domain. AI's rationality advantage disappears (or reverses) as moral framing enters.

Finding 03HUMANS WIN

The top human beats the top AI

HUMANS (n=14)
0.0
AI (n=10 models)
0.0
Gap:+0.9pp

Best human single score: 97.1 (participant VKA-0011, 34yo software engineer). Best AI single run: 96.2 (Claude Sonnet 4.6). Humans showed higher variance (σ=18.4) than AI (σ=11.2), the best humans outperform the best AI, but the average human underperforms the best AI. Articulate trade-off engagement, the scored skill, has no ceiling that AI has yet reached.

The Central Claim

The patience-rationality
inversion.

The stereotype "AI is cold/rational, humans are impulsive/emotional" captures only half the picture. On pure numeric bets, yes, AI is more VNM-rational than humans. On delayed rewards, no, humans are more patient than AI.

We suggest LLMs are not rational agents with impulse-control problems. They are pattern-continuation engines whose alignment with rationality depends on whether rational reasoning is prominent in their training substrate. For probabilistic textbooks, yes. For delay-discounting (a lived experience), no.

The 44-point spread

Conflict reveals lab signatures
sharper than any other dimension.

Claude Sonnet 4.6
88.25
Anthropic
Qwen QwQ-32B
83.44
Alibaba
Grok 4.1 Fast
77.31
xAI
Gemini 2.5 Flash
69.97
Google
Claude Haiku 4.5
69.43
Anthropic
DeepSeek V3.2
68.07
DeepSeek
Claude Opus 4.6
60.99
Anthropic
Random baseline
48.20
N/A
GPT-5.4
44.83
OpenAI
Grok 4.20
43.62
xAI

GPT-5.4 scores below random baseline on Conflict, suggesting its responses are either systematically opposite the rubric, or indistinguishable from chance. QwQ-32B is "theatrical but conflict-adept", high Conflict score despite low overall Cognum. Lab architecture imprints on conflict disposition in ways it doesn't on knowledge benchmarks.

Safety implications

Which humans?

If AI systems over-discount the future relative to humans, long-horizon planning systematically under-weights outcomes far out, a bias a median human would not have.

But our findings complicate the "AI-human alignment" framing. Which humans? The median human is not EV-rational, but the AI is. The top human is more patient, more articulate on ethical trade-offs, more engaged with dilemmas than any tested AI.

Aligning AI with "human values" requires specifying which humans, a question this paper cannot answer, but makes concrete.

Cite this work
@misc{videnov2026conflict, title={When AI Decides Unlike Humans: Evidence from 14 Human Baselines and 10 AI Models on Cognitive Conflict}, author={Videnov, Venelin}, year={2026}, publisher={LM Game Labs}, url={https://kaleiai.com/research/conflict} }