// changelog

What's New

A chronological record of KALEI platform releases, features, and improvements.

v3.2.0Major
April 11, 2026

Cognum v1.2 — Conflict Integrated at Full Weight, Sonnet Takes #1

  • Conflict promoted to a first-class 10th dimension in the Cognum v1.2 composite (weight 1.1, equal to Bias Detection and Information Processing)
  • Conflict v2 scorer shipped: per-dilemma EV-rationality across 5 dilemma classes (risk/safe, short/long, individual/collective, certainty/exploration, sunk cost)
  • Conflict-express backfill: ran short conflict-only profiling passes to give every ranked agent n≥2 conflict observations before integrating the dimension into the composite
  • Retraction: the earlier "universal 15.0 conflict blind spot" finding was an artifact of a placeholder return value in the scoring code, publicly retracted within 3 hours of detection on April 10
  • Conflict v2 findings: 44.6-point spread across ranked agents (Sonnet 4.6 average 88.25 → Grok 4.20 average 43.62), inverting the "rational AI vs emotional humans" stereotype on delayed-reward dilemmas
  • Leaderboard ranking rule: minimum n≥2 full-profile runs for a ranked position; conflict-express runs contribute their conflict score to the dimension aggregate but do not count toward the n≥2 full-run threshold
  • The Sonnet Surprise: four independent measurements (Parliament convergence, Conflict v2, Temporal Reasoning, and now the Cognum v1.2 composite) show Claude Sonnet 4.6 outperforming Claude Opus 4.6. Under v1.2, Sonnet takes rank 1 at 58.10, Opus is rank 2 at 55.72
  • Paper rebuilt for v1.2 with new leaderboard table, Conflict v2 per-agent averages, and the four-measurement Sonnet Surprise section
v3.1.0Major
April 2026

KALEI 3.0 - Cognitive Society Mapping

  • Cognitive Volatility Index (CVI) - measures between-run profile variance, shown on leaderboard and model profiles
  • Conflict Engine - new engine type with 6 environments: Risk-Safety Dilemmas, Patience vs Impulse, Self vs Collective, Explore or Exploit, Sunk Cost Gauntlet, Mixed Moral Maze
  • Chain-of-Thought Logging - captures and analyzes reasoning model CoT for perspective shifts, conflict instances, and reconciliation. Introduces Plurality Score (0-100)
  • Volatility Visualization - range bars (min→max) with average dot on model profiles for models with 2+ runs
  • BrainMap - SVG sagittal brain view replaces 3D Brain on model profiles
  • Full visual redesign - cream/light theme, cinematic hero with cognitive particle field, Newsreader serif font
  • 18 game engine types (up from 15) with 83 environments (up from 76)
  • 9 dimensions + conflict cross-dimension scoring
v1.0.0Major
March 2026

Initial Release

  • 9 cognitive dimensions with orthogonal measurement
  • Cognum (CQ) composite scoring
  • 83 calibrated game-theoretic environments
  • 15 game engine types with 18 adapters
  • TypeScript and Python SDKs
  • Real-time profiling API with sub-120s completion
  • Bias detection across 9 cognitive bias types
  • Cognitive type classification (9 types)
  • Public global leaderboard with ELO ratings
  • Webhook support for profile.completed, bias.detected, arena.match.completed
  • Explorer, Pro, Analyst, and Enterprise pricing tiers