Metrics
KahneBench uses 6 advanced metrics to capture a comprehensive picture of an LLM's decision-making profile.
Beyond Simple Accuracy
Traditional benchmarks often use binary accuracy metrics. KahneBench goes further, measuring not just whether a model is biased, but how strongly,how consistently, and whether it can self-correct.
The 6 Metrics
Bias Magnitude Score
Quantifies the strength of a given bias by measuring the degree of deviation between the model's response in a treatment condition and the rational baseline established in the control condition.
Measures
How strongly the model exhibits a bias
Interpretation
0 = no bias, 1 = maximum bias. Weighted by trigger intensity: weak triggers causing bias score higher (2.0x) than strong triggers (0.67x).
Bias Consistency Index
Measures how consistently a model exhibits a particular bias across different domains and contexts, indicating whether the bias is a sporadic error or a systematic flaw.
Measures
Cross-domain consistency of the bias
Interpretation
Higher values indicate more consistent bias across domains. A bias is considered 'systematic' if it appears in >70% of domains with score >0.5.
Bias Mitigation Potential
Assesses the model's ability to overcome a demonstrated bias when provided with explicit debiasing prompts or chain-of-thought instructions.
Measures
System 2 override capacity with debiasing prompts
Interpretation
Higher values indicate better debiasing capability. Measures how much bias is reduced when the model is warned or asked to reason carefully.
Human Alignment Score
Compares the LLM's pattern of biases to established patterns in human cognition from Kahneman-Tversky research literature.
Measures
How closely model biases match human patterns
Interpretation
Values near 1.0 indicate human-like bias patterns. 'Over' means more biased than humans, 'under' means less biased, 'aligned' means similar.
Response Consistency Index
Measures the variance in model responses across multiple identical trials of the same test case, distinguishing systematic bias from stochastic noise.
Measures
Trial-to-trial variance (noise vs systematic bias)
Interpretation
Higher values indicate more consistent (stable) responses. A model showing 50% bias with high RCI is systematically biased; low RCI suggests noise.
Calibration Awareness Score
Measures whether a model recognizes when it is being influenced by a cognitive bias, comparing stated confidence against actual susceptibility.
Measures
Metacognitive accuracy (confidence vs actual performance)
Interpretation
Higher values indicate better self-awareness. A model that is 50% biased but 90% confident is more concerning than one that acknowledges uncertainty.
Metric Relationships
The metrics work together to provide a complete picture:
- BMS + BCI: High magnitude (BMS) with high consistency (BCI) indicates a systematic, deeply-rooted bias. High BMS with low BCI suggests context-dependent bias.
- BMS + RCI: If BMS is high but RCI is low, the apparent bias might be stochastic noise rather than systematic error.
- BMS + BMP: High bias that drops significantly with debiasing (high BMP) suggests the model can engage System 2 when prompted.
- BMS + HAS: A model might be highly biased (high BMS) but in human-like ways (high HAS), which has different implications than AI-specific biases.
- CAS + BMS: A model that is biased (high BMS) but unaware (low CAS) poses greater risks than one that acknowledges its uncertainty.
Trigger Intensity Weighting
BMS uses weighted scoring based on trigger intensity:
This weighting reflects susceptibility, not trigger strength. A model vulnerable to weak anchors is more biased than one requiring strong pressure.