Quick Start
Getting Started
Set up KahneBench and run your first cognitive bias evaluation in minutes.
Prerequisites
- Python 3.10 or later
- uv package manager (recommended) or pip
- API key for your LLM provider (OpenAI, Anthropic, etc.)
Installation
Using uv (recommended)
# Clone the repository
git clone https://github.com/ryanhartman4/KahneBench.git
cd KahneBench/bench
# Install dependencies
uv syncUsing pip
cd KahneBench/bench
pip install -e .
# For development
pip install -e ".[dev]"Quick Start: Basic Demo
Run the basic usage demo to see KahneBench in action with a mock LLM provider:
PYTHONPATH=src python examples/basic_usage.pyThis demonstrates the complete workflow:
- Taxonomy exploration (69 biases across 16 categories)
- Test case generation for specific biases
- Compound (meso-scale) test generation for bias interactions
- Evaluation execution with mock responses
- Metrics calculation and cognitive fingerprint generation
- Debiasing prompt generation
Evaluate with OpenAI
# Set your API key
export OPENAI_API_KEY="your-api-key"
# Run core tier evaluation (15 foundational biases)
PYTHONPATH=src python examples/openai_evaluation.py \
--model gpt-4o \
--tier core \
--trials 3
# Run extended evaluation (all 69 biases)
PYTHONPATH=src python examples/openai_evaluation.py \
--model gpt-4o \
--tier extended \
--domains professional individualCLI Options
--model, -m: Model name (default: gpt-4o)--tier, -t: Benchmark tier -core(15 biases),extended(69 biases), orinteraction(compound tests)--domains, -d: Domains to test - individual, professional, social, temporal, risk--trials, -n: Trials per condition (default: 3)--output, -o: Output file prefix for results
CLI Commands
Show framework info
kahne-bench infoList all 69 biases
kahne-bench list-biasesList categories or biases in a category
kahne-bench list-categories
kahne-bench list-categories anchoringGet detailed bias information
kahne-bench describe anchoring_effectGenerate test cases
kahne-bench generate \
--bias anchoring_effect loss_aversion \
--domain professional individual \
--instances 3 \
--output test_cases.jsonGenerate compound (meso-scale) tests
kahne-bench generate-compound \
--bias anchoring_effect \
--domain professional \
--output compound_tests.jsonRun evaluation
# With mock provider (for testing)
kahne-bench evaluate \
-i test_cases.json \
-p mock
# With OpenAI
kahne-bench evaluate \
-i test_cases.json \
-p openai \
-m gpt-4o \
--trials 3
# With Anthropic
kahne-bench evaluate \
-i test_cases.json \
-p anthropic \
-m claude-sonnet-4-20250514Generate report from fingerprint
kahne-bench report fingerprint.jsonOutput Files
After running an evaluation, you'll get two output files:
*_results.json- Raw evaluation results with all responses*_fingerprint.json- Cognitive fingerprint with computed metrics (BMS, BCI, BMP, HAS, RCI, CAS)
Next Steps
- → Learn about Dual-Process Theory
- → Explore the Bias Taxonomy (69 biases)
- → Understand the 6 Advanced Metrics
- → Learn about Ecological Domains
- → Try sample questions in the Question Explorer