
CLI for prompt engineering: version control, test-driven evaluation, A/B testing with statistical confidence.
Prompts as Code. Markdown + YAML frontmatter, JSON test fixtures, bulk evaluation with pass/fail. A/B testing with 95% confidence intervals via bootstrapped sampling. Local React dashboard for traces and cost analysis.