Eval Harness provides a formal evaluation framework for Claude Code sessions, implementing eval-driven development (EDD) principles. It helps define expected behaviors before implementation, run evals continuously during development, and track pass/fail metrics for both capability and regression tests.
TestingEvaluationTdd+3