AI Agent Testing in CI/CD

If you are looking for AI agent testing in CI/CD, the practical problem is not just "does the output look okay?" It is "did my agent behavior change in a way that should block this merge?" EvalView is built for that workflow.

What EvalView tests in CI

Recommended workflow

evalview generate --agent http://localhost:8000
evalview snapshot tests/generated --approve-generated
evalview check --json --fail-on REGRESSION
evalview ci comment --results tests/generated/generated.report.json

Works especially well for

Back to EvalView homepage | View on GitHub