agentrial is an open-source Python framework that runs your AI agent N times on each test case and gives you confidence intervals instead of pass/fail.
Your agent passed 10/10 runs? Wilson CI says the true reliability could be as low as 72%. agentrial catches that.
- Multi-trial evaluation — Wilson confidence intervals on pass rates, bootstrap resampling on cost/latency
- Failure attribution — Fisher exact test pinpoints which step in your pipeline breaks
- Regression detection — compare versions in CI/CD, exit code 1 blocks the PR on significant drops
- Framework-agnostic — adapters for LangGraph, CrewAI, AutoGen, Pydantic AI, OpenAI Agents SDK, smolagents, or any Python callable
- Local-first — no accounts, no telemetry, no cloud. MIT license.
pip install agentrial
Classified in
Comments, support and feedback
About this launch
agentrial by Alessandro Potenza Will be launched June 29th 2027.


