Evaluations
Agent quality scoring via DeepEval + Ragas (14+ metrics)
order-processor v2.12 hrs ago · order-test-50
47/50 passed (94%)
Accuracy
94%
Relevance
91%
Faithfulness
96%
Latency Score
88%
support-bot v3.0.11 day ago · support-qa-100
92/100 passed (92%)
Accuracy
92%
Relevance
88%
Faithfulness
94%
Latency Score
95%
doc-analyzer v1.53 days ago · invoice-extract-30
29/30 passed (97%)
Accuracy
97%
Relevance
93%
Faithfulness
98%
Latency Score
72%
price-optimizer v1.25 days ago · pricing-test-20
16/20 passed (80%)
Accuracy
80%
Relevance
75%
Faithfulness
85%
Latency Score
65%