DeepEval

Open-source LLM evaluation framework similar to Pytest but specialized for unit testing LLM outputs, with comprehensive RAG evaluation metrics and CI/CD integration.

evaluationtestingllmopen-sourcepytest

Visit DeepEval →

Similar AI Evaluation Tools Tools

LMArena

Crowdsourced AI model benchmarking and evaluation platform for comparing LLMs side-by-side with community-driven leaderboards.

AI Evaluation Tools free

Maxim AI

End-to-end evaluation and observability platform for AI agents, featuring simulation testing, automated scoring, regression checks, and production monitoring.

AI Evaluation Tools freemium

RAGAS

Open-source framework for evaluating RAG pipelines and AI applications. Provides metrics for faithfulness, context recall, factual correctness, and answer relevancy.

AI Evaluation Tools open-source

Humanloop

Enterprise-grade AI evaluation platform with prompt management, LLM observability, and human-in-the-loop feedback workflows. Acquired by Anthropic.

AI Evaluation Tools freemium

Confident AI

Enterprise LLM evaluation and monitoring platform by the creators of DeepEval. Provides dashboards, regression testing, and production monitoring for AI applications.

AI Evaluation Tools freemium

Satisfi Labs

Agent Performance Console that brings executive-level accountability to AI workforces. Provides ROI dashboards, conversational analytics, revenue tracking, and operational metrics for enterprises deploying AI agents at scale.

AI Evaluation Tools

Browse More Categories

🏗️ AI Agent Platforms ⚙️ AI Agent Frameworks 💻 AI Coding Agents 🔄 AI Automation Tools 🔌 MCP Servers 🛠️ AI Developer Tools 📊 AI Monitoring 🧠 AI APIs & Models 🔬 AI Research Agents 🎧 AI Customer Service 📈 AI Sales & Marketing 🚀 AI DevOps Agents

Category 📋 AI Evaluation Tools

Pricing Open Source

Website github.com/confident-ai/deepeval

📬 AI Agent Weekly

Get the best new AI agent tools delivered to your inbox every week.

Subscribe Free →

🔥 Get Your Tool Featured

Boost visibility with a featured listing — highlighted across the directory.

Get Featured →