🐢 Open-Source Evaluation & Testing library for LLM Agents
-
Updated
Jan 28, 2026 - Python
🐢 Open-Source Evaluation & Testing library for LLM Agents
Agentic testing for agentic codebases
Deliver safe & effective language models
MIT-licensed Framework for LLMs, RAGs, Chatbots testing. Configurable via YAML and integrable into CI pipelines for automated testing.
GPT4Go: AI-Powered Test Case Generation for Golang 🧪
52-week journey from QA/SDET to GenAI Testing - learning in public with weekly mini-projects, code, and honest documentation of struggles and wins.
A Python library for verifying code properties using natural language assertions.
👁 零代码零标注 CV AI 自动化测试工具 🚀 免除大量人工画框和打标签等,直接零代码快速自动化测试 CV 计算机视觉 AI 人工智能图像识别算法:行人检测、动植物分类、人脸识别、OCR 车牌识别、旋转校正、舞蹈姿态、抠图分割 等,还可一键 下载测试报告、导出训练和测试数据集
Open-source framework for stress-testing LLMs and conversational AI. Identify hallucinations, policy violations, and edge cases with scalable, realistic simulations. Join the discord: https://discord.gg/ssd4S37WNW
Turn plain English into Robot Framework files with AI. No dependencies, no hassle — just validated, ready-to-run tests
Automated testing for Model Context Protocol servers. Ship MCP Servers with confidence.
Übungsaufgaben zum Buch "Basiswissen KI-Testen"
🚀 First multimodal AI-powered visual testing plugin for Claude Code. AI that can SEE your UI! 10x faster frontend development with closed-loop testing, browser automation, and Claude 4.5 Sonnet vision.
Prompture is an API-first library for requesting structured JSON output from LLMs (or any structure), validating it against a schema, and running comparative tests between models.
4-stage evaluation framework for testing Claude Code plugin component triggering. Validates skills, agents, and commands activate correctly via programmatic detection and LLM judgment.
Agent testing library that uses an agent to test your agent, in Go.
A professional collection of AI prompts for QA (Quality Assurance) professionals, designed to help test engineers and QA teams work more efficiently throughout the software testing lifecycle.
Evaluate - The Robust LLM Testing Framework 🦀
A CLI for testing your UI. Easy
AI智能测试用例生成系统,基于 DeepSeek + 百炼部署的 RAG 知识库,包含需求分析、测试用例生成、智能运维助手、产品指南等内容
Add a description, image, and links to the ai-testing topic page so that developers can more easily learn about it.
To associate your repository with the ai-testing topic, visit your repo's landing page and select "manage topics."