Insights
Writing on AI testing, evaluation methodology, and quality engineering practice.
Article
The State of AI Testing in 2026
A survey of current practices, tools, and challenges in testing AI-powered software systems.
ArticleWhy LLM Evaluation Is Different
Traditional software testing paradigms fall short when evaluating large language models. Here is why.
ArticleBuilding Reliable AI Agent Pipelines
Engineering patterns for testing and validating multi-step agent workflows in production.
ArticleMetrics That Matter in AI Quality
Beyond accuracy: a framework for choosing evaluation metrics that align with real-world requirements.