Xiaojun Wang

AI Testing & AI Quality Engineering

Focused on

AI Testing
LLM Evaluation
AI Agent Testing
Intelligent System Quality Engineering

Core Focus

Research areas and engineering disciplines I work in.

AI Testing

Systematic approaches to testing AI-powered systems, from model behavior to pipeline integrity.

AI Quality Engineering

Engineering reliable quality frameworks for AI systems across the development lifecycle.

LLM Testing

Evaluating large language model outputs, reasoning quality, safety, and consistency at scale.

AI Agent Testing

Measuring agentic system performance, tool-use accuracy, and multi-step task completion.

AI Evaluation

Building comprehensive evaluation frameworks — metrics, benchmarks, and methodologies for assessing AI system quality.

AI Quality Platform

Designing and building integrated platforms for AI quality management — from test orchestration to results analysis and reporting.

Intelligent System Testing

Testing methodologies for systems that learn, adapt, and operate under uncertainty.

AI Workflow Quality

Ensuring correctness and reliability of AI-orchestrated workflows and decision pipelines.

AI Reliability

Building robust, reproducible, and trustworthy AI systems for real-world deployment.

Projects

Practice and research in AI testing and quality engineering.

AI Testing Platform

Active

A unified platform for designing, executing, and analyzing AI model tests across different providers and modalities.

PlatformTestingEvaluation

Evaluation Engine

Active

A modular evaluation framework supporting custom metrics, comparative analysis, and reproducible AI benchmarking.

EvaluationBenchmarkingFramework

Workflow Orchestration

Active

Quality assurance tooling for AI-driven workflow pipelines — testing each node, validating data flow, and monitoring drift.

WorkflowQualityOrchestration

AI Evaluation Examples

Active

Public engineering examples for AI evaluation, LLM testing, and AI quality workflow design.

AI EvaluationLLM TestingAI Quality Engineering

GitHub →

View all projects →

Insights

Writing on AI testing, evaluation, and quality engineering.

Article

Xiaojun Wang

Core Focus

AI Testing

AI Quality Engineering

LLM Testing

AI Agent Testing

AI Evaluation

AI Quality Platform

Intelligent System Testing

AI Workflow Quality

AI Reliability

Projects

AI Testing Platform

Evaluation Engine

Workflow Orchestration

AI Evaluation Examples

Insights

What is AI Testing?

AI Quality Engineering vs Traditional Testing

Why AI Quality Platform Matters

Why LLM Evaluation Is Different

Building Reliable AI Agent Pipelines

Metrics That Matter in AI Quality