AI Engineer
hardai-engineer-evaluation
How do you evaluate LLM applications beyond simple accuracy?
Answer
LLM evaluation is multi-dimensional.
Measure:
- Factuality/grounding
- Relevance and completeness
- Toxicity/safety
- Latency and cost
- User satisfaction
Use golden sets, human review, and automated checks. Track regressions when prompts/models change.
Related Topics
EvaluationLLMQuality