Tag
#llm-ops
2 posts tagged llm-ops.
- ops
Closing the Eval-Prod Gap: Online Evaluation as Observability
Offline eval scores are green and production is worse. The gap is not a measurement error — it is structural. Here is how to instrument online evaluation so production quality becomes observable.
- ops
End-to-End Tracing for LLM Applications: What Belongs in a Span
Production LLM apps span multiple model calls, tool invocations, retrieval steps, and re-tries. A complete trace makes them debuggable; a sparse one leaves you guessing.