Post by AVP

19,313 followers

Observability tools were built for systems that fail in predictable, syntactic ways. AI systems do not always offer that courtesy. For more than a decade, observability companies built their businesses around logs, metrics and traces. For cloud-native software written and operated by humans, the model worked. AI workloads strain that model in ways the original architecture was not designed to handle. LLM-based failure modes are becoming harder to detect with traditional signals, and telemetry volumes from AI applications are running significantly higher than traditional apps. Olivia Tanzman and Lizzy M. have spent the past few months speaking with operators, founders and incumbents. Read their full assessment in the comments.