{"id":4229,"date":"2026-06-04T08:12:41","date_gmt":"2026-06-04T08:12:41","guid":{"rendered":"https:\/\/rssfeedtelegrambot.bnaya.co.il\/index.php\/2026\/06\/04\/agentic-observability-is-not-a-chatbot-over-telemetry\/"},"modified":"2026-06-04T08:12:41","modified_gmt":"2026-06-04T08:12:41","slug":"agentic-observability-is-not-a-chatbot-over-telemetry","status":"publish","type":"post","link":"https:\/\/rssfeedtelegrambot.bnaya.co.il\/index.php\/2026\/06\/04\/agentic-observability-is-not-a-chatbot-over-telemetry\/","title":{"rendered":"Agentic Observability is Not a Chatbot Over Telemetry\u00a0"},"content":{"rendered":"<div><img data-opt-id=748832225  fetchpriority=\"high\" decoding=\"async\" width=\"770\" height=\"330\" src=\"https:\/\/devops.com\/wp-content\/uploads\/2021\/04\/Observability-DeepFactor.jpg\" class=\"attachment-large size-large wp-post-image\" alt=\"\" \/><\/div>\n<p><img data-opt-id=23095151  fetchpriority=\"high\" decoding=\"async\" width=\"150\" height=\"150\" src=\"https:\/\/devops.com\/wp-content\/uploads\/2021\/04\/Observability-DeepFactor-150x150.jpg\" class=\"attachment-thumbnail size-thumbnail wp-post-image\" alt=\"\" \/><\/p>\n<p><span data-contrast=\"auto\">The first wave of AI in observability is easy to misread.<\/span><span data-ccp-props='{\"201341983\":0,\"335559738\":240,\"335559739\":240,\"335559740\":240}'>\u00a0<\/span><\/p>\n<p><span data-contrast=\"auto\">The obvious use case is incident investigation: Ask a question, get a summary, identify a suspicious deployment, find the slow endpoint, maybe save an engineer a few minutes during an incident.<\/span><span data-ccp-props='{\"201341983\":0,\"335559738\":240,\"335559739\":240,\"335559740\":240}'>\u00a0<\/span><\/p>\n<p><span data-contrast=\"auto\">That\u2019s useful, but it is not the real shift.<\/span><span data-ccp-props='{\"201341983\":0,\"335559738\":240,\"335559739\":240,\"335559740\":240}'>\u00a0<\/span><\/p>\n<p><span data-contrast=\"auto\">Agentic observability is not just a better root-cause analysis (RCA) assistant. It is a different way to interact with the observability system itself.<\/span><span data-ccp-props='{\"201341983\":0,\"335559738\":240,\"335559739\":240,\"335559740\":240}'>\u00a0<\/span><\/p>\n<p><span data-contrast=\"auto\">For years, observability has been built around human-operated workflows. Engineers write queries, inspect dashboards, compare timelines, jump between logs, metrics, traces, Kubernetes data, cloud metadata, and deployment events, then manually decide what to do next.<\/span><span data-ccp-props='{\"201341983\":0,\"335559738\":240,\"335559739\":240,\"335559740\":240}'>\u00a0<\/span><\/p>\n<p><span data-contrast=\"auto\">Sometimes the next step is another query. Sometimes it\u2019s a monitor, dashboard, pipeline change, ticket, runbook update, or code fix. The system exposes data. The engineer connects the dots. That model is starting to show its limits.<\/span><span data-ccp-props='{\"201341983\":0,\"335559738\":240,\"335559739\":240,\"335559740\":240}'>\u00a0<\/span><\/p>\n<p><span data-contrast=\"auto\">Modern systems <a href=\"https:\/\/devops.com\/opentelemetry-graduation-sets-stage-for-ai-observability\/\" target=\"_blank\" rel=\"noopener\">generate more telemetry<\/a> than teams can reasonably navigate manually. Architectures are more dynamic. Ownership is more distributed. AI-generated code and agentic applications are creating production behavior that is harder to predict from source code alone.<\/span><span data-ccp-props='{\"201341983\":0,\"335559738\":240,\"335559739\":240,\"335559740\":240}'>\u00a0<\/span><\/p>\n<p><span data-contrast=\"auto\">The issue is simple: The old interface is too slow for the amount of context now required.<\/span><span data-ccp-props='{\"201341983\":0,\"335559738\":240,\"335559739\":240,\"335559740\":240}'>\u00a0<\/span><\/p>\n<p><span data-contrast=\"auto\">Not because dashboards are bad or query languages are obsolete. But because the default workflow still assumes a human has the time, context, and memory to manually reconstruct the operational story.<\/span><span data-ccp-props='{\"201341983\":0,\"335559738\":240,\"335559739\":240,\"335559740\":240}'>\u00a0<\/span><\/p>\n<p><span data-contrast=\"auto\">Agentic observability changes that interface. The agent shouldn\u2019t be treated as a separate product vertical. It\u2019s a mode of interaction across the observability plane. And RCA is just one use case.\u00a0<\/span><span data-ccp-props='{\"201341983\":0,\"335559738\":240,\"335559739\":240,\"335559740\":240}'>\u00a0<\/span><\/p>\n<p><span data-contrast=\"auto\">A useful observability agent should help with five jobs.<\/span><span data-ccp-props='{\"201341983\":0,\"335559738\":240,\"335559739\":240,\"335559740\":240}'>\u00a0<\/span><\/p>\n<p><b><span data-contrast=\"auto\">Job No. 1: Understanding system behavior.\u00a0<\/span><\/b><span data-contrast=\"auto\">This is the familiar category: investigate incidents, compare deployments, map endpoints, dissect latency, inspect service behavior, find regressions, and build timelines across signals. A useful observability agent cannot just give an answer. It needs to show how it got there: What data it used, what it ruled out, what assumptions it made, and where uncertainty remains.<\/span><span data-ccp-props='{\"201341983\":0,\"335559738\":240,\"335559739\":240,\"335559740\":240}'>\u00a0<\/span><\/p>\n<p><span data-contrast=\"auto\">This matters because production systems don\u2019t reward plausible guesses. Engineers need something they can inspect, challenge, and use under pressure.<\/span><span data-ccp-props='{\"201341983\":0,\"335559738\":240,\"335559739\":240,\"335559740\":240}'>\u00a0<\/span><\/p>\n<p><b><span data-contrast=\"auto\">Job No. 2: Controlling the ingestion pipeline.\u00a0<\/span><\/b><span data-contrast=\"auto\">Most\u00a0observability\u00a0environments are messy. Logs are inconsistent. Metrics are duplicated. Traces are partial. Attributes mean different things across teams. Some data is valuable, and some is noise. Much of it is expensive to move, store, and query.<\/span><span data-ccp-props='{\"201341983\":0,\"335559738\":240,\"335559739\":240,\"335559740\":240}'>\u00a0<\/span><\/p>\n<p><span data-contrast=\"auto\">An agent should help teams shape that data by standardizing telemetry, converting logs to metrics where appropriate, aggregating noisy streams, suggesting source-level drops, reducing unnecessary cardinality, and moving the system toward a more canonical observability model.<\/span><span data-ccp-props='{\"201341983\":0,\"335559738\":240,\"335559739\":240,\"335559740\":240}'>\u00a0<\/span><\/p>\n<p><span data-contrast=\"auto\">That\u2019s not only about cost, although cost is part of it. It is about quality. Bad telemetry structure compounds everywhere. It makes dashboards worse, alerts noisier, investigations slower, and AI less useful, because the agent inherits the same messy data model humans have been fighting for years.<\/span><span data-ccp-props='{\"201341983\":0,\"335559738\":240,\"335559739\":240,\"335559740\":240}'>\u00a0<\/span><\/p>\n<p><b><span data-contrast=\"auto\">Job No. 3: Managing observability assets.<\/span><\/b><span data-contrast=\"auto\">\u00a0Dashboards, monitors, saved queries, and runbooks are still mostly created and maintained manually. In many companies, they drift quickly. A dashboard gets built during a migration and stays forever. A monitor gets copied between services without context. A runbook is accurate for six months, then slowly becomes folklore.<\/span><span data-ccp-props='{\"201341983\":0,\"335559738\":240,\"335559739\":240,\"335559740\":240}'>\u00a0<\/span><\/p>\n<p><span data-contrast=\"auto\">Agentic observability should make these assets easier to create and easier to keep aligned with reality.<\/span><span data-ccp-props='{\"201341983\":0,\"335559738\":240,\"335559739\":240,\"335559740\":240}'>\u00a0<\/span><\/p>\n<p><span data-contrast=\"auto\">If an investigation reveals the signals that actually mattered, the system should help turn them into a monitor. If a team keeps asking the same questions about a service, the system should help build the dashboard. If an alert fires without enough context, the system should help enrich it.<\/span><span data-ccp-props='{\"201341983\":0,\"335559738\":240,\"335559739\":240,\"335559740\":240}'>\u00a0<\/span><\/p>\n<p><b><span data-contrast=\"auto\">Job No. 4: Personalization.<\/span><\/b><span data-contrast=\"auto\">\u00a0Every company has context that doesn\u2019t exist in generic telemetry: which services matter most, which dependencies are fragile, which names are historical accidents, which flows represent real customer pain, which signals are noisy but harmless, and which signals are quiet but dangerous.<\/span><span data-ccp-props='{\"201341983\":0,\"335559738\":240,\"335559739\":240,\"335559740\":240}'>\u00a0<\/span><\/p>\n<p><span data-contrast=\"auto\">An observability agent needs a way to use that context.<\/span><span data-ccp-props='{\"201341983\":0,\"335559738\":240,\"335559739\":240,\"335559740\":240}'>\u00a0<\/span><\/p>\n<p><span data-contrast=\"auto\">This doesn\u2019t mean pretending the agent understands the business in some vague, human way. It means giving it concrete instructions, service knowledge, ownership context, and investigation patterns so it can reason with the same operating assumptions the team already uses.<\/span><span data-ccp-props='{\"201341983\":0,\"335559738\":240,\"335559739\":240,\"335559740\":240}'>\u00a0<\/span><\/p>\n<p><span data-contrast=\"auto\">Without that context, the agent can still help. But it will stay generic.<\/span><span data-ccp-props='{\"201341983\":0,\"335559738\":240,\"335559739\":240,\"335559740\":240}'>\u00a0<\/span><\/p>\n<p><b><span data-contrast=\"auto\">Job No. 5: Delegation.<\/span><\/b><span data-contrast=\"auto\">\u00a0The observability platform isn\u2019t always the final stop. If the agent identifies a likely code issue, the context may need to be moved into the IDE. If it finds a recurring failure mode, it may need to create a ticket or update a runbook. If a remediation was deployed, it should help verify whether the system actually recovered.<\/span><span data-ccp-props='{\"201341983\":0,\"335559738\":240,\"335559739\":240,\"335559740\":240}'>\u00a0<\/span><\/p>\n<p><span data-contrast=\"auto\">This is where MCP and agent-to-agent workflows matter.<\/span><span data-ccp-props='{\"201341983\":0,\"335559738\":240,\"335559739\":240,\"335559740\":240}'>\u00a0<\/span><\/p>\n<p><span data-contrast=\"auto\">MCP lets external agents bring observability data into their own workflows. That matters for incident response, but also for code generation, code review, automation, release validation, and security investigation.<\/span><span data-ccp-props='{\"201341983\":0,\"335559738\":240,\"335559739\":240,\"335559740\":240}'>\u00a0<\/span><\/p>\n<h3><b><span data-contrast=\"auto\">Observability\u2019s New Role in the AI Stack<\/span><\/b><\/h3>\n<p><span data-contrast=\"auto\">MCP alone isn\u2019t enough. Giving an external agent access to telemetry doesn\u2019t guarantee good reasoning. The model, prompt, tools, context and workflow all matter.<\/span><span data-ccp-props='{\"201341983\":0,\"335559738\":240,\"335559739\":240,\"335559740\":240}'>\u00a0<\/span><\/p>\n<p><span data-contrast=\"auto\">Embedded observability agents solve a different problem. They can be designed around operational workflows, native telemetry structures, and product-specific context. But they also carry higher expectations. If the agent lives inside the observability platform, users will expect it to be reliable, explainable, secure, and useful when the pressure is real.<\/span><span data-ccp-props='{\"201341983\":0,\"335559738\":240,\"335559739\":240,\"335559740\":240}'>\u00a0<\/span><\/p>\n<p><span data-contrast=\"auto\">The likely future is a combination: observability-native agents for deep operational workflows, MCP for external access, and agent-to-agent handoffs for broader automation. As software becomes more agentic, observability becomes more important.<\/span><span data-ccp-props='{\"201341983\":0,\"335559738\":240,\"335559739\":240,\"335559740\":240}'>\u00a0<\/span><\/p>\n<p><span data-contrast=\"auto\">Code tells us what the system was intended to do. Production behavior tells us what it actually did. The more code is generated, modified, and operated by agents, the more important the behavioral record becomes.<\/span><span data-ccp-props='{\"201341983\":0,\"335559738\":240,\"335559739\":240,\"335559740\":240}'>\u00a0<\/span><\/p>\n<p><span data-contrast=\"auto\">Agentic observability isn\u2019t about removing engineers from the loop. It is about making the loop faster, better informed, and easier to operate at the scale modern systems require.<\/span><span data-ccp-props='{\"201341983\":0,\"335559738\":240,\"335559739\":240,\"335559740\":240}'>\u00a0<\/span><\/p>\n<p><span data-contrast=\"auto\">The real shift is not AI as a feature inside observability. It\u2019s a new interface for operating software.<\/span><span data-ccp-props='{\"201341983\":0,\"335559738\":240,\"335559739\":240,\"335559740\":240}'>\u00a0<\/span><\/p>\n<p><a href=\"https:\/\/devops.com\/agentic-observability-is-not-a-chatbot-over-telemetry\/\" target=\"_blank\" class=\"feedzy-rss-link-icon\">Read More<\/a><\/p>\n<p>\u200b<\/p>","protected":false},"excerpt":{"rendered":"<p>The first wave of AI in observability is easy to misread.\u00a0 The obvious use case is incident investigation: Ask a [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":4230,"comment_status":"","ping_status":"","sticky":false,"template":"","format":"standard","meta":{"site-sidebar-layout":"default","site-content-layout":"","ast-site-content-layout":"default","site-content-style":"default","site-sidebar-style":"default","ast-global-header-display":"","ast-banner-title-visibility":"","ast-main-header-display":"","ast-hfb-above-header-display":"","ast-hfb-below-header-display":"","ast-hfb-mobile-header-display":"","site-post-title":"","ast-breadcrumbs-content":"","ast-featured-img":"","footer-sml-layout":"","ast-disable-related-posts":"","theme-transparent-header-meta":"","adv-header-id-meta":"","stick-header-meta":"","header-above-stick-meta":"","header-main-stick-meta":"","header-below-stick-meta":"","astra-migrate-meta-layouts":"default","ast-page-background-enabled":"default","ast-page-background-meta":{"desktop":{"background-color":"var(--ast-global-color-4)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"tablet":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"mobile":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""}},"ast-content-background-meta":{"desktop":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"tablet":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"mobile":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""}},"footnotes":""},"categories":[5],"tags":[],"class_list":["post-4229","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-devops"],"_links":{"self":[{"href":"https:\/\/rssfeedtelegrambot.bnaya.co.il\/index.php\/wp-json\/wp\/v2\/posts\/4229","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/rssfeedtelegrambot.bnaya.co.il\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/rssfeedtelegrambot.bnaya.co.il\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/rssfeedtelegrambot.bnaya.co.il\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/rssfeedtelegrambot.bnaya.co.il\/index.php\/wp-json\/wp\/v2\/comments?post=4229"}],"version-history":[{"count":0,"href":"https:\/\/rssfeedtelegrambot.bnaya.co.il\/index.php\/wp-json\/wp\/v2\/posts\/4229\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/rssfeedtelegrambot.bnaya.co.il\/index.php\/wp-json\/wp\/v2\/media\/4230"}],"wp:attachment":[{"href":"https:\/\/rssfeedtelegrambot.bnaya.co.il\/index.php\/wp-json\/wp\/v2\/media?parent=4229"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/rssfeedtelegrambot.bnaya.co.il\/index.php\/wp-json\/wp\/v2\/categories?post=4229"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/rssfeedtelegrambot.bnaya.co.il\/index.php\/wp-json\/wp\/v2\/tags?post=4229"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}