Datadog, a New York-based company specializing in cloud observability for enterprise applications and infrastructure, has unveiled new capabilities as part of its core platform expansion.
During its annual DASH conference, Datadog introduced “Bits,” a cutting-edge generative AI assistant designed to assist engineers in real-time issue resolution for applications. Additionally, the company launched an end-to-end solution for monitoring large language models (LLMs).
The primary goal of these new offerings, particularly the AI assistant, is to simplify observability for enterprise teams. However, they are currently in beta testing and not yet generally available. Datadog is working with a limited number of customers to fine-tune the capabilities before making them accessible to a wider audience.
The new Bits AI addresses the challenges faced by teams in monitoring applications and infrastructure. It acts as an assistant, responding to natural language commands and providing end-to-end incident management support. By learning from customers’ data, including logs, metrics, traces, and real-user transactions, as well as institutional knowledge from sources like Confluence pages and Slack conversations, Bits AI can quickly offer insights and guidance for issue troubleshooting and remediation. This feature significantly streamlines the workflow of users, reducing the time required to resolve problems.
Michael Gerstenhaber, VP of product at Datadog, explained that Bits AI combines statistical analysis and machine learning with LLM models to analyze data, predict system behavior, and generate responses. The AI assistant utilizes OpenAI’s LLMs to power its capabilities. It can coordinate responses by assembling on-call teams in Slack and providing automated status updates to stakeholders. If the problem is at the code level, it offers concise explanations of errors along with suggested code fixes, which can be applied with a few clicks, along with a unit test to validate the solution.
Datadog’s competitor, New Relic, has also introduced a similar AI assistant called Grok, which offers support for issue identification and software problem-solving through a chat interface.
In addition to Bits AI, Datadog expanded its platform with an end-to-end solution for LLM observability. This tool aggregates data from various sources, including AI applications, models, and integrations, to help engineers quickly detect and resolve problems related to LLMs. The observability tool monitors and alerts users about model usage, costs, and API performance. It also analyzes model behavior to detect instances of hallucinations and drift based on different data characteristics, such as prompt and response lengths, API latencies, and token counts.
Datadog’s LLM Observability offering brings together two separate teams, app developers, and ML engineers, to collaborate on operational and model performance issues like latency delays, cost spikes, and model performance degradations.
While Datadog’s innovations are noteworthy, it faces competition from New Relic and Arize AI, both of which have launched integrations and tools aimed at simplifying the running and maintenance of LLMs.
Given the rising popularity of LLMs within enterprises for accelerating key business functions, monitoring solutions like Datadog’s are expected to be in high demand. Companies are increasingly adopting LLM tools, particularly from OpenAI, to optimize customer service and other critical processes involving data queries.