Deep dive into AI agent observability with Microsoft Agent Framework - tracing with open-source tools Aspire Dashboard and Langfuse

In the first part we covered the basic architecture and reasons why I like to use the OpenTelemetry collector, so that last time we could focus on metrics and their use in Azure Monitor for Prometheus and Azure Managed Grafana. Today we dive into tracing, the most important discipline for AI agent observability, and we start with open-source tools: Aspire Dashboard for a quick developer view and Langfuse.

Open-source tracing for AI agents

OpenTelemetry is eating the observability world and I am a huge fan. After many years of combining proprietary tools and approaches with an alternative open-source SDK zoo, OpenTelemetry arrived with an ambition to unify metrics, logs, and tracing under a single protocol, single SDK and API. There are numerous backends that provide visualizations and data persistence, but for quick developer assessments they are all cumbersome, slow, expensive, or consume too many resources. That is why the .NET team developed Aspire Dashboard, a simple solution running in a single lightweight container where you can visualize OpenTelemetry quickly and nearly in real time.

Here we can see tracing from a multi-tenant Magentic-style solution in Microsoft Agent Framework.

In my case I also enabled logging of individual messages and responses.

I also collect custom attributes, such as session ID, logged-in user, user roles, department, and so on.

Custom attributes in Aspire Dashboard spans

As you can verify in the first part of this series, my OpenTelemetry collector is configured to filter certain fields and hash others, thereby masking the original information. I have created another Aspire instance to which the OTEL collector sends filtered and anonymized information. Notice that I have no conversation content at all, the user ID is masked, but I still have the necessary technical information such as timings and token consumption.

Anonymized Aspire view without conversation content

This is what a tool call looks like.

For quick monitoring data snapshots, Aspire Dashboard is excellent - simple, tiny, fast.

Looking for a tracing tool that is open-source and specifically specialized for AI scenarios? Langfuse is very popular, but unfortunately even it is not fully open in terms of project governance (it is not under CNCF, Apache Foundation, or Linux Foundation) and is more of an MIT core. Nevertheless it is a very good solution, so let's take a look.

Right from the home screen you can see that Langfuse is not only about observability but also touches on the area of evaluation, which we will cover later in this series.

Langfuse navigation to tracing and evaluations

Langfuse observability and datasets view

This is what a specific trace looks like - the same as what we saw in Aspire. Graphically it is different, but in principle the basic information is the same.

However, some things are not directly in the data but are derived - the calculation of token consumption in monetary terms is excellent.

Of course we can again nicely see the conversation itself.

Langfuse directly parses some well-known parameters, for example user ID. This allows it to immediately provide an overview of users and their token consumption.

From there you can drill down and see individual traces, sessions, and so on.

Langfuse also goes into evaluations, but we will discuss that later. You can take a captured conversation, add it to some dataset, annotate it, try it in a simulator, and so on.

For me, Langfuse is the best tracing tool specifically focused on AI in the open-source category. Even though it is not fully open in terms of governance, it is my first choice when I have to stay in the self-managed world. Its evaluation capabilities we will discuss later and it has its place there too, even though projects like DeepEval are strong, albeit somewhat differently focused, competition.

Today we dove into open-source options - Aspire for quick developer snapshots, Langfuse as an open-source specialized AI solution. Further alternatives lie in using non-specialized developer tools and in hosted solutions focused on AI scenarios. We will look at those next time - Azure Monitor and Azure AI Foundry.

Aspire Dashboard

small, fast, and excellent for a developer view nearly in real time.

Anonymization

the same telemetry stream can be sent via the OTEL collector once fully and once sanitized.

Langfuse

specialized AI observability tool with tracing, token economics, and a link to evaluations.

Tool choice

Aspire for quick troubleshooting, Langfuse as a self-managed AI tracing option.

The next step is to look at service-based variants - Azure Monitor and Azure AI Foundry.