๐ Lab 2: Exploring AI Traces in Dynatrace
Duration: ~30 minutes
In this lab, youโll explore the traces generated by your AI application in Dynatrace, understanding the insights available for LLM and RAG observability.
๐ฏ Learning Objectives
- Navigate to distributed traces in Dynatrace
- Analyze LLM call details including prompts and completions
- Understand token usage and cost attribution
- Explore RAG pipeline spans (embeddings, vector search, completion)
- Create basic queries for AI observability
๐ Why Dynatrace for AI Observability?
| Capability | Basic Tracing | Dynatrace |
|---|---|---|
| Collect traces | โ OpenTelemetry | โ Native OTLP + OpenLLMetry |
| See token counts | โ In span attributes | โ Unified with cost analysis |
| Correlate to infra | โ Manual | โ Davis AI auto-correlation |
| Root cause analysis | โ You investigate | โ Davis AI automatic RCA |
| Anomaly detection | โ Static thresholds | โ AI-powered baselines |
| Take action | โ External tools | โ Built-in Workflows |
Step 1: Access Dynatrace
1.1 Open Dynatrace
Open the Dynatrace environment URL provided by your instructor:
https://YOUR_ENV.live.dynatrace.com
1.2 Login
Use the credentials provided by your instructor.
Step 2: Find Your Service
2.1 Navigate to the AI Observability App
- In the left navigation menu, click Search
-
Search for AI Observability and select the app from the list to open

2.2 Explore Service Health
- Click Service Health on the top
-
Choose
ai-chat-service-{YOUR_ATTENDEE_ID}from the list on the left and click Update
This will allow you to view your service health metrics such as Errors, Traffic and Latency, Cost, and Guardrails.
Step 3: Explore Prompt and Trace Data
3.1 Explore Prompts
- Click Explorer on the top
- Choose
ai-chat-service-{YOUR_ATTENDEE_ID}from the list
This is where you access deeper data about your AI service.

3.2 Access Traces and Spans
Select the View traces on the top right

This will bring you to the Distributed Tracing app with a list of spans.

3.3 Select a Trace
Click on any trace to view the details. You should see traces for your /chat endpoint.

๐ญ Your Mission (Choose Your Persona)
From this point forward, youโll focus on different aspects depending on your role. Both paths cover all steps, but with different emphasis.
๐ป Developer: โWhy is my RAG giving bad answers?โ
Your story: Youโve deployed a RAG-powered chatbot, but users are complaining that sometimes it gives irrelevant or incomplete answers. You need to understand:
- Is the vector search retrieving the right documents?
- Is the context being formatted correctly for the LLM?
- What prompts are actually being sent to the model?
Your goal: Learn to trace a request end-to-end, inspect prompts/completions, and identify where your RAG pipeline might be breaking down.
Focus on: Steps 4, 5, and 6 (marked with ๐ป)
๐ง SRE/Platform: โHow much is this AI service costing us?โ
Your story: Your team just launched an AI feature and leadership wants to know:
- Whatโs the cost?
- Whatโs the capacity?
- Can we scale this?
Your goal: Build queries that give you token economics visibility, understand cost attribution, and prepare data for capacity planning.
Focus on: Steps 7 and 8 (marked with ๐ง)
๐ป Step 4: Analyze an AI Trace
4.1 Understanding the Trace Structure
A typical RAG request trace includes these spans:
๐ rag_chat_pipeline.workflow (Main RAG pipeline)
โโโ ๐ analyze_query_intent.task (Classify user query type)
โโโ ๐ AzureChatOpenAI.chat (LLM call for classification)
โโโ ๐ retrieve_documents.task (Document retrieval)
โโโ ๐ openai.embeddings (Generate query embedding)
โโโ ๐ chroma.query (Vector store search)
โโโ ๐ generate_context.task (Format retrieved docs)
โโโ ๐ generate_response.task (Generate final answer)
โโโ ๐ AzureChatOpenAI.chat (LLM completion call)
4.2 Examine the LLM Span
Click on the azure_openai.chat span under the analyze_query_intent.task to see:
| Attribute | Description |
|---|---|
gen_ai.system |
The LLM provider (Azure) |
gen_ai.request.model |
The model requested (gpt-4o-2024-11-20) |
gen_ai.response.model |
The model that responded |
gen_ai.request.temperature |
Temperature setting (e.g., 0.7) |
gen_ai.usage.input_tokens |
Number of input tokens |
gen_ai.usage.output_tokens |
Number of output tokens |
gen_ai.usage.cache_read_input_tokens |
Cached input tokens (prompt caching) |
4.3 View Prompts and Responses
Note: Depending on configuration, you may see:
gen_ai.prompt.0.content- The input prompt contentgen_ai.prompt.0.role- The prompt role (user, system)gen_ai.completion.0.content- The generated response contentgen_ai.completion.0.role- The completion role (assistant)gen_ai.completion.0.finish_reason- Why generation stopped (stop, length)
This visibility is crucial for debugging AI applications!
๐ป Step 5: Analyze Embedding Spans
5.1 Find the Embedding Span
In the trace view, locate the openai.embeddings span.
5.2 Examine Embedding Details
Key attributes include:
| Attribute | Description |
|---|---|
gen_ai.request.model |
Embedding model (text-embedding-3-large) |
gen_ai.usage.input_tokens |
Tokens in the text being embedded |
gen_ai.system |
The provider (Azure) |
๐ป Step 6: Vector Store Spans
6.1 Find the Vector Store Span
Look for chroma.query or similar vector database spans.
6.2 Key Insights
Click on the chroma.query span to see database attributes:
| Attribute | Description |
|---|---|
db.system |
The vector database (chroma) |
db.operation |
The operation performed (query) |
db.chroma.query.n_results |
Number of documents retrieved (e.g., 3) |
db.chroma.query.embeddings_count |
Number of embeddings in the query (e.g., 1) |
๐ง Step 7: Token Optimization
Understanding Token Limits
All AI models have maximum input and output tokens that they can accomodate. In our case, weโre using GPT-4o with the following limits:
| Model | Max Input Tokens | Max Output Tokens |
|---|---|---|
| GPT-4o | 128,000 | 16,384 |
7.1 Create a New Notebook
- Navigate to Notebooks in the left-hand menu
- Click + Notebook on the top to create a new notebook
- Name it:
AI Observability - {YOUR_ATTENDEE_ID} - For each DQL query, create a new DQL tile in your Notebook.
Lookup Tables
For this lab, weโve made use of lookup tables. Lookup tables allow us to upload referencable tables that we can use to enrich our data in Dynatrace. In this case, weโve created a lookup table with the maximum input/output tokens for our LLM model to make our following DQL queries more dynamic and robust in case prices ever change in the future.
To see the table for this lab, run the following DQL query:
load "/lookups/ai/azure-openai/model-max-tokens"
To see all lookup tables, run the following DQL query:
fetch dt.system.files
7.2 Find the Biggest Token Spenders and Understand What Percentage of Token Limits are Used
//Find the Biggest Token Spenders and Understand What Percentage of Token Limits are Used
fetch spans
| filter service.name == "ai-chat-service-{YOUR_ATTENDEE_ID}"
| filter isNotNull(gen_ai.usage.input_tokens)
| summarize
total_input = sum(gen_ai.usage.input_tokens),
total_output = sum(gen_ai.usage.output_tokens),
avg_input = avg(gen_ai.usage.input_tokens),
avg_output = avg(gen_ai.usage.output_tokens),
request_count = count(),
by: {gen_ai.response.model}
| fieldsAdd total_tokens = total_input + total_output
| lookup [load "/lookups/ai/azure-openai/model-max-tokens"], sourceField:gen_ai.response.model, lookupField:model
| filter isNotNull(lookup.model)
| fieldsAdd input_token_usage_percent = (avg_input / lookup.max.tokens.input)*100
| fieldsAdd output_token_usage_percent = (avg_output / lookup.max.tokens.output)*100
| fieldsRemove "lookup*"
| fields gen_ai.response.model, request_count, total_input, total_output, avg_input, avg_output, input_token_usage_percent, output_token_usage_percent
๐ง Step 8: Using Notebooks for AI Analysis
Dynatrace Notebooks provide powerful querying capabilities for AI observability.
8.1 Create a New Notebook
- Navigate to Notebooks in the left-hand menu
- Click + Notebook on the top to create a new notebook
- Name it:
AI Observability - {YOUR_ATTENDEE_ID} - For each DQL query, create a new DQL tile in your Notebook.
8.2 Query: Model Usage Distribution
//Model Usage Distribution
fetch spans
| filter service.name == "ai-chat-service-{YOUR_ATTENDEE_ID}"
| filter isNotNull(gen_ai.response.model)
| summarize request_count = count(), by: {gen_ai.response.model}
| sort request_count desc
Consider changing the visualization to make the data more intuitive! Click Options > Visualization and select โPieโ.
8.3 Query: Average Response Time by Operation
//Average Response Time by Operation
fetch spans
| filter service.name == "ai-chat-service-{YOUR_ATTENDEE_ID}"
| summarize
avg_duration = avg(duration),
by: {span.name}
| sort avg_duration desc
Consider changing the visualization to make the data more intuitive! Click Options > Visualization and select โCategoricalโ.
๐ง Step 9: Token Economics Analysis
Understanding Token Costs
Tokens directly translate to cost. Hereโs the current Azure OpenAI pricing:
| Model | Input Cost (per 1M tokens) | Output Cost (per 1M tokens) |
|---|---|---|
| GPT-4o | $2.50 | $10.00 |
| GPT-4o-mini | $0.15 | $0.60 |
| text-embedding-3-large | $0.13 | N/A |
Lookup Tables
For this lab, weโve made use of lookup tables. Lookup tables allow us to upload referencable tables that we can use to enrich our data in Dynatrace. In this case, weโve created a lookup table with our Azure pricing to make our following DQL queries more dynamic and robust in case prices ever change in the future.
To see the table for this lab, run the following DQL query:
load "/lookups/ai/azure-openai/model-costs"
To see all lookup tables, run the following DQL query:
fetch dt.system.files
9.1 Find Your Biggest Token Spenders
//Find Your Biggest Token Spenders
fetch spans
| filter service.name == "ai-chat-service-{YOUR_ATTENDEE_ID}"
| filter isNotNull(gen_ai.usage.input_tokens)
| summarize
total_input = sum(gen_ai.usage.input_tokens),
total_output = sum(gen_ai.usage.output_tokens),
avg_input = avg(gen_ai.usage.input_tokens),
request_count = count(),
by: {gen_ai.response.model}
| fieldsAdd total_tokens = total_input + total_output
| lookup [load "/lookups/ai/azure-openai/model-costs"], sourceField:gen_ai.response.model, lookupField:model
| filter isNotNull(lookup.model)
| fieldsAdd estimated_cost_usd = (total_input * lookup.input.cost + total_output * if(isNull(lookup.output.cost),0.00,else:lookup.output.cost)) / 1000000
| fieldsRemove "lookup*"
| sort estimated_cost_usd desc
๐ก Tip: High avg_input tokens? Your system prompt or context might be too large. Consider summarizing retrieved documents before adding to context.
9.2 Prompt Caching Effectiveness
Azure OpenAI caches prompts > 1024 tokens. Check your cache hit rate:
//Prompt Caching Effectiveness
fetch spans
| filter service.name == "ai-chat-service-{YOUR_ATTENDEE_ID}"
| filter isNotNull(gen_ai.usage.cache_read_input_tokens)
| summarize
cached_tokens = sum(gen_ai.usage.cache_read_input_tokens),
total_tokens = sum(gen_ai.usage.input_tokens)
| fieldsAdd cache_rate_percent = (toDouble(cached_tokens) / toDouble(total_tokens)) * 100
๐ก Tip: Low cache rate (<30%)? Youโre paying more than necessary! Standardize system prompts and use longer static prefixes (1024+ tokens).
9.3 Token Trend Analysis
Track token usage over time to catch runaway costs early:
//Token Trend Analysis
fetch spans
| filter service.name == "ai-chat-service-{YOUR_ATTENDEE_ID}"
| filter isNotNull(gen_ai.usage.input_tokens)
| makeTimeseries
total_input = sum(gen_ai.usage.input_tokens),
total_output = sum(gen_ai.usage.output_tokens),
request_count = count()
9.4 What To Do With Token Data
| Finding | Indicates | Action |
|---|---|---|
| High input tokens | Large prompts/context | Reduce system prompt, compress context |
| High output tokens | Verbose responses | Add length constraints to prompts |
| Low cache rate | Inconsistent prompts | Standardize prompt templates |
| Token spikes | Potential abuse/bugs | Set up alerts, investigate queries |
| Output > Input | Complex questions | Normal for detailed answers |
โ Checkpoint
Before proceeding to Lab 3, verify you can:
- Find your service in Dynatrace
- View distributed traces for your AI requests
- Identify LLM spans and their attributes
- See token usage metrics and understand cost implications
- Calculate token costs using DQL queries
- Create basic DQL queries for AI observability
- Understand the trace structure (HTTP โ Embedding โ Vector โ LLM)
๐ Troubleshooting
โNo traces foundโ
- Verify your service name matches your
ATTENDEE_ID - Wait 1-2 minutes for traces to appear
- Check that your application is running and receiving requests
- Verify the DT_ENDPOINT and DT_API_TOKEN are correct
โMissing LLM attributesโ
- Ensure youโre using the traceloop-sdk
- Some attributes may require specific Traceloop configuration
- Check the span details for any available attributes
โService not appearingโ
- Send a few more requests to your application
- Refresh the Dynatrace UI
- Use search (Cmd/Ctrl + K) to find your service
๏ฟฝ What Youโve Learned
๐ป Developer Takeaways
You now know how to debug your RAG pipeline using traces:
- โ Navigate the trace structure to understand your RAG workflow
- โ Inspect LLM spans to see prompts, completions, and model parameters
- โ Analyze embedding spans to verify query vectorization
- โ Check vector store spans to confirm document retrieval
- โ Use span attributes to debug why your AI gives certain responses
Next time your RAG gives a bad answer: Open the trace, check the retrieved documents, and inspect what prompt was actually sent to the LLM.
๐ง SRE/Platform Takeaways
You now have visibility into AI service costs and performance:
- โ Create Notebooks with DQL queries for token analysis
- โ Calculate estimated costs using token pricing formulas
- โ Monitor prompt caching effectiveness to optimize spend
- โ Track token trends over time to catch runaway costs
- โ Identify your biggest token spenders by operation
Take back to your team: The DQL queries you built โ theyโre ready for dashboards and alerts.
๏ฟฝ๐ Great Progress!
Youโve explored AI traces in Dynatrace and understand how to analyze LLM observability data. Now letโs learn how to use Dynatrace MCP for agentic AI interactions!