๐Ÿ” Lab 2: Exploring AI Traces in Dynatrace

Duration: ~30 minutes

In this lab, youโ€™ll explore the traces generated by your AI application in Dynatrace, understanding the insights available for LLM and RAG observability.


๐ŸŽฏ Learning Objectives

  • Navigate to distributed traces in Dynatrace
  • Analyze LLM call details including prompts and completions
  • Understand token usage and cost attribution
  • Explore RAG pipeline spans (embeddings, vector search, completion)
  • Create basic queries for AI observability

๐Ÿ† Why Dynatrace for AI Observability?

Capability Basic Tracing Dynatrace
Collect traces โœ… OpenTelemetry โœ… Native OTLP + OpenLLMetry
See token counts โœ… In span attributes โœ… Unified with cost analysis
Correlate to infra โŒ Manual โœ… Davis AI auto-correlation
Root cause analysis โŒ You investigate โœ… Davis AI automatic RCA
Anomaly detection โŒ Static thresholds โœ… AI-powered baselines
Take action โŒ External tools โœ… Built-in Workflows

Step 1: Access Dynatrace

1.1 Open Dynatrace

Open the Dynatrace environment URL provided by your instructor:

https://YOUR_ENV.live.dynatrace.com

1.2 Login

Use the credentials provided by your instructor.


Step 2: Find Your Service

2.1 Navigate to the AI Observability App

  1. In the left navigation menu, click Search
  2. Search for AI Observability and select the app from the list to open

    Open AI Observability App

2.2 Explore Service Health

  1. Click Service Health on the top
  2. Choose ai-chat-service-{YOUR_ATTENDEE_ID} from the list on the left and click Update

    Service Health

This will allow you to view your service health metrics such as Errors, Traffic and Latency, Cost, and Guardrails.


Step 3: Explore Prompt and Trace Data

3.1 Explore Prompts

  1. Click Explorer on the top
  2. Choose ai-chat-service-{YOUR_ATTENDEE_ID} from the list

This is where you access deeper data about your AI service.

Explore Prompts

3.2 Access Traces and Spans

Select the View traces on the top right

View Traces

This will bring you to the Distributed Tracing app with a list of spans. Spans List

3.3 Select a Trace

Click on any trace to view the details. You should see traces for your /chat endpoint. Trace Dive


๐ŸŽญ Your Mission (Choose Your Persona)

From this point forward, youโ€™ll focus on different aspects depending on your role. Both paths cover all steps, but with different emphasis.

๐Ÿ’ป Developer: โ€œWhy is my RAG giving bad answers?โ€

Your story: Youโ€™ve deployed a RAG-powered chatbot, but users are complaining that sometimes it gives irrelevant or incomplete answers. You need to understand:

  • Is the vector search retrieving the right documents?
  • Is the context being formatted correctly for the LLM?
  • What prompts are actually being sent to the model?

Your goal: Learn to trace a request end-to-end, inspect prompts/completions, and identify where your RAG pipeline might be breaking down.

Focus on: Steps 4, 5, and 6 (marked with ๐Ÿ’ป)

๐Ÿ”ง SRE/Platform: โ€œHow much is this AI service costing us?โ€

Your story: Your team just launched an AI feature and leadership wants to know:

  • Whatโ€™s the cost?
  • Whatโ€™s the capacity?
  • Can we scale this?

Your goal: Build queries that give you token economics visibility, understand cost attribution, and prepare data for capacity planning.

Focus on: Steps 7 and 8 (marked with ๐Ÿ”ง)


๐Ÿ’ป Step 4: Analyze an AI Trace

4.1 Understanding the Trace Structure

A typical RAG request trace includes these spans:

๐Ÿ“ rag_chat_pipeline.workflow (Main RAG pipeline)
  โ””โ”€โ”€ ๐Ÿ“ analyze_query_intent.task (Classify user query type)
      โ””โ”€โ”€ ๐Ÿ“ AzureChatOpenAI.chat (LLM call for classification)
  โ””โ”€โ”€ ๐Ÿ“ retrieve_documents.task (Document retrieval)
      โ””โ”€โ”€ ๐Ÿ“ openai.embeddings (Generate query embedding)
      โ””โ”€โ”€ ๐Ÿ“ chroma.query (Vector store search)
  โ””โ”€โ”€ ๐Ÿ“ generate_context.task (Format retrieved docs)
  โ””โ”€โ”€ ๐Ÿ“ generate_response.task (Generate final answer)
      โ””โ”€โ”€ ๐Ÿ“ AzureChatOpenAI.chat (LLM completion call)

4.2 Examine the LLM Span

Click on the azure_openai.chat span under the analyze_query_intent.task to see:

Attribute Description
gen_ai.system The LLM provider (Azure)
gen_ai.request.model The model requested (gpt-4o-2024-11-20)
gen_ai.response.model The model that responded
gen_ai.request.temperature Temperature setting (e.g., 0.7)
gen_ai.usage.input_tokens Number of input tokens
gen_ai.usage.output_tokens Number of output tokens
gen_ai.usage.cache_read_input_tokens Cached input tokens (prompt caching)

4.3 View Prompts and Responses

Note: Depending on configuration, you may see:

  • gen_ai.prompt.0.content - The input prompt content
  • gen_ai.prompt.0.role - The prompt role (user, system)
  • gen_ai.completion.0.content - The generated response content
  • gen_ai.completion.0.role - The completion role (assistant)
  • gen_ai.completion.0.finish_reason - Why generation stopped (stop, length)

This visibility is crucial for debugging AI applications!


๐Ÿ’ป Step 5: Analyze Embedding Spans

5.1 Find the Embedding Span

In the trace view, locate the openai.embeddings span.

5.2 Examine Embedding Details

Key attributes include:

Attribute Description
gen_ai.request.model Embedding model (text-embedding-3-large)
gen_ai.usage.input_tokens Tokens in the text being embedded
gen_ai.system The provider (Azure)

๐Ÿ’ป Step 6: Vector Store Spans

6.1 Find the Vector Store Span

Look for chroma.query or similar vector database spans.

6.2 Key Insights

Click on the chroma.query span to see database attributes:

Attribute Description
db.system The vector database (chroma)
db.operation The operation performed (query)
db.chroma.query.n_results Number of documents retrieved (e.g., 3)
db.chroma.query.embeddings_count Number of embeddings in the query (e.g., 1)

๐Ÿ”ง Step 7: Token Optimization

Understanding Token Limits

All AI models have maximum input and output tokens that they can accomodate. In our case, weโ€™re using GPT-4o with the following limits:

Model Max Input Tokens Max Output Tokens
GPT-4o 128,000 16,384

7.1 Create a New Notebook

  1. Navigate to Notebooks in the left-hand menu
  2. Click + Notebook on the top to create a new notebook
  3. Name it: AI Observability - {YOUR_ATTENDEE_ID}
  4. For each DQL query, create a new DQL tile in your Notebook.

Lookup Tables

For this lab, weโ€™ve made use of lookup tables. Lookup tables allow us to upload referencable tables that we can use to enrich our data in Dynatrace. In this case, weโ€™ve created a lookup table with the maximum input/output tokens for our LLM model to make our following DQL queries more dynamic and robust in case prices ever change in the future.

To see the table for this lab, run the following DQL query:

load "/lookups/ai/azure-openai/model-max-tokens"

To see all lookup tables, run the following DQL query:

fetch dt.system.files

Documentation

7.2 Find the Biggest Token Spenders and Understand What Percentage of Token Limits are Used

//Find the Biggest Token Spenders and Understand What Percentage of Token Limits are Used
fetch spans
| filter service.name == "ai-chat-service-{YOUR_ATTENDEE_ID}"
| filter isNotNull(gen_ai.usage.input_tokens)
| summarize 
    total_input = sum(gen_ai.usage.input_tokens),
    total_output = sum(gen_ai.usage.output_tokens),
    avg_input = avg(gen_ai.usage.input_tokens),
    avg_output = avg(gen_ai.usage.output_tokens),
    request_count = count(),
  by: {gen_ai.response.model}
| fieldsAdd total_tokens = total_input + total_output
| lookup [load "/lookups/ai/azure-openai/model-max-tokens"], sourceField:gen_ai.response.model, lookupField:model
| filter isNotNull(lookup.model)
| fieldsAdd input_token_usage_percent = (avg_input / lookup.max.tokens.input)*100
| fieldsAdd output_token_usage_percent = (avg_output / lookup.max.tokens.output)*100
| fieldsRemove "lookup*"
| fields gen_ai.response.model, request_count, total_input, total_output, avg_input, avg_output, input_token_usage_percent, output_token_usage_percent

๐Ÿ”ง Step 8: Using Notebooks for AI Analysis

Dynatrace Notebooks provide powerful querying capabilities for AI observability.

8.1 Create a New Notebook

  1. Navigate to Notebooks in the left-hand menu
  2. Click + Notebook on the top to create a new notebook
  3. Name it: AI Observability - {YOUR_ATTENDEE_ID}
  4. For each DQL query, create a new DQL tile in your Notebook.

8.2 Query: Model Usage Distribution

//Model Usage Distribution
fetch spans
| filter service.name == "ai-chat-service-{YOUR_ATTENDEE_ID}"
| filter isNotNull(gen_ai.response.model)
| summarize request_count = count(), by: {gen_ai.response.model}
| sort request_count desc

Consider changing the visualization to make the data more intuitive! Click Options > Visualization and select โ€œPieโ€.

8.3 Query: Average Response Time by Operation

//Average Response Time by Operation
fetch spans
| filter service.name == "ai-chat-service-{YOUR_ATTENDEE_ID}"
| summarize 
    avg_duration = avg(duration),
  by: {span.name}
| sort avg_duration desc

Consider changing the visualization to make the data more intuitive! Click Options > Visualization and select โ€œCategoricalโ€.


๐Ÿ”ง Step 9: Token Economics Analysis

Understanding Token Costs

Tokens directly translate to cost. Hereโ€™s the current Azure OpenAI pricing:

Model Input Cost (per 1M tokens) Output Cost (per 1M tokens)
GPT-4o $2.50 $10.00
GPT-4o-mini $0.15 $0.60
text-embedding-3-large $0.13 N/A

Lookup Tables

For this lab, weโ€™ve made use of lookup tables. Lookup tables allow us to upload referencable tables that we can use to enrich our data in Dynatrace. In this case, weโ€™ve created a lookup table with our Azure pricing to make our following DQL queries more dynamic and robust in case prices ever change in the future.

To see the table for this lab, run the following DQL query:

load "/lookups/ai/azure-openai/model-costs"

To see all lookup tables, run the following DQL query:

fetch dt.system.files

Documentation

9.1 Find Your Biggest Token Spenders

//Find Your Biggest Token Spenders
fetch spans
| filter service.name == "ai-chat-service-{YOUR_ATTENDEE_ID}"
| filter isNotNull(gen_ai.usage.input_tokens)
| summarize 
    total_input = sum(gen_ai.usage.input_tokens),
    total_output = sum(gen_ai.usage.output_tokens),
    avg_input = avg(gen_ai.usage.input_tokens),
    request_count = count(),
    by: {gen_ai.response.model}
| fieldsAdd total_tokens = total_input + total_output
| lookup [load "/lookups/ai/azure-openai/model-costs"], sourceField:gen_ai.response.model, lookupField:model
| filter isNotNull(lookup.model)
| fieldsAdd estimated_cost_usd = (total_input * lookup.input.cost + total_output * if(isNull(lookup.output.cost),0.00,else:lookup.output.cost)) / 1000000
| fieldsRemove "lookup*"
| sort estimated_cost_usd desc

๐Ÿ’ก Tip: High avg_input tokens? Your system prompt or context might be too large. Consider summarizing retrieved documents before adding to context.

9.2 Prompt Caching Effectiveness

Azure OpenAI caches prompts > 1024 tokens. Check your cache hit rate:

//Prompt Caching Effectiveness
fetch spans
| filter service.name == "ai-chat-service-{YOUR_ATTENDEE_ID}"
| filter isNotNull(gen_ai.usage.cache_read_input_tokens)
| summarize 
    cached_tokens = sum(gen_ai.usage.cache_read_input_tokens),
    total_tokens = sum(gen_ai.usage.input_tokens)
| fieldsAdd cache_rate_percent = (toDouble(cached_tokens) / toDouble(total_tokens)) * 100

๐Ÿ’ก Tip: Low cache rate (<30%)? Youโ€™re paying more than necessary! Standardize system prompts and use longer static prefixes (1024+ tokens).

9.3 Token Trend Analysis

Track token usage over time to catch runaway costs early:

//Token Trend Analysis
fetch spans
| filter service.name == "ai-chat-service-{YOUR_ATTENDEE_ID}"
| filter isNotNull(gen_ai.usage.input_tokens)
| makeTimeseries 
    total_input = sum(gen_ai.usage.input_tokens),
    total_output = sum(gen_ai.usage.output_tokens),
    request_count = count()

9.4 What To Do With Token Data

Finding Indicates Action
High input tokens Large prompts/context Reduce system prompt, compress context
High output tokens Verbose responses Add length constraints to prompts
Low cache rate Inconsistent prompts Standardize prompt templates
Token spikes Potential abuse/bugs Set up alerts, investigate queries
Output > Input Complex questions Normal for detailed answers

โœ… Checkpoint

Before proceeding to Lab 3, verify you can:

  • Find your service in Dynatrace
  • View distributed traces for your AI requests
  • Identify LLM spans and their attributes
  • See token usage metrics and understand cost implications
  • Calculate token costs using DQL queries
  • Create basic DQL queries for AI observability
  • Understand the trace structure (HTTP โ†’ Embedding โ†’ Vector โ†’ LLM)

๐Ÿ†˜ Troubleshooting

โ€œNo traces foundโ€

  1. Verify your service name matches your ATTENDEE_ID
  2. Wait 1-2 minutes for traces to appear
  3. Check that your application is running and receiving requests
  4. Verify the DT_ENDPOINT and DT_API_TOKEN are correct

โ€œMissing LLM attributesโ€

  1. Ensure youโ€™re using the traceloop-sdk
  2. Some attributes may require specific Traceloop configuration
  3. Check the span details for any available attributes

โ€œService not appearingโ€

  1. Send a few more requests to your application
  2. Refresh the Dynatrace UI
  3. Use search (Cmd/Ctrl + K) to find your service

๏ฟฝ What Youโ€™ve Learned

๐Ÿ’ป Developer Takeaways

You now know how to debug your RAG pipeline using traces:

  1. โœ… Navigate the trace structure to understand your RAG workflow
  2. โœ… Inspect LLM spans to see prompts, completions, and model parameters
  3. โœ… Analyze embedding spans to verify query vectorization
  4. โœ… Check vector store spans to confirm document retrieval
  5. โœ… Use span attributes to debug why your AI gives certain responses

Next time your RAG gives a bad answer: Open the trace, check the retrieved documents, and inspect what prompt was actually sent to the LLM.

๐Ÿ”ง SRE/Platform Takeaways

You now have visibility into AI service costs and performance:

  1. โœ… Create Notebooks with DQL queries for token analysis
  2. โœ… Calculate estimated costs using token pricing formulas
  3. โœ… Monitor prompt caching effectiveness to optimize spend
  4. โœ… Track token trends over time to catch runaway costs
  5. โœ… Identify your biggest token spenders by operation

Take back to your team: The DQL queries you built โ€” theyโ€™re ready for dashboards and alerts.


๏ฟฝ๐ŸŽ‰ Great Progress!

Youโ€™ve explored AI traces in Dynatrace and understand how to analyze LLM observability data. Now letโ€™s learn how to use Dynatrace MCP for agentic AI interactions!