Skip to main content
Automatically capture document ingestion, embedding, retrieval, and LLM synthesis within LlamaIndex pipelines.

Setup

To capture the complete picture—including both the pipeline structure and the underlying token costs—we highly recommend initializing both the LlamaIndex integration and your specific LLM provider (e.g., OpenAI, Anthropic, Google GenAI).
import traceroot
from traceroot import Integration

# Initialize LlamaIndex alongside your LLM provider to capture tokens and costs
traceroot.initialize(integrations=[
    Integration.LLAMA_INDEX,
    Integration.OPENAI  # Or ANTHROPIC, GOOGLE_GENAI, etc.
])

Usage

Once initialized, all LlamaIndex operations are captured automatically:
from llama_index.core import Document, Settings, VectorStoreIndex
from llama_index.llms.openai import OpenAI
from llama_index.embeddings.openai import OpenAIEmbedding

Settings.llm = OpenAI(model="gpt-4o-mini")
Settings.embed_model = OpenAIEmbedding(model="text-embedding-3-small")

documents = [
    Document(text="TraceRoot is an open-source observability platform for AI agents."),
    Document(text="TraceRoot supports OpenTelemetry-compatible tracing via a Python SDK."),
]

# Index construction is automatically traced (SentenceSplitter, Embedding)
index = VectorStoreIndex.from_documents(documents)
query_engine = index.as_query_engine()

# Each query is automatically traced (retrieval + LLM synthesis)
response = query_engine.query("What is TraceRoot?")
print(response)

What Gets Captured

AttributeDescription
Document chunkingEach SentenceSplitter call as a span
Embedding callsEach batch embedding with text inputs and vector outputs
RetrievalTop-k retrieval with query and retrieved documents
LLM synthesisResponse generation with context and final answer
Tokens & CostInput/output tokens and calculated cost per LLM call
LatencyDuration of each pipeline stage