January 12, 2026
How Canva built an Agentic Support Experience using Langfuse logo

How Canva built an Agentic Support Experience using Langfuse

Learn how Canva's 4-person ML team built an AI support experience surpassing all baseline evaluation targets, powered by Langfuse observability across Java and Python stacks.

Picture Felix KrauthFelix Krauth

About Canva

Canva is the visual communication platform used by over 250 million monthly active users worldwide. From presentations to social media graphics to full brand kits, Canva has democratized design for individuals and enterprises alike.

Building a Multi-Agent Support Experience

Canva is building on Langfuse to develop and operate their agentic customer support experience. Their setup has evolved from a simple chat experience to a multi-layered and multi-agent system with access to many tools, sub-agents, and internal systems of record for context retrieval.

The core revolves around the in-app chat (Help Assistant) and an asynchronous ticket resolution agent (Omni Agent).

Screenshot of Canva's Support Agent

Help Assistant: The Help Assistant is the user-facing chat panel that handles the majority of support volume. When a user opens the help interface, their query gets routed to specialized sub-agents:

  • Design assistance - “How do I remove a background?”
  • Account actions - Refunds, subscription changes
  • Feature requests - Routed to bug lists and roadmaps

Omni Agent: Omni Agent is a more sophisticated system that works asynchronously on submitted tickets. It interfaces with users through the Help Assistant or e-mail. If Omni Agent can’t resolve the ticket, it escalates to human support.

“We call it Omni Agent because it has access to a large amount of tools, functionalities, and user data,” says Andreas. “It can dig into account history, execute complex multi-step resolutions, and handle edge cases the fast path can’t.”

Two Stacks, one Platform

Canva’s multi-language architecture made handling different tech stacks a core requirement for their LLM operations platform.

Help Assistant runs on Java, the backbone of much of Canva’s infrastructure. The team integrated via OpenTelemetry, which doesn’t lock them into a single observability solution.

Omni Agent runs as a Python ML worker, taking full advantage of Langfuse’s native Python SDK and the faster iteration cycles that come with it.

How Canva uses Langfuse

Canva takes full advantage of the entire Langfuse suite across Observability, Prompt Management and Evaluation. What started as a tight engineering core has expanded across roles:

  • ML Engineers: Deep debugging, trace analysis, online and offline evaluation setup
  • Product Managers: Prompt iteration, replay testing, quality monitoring
  • QA Team: Annotation queues, systematic quality scoring
  • Content Designers: Maintaining and improving response and RAG content
  • Domain Experts : Topic-, market- or language-specific QA

One example: Canva’s Japanese market requires precise formal business tones. A marketing manager in Japan set up a dedicated LLM-as-a-judge evaluator to monitor tone of voice, without engineering help. This is a massive enabler: the person who knows the subject matter best can build and run evaluators independently.

"We have realized that to build good AI systems, you need to inject domain expertise which is not within an engineer's scope. Langfuse makes that possible. It hits the sweet spot between engineering requirements and empowerment of non-technical users to contribute their domain expertise.
Andreas Schuster
Andreas Schuster, Head of Product, AI Help Experience at Canva

Tracing for Debugging

The Tracing captures error information, warnings, and metadata across every step. Engineers use Metadata and Tags to search and filter efficiently, while the Playground replay functionality lets anyone re-run a generation with the exact system prompt of that moment, critical for reproducing issues.

Screenshot of Canva's Tracing

Prompt Management

Langfuse’s Prompt Management has become a key enabler. Prompts are versioned, changes can be tested before deployment, and critically, non-technical team members can make updates independently.

Screenshot of Canva's Prompt Management

“The prompt management system is well-designed,” says Sergey. “Versioning, the ability to promote or rollback changes from and to production, is a big enabler. When product managers can make changes without involving engineering, it frees up a lot of time and makes everything faster.”

Evals and Experiments

While both systems, Help Assistant and Omni Agent, differ in request volume and integration complexity, the team has over time unified their approaches for evals with only some differences. “If something works well in one system, we quickly implement it for the other as well,” says Sergey.

Here are Canva’s approaches to offline and online Evals:

  • Experiments & Datasets: In development, both systems can be tested by running offline Experiments based on data stored in Datasets. Canva has stored input data for default paths and edge cases together with their expected outputs. To make agents with access to tools testable the team is mocking tool call responses during experiments.

  • LLM-as-a-Judge: Both systems are running slightly different sets of evaluators scoring them across 15-20 metrics both offline (in development) and online (in production).

  • Custom Scores: Additionally to LLM-as-a-Judge, Canva has implemented custom deterministic scoring logic via API/SDK which is executed during offline Experiments.

  • Human Annotation: To complement automated online/offline evals the QA team runs systematic manual annotation workflows to manually inspect 20-100 production cases per week.

  • Shadow Mode: To test Omni Agent with real system data Canva often deploys changes with a “shadow mode” flag. This allows them to test Omni Agent behavior in production without affecting the overall UX. They are using prompt deployment labels to manage this workflow.

From Self-Hosting to Cloud

Canva started on self-hosting during early product development but then migrated to Langfuse Cloud to reduce internal workload and focus on building the best possible AI support system.

“I could just run Langfuse locally. Being open source is a huge differentiator. It lets the team validate the tooling before kicking off all required approvals in legal and procurement,” says Sergey.

Once the value was proven, they migrated to Langfuse Cloud. “Running such a large system at scale means we need to maintain a lot with our own team,” Sergey explains. “We don’t have capacity for all the maintenance. It’s a platform effort.”

Why Canva chose Langfuse

The team evaluated several LLM observability platforms. Langfuse won for several reasons:

  • Open source: Allowed Sergey to build confidence into Langfuse’s capabilities before engaging in commercial discussions
  • Framework agnostic: Canva uses raw LLM clients, no frameworks
  • OpenTelemetry support: Critical for the Java stack, no vendor lock-in
  • End-to-end LLM operations platform: Full suite across observability, prompt management, and evaluation
  • Shipping velocity: “Other vendors came back with Figma prototypes. Meanwhile, Langfuse shipped two features. That’s when we knew.”
"Langfuse makes our engineers' life so much easier. Without Langfuse, our AI systems would be a black box. Only engineers would know what's happening, and only after deep investigation into logs.
Sergey Iakovlev
Sergey Iakovlev, Lead ML Engineer, User Voice at Canva

Business Impact

01

Driving better user experiences

Building on Langfuse a 4-person team enabled Canva to automate repeatable support requests driving better resolutions for our users at lower cost.


02

Multi-Agent System at Scale

AI support handles 80% of user interactions across 250M monthly active users through a sophisticated multi-agent architecture.


03

Faster Iteration Speed

Engineers ship faster and domain experts are empowered to directly improve the system without requiring engineering.


04

Improved AI Output Quality

The overall system quality significantly improved through the inclusion of non-technical team members and domain experts.


05

Single Platform across Tech Stacks

Langfuse runs for both Canva's Java and Python stacks, enabling a single observability platform for their entire multi-agent support system.

Ready to get started with Langfuse?

Join thousands of teams building better LLM applications with Langfuse's open-source observability platform.

No credit card required • Free tier available • Self-hosting option