---
title: "Agentic AI Process Observability: Discovering Behavioral Variability"
type: source
tags: [agentic-bpm, observability, process-mining, causal-discovery, llm-agents, variability, static-analysis]
authors: [Fournier, Fabiana; Limonad, Lior; David, Yuval]
year: 2025
venue: "PMAI'25: Process Management in the AI Era Workshop, co-located with ECAI 2025, Bologna"
kind: paper
raw_path: "raw/ABPS/Agentic AI process observability.pdf"
sources: []
key_claims:
  - "LLM-backed agent frameworks (CrewAI, LangGraph, AutoGen) produce non-deterministic execution trajectories that demand dedicated observability tooling."
  - "Agent execution trajectories (timestamped tool invocations) can be treated as process event logs, enabling process mining and causal process discovery."
  - "Distinguish intended variability (developer-specified decision points) from unintended variability (accidental variation points arising from under-specification)."
  - "Approach combines Heuristics Miner (temporal/frequency view) with causal process mining (functional dependencies among tool invocations) to reveal gateway split points."
  - "LLM-based static analysis matches gateway rule statements against the agent specification source text to classify split points as decision vs variation points."
  - "Reliability calculation for gateways: minimum sample size n = Z² p(1-p)/E² plus rare-branch detection via (1-p)^n < 0.05."
  - "Case study on a CrewAI calculator with Decomposer, Calculator, and Manager agents (290 runs of 1+2-3*4/5) reveals breach-of-responsibility where Manager agent invokes math_tools despite not being granted access."
  - "Even with LLM temperature=0, notable behavioral instability remains; observability is essential for tightening natural-language agent specifications (vibe coding)."
  - "Contribution positioned within emerging Agentic BPM (AI-Augmented BPMS manifesto; Agentic BPM of Vu et al. 2025)."
created: 2026-04-15
updated: 2026-04-15
---

# Agentic AI Process Observability: Discovering Behavioral Variability

## Summary
Fournier, Limonad, and David (IBM Research, Israel) propose an observability approach for Agentic AI applications built atop LLM frameworks such as CrewAI, LangGraph, and AutoGen. Because such agents operate non-deterministically — the same prompt yields different execution trajectories — developers need tooling to expose where and why behavior varies. The paper reframes **agent execution trajectories** as process event logs, where each event is a timestamped tool invocation by a specific agent, enabling direct application of [[concepts/process-discovery]] and causal process discovery techniques ([[sources/2012-vanderaalst-process-mining-manifesto]]).

The central conceptual contribution is the distinction between **intended variability** (decision points explicitly specified by the developer) and **unintended variability** (variation points that emerge from under-specified prompts/tool assignments). Branching points in the mined causal model are classified as one or the other.

The pipeline has six steps: (1) trajectory generation across k runs; (2) event-log processing (agent+tool as activity, run as trace); (3) process + causal discovery producing two complementary views (Heuristics Miner for temporal flow, causal process mining for tool-invocation dependencies); (4) rule derivation per gateway; (5) **LLM-based static analysis** prompting LLaMA 3.3 70B Instruct to match the rule statement against the agent specification source; (6) reliability calculation using a normal approximation to the binomial to size the sample and detect rare branches.

A toy CrewAI calculator (Decomposer / Calculator / Manager agents, math tools, 290 runs over "1+2-3*4/5") illustrates the approach. The mined causal view surfaces a **breach of responsibility**: the Manager agent invoked math tools despite not being explicitly granted access. A second issue surfaces loophole use of `evaluate_parentheses` even when input has no parentheses. Both are fixed by tightening the specification (adding `tools=[]` on the Manager, strengthening backstory).

The paper frames observability as a prerequisite for future **self-debugging adaptive agents** and treats this as early work in agent process observability.

## Connections
- Extends [[concepts/agentic-bpm]] with an operational observability layer.
- Applies [[concepts/process-discovery]] (Heuristics Miner, [[entities/wil-van-der-aalst]]) to agent trajectories.
- Introduces new concepts: [[concepts/agent-process-observability]], [[concepts/behavioral-variability]], [[concepts/causal-process-discovery]].
- New entities: [[entities/fabiana-fournier]], [[entities/lior-limonad]], [[entities/yuval-david]].
- Cites [[sources/2023-dumas-ai-augmented-bpms]] (ABPMS manifesto) as part of the Agentic BPM renaissance.
- Relates to [[concepts/self-modification]] — observability feeds the adaptation/evolution loop.
- Complements [[concepts/explainability-apm]]: causal views support rationale articulation.
- OCED (Object-Centric Event Data) standard referenced as relevant abstraction *(unverified — referenced-not-ingested)*.