--- title: Process Model Complexity Metrics type: concept tags: [bpm, modelling, quality, metrics, complexity, understandability, error-prediction] sources: - "[[sources/2010-mendling-reijers-vanderaalst-7pmg]]" - "[[sources/2018-dumas-fundamentals-of-bpm]]" created: 2026-05-04 updated: 2026-05-04 --- # Process Model Complexity Metrics Bottom-up structural metrics computed mechanically over a process model graph. Used to predict **error probability** and **comprehensibility**, and to operationalise size/complexity-related guidelines from [[concepts/7pmg]]. [[sources/2010-mendling-reijers-vanderaalst-7pmg]] cites these metrics as the **empirical foundation** for guidelines G1, G2, G4, G7. They were validated on the SAP Reference Model (600 EPCs) and on a 2000-EPC industrial corpus. ## Core metrics (control-flow models — BPMN, EPC, YAWL, Petri nets) ### Size - **|N|** — number of nodes (activities + events + gateways). - **|A|** — number of arcs (sequence flows). Empirically: error probability rises sharply with size; ≥50 elements → error probability > 50% ([[concepts/7pmg]] G7). ### Density $$\text{density} = \frac{|A|}{|N| \cdot (|N|-1)}$$ Ratio of actual arcs to theoretically possible arcs. Higher density → more interconnected → harder to understand. ### Connector degree For each gateway/connector c: - **in-degree(c)** = number of incoming arcs - **out-degree(c)** = number of outgoing arcs - **degree(c)** = in-degree + out-degree Aggregates: - **avg-degree** — average across all connectors - **max-degree** — worst-case connector 7PMG G2 ("minimise routing paths per element") targets degree directly. High avg/max degree negatively correlates with understanding (n=73 student questionnaire study). ### Control-Flow Complexity (CFC) — Cardoso Sum across split connectors of: - **XOR-split**: degree contribution = number of outgoing arcs - **OR-split**: degree contribution = 2^(out-degree) − 1 (combinatorial branch-set count) - **AND-split**: degree contribution = 1 CFC is bounded above by the number of execution paths the model permits. Validated against perceived complexity (Cardoso) and predictive of errors (Mendling). ### Depth and nesting - **Depth** — maximum nesting level of structured blocks. - Deeper nesting → harder to comprehend for human readers. ### Mismatch Count of split-connectors whose matching join is of a different type (e.g., AND-split matched by XOR-join). 7PMG G4 ("model as structured as possible") aims to minimise mismatch. ### Connector heterogeneity Diversity of connector types (entropy across {AND, XOR, OR}). ### Cross-connectivity A graph-theoretic metric (Vanderfeesten et al.) capturing how strongly nodes are causally linked. Negatively correlates with errors. ### Cyclomatic complexity (graph-theoretic) $$V(G) = |A| - |N| + 2 \cdot p$$ where *p* is the number of connected components. Adopted from McCabe's software complexity. For most well-formed process models *p* = 1. ## Interpretation | Metric ↑ | Effect on understanding | Effect on error probability | |---|---|---| | Size (|N|, |A|) | ↓ | ↑ | | Density | ↓ | ↑ | | Avg/Max connector degree | ↓ | ↑ | | CFC | ↓ | ↑ | | Depth | ↓ | ↑ | | Mismatch | ↓ | ↑ | | OR-connector count | ↓ | ↑ | | Cross-connectivity | ↑ | ↓ | Empirically validated on EPC corpora; replication on BPMN is partial. ## Use in 7PMG | 7PMG Guideline | Targets metric | |---|---| | G1 (few elements) | size | | G2 (minimise routing paths) | avg/max connector degree | | G3 (one start, one end) | reduces start/end count → enables soundness analysis | | G4 (structured) | mismatch → 0 | | G5 (avoid OR) | OR-connector count → 0 | | G6 (verb-object labels) | label-ambiguity (orthogonal axis) | | G7 (decompose if >50) | size threshold | ## Use in evaluation rubrics For an as-is BPMN model produced from interview discovery: 1. Compute the metrics above with a static analysis tool (Camunda Modeler shows size; ProM plug-ins compute CFC, density, mismatch). 2. Flag thresholds: size > 50 (G7); max degree > 4 (G2 violation); any OR connector (G5); density > 0.1 typically too dense. 3. Cross-reference with 7PMG guidelines to suggest fixes. ## Limitations - **Predictive, not deterministic** — metrics correlate with errors but do not entail them. A high-CFC model may be perfectly correct; a low-CFC model may still hide a soundness violation. - **Notation-specific weights** — metric weights calibrated on EPCs; BPMN re-calibration is partial. - **Pragmatic-only signal** — these metrics target understandability/error-likelihood, not [[concepts/sequal-framework|semantic]] correctness (whether the model matches reality) or [[concepts/soundness|behavioural]] correctness. - **Aggregation hides locality** — a model with one very-high-degree connector may have acceptable averages; report max as well as avg. ## Tool support - **ProM** — Petri-net / EPC plug-ins compute CFC, mismatch, structuredness. - **Camunda Modeler** — element count, basic structural validation. - **Signavio / SAP Signavio** — bottom-up metrics + 7PMG-style heuristics. - **bpmn.io** ecosystem — programmatic access to BPMN graph for custom metric computation. ## Related [[concepts/7pmg]] (the actionable distillation of these metrics) · [[concepts/process-model-quality]] · [[concepts/soundness]] (orthogonal correctness layer) · [[concepts/sequal-framework]] ## Sources [[sources/2010-mendling-reijers-vanderaalst-7pmg]] (primary — empirical validation) · [[sources/2018-dumas-fundamentals-of-bpm]]