--- title: "Process Model Quality & Soundness — Evaluation Guide for BPMN Models from Qualitative Interviews" type: synthesis tags: [bpm, modelling, quality, soundness, 7pmg, sequal, bpmn, evaluation, llm-assisted-review, rubric, methodology] sources: - "[[sources/2018-dumas-fundamentals-of-bpm]]" - "[[sources/2018-dumas-fundamentals-of-bpm-ch5-discovery]]" - "[[sources/1998-vanderaalst-verification-of-workflow-nets]]" - "[[sources/2010-mendling-reijers-vanderaalst-7pmg]]" - "[[sources/2006-krogstie-sindre-jorgensen-revised-sequal-framework]]" - "[[sources/2012-ottensooser-graphical-vs-textual]]" - "[[sources/2011-vanderaalst-process-mining-book]]" - "[[sources/2008-pesic-declare-manual]]" created: 2026-05-04 updated: 2026-05-04 --- # Process Model Quality & Soundness — Evaluation Guide for BPMN Models from Qualitative Interviews A consolidated, exhaustive guide for evaluating a BPMN model produced by a process analyst from qualitative interview data. Synthesised from five primary-source layers: SEQUAL ([[sources/2006-krogstie-sindre-jorgensen-revised-sequal-framework]]), Dumas's tripartite ([[sources/2018-dumas-fundamentals-of-bpm]] §5.4), 7PMG ([[sources/2010-mendling-reijers-vanderaalst-7pmg]]), workflow soundness ([[sources/1998-vanderaalst-verification-of-workflow-nets]]), and conformance dimensions ([[sources/2011-vanderaalst-process-mining-book]] Ch. 8 + Dumas §11.4.4). The pragmatic-quality layer is also empirically anchored in Ottensooser et al. ([[sources/2012-ottensooser-graphical-vs-textual]]). The guide is **operational**: it concludes in §10 with an LLM-assisted review rubric — a structured prompt scaffold the analyst (or an LLM acting as reviewer) applies to a candidate BPMN model. For the *upstream* interview-conduct guidance (how to elicit the content the model is built from), see the companion [[syntheses/interview-structuring-for-process-models]]. --- ## 1. Scope and audience **You** are a process analyst who has just completed (or is finalising) interviews with domain experts about an existing process. The deliverable is a **BPMN as-is process model** — a structured artefact intended to: 1. Be **validated** by the same domain experts. 2. Be **understandable** by stakeholders who did not participate in modelling. 3. Be **mechanically verifiable** so that obvious correctness flaws (deadlocks, dead activities, unreachable states) can be caught before the model is signed off. 4. Serve as a baseline for redesign, simulation, monitoring, or executable workflow specification. The guide covers: what quality dimensions exist (§2), how each is verified for a BPMN model from interview data (§§3–8), how they trade off (§9), and how to operationalise the evaluation as an LLM-assisted review (§10). §11 covers anti-patterns; §12 the workflow; §13 acknowledged gaps. --- ## 2. The quality framework — five layers, one stack A BPMN model from interviews is judged across **five layered concerns**, derived by integrating the SEQUAL semiotic framework with the BPM-textbook tripartite and the formal verification layer. ``` ┌─────────────────────────────────────────────────────┐ │ 5. Goal-fulfilment / organisational quality │ ← does the model serve the modelling goal? ├─────────────────────────────────────────────────────┤ │ 4. Pragmatic quality │ ← is the model understandable / actionable? ├─────────────────────────────────────────────────────┤ │ 3. Semantic quality │ ← does the model match reality? ├─────────────────────────────────────────────────────┤ │ 2. Syntactic quality │ ← is the model well-formed? │ 2a. Notation-syntactic (BPMN rules) │ │ 2b. Behavioural-syntactic (soundness) │ ├─────────────────────────────────────────────────────┤ │ 1. Physical / empirical quality │ ← is the model accessible & readable? └─────────────────────────────────────────────────────┘ ``` The five layers correspond to SEQUAL quality types ([[concepts/sequal-framework]]): | This guide's layer | SEQUAL types it covers | Verification mode | |---|---|---| | 1. Physical / empirical | physical + empirical | Tooling check | | 2a. Notation-syntactic | syntactic (M ⊆ L) | Mechanical (BPMN tool) | | 2b. Behavioural-syntactic | (extension — formal correctness inside the syntactic bucket per Dumas §5.4) | Mechanical ([[concepts/soundness|soundness]] checker) | | 3. Semantic | semantic + perceived semantic | Domain-expert validation | | 4. Pragmatic | pragmatic | Reader test + 7PMG/metrics | | 5. Organisational | social + organisational | Process owner sign-off | **Conformance-checking dimensions** (fitness, precision, generalisation, simplicity — [[concepts/conformance-checking]]) belong to a *different axis*: they evaluate a model against an event log, not against reality or audience. They become relevant *after* discovery (when monitoring/mining begins). Section 8 covers them briefly for completeness; for an as-is model with no log, they do not yet apply. --- ## 3. Layer 1 — Physical and empirical quality Often invisible because most modern BPMN tooling solves it by default; still worth a single check. ### 3.1 Physical quality - The model exists as a stored, accessible artefact (BPMN 2.0 XML or equivalent). - Single source of truth: avoid PowerPoint sketches, hand-redrawn versions, screenshot-only models. - Versioned in a system the audience can access. ### 3.2 Empirical quality - **Layout**: left-to-right or top-to-bottom flow; no crossing arcs unless unavoidable. - **Spacing**: connectors and tasks aligned on a grid. - **Colour** (if used): purposeful and consistent (e.g. lane colours for organisations). - **Font**: legible at the size the model will be read. Tools like Camunda Modeler, bpmn.io, Signavio handle most of this automatically. Bad layout *will* sink pragmatic quality regardless of correctness — Dumas §5.1.2 expert-box note: "neat diagrams engage stakeholders". --- ## 4. Layer 2a — Notation-syntactic quality (BPMN rules) Mechanical, tool-enforceable, non-negotiable. ### 4.1 Element-level rules - **Sequence flows** connect flow elements only (events, activities, gateways) — not data objects, not lanes. - **Tasks** have exactly one incoming and one outgoing sequence flow (in basic models). Multiple flows must go through gateways. - **Events**: - **Start events**: no incoming sequence flow. - **End events**: no outgoing sequence flow. - **Intermediate events**: catching or throwing variants typed correctly (message, timer, error, signal, escalation, compensation). - **Boundary events** attached to activities, not to other events or gateways. - **Gateways**: - Logical type set (XOR / AND / OR / event-based / complex). - Exclusive gateway (XOR) typically has explicit *default* flow if conditions might all be false. - Parallel gateway (AND) joins all incoming branches before continuing. - Inclusive gateway (OR) — flag for review per [[concepts/7pmg]] G5. - **Pools / lanes**: each task assigned to exactly one lane; a pool encapsulates one organisational entity. - **Message flows**: cross pools only; never within a pool. - **Data objects / data stores**: connected via association (dotted line), not sequence flow. ### 4.2 Common violations - Sequence flow crossing pool boundaries (should be message flow). - Task with two incoming sequence flows (implicit AND-join — discouraged; use explicit gateway). - End event followed by sequence flow ("End" must terminate). - Gateway with single in + single out (vestigial — remove). ### 4.3 Tooling - **Camunda Modeler** — real-time syntactic validation; refuses some illegal constructs. - **bpmn.io** — same engine. - **Signavio / SAP Signavio** — server-side validation + suggestion engine. - **bpmnlint** (open-source) — linting rules library; integrate into CI. --- ## 5. Layer 2b — Behavioural-syntactic quality (soundness) The structural rules of §4 do not catch *behavioural* errors: deadlocks, livelocks, dead activities, orphan tokens. Those need [[concepts/soundness|soundness]] verification per [[sources/1998-vanderaalst-verification-of-workflow-nets]]. ### 5.1 The soundness criterion A BPMN model translates (under standard semantics) to a [[concepts/workflow-net|workflow net]]. The model is **sound** iff the WF-net is sound, i.e. iff: 1. **Option to complete**: from every reachable state, the end event is reachable. *No deadlocks.* 2. **Proper completion**: when the end event fires, no orphan tokens are left in the process. *No tokens stuck on parallel branches.* 3. **No dead transitions / activities**: every modelled activity is executable from some reachable state. *No unreachable work.* These are exactly the three clauses Dumas §5.4 paraphrases as "option to complete · proper completion · no dead activities". ### 5.2 Why soundness is decidable in practice For **free-choice** WF-nets — the class most naturally produced by BPMN with AND/XOR gateways — soundness is verifiable in **polynomial time** ([[sources/1998-vanderaalst-verification-of-workflow-nets]] Theorem 12). Theorem 11 shows the reduction: PN sound ⇔ extended net (PN + transition o→i) is live and bounded. Standard Petri-net analysis tools then suffice. In industrial WFMS deployments the author surveyed (Dutch Customs, Justice, banks, insurance), almost all WF-nets are free-choice or *almost* free-choice (free-choice + inquiry transitions, Theorem 16 — also polynomial-time decidable). ### 5.3 The four behavioural anomalies (Dumas §5.4.1) Soundness violations manifest as one of four anomalies — each with its own concept page: | Anomaly | Symptom | Typical cause | Violates | |---|---|---|---| | **[[concepts/deadlock]]** | Token stuck; instance never completes | XOR-split → AND-join (mismatch); branch injected into AND-block | (1) option to complete | | **[[concepts/livelock]]** | Token cycles in loop forever | Loop with always-true exit, or no exit gateway, or unsatisfiable exit | (1) option to complete | | **[[concepts/lack-of-synchronization]]** | Multiple tokens on same flow; activity executes more times than intended | AND-split → XOR-join (mismatch); OR-split → XOR-join | (2) proper completion | | **[[concepts/dead-activity]]** | Activity never executable in any instance | Provably-false branch condition; downstream of a deadlock; disconnected | (3) no dead activities | A sound model contains *none* of these. Verifying soundness = checking the absence of all four. See [[concepts/token-semantics]] for the underlying token-flow reasoning. **Common appearance patterns** in models from interviews: | Pattern | Anomaly | Why it appears | |---|---|---| | Mismatched AND-split / XOR-join | Lack of synchronisation | Branch added late without updating join | | Mismatched XOR-split / AND-join | Deadlock | Analyst inverts the gateway type | | Activity reachable only via a gateway condition that's never true | Dead activity | "What-if" branches added speculatively | | Implicit join (multiple sequence flows into a task) | Lack of synchronisation | Quick-and-dirty modelling shortcut | | Loop without exit condition | Livelock | Iterate-without-checking pattern from interview | | Loop with always-true exit | Livelock | Inverted loop polarity | #### Block structure as anomaly prevention [[concepts/block-structure]] (single-entry-single-exit fragments with matching split/join type) is **sound by construction** — no anomaly can arise within a block. Most discovery-phase models can be built block-structured; deviations should be the conscious exception, not the default. ### 5.4 Verification tools - **Camunda Modeler** — basic structural validation; not full soundness. - **Signavio** — soundness checks built in (commercial). - **ProM** — academic gold-standard; multiple soundness plug-ins. - **WOFLAN** — original [[sources/1998-vanderaalst-verification-of-workflow-nets]] tool. - **bpmn-js-token-simulation** — interactive token-flow simulation; useful for visualising deadlocks. ### 5.5 Soundness variants Beyond classical soundness, the literature defines (referenced-not-yet-ingested): - **Weak soundness** — drops "no dead transitions". - **Relaxed soundness** — every transition participates in *some* successful run. - **Lazy / k-soundness** — for cyclic processes. - **Data-aware soundness** — extends to data-flow correctness. For an as-is model from interviews, **classical soundness is the right gate**. Relaxed variants are useful when modelling explicitly leaves dead branches as documentation of "we know this is a dead-end but it's there in policy". --- ## 6. Layer 3 — Semantic quality (does the model match reality?) This is where the bulk of analyst skill is invested. SEQUAL distinguishes three flavours ([[concepts/sequal-framework]]): | Flavour | Set correspondence | When it applies | |---|---|---| | **Descriptive semantic quality** | M vs D (current domain) | as-is model | | **Prescriptive semantic quality** | M vs D^O (optimal domain) | to-be model | | **Semantic quality** *(formerly "perceived")* | K vs M | model fits stakeholder mental models | For an as-is BPMN model from interviews, **descriptive** quality dominates. Two sub-criteria, both from Dumas §5.4.2 / Krogstie 2006: ### 6.1 Validity (M ⊆ D — no false statements) Every statement the model makes is consistent with reality. **How to verify**: - **Translate the model to natural language** before showing to domain experts ([[sources/2012-ottensooser-graphical-vs-textual]]: experimental n=196 evidence shows untrained readers gain *no statistically significant* understanding from BPMN alone, p=0.15; written use cases help both trained and untrained readers, p<0.01). - A useful artefact is a **structured written use case** (trigger, primary actor, main success scenario as numbered steps, extensions for exceptions) — present it *before* the BPMN; that order maximises comprehension across audiences (Ottensooser H5, p<0.01). - Walk the expert through the natural-language version and let them point out falsifications. **Common validity failures**: - An activity that "sometimes happens" modelled as always happening (no XOR gateway). - Activity granularity mismatch — model shows a step the expert doesn't recognise as a coherent task. - Wrong actor — task assigned to the wrong lane. - Hidden upstream dependency — an activity is shown as triggered by event X when it actually requires X *and* Y. ### 6.2 Completeness (D ⊆ M — no missing essential paths) No essential alternative path is missing — but per SEQUAL, *absolute* completeness is infeasible (D too large). Practical target: **feasible completeness** — all paths the modelling goal requires are present. **How to verify**: - Active probing: "what other outcome is possible here?"; "who else could perform this?"; "is there a scenario where this branch is skipped?" - Sunny-day vs rainy-day balance — the well-known interview pitfall ([[sources/2018-dumas-fundamentals-of-bpm-ch5-discovery]]). Use rainy-day questions derived from the exception taxonomy: - Internal business exception (out-of-stock, ineligible application). - External business exception (customer cancels, supplier defaults). - Internal technology exception (system unresponsive). - Activity timeout (deadline missed). - Cross-handoff verification — when activity A is followed by activity B in the model, separately ask the performer of B what they receive and from whom; mismatches reveal incomplete or wrong handoffs. **Common completeness failures**: - No exception path from any activity (sunny-day model only). - No timeout handling for activities with implicit deadlines. - Backward-flow (rework) loops missing. - Authorisation/approval branches absent. ### 6.3 Iteration discipline Expect ≥2 validation iterations per domain expert ([[sources/2018-dumas-fundamentals-of-bpm-ch5-discovery]] §5.2.2). Final approval from the **process owner** closes semantic validation (§5.4.2). Mark uncertainty *on the model* (e.g. coloured sticky / annotation) so the next interview has targeted questions. ### 6.4 Why the analyst cannot validate semantically alone Dumas §5.1.2 challenge 1 — **fragmented knowledge**: tasks are split across specialists; each expert has deep local knowledge but conflicting upstream/downstream assumptions. The analyst can only catch this by interviewing both ends of every handoff and confronting the disagreement. --- ## 7. Layer 4 — Pragmatic quality (is the model understandable and actionable?) SEQUAL's revised pragmatic quality is *more demanding* than mere comprehension: the model must enable **learning** (knowledge gain) and **action** (domain change toward goal). For an as-is model, action-enablement reduces to "the model can be read by stakeholders well enough to ground the next-phase decision" (analysis, redesign, sign-off). ### 7.1 7PMG — the operational toolkit [[concepts/7pmg]] is the empirically-grounded distillation of pragmatic-quality engineering for control-flow models: | # | Guideline | Mechanical check | |---|---|---| | **G1** | Few elements | Count nodes; flag if growing without reason | | **G2** | Min routing paths per element | Max connector degree (in+out) ≤ 4; avg ≤ 3 | | **G3** | One start, one end | Count start/end events | | **G4** | Structured (split-join matched) | Mismatch metric → 0 | | **G5** | Avoid OR | OR-connector count → 0 | | **G6** | Verb-object labels | Label-style audit | | **G7** | Decompose if >50 | Total elements ≤ 50 per view | Apply the guidelines as transformations — they preserve behaviour modulo branching bisimulation. The model becomes more readable without behavioural change. ### 7.2 Empirical evidence [[sources/2010-mendling-reijers-vanderaalst-7pmg]] cites the underlying studies: - **Process model understanding** (n=73 across TU/e + Madeira + Vienna): OR-joins and average connector degree negatively correlate with comprehension. - **Error probability** on SAP Reference Model (600 EPCs) and an industry corpus (2000 EPCs): size and complexity drive errors; >50 elements → >50% error probability. - **Label ambiguity** (n=29 postgrad experiment): verb-object significantly less ambiguous and more useful than action-noun. [[sources/2012-ottensooser-graphical-vs-textual]] additionally shows (n=196, p<0.01) that **for non-experts, a structured written use case presented BEFORE the BPMN diagram yields the highest comprehension across all reader groups**. Implication: pragmatic quality for mixed audiences requires pairing the BPMN with a written companion document. ### 7.3 Bottom-up complexity metrics Where 7PMG is the *qualitative* checklist, [[concepts/process-model-complexity-metrics]] provides the underlying quantitative measures: | Metric | What it captures | 7PMG link | |---|---|---| | **|N|, |A|** | Size | G1, G7 | | **Density = |A| / (|N|·(|N|-1))** | Interconnectedness | G1 | | **avg / max connector degree** | Routing complexity per node | G2 | | **CFC (Cardoso Control-Flow Complexity)** | Path count under split semantics | G2, G5 | | **Mismatch** | Splits unmatched by joins of the same type | G4 | | **OR-connector count** | OR usage | G5 | | **Cross-connectivity** | Causal coherence | (cross-cutting) | | **Depth** | Nesting level | (G7) | Compute these statically with ProM or programmatic BPMN parsers; report as a quality dashboard alongside the model. ### 7.4 Pragmatic-quality red flags from interview-derived models | Red flag | Likely cause | Fix | |---|---|---| | 70+ activities in one diagram | Insufficient sub-process abstraction | Apply G7; identify single-entry-single-exit blocks | | Multiple start events (each interview's "trigger" became an event) | Discovery-phase artefact | Apply G3; use one start with subsequent XOR | | OR-join in the middle of the model | Analyst hedging on synchronisation | Apply G5; use XOR with explicit conditions | | Long noun-phrase labels ("Complaint analysis and processing") | Dictation-style transcription from interviews | Apply G6; rewrite as verb-object | | Connector with degree 6+ | Analyst aggregating multiple choice points | Apply G2; split into nested gateways | --- ## 8. Layer 5 — Organisational quality, social quality, conformance dimensions ### 8.1 Organisational quality (M vs G) Does the model fulfil the modelling goal? An as-is model produced for *redesign* has different goal-fitness criteria than one produced for *documentation* or for *executable workflow specification*. For each goal: | Goal | Quality emphasis | |---|---| | Documentation | Pragmatic (readable for newcomers) + semantic (validity) | | Compliance evidence | Semantic (completeness on rainy-day paths) + soundness | | Redesign baseline | Semantic (validity, completeness) + complexity metrics for diagnostics | | Simulation | Semantic + soundness + quantitative annotations (durations, probabilities) | | Executable workflow | Soundness + executable BPMN constraints (data types, mappings, [[frameworks/dmn]] for rules) | Closing organisational quality means **process owner sign-off**, not analyst self-assessment ([[sources/2018-dumas-fundamentals-of-bpm-ch5-discovery]] §5.4.2). ### 8.2 Social quality (agreement among stakeholders) When multiple stakeholders reviewed the model and disagree, what does "the model" represent? - Workshop discovery surfaces disagreement faster than serial interviews ([[methods/process-discovery-methods]] §5.2.3). - Where disagreement persists, **annotate the model** with the disagreement as documentation rather than silently picking one side. - The model can be sound and pragmatic and still **socially un-agreed** — that is a separate quality axis SEQUAL flags. ### 8.3 Conformance dimensions (when an event log exists) [[concepts/conformance-checking]] provides four log-relative quality dimensions that become applicable *after* discovery, when monitoring or process mining begins: - **Fitness** — can the model replay the traces in the log? - **Precision** — does the model only allow behaviour seen in the log, or over-generalise? - **Generalisation** — will the model handle unseen-but-plausible future behaviour? - **Simplicity** — Occam: smallest adequate explanation. These trade off — the conformance equivalent of the [[concepts/devils-quadrangle|Devil's Quadrangle]]. They are **not relevant for a freshly-built as-is model with no log**, but should be revisited once an event log is available (the model can then be cross-validated against actual executions). --- ## 9. How the layers trade off The layers are not all independent. Some interact: | Tension | Description | |---|---| | **Soundness ↔ pragmatic structuredness** | 7PMG G4 "structured" simplifies soundness analysis (free-choice) — they reinforce. | | **Semantic completeness ↔ pragmatic simplicity** | Adding rainy-day paths (semantic completeness) increases size (pragmatic G1, G7 violations). Resolution: decompose to sub-processes (G7). | | **Pragmatic G1 (few elements) ↔ G2 (low routing degree)** | Reducing degree often requires adding intermediate connectors. 7PMG explicitly notes this; the paper's pragmatic priority ordering is workshop-derived. | | **Pragmatic G6 (verb-object) ↔ semantic specificity** | "Write down complaint" is shorter than "Write down complaint with form AZ2" but loses information. Trade depends on modelling goal — executable models need specificity; documentation can drop it. | | **Soundness ↔ semantic validity** | A model can be sound (no deadlocks) yet semantically invalid (e.g., wrong activity order). Soundness is a gate, not a target. | | **Pragmatic ↔ social** | A perfectly-structured model may suppress disagreement. Annotated disagreement is more socially valid but pragmatically messier. | There is no global optimum. **Resolve by modelling goal** (organisational quality): compliance-driven models prioritise completeness + soundness; redesign-driven models prioritise complexity-metrics + structure for diagnosis; documentation-driven models prioritise pragmatic G6 + semantic validity. --- ## 10. LLM-assisted review rubric This section is the operational deliverable: a structured prompt the analyst (or an LLM running review automation) applies to a candidate BPMN model. ### 10.1 Required inputs to the LLM 1. **The BPMN model** — preferred: BPMN 2.0 XML (machine-parsable). Acceptable: rendered diagram (image) + activity/gateway list. Acceptable for behavioural review: textual representation (activity → next-activity table + gateway type per fork). 2. **Modelling goal** — one of: documentation · compliance · redesign baseline · simulation · executable workflow. Influences prioritisation across layers. 3. **Audience description** — who will read the model? Domain experts in this process · domain experts in adjacent processes · trained business analysts · executives · modelling novices. 4. **Available companion artefacts** — written use case · interview transcripts · prior model version · event log (if any). ### 10.2 The structured review prompt ```text You are a BPMN model reviewer applying the Process Model Quality & Soundness Evaluation Guide. Output a structured report with five layered findings (L1–L5) and a final verdict. For each layer, output: - VERDICT: pass / pass-with-issues / fail / not-evaluable - FINDINGS: bullet list of concrete issues with element-level references (use BPMN element IDs or labels). - FIXES: for each finding, propose the smallest fix that resolves it without changing modelled behaviour, citing the relevant guideline (e.g. "Apply 7PMG G3" or "Restructure to satisfy proper completion"). - SEVERITY: critical (must fix) / high / medium / low. LAYER 1 — PHYSICAL & EMPIRICAL - Is the model rendered legibly? - Layout: predominant flow direction, crossing arcs, alignment. - Spacing / font / colour — purposeful? LAYER 2A — NOTATION-SYNTACTIC (BPMN rules) Check: - Sequence flows connect only flow elements. - Tasks single-in / single-out (or via gateway). - Start events have no incoming flow; end events no outgoing. - Boundary events attached to activities only. - Gateway types set; XOR has default if conditions are partial. - Pool/lane assignment correct; message flows cross pools only. - Data objects connected via association. LAYER 2B — BEHAVIOURAL-SYNTACTIC (soundness) Reason about the WF-net translation. Verify the three classical soundness conditions: 1. Option to complete: from every reachable state, the end event is reachable. Identify any deadlocks. 2. Proper completion: when the end event fires, no orphan tokens remain on parallel branches. Identify any "lost" tokens. 3. No dead transitions: every activity is executable from some reachable state. Identify any unreachable activities. Mention applicable verification: free-choice WF-net implies polynomial-time decidability (Aalst 1997, Theorem 12). LAYER 3 — SEMANTIC (descriptive — model vs reality) This layer requires the candidate model alongside the available companion artefacts (use case, transcripts). - VALIDITY: are all modelled elements consistent with the available evidence? Flag any element not supported by the source material. - COMPLETENESS: identify likely-missing elements by applying the exception taxonomy: a) internal business exception b) external business exception c) internal technology exception d) activity timeout For each activity, ask: is at least one rainy-day path modelled? If transcripts are absent, mark this layer "not-evaluable" and suggest the analyst conduct a completeness validation conversation with the domain expert. - HANDOFFS: for each cross-lane handoff, flag the receiving activity and ask whether the receiver's expectations are documented and consistent. LAYER 4 — PRAGMATIC (understandability & action-enablement) Apply 7PMG (Mendling, Reijers, Aalst 2010): - G1 size: count nodes; flag if > 30 in single view. - G2 routing: max connector degree; flag if > 4. avg if > 3. - G3 single start/end: count; flag if > 1 of either. - G4 structuredness: identify split connectors without matching joins of the same type (mismatch metric); flag each. - G5 OR usage: count OR connectors; flag any. - G6 labels: classify each activity label as verb-object, action-noun, or other; flag non-verb-object. - G7 decomposition: if total elements > 50, propose subprocess candidates (single-entry-single-exit blocks). Also compute and report: - Density: |A| / (|N|·(|N|-1)). Flag if > 0.10. - Cyclomatic complexity: |A| - |N| + 2. Flag if > 10. For the audience type, comment on whether the BPMN should be paired with a written use case (Ottensooser et al. 2012: non-experts gain no statistically significant understanding from BPMN alone; written use cases help all readers; UC-then-BPMN ordering maximises comprehension). LAYER 5 — ORGANISATIONAL & SOCIAL - Does the model fit the modelling goal? (state goal upfront) - Are there annotated disagreements / parking-lot items? - Is process-owner sign-off recorded? FINAL VERDICT - If any layer is "fail" or has "critical" severity findings: the model is NOT ready for sign-off; list the gating issues. - If "pass-with-issues": list the highest-priority fix order using this default priority (override by modelling goal): 1. Soundness (L2b critical) — must fix. 2. Notation-syntactic (L2a critical) — must fix. 3. Semantic completeness (L3) — must fix for compliance/redesign goals. 4. 7PMG G3 single start/end (enables soundness). 5. 7PMG G5 (eliminate OR). 6. 7PMG G4 (structuredness). 7. 7PMG G2 (degree reduction). 8. 7PMG G1 / G7 (size / decomposition). 9. 7PMG G6 (label rewrite). - If "pass": confirm sign-off readiness. Output JSON: { "L1_physical_empirical": {...}, "L2a_notation_syntactic": {...}, "L2b_behavioural_syntactic_soundness": {...}, "L3_semantic": {...}, "L4_pragmatic": { "G1_size": {...}, "G2_routing": {...}, "G3_start_end": {...}, "G4_structuredness": {...}, "G5_or_usage": {...}, "G6_labels": {...}, "G7_decomposition": {...}, "metrics": {"|N|": ..., "|A|": ..., "density": ..., "max_degree": ..., "avg_degree": ..., "CFC": ..., "cyclomatic": ..., "mismatch_count": ...} }, "L5_organisational_social": {...}, "verdict": "pass" | "pass-with-issues" | "fail" | "not-evaluable", "priority_fix_list": [...] } ``` ### 10.3 What the LLM cannot do - **Layer 3 (semantic) requires evidence**: without transcripts, use cases, or expert access, semantic validity is unknowable — the LLM should mark "not-evaluable" rather than hallucinate. This is a hard limit: the [[sources/2018-dumas-fundamentals-of-bpm-ch5-discovery]] §5.1.2 challenge 1 — fragmented knowledge — applies to the LLM as much as to the analyst. The model is not the territory. - **Layer 5 organisational sign-off** is a human commitment, not an LLM artefact. The LLM can flag missing sign-off but not provide it. - **Soundness on non-trivial BPMN** may require a real verification tool (WOFLAN, ProM, Camunda) — the LLM can flag candidate violations but should not claim definitive soundness for complex models without tool corroboration. ### 10.4 Calibration notes Run the rubric on a known-good model first to baseline the LLM's calibration. Common drift patterns: - LLMs over-flag G1 size violations on intentionally large but well-decomposed models — anchor with G7 decomposition reasoning. - LLMs sometimes confuse XOR-with-default-flow as an OR construct — verify with explicit gateway-type metadata. - LLMs may confidently call a model "sound" without doing the formal reduction — require the LLM to *construct the WF-net translation* explicitly before claiming soundness. --- ## 11. Anti-patterns (top 12) Consolidating Dumas §5.4 + 7PMG paper §3.4 + interview-synthesis observations: 1. **Sunny-day-only model** — no rainy-day paths. Most common discovery failure. 2. **Multiple start events** — one per interview "trigger". Violates G3. 3. **Implicit join** — multiple sequence flows into a task without a gateway. 4. **OR-join with vague semantics** — analyst hedging. Replace with XOR or AND. 5. **Mismatched split-join types** — AND-split → XOR-join (proper completion violation). 6. **Activity at micro-step level** ("put document on fax machine") — violates Dumas activity-level discipline. 7. **Single-source authority for cross-lane handoff** — receiving end never validated. 8. **Long noun-phrase labels** — violates G6. 9. **Connector with degree 6+** — should split. 10. **Loop without exit condition** — livelock; soundness violation. 11. **Showing raw BPMN to domain expert for validation** — Ottensooser counter-evidence. 12. **Treating every complaint as a structural variant** — frequency-blind modelling. Use frequency questions ([[sources/2018-dumas-fundamentals-of-bpm-ch5-discovery]] §5.2.2 and §6.3.1) to separate sporadic anecdote from structural variant. --- ## 12. Evaluation workflow (recommended order) For a candidate model, evaluate in this order — early failures invalidate later layers: ``` 1. L1 (physical/empirical) — quick, near-zero-cost │ ▼ 2. L2a (notation-syntactic) — mechanical, fast (Camunda / bpmnlint) │ ─ if fails: fix, return to step 2 ▼ 3. L2b (soundness) — mechanical, polynomial for free-choice │ ─ if fails: must fix before semantic validation; │ soundness violations make semantic walk-through pointless ▼ 4. L4 partial (7PMG G3 only) — single start/end enables proper soundness │ (loop back to step 3 if changes affect behaviour) ▼ 5. L3 (semantic) — translate to natural language; expert walkthrough. │ This is the expensive, irreducibly-human step. ▼ 6. L4 full (7PMG G1, G2, G4, G5, G6, G7 + metrics) — refinement │ (these transformations preserve behaviour; do not loop to L3) ▼ 7. L5 (organisational) — process owner sign-off │ ▼ 8. (when log exists) Conformance dimensions — fitness, precision, generalisation, simplicity (§8.3) — model maintenance phase ``` Rule of thumb: **fix soundness before validating semantically**. A model with a deadlock will have an expert notice the deadlock or — worse — not notice it but invalidate other parts of the walkthrough. Resolve correctness before correspondence. --- ## 13. Acknowledged gaps (next ingest candidates) This guide rests on the five primary sources listed in the frontmatter, but several adjacent sources would tighten its rigour: - **Lindland, Sindre, Sølvberg 1994 (IEEE Software)** — the *original* SEQUAL paper (this guide uses Krogstie 2006 revision). - **Mendling 2008** *Metrics for Process Models* — the bottom-up complexity-metrics monograph; would deepen [[concepts/process-model-complexity-metrics]]. - **Verbeek, Basten, van der Aalst 2001** *Diagnosing Workflow Processes Using Woflan* — would give concrete diagnostics for soundness violations. - **Trcka, van der Aalst, Sidorova 2009** *Data-Flow Anti-Patterns* — adds data-aware soundness layer. - **Reijers & Mendling 2008** *Modularity in Process Models* — empirical foundation for G7 decomposition threshold. - **Rosemann 22 Modelling Pitfalls** — the exhaustive anti-pattern catalogue (referenced from [[syntheses/interview-structuring-for-process-models]]). - **Moody 2009** *The 'Physics' of Notations* — pragmatic-quality theory for visual notations. - **Cardoso 2005** *Control-Flow Complexity for Web Processes* — primary source for CFC metric. When these are ingested, this synthesis can be upgraded delta-by-delta — particularly §5 (richer soundness variants and diagnostics), §7 (deeper metric calibration), §11 (full Rosemann 22-pattern catalogue). --- ## 14. Quick-reference card ``` LAYER CRITERION HOW VERIFIED OWNER ───────────────────────────────────────────────────────────────────────────────── L1 phys/emp Accessible, legible BPMN tool Analyst L2a notation BPMN rules bpmnlint / Camunda Analyst L2b sound Option-to-complete Soundness checker (free-choice Analyst Proper-completion ⇒ polynomial) No dead transitions L3 semantic Validity (M ⊆ D) Expert NL walkthrough Domain expert Completeness (D ⊆ M) + rainy-day taxonomy probing Analyst + expert L4 pragmatic G1–G7 + metrics Static analyser + reader test Analyst (+ LLM) L5 organis Goal fulfilment Owner sign-off Process owner ───────────────────────────────────────────────────────────────────────────────── (post-go-live, with log) Fitness, Precision, ProM / Disco / Celonis Analyst Generalisation, Simplicity ``` --- ## Sources & related **Primary**: [[sources/1998-vanderaalst-verification-of-workflow-nets]] · [[sources/2010-mendling-reijers-vanderaalst-7pmg]] · [[sources/2006-krogstie-sindre-jorgensen-revised-sequal-framework]] · [[sources/2018-dumas-fundamentals-of-bpm]] (§5.4 quality, §11.4 conformance) · [[sources/2018-dumas-fundamentals-of-bpm-ch5-discovery]] · [[sources/2012-ottensooser-graphical-vs-textual]] **Concepts**: [[concepts/process-model-quality]] · [[concepts/soundness]] · [[concepts/workflow-net]] · [[concepts/7pmg]] · [[concepts/sequal-framework]] · [[concepts/process-model-complexity-metrics]] · [[concepts/conformance-checking]] **Companion synthesis (upstream)**: [[syntheses/interview-structuring-for-process-models]] — the elicitation guide that produces the model this synthesis evaluates. **Frameworks**: [[frameworks/bpmn]] · [[frameworks/declare]] (declarative analogue) · [[frameworks/dmn]] (decision-logic adjunct)