--- title: "Large Process Models: A Vision for Business Process Management in the Age of Generative AI" type: source tags: [llm, bpm, vision, large-process-models, neuro-symbolic, foundation-models, knowledge-graphs] authors: [Kampik Timotheus; Warmuth Christian; Rebmann Adrian; Agam Ron; Egger Lukas N. P.; Gerber Andreas; Hoffart Johannes; Kolk Jonas; Herzig Philipp; Decker Gero; van der Aa Han; Polyvyanyy Artem; Rinderle-Ma Stefanie; Weber Ingo; Weidlich Matthias] year: 2024 venue: "Künstliche Intelligenz 39 (arXiv:2309.00900v3, Jan 2025)" kind: paper raw_path: "raw/ABPS/2024-kampik-large-process-models.pdf" correction: "[[sources/2024-kampik-large-process-models-correction]]" key_claims: - Pure LLMs are insufficient for BPM because their behaviour is unpredictable, at times undesirable, and frequently illogical — an unacceptable property for decisions with critical business implications. - A Large Process Model (LPM) is a neuro-symbolic software system that combines the correlation power of LLMs with the analytical precision and reliability of knowledge-based systems and automated reasoning. - The LPM architecture has four layers - process data and knowledge sources, a process atom layer, run-time contextualization (process query + prediction engines), and BPM tools and integration (modelling, analysis/mining, execution). - A process atom is an atomic fact about a process (or a relation between facts) that cannot be split further without losing business meaning; atoms form a bridge between LLM-style natural language and executable symbolic queries and models. - A process-fine-tuned LLM (not one trained from scratch) is contextualized per industry, region, regulation, or organisation; fine-tuning plus prompt contextualization is preferred over scratch training. - LPMs enable organizations to receive context-specific, tailored process and business models, analytical deep-dives, and improvement recommendations — substantially lowering the time and effort required for business transformation. - Classical symbolic BPM tooling remains necessary; conformance and performance assessments must yield hard guarantees, not approximate guesses, so LLMs augment rather than replace BPM software. - Implementing an LPM is feasible only in part - the vision highlights substantial open research challenges across data, reasoning, trust, and tool integration. created: 2026-04-13 updated: 2026-05-06 sources: [] --- # Large Process Models — Kampik et al. 2024 Vision paper from **SAP Signavio** (Kampik, Warmuth, Rebmann, Agam, Egger, Gerber, Hoffart, Kolk, Herzig, Decker) with academic co-authors from Vienna (Han van der Aa), Melbourne (Artem Polyvyanyy), TUM (Stefanie Rinderle-Ma, Ingo Weber), and Humboldt-Universität zu Berlin (Matthias Weidlich). Published in *Künstliche Intelligenz* vol. 39 (2024). The paper articulates a conceptual framework — **Large Process Models (LPMs)** — for BPM software in the age of generative AI. ## Summary The paper's central claim is that pure LLMs are structurally unfit for [[concepts/business-process|business process]] contexts where decisions have critical operational implications: their statistics-based output is unpredictable, at times not desirable, and frequently illogical. Yet their correlation power over large, heterogeneous corpora is genuinely useful for BPM tasks such as drafting [[frameworks/bpmn|BPMN]] / [[frameworks/dmn|DMN]] / [[frameworks/cmmn|CMMN]] models, generating analytical deep-dives, and recommending improvements. To reconcile these, Kampik et al. propose a **neuro-symbolic** architecture that fuses LLMs with symbolic data management (knowledge graphs), automated reasoning, and classical process analytics. An LPM is explicitly *not* an LLM trained from scratch on process data. Instead it is a composite system with four layers: 1. **Process data and knowledge sources** — structured knowledge in knowledge graphs, language embeddings in vector stores, unstructured documents, tabular relational data (event logs included), and KPI / experience data. 2. **Process atom layer** — a layer of *process atoms*: atomic facts about a process (or relations between facts) that cannot be split without losing business meaning. Atoms are the bridge between natural-language LLM output and executable symbolic artefacts. They are conceptually equivalent to declarative-model constraints ([[frameworks/declare|DECLARE]]) and can be sourced by LLMs for the purpose of detecting *semantic anomalies* — behaviour unusual given the available knowledge. 3. **Run-time contextualization** — process query engines (over knowledge graphs and event data) and process prediction engines (classical ML + foundation models) provide the live grounding layer. On top sits a **process-fine-tuned LLM** contextualized per process vertical, region/regulation, and organisation. 4. **BPM tools and integration** — process modelling, analysis/mining, and execution tooling that consumes and emits artefacts the LPM can reason about. The human / machine process manager specifies BPM goals, consumes recommendations, and provides feedback. The paper motivates the LPM from two failed extremes in BPM's history: purely imperative model-driven execution (BPMN with DMN rulebases) that rarely survives contact with real organisational change, and purely statistical ML approaches that struggle with BPM's knowledge intensity, cold-start training cost, and the reinforcement-learning credit-assignment delay characteristic of long-running processes. The LPM is positioned as the *pars pro toto* bridge between the two. The authors explicitly align the LPM vision with the [[sources/2023-dumas-ai-augmented-bpms|AI-Augmented BPM manifesto]] — the goal of making processes *"adaptable, proactive, explainable, and context-sensitive"* — and argue that achieving these properties requires the symbolic/subsymbolic fusion the LPM enacts. The paper closes by highlighting risks (data silos, knowledge representation, trust, maintenance) and concrete research challenges for each architectural layer. ## Key claims - **LLMs alone are insufficient for BPM.** Unpredictability, undesirable output, and illogical behaviour are disqualifying in decision-critical business contexts. - **Neuro-symbolic fusion is the path forward.** Symbolic data management (knowledge graphs, automated reasoning) must complement LLM correlation power. - **Process atoms are a new conceptual primitive.** They generalise declarative constraints and provide the executable substrate the LLM layer writes to. - **Fine-tune, don't train from scratch.** Foundation-model fine-tuning + prompt contextualization is preferred for BPM verticals; scratch training is cost-prohibitive and brittle under [[concepts/concept-drift]]. - **Classical BPM tooling stays.** Conformance and performance assessments require hard guarantees; LLMs augment but do not replace [[methods/process-mining-basics|process mining]] and [[concepts/conformance-checking|conformance]] machinery. - **The LPM is a framework, not a model.** Depending on the task at hand it is instantiated from the general architecture — different queries, predictions, and tool chains compose different LPMs. - **Agent-based and reasoning loops are first-class.** The authors note that reasoning loops "have been at the center of AI research for decades" and are re-surging in the LLM context — but warn that nascent agent systems do not yet make full use of planning, reasoning, or RL capabilities. ## Framing distinctions introduced - **LPM ≠ local process model.** A footnote explicitly distinguishes the proposed LPM from the pre-existing "Local Process Model" notation (small frequent-behaviour patterns in event logs). - **LPM ≠ BPM-LLM.** An LPM is not an LLM trained on domain-specific data; it is a *system* in which a fine-tuned LLM is one component among several. - **Imperative vs declarative BPM maturity.** BPMN/DMN achieved mainstream maturity for imperative models; [[frameworks/cmmn|CMMN]] and declarative approaches did not (Camunda's "CMMN never lived up to its potential"). LPM sidesteps this by pushing declarative logic into the process atom layer rather than into a user-facing modelling language. - **Object-level vs meta-level correctness.** A symbolic rule can be logically correct (object-level) yet dated, inconsistent, or wrongly modelled from a domain perspective (meta-level). LPMs aim to continuously reconcile both levels using the data/knowledge layer. ## §5 — How LPMs facilitate BPM (three benefits) The paper articulates three concrete benefits an LPM-augmented BPM stack would deliver: 1. **Reduction of effort & expertise required for knowledge-based BPM tasks.** LPMs lower the entry bar for running BPM initiatives by (i) finding and contextualising information automatically, (ii) turning unstructured/semi-structured information into models and queries, and (iii) enriching context with logically inferred or statistically plausible facts. Specific capabilities envisioned: turning natural-language text into process models and queries; enhancing models/queries from natural-language feedback; recommending changes to process models; scaling generic insights via auto-generated query templates instantiated per organisation. The paper anticipates *conversational process modelling* and *conversational process mining* will reach production in the coming years. 2. **Improvement of process observability.** Process observability (analogous to data observability in distributed systems) is the extent to which a process is correctly observed and understood given the business objective. It tends to remain low when relying on a single method (modelling alone, or event-log mining alone) because event logs cover only a subset of process activity. By fusing knowledge and data from multiple sources, an LPM can: (i) turn unstructured informal process knowledge into actionable models and queries by traversing organisational knowledge silos; (ii) discover data sources in large information system landscapes and recommend ETL scripts; (iii) enable foundation-model-based forecasting that avoids per-organisation model training. 3. **Convergence of process design, execution, and analysis.** LPMs may eventually deliver continuous automated process improvement where the three lifecycle phases converge. This generalises the prescriptive-monitoring agenda (matching analysis insights to actions, fusing data to assess implications, continuously fine-tuning deployed changes). The vision aligns with [[sources/2023-chapela-campa-augmented-process-execution|Chapela-Campa & Dumas 2023]]'s augmented-process-execution pyramid but pushes further into autonomous closed-loop adaptation. ## §6 — Three-step feasibility roadmap The paper offers a "feasibility-oriented" roadmap (not a research agenda): 1. **Augmenting modelling and analysis with contextualised knowledge.** Considered *generally feasible* given current state of the art. Key challenges are *engineering*: (i) rigid evaluation metrics for generative AI-supported model/query generation; (ii) LLM-friendly data exchange formats as middle-layer representations between symbolic and sub-symbolic systems; (iii) assessing fine-tuning vs. retrieval-augmented generation. → [[sources/2026-licardo-bpmn-assistant]] is a concrete instantiation of this step. 2. **Fusing unstructured and tabular data for actionable insights.** Considered *substantial-challenge*. Calls for: (i) conversational process-mining foundations; (ii) hypothesis-elicitation methods over tabular and unstructured data; (iii) scalable algorithms for evaluating those hypotheses. 3. **Automating continuous improvement with the human in control.** Considered *blue-sky*. Open questions: enterprise-software design for maximal modular flexibility on the *knowledge level*; how analysis-oriented BPM software changes when it becomes mission-critical to execution; how to define guardrails for autonomous process-execution systems so that *standardisation versus tailored optimisation* trade-offs can be shifted toward the latter. The orthogonal feasibility challenges across all three steps are: data ingestion and curation under questionable LLM-generated quality; reliability and ethics oversight for stochastic-parrot / bullshit-generator failure modes; verifiability of inferences vs human cognitive load; choice-architecture effects on humans deferring to LPM recommendations; the cost frontier of foundation-model training, operation, and maintenance. ## §7 — Discussion: positioning vs other LLM-BPM proposals The paper compares LPM to four parallel proposals: - **BloombergGPT** — domain-specific LLM trained from scratch for finance. LPM differs by *not* training from scratch; instead it fine-tunes plus prompt-contextualises a general LLM, accepting lock-in but avoiding the cost-frontier issue. - **Vidgof et al. — vision and research agenda for LLMs in BPM** — primarily aligned with the BPM lifecycle from a *management* view. LPM differs by being feasibility-oriented and software-architecture-centric. - **ProcessGPT (Beheshti et al.)** — transformer-based approach for recommending next actions in knowledge-intensive processes during execution. LPM differs in scope (entire BPM lifecycle vs execution-recommendation only) and posture (no proposal to train a process-execution-trace foundation model from scratch). - **Berti & Qafari — off-the-shelf LLMs for process mining** — answers user queries and generates symbolic queries on event data. LPM positions this work as a *subset* of LPM capabilities, providing first partial evidence of LPM feasibility. - **Klievtsova et al., Grohs et al.** — conceptual/experimental starting points for LPMs in process modelling. LPM frames these as supportive evidence of Step-1 feasibility while noting the field still lacks experimental work demonstrating LLM effectiveness in *process-execution* contexts. §7.2 also flags **Generative AI for BPM beyond LLMs**: process models often live as figures in slide decks, and image-to-formal-model generation is an emerging direction; audio-to-process-model from interviews is feasible but speech-to-text pre-processing is more practical than direct foundation-model audio handling. ## Positioning vs related work in this wiki - **Predecessor:** [[sources/2023-dumas-ai-augmented-bpms|AI-Augmented BPMS (Dumas et al. 2023)]] — LPM explicitly adopts the four ABPMS characteristics (adaptable, proactive, explainable, context-sensitive) as design targets and contributes an **LLM-era architecture** for realising them. - **Successor-in-dialogue:** [[sources/2026-calvanese-agentic-bpm-manifesto|APM Manifesto (2026)]] — cites LPM as the LLM-centric BPM vision APM *positions itself against*, arguing that "while LLM potential facilitates some tasks, achieving self-improving processes… requires more than current-state LLMs." The two papers agree on neuro-symbolic hybridisation but disagree on the centre of gravity: LPM centres on a fine-tuned LLM with symbolic scaffolding; APM centres on goal-directed agents with LLMs as one reasoning substrate among several. - **Parallel:** [[sources/2023-chapela-campa-augmented-process-execution|Chapela-Campa & Dumas 2023]] — takes the analytics-side view of the same transition (descriptive → predictive → prescriptive → augmented). LPM and the augmented-execution pyramid are complementary framings: LPM focuses on the software stack, Chapela-Campa on the analytical capability progression. - **Textbook baseline:** [[sources/2018-dumas-fundamentals-of-bpm|Fundamentals of BPM (2018)]], [[sources/2011-vanderaalst-process-mining-book|Van der Aalst Process Mining textbook]] — LPM treats these as the classical BPM tooling the system must integrate with, not replace. ## Connections **Concepts:** [[concepts/agentic-bpm]] · [[concepts/process-aware-information-system]] · [[concepts/explainability-apm]] · [[concepts/context-aware-bpm]] · [[concepts/conformance-checking]] · [[concepts/declarative-process-modelling]] · [[concepts/concept-drift]] · [[concepts/business-process]] **Frameworks:** [[frameworks/bpmn]] · [[frameworks/dmn]] · [[frameworks/cmmn]] · [[frameworks/declare]] · [[frameworks/bdi-agents]] **Methods:** [[methods/process-mining-basics]] · [[methods/agent-oriented-process-mining]] **Authors (entities):** [[entities/timotheus-kampik]] · [[entities/artem-polyvyanyy]] · [[entities/stefanie-rinderle-ma]] **Related sources:** [[sources/2023-dumas-ai-augmented-bpms]] · [[sources/2026-calvanese-agentic-bpm-manifesto]] · [[sources/2023-chapela-campa-augmented-process-execution]] · [[sources/2012-vanderaalst-process-mining-manifesto]] · [[sources/2011-vanderaalst-process-mining-book]] · [[sources/2026-licardo-bpmn-assistant]] (instantiates §6 Step 1) · [[sources/2025-varsani-neuro-symbolic-ai-sap-erp]] (concrete neuro-symbolic LLM/ERP integration) · [[sources/2024-kampik-large-process-models-correction]] (Fig. 1 erratum)