--- title: "Synthesis: The PPM Landscape (2008–2026)" type: synthesis tags: [ppm, synthesis, landscape, survey] created: 2026-04-13 updated: 2026-05-11 --- # Synthesis: The PPM Landscape (2008–2023) Cross-cutting synthesis of the `raw/Predictive process monitoring/` corpus after the autonomous ingest batch of 2026-04-13. Thirty-three unique papers spanning 2008–2023 were ingested. This page situates each method family, traces the research lineage, and connects the corpus to [[concepts/agentic-bpm|APM]]. ## 1. Prediction targets Three canonical targets organise the corpus: - **[[concepts/next-activity-prediction|Next-activity prediction]]** — recursively extensible to full-suffix prediction. - **[[concepts/remaining-time-prediction|Remaining-time prediction]]** — sometimes framed as completion-time or cycle-time prediction. - **[[concepts/outcome-prediction|Outcome prediction]]** — binary/multi-class case closure labels (SLA compliance, business outcome, LTL satisfaction). Anomaly detection ([[sources/2019-nolle-binet-anomaly|BINet]]) and failure prediction ([[sources/2018-borkowski-event-based-failure-prediction]]) are adjacent — same architectures, different targets. ## 2. Model-family evolution — four eras ### Era 1 — Classical probabilistic (2008–2015) Transition systems, Markov chains, HMMs, non-parametric regression, decision trees. Interpretable but dataset-brittle. - [[sources/2008-vandongen-cycle-time-prediction]] — non-parametric regression (foundational). - [[sources/2013-lakshmanan-markov-semi-structured]] — instance-specific Markov for case management. - [[sources/2014-polato-data-aware-remaining-time]] — data-aware annotated transition system. - [[sources/2014-ceci-sequential-pattern-mining-ppm]] — sequential pattern mining. - [[sources/2015-leontjeva-complex-symbolic-encodings]] — **encoding taxonomy** (hub). - [[sources/2015-metzger-comparing-combining-ppm]] — ML vs constraint-based ensembles. - [[sources/2016-ferilli-woman-process-prediction]] — symbolic/WoMan. - [[sources/2016-unuvar-path-information-parallel]] — decision-tree, parallel paths. ### Era 2 — Deep learning debut (2017) The pivotal year. Three concurrent 2017 papers establish neural PPM. - [[sources/2017-evermann-deep-learning-runtime]] — **first RNN/LSTM application** (feasibility study). - [[sources/2017-tax-lstm-process-prediction]] — **canonical LSTM-PPM**; unified multi-task architecture. - [[sources/2017-navarin-lstm-data-aware-remaining-time]] — data-aware LSTM for remaining time. - [[sources/2017-senderovich-intra-inter-case]] — **inter-case features** (orthogonal innovation). - [[sources/2017-difrancescomarino-a-priori-ppm]] — a-priori knowledge in LSTM. - [[sources/2017-verenich-white-box-flow-analysis]] — **white-box** alternative via flow analysis. ### Era 3 — Neural refinement (2018–2020) Data-awareness, hyperparameter optimisation, interpretability, and attention. - [[sources/2018-schoenig-deep-learning-discrete-continuous]] — mixed discrete/continuous attributes. - [[sources/2018-difrancescomarino-genetic-hpo-ppm]] — GA-based HPO framework. - [[sources/2018-verenich-apromore-ppm]] — tooling milestone (Nirdizati/Apromore). - [[sources/2019-hinkka-exploiting-event-attributes-rnn]] — attribute clustering for scalability. - [[sources/2019-nolle-binet-anomaly]] — multi-perspective anomaly (PPM-adjacent). - [[sources/2020-jalayer-attention-ppm]] — **attention over LSTM hidden states** (transition to Transformers). - [[sources/2020-rama-maneiro-deep-learning-ppm-review]] — DL-PPM survey & benchmark (hub). ### Era 4 — Transformers & beyond (2021–2023) Attention takes over; GNNs and multi-task emerge. - [[sources/2021-bukhsh-processtransformer]] — **ProcessTransformer** (pivotal). - [[sources/2023-wang-mtlformer-multitask-transformer]] — multi-task Transformer. - [[sources/2023-duong-gnn-remaining-cycle-time]] — GNN for remaining time. - [[sources/2023-cao-gated-rnn-explainable]] — explainable gated RNN via reachability graph. ### Era 5 — Foundation-model PPM (2024–2026) Pre-trained LLMs invoked via in-context learning enter PPM, particularly competitive in **data-scarce** settings. - [[sources/2026-padella-llm-features-ppm]] — **flagship LLM-PPM source** (Padella, de Leoni & Dumas 2026). Gemini 2.5 Flash Thinking trained on 100 traces matches or surpasses CatBoost / PGTNet trained on full event logs across BPI12 / Bac / Hospital × Total Time + Activity Occurrence. Methodological contributions: ρ_seq encoding, [[concepts/semantic-hashing-probe|semantic-hashing probe]], [[concepts/beta-learner-distillation|β-learner distillation]]. See concept hub [[concepts/llm-based-ppm]]. ### Surveys & standards (spanning all eras) - [[sources/2016-verenich-general-framework-ppm]] — programmatic call. - [[sources/2016-teinemaa-structured-unstructured-ppm]] — text + structured. - [[sources/2019-verenich-survey-ppm]] — **remaining-time survey & benchmark**. - [[sources/2020-rama-maneiro-deep-learning-ppm-review]] — DL-PPM benchmark. - [[sources/2023-berti-ocel-2-specification]] — OCEL 2.0 event-log standard (infrastructure). - [[sources/2021-dumas-process-mining-2-from-insights-to-action]] — prescriptive-monitoring vision. ### Supporting references - [[sources/2018-smith-dont-decay-learning-rate]] — general DL optimisation. - [[sources/2019-lee-nsga-ii-deap-tutorial]] — NSGA-II tutorial. ## 3. Cross-cutting design axes Every PPM paper in the corpus can be located along four axes: | Axis | Options | |---|---| | **Prediction target** | Next-activity · Suffix · Remaining-time · Outcome · Failure/Anomaly | | **Architecture** | Markov / transition system / pattern mining / decision tree / LSTM / GRU / attention / Transformer / GNN / hybrid | | **Encoding** | Boolean / frequency / index-based / complex symbolic / learned embeddings (see [[concepts/trace-encoding]]) | | **Data perspective** | Control-flow only · Data-aware (intra-case) · Inter-case (queue / workload) · Multi-perspective (incl. text) | ## 4. The "no single best method" recurring claim From Metzger 2015 through Tax 2017 through Di Francescomarino 2018: **no single PPM technique wins across all datasets**. This motivated: - HPO-driven frameworks ([[sources/2018-difrancescomarino-genetic-hpo-ppm]]). - Cross-benchmarks ([[sources/2019-verenich-survey-ppm]], [[sources/2020-rama-maneiro-deep-learning-ppm-review]]). - Configurable tooling ([[sources/2018-verenich-apromore-ppm]]). The claim remains true and implies an APM agent should **not commit to a single PPM backend** but select per process. ## 5. Connections to APM [[sources/2026-calvanese-agentic-bpm-manifesto]] positions PPM as the set of techniques an agent invokes for the **"Recommend"** role of [[concepts/conversational-actionability]]. Specifically: - **Framing signals** — PPM predictions feed the agent's intentional model (is the case on track?). - **Adaptation triggers** — a deteriorating PPM forecast is a signal for [[concepts/self-modification|adaptation]]. - **Explainability** — the explainable-PPM line ([[sources/2017-verenich-white-box-flow-analysis]], [[sources/2023-cao-gated-rnn-explainable]], Mehdiyev & Fettke cited in the Manifesto) serves the [[concepts/explainability-apm|APM explainability]] requirement. - **OCEL 2.0** — [[sources/2023-berti-ocel-2-specification]] is the object-centric substrate APM agents will likely operate over. ## 5b. Riess research arc (2022–2026) The wiki owner's own research programme on PPM/PrPM spans seven sources — see [[syntheses/riess-research-arc]] and the author hub [[entities/mike-riess]]. Key contributions to the PPM landscape: - [[sources/2023-riess-temporal-loss-remaining-cycle-time|Riess 2023]] — temporally weighted L1 losses for earliness + introduction of **Temporal Consistency** as a third evaluation axis for remaining-time models. - [[sources/2024-riess-synbps-simulation-framework|Riess 2024 (SynBPS)]] — parametric synthetic-event-log framework for controlled-variable PPM evaluation; critique of benchmark-addiction in the field. - [[sources/2022-riess-metaheuristics-concept-drift-survey|Riess 2022]] — metaheuristics for concept-drift adaptation; catalogues the evaluation-rigour gap in drift-adaptation research. - [[sources/2025-riess-jorgensen-brage-benchmark-norwegian-llm|Riess & Jørgensen 2025 (BRAGE)]] — zero-shot LLM classification as a drift-robust alternative to supervised classifiers in customer-service dialogue analysis. ## 6. Open threads for future ingests - Seminal outcome-PPM benchmark (**Teinemaa, Dumas, Rosemann, Maggi 2019, ACM TKDD**) is *not* in `raw/` — a priority external source to acquire. - **Camargo, Dumas, Gonzalez-Rojas 2019** (data-aware LSTM for next-activity + resource + time) is referenced by Schönig and Bukhsh but not in `raw/`. - **Prescriptive process monitoring** — now covered by [[sources/2014-groger-prescriptive-analytics-bpo|Gröger et al. 2014]] (early rBPO), [[sources/2015-krumeich-prescriptive-control-business-processes|Krumeich, Werth & Loos 2015]] (CEP/ED-BPM enterprise architecture for process manufacturing — *added 2026-05-11*), and [[sources/2022-kubrak-prescriptive-ppm-slr|Kubrak et al. 2022]] (SLR). See canonical hub [[concepts/prescriptive-process-monitoring]] and synthesis [[syntheses/prescriptive-process-monitoring-lineage]]. Still missing from `raw/`: Teinemaa et al. 2018 alarm-based; Fahrenkrog-Petersen et al. 2022; Metzger, Kley & Palm 2020 online RL; Bozorgi et al. 2021 causal; Shoush & Dumas 2021. - ~~**LLM-based PPM** work (post-2023) — still emerging; worth monitoring.~~ **CLOSED 2026-05-11** by ingest of [[sources/2026-padella-llm-features-ppm|Padella, de Leoni & Dumas 2026]] — flagship paper introducing the [[concepts/llm-based-ppm|LLM-PPM]] method family, [[concepts/beta-learner-distillation|β-learner distillation]] for interpretability, and [[concepts/semantic-hashing-probe|semantic-hashing probe]] for embodied-knowledge isolation. The conference-version Padella, Frazzetto, Navarin & de Leoni 2026 BPM Forum paper (ref [25] of the journal version) remains a referenced-not-ingested predecessor. - **Concept drift in PM** — coverage strengthened 2026-05-11 with [[sources/2021-sato-concept-drift-pm-survey|Sato et al. 2021 SLR]] (45 papers, 5-axis taxonomy, reproducibility-crisis flag) and [[sources/2020-baier-handling-concept-drift-bpm|Baier, Reimold & Kühl 2020]] (industrial P2P case, Page-Hinkley + `last`-batch retraining wins). See [[syntheses/concept-drift-in-pm]]. Still missing: Maaradji et al. 2017, Ostovar et al. 2020, Yeshchenko et al. 2019. - **Integrative BI ↔ BPM ↔ GenAI ↔ DSS framing** — added 2026-05-11 with [[sources/2026-theodorakopoulos-bi-bpm-genai-review]] (narrative review proposing a 5-layer integrative framework; complementary to APM's agent-centric pyramid). ## 7. Practical recommendations for APM use - For quick wins on **next-activity**: start with [[sources/2021-bukhsh-processtransformer|ProcessTransformer]] (canonical, easily reproducible). - For **remaining-time**: consult [[sources/2019-verenich-survey-ppm|the Verenich survey]] then pick a data-aware LSTM/Transformer baseline. - For **outcome**: use [[concepts/trace-encoding|index-based or complex symbolic]] encodings + gradient-boosted trees as a strong non-deep baseline. - For **explainability**: white-box flow analysis, or post-hoc over a gated-RNN reachability mapping. - For **benchmarking**: 12 public logs from [[sources/2020-rama-maneiro-deep-learning-ppm-review|Rama-Maneiro 2020]] are community standard — but beware benchmark contamination (APM Manifesto challenge C3).