---
title: "Synthesis: Concept Drift in Process Mining and PPM"
type: synthesis
tags: [concept-drift, process-mining, ppm, synthesis, drift-detection, model-maintenance]
sources:
  - "[[sources/2021-sato-concept-drift-pm-survey]]"
  - "[[sources/2020-baier-handling-concept-drift-bpm]]"
  - "[[sources/2022-riess-metaheuristics-concept-drift-survey]]"
  - "[[sources/2026-padella-llm-features-ppm]]"
created: 2026-05-11
updated: 2026-05-11
---

# Synthesis: Concept Drift in Process Mining and PPM

How three angle-different surveys + one industrial case + one emerging-method paper together describe the state of [[concepts/concept-drift|concept drift]] in BPM as of May 2026.

## The three surveys are non-overlapping

| Survey | Scope | What it surveys | Year |
|---|---|---|---|
| **[[sources/2021-sato-concept-drift-pm-survey\|Sato et al. 2021]]** | Concept drift *in process mining* | Detection + online PM under drift — 45 papers, 5-axis taxonomy (type, duration, dynamic, perspective, analysis mode), 5 challenges (detection, change-point detection, localisation, characterisation, change-process discovery) | 2021 |
| **[[sources/2022-riess-metaheuristics-concept-drift-survey\|Riess 2022]]** | Metaheuristics for drift *adaptation* (cross-domain, not BPM-specific) | Population-based metaheuristics (GA, PSO, etc.) for retraining/reselecting models under drift across engineering, finance, social science | 2022 |
| **[[sources/2020-baier-handling-concept-drift-bpm\|Baier, Reimold & Kühl 2020]]** | Empirical industrial case study | P2P (procure-to-pay) process at a large German company; 70 k transactions 2016–2018; Naïve Bayes + Page-Hinkley/ADWIN + 3 retraining strategies | 2020 |

The triangulation: **Sato tells us what to detect**, **Riess tells us how to adapt**, **Baier tells us what works in production**.

## Convergent findings

### Drift handling works — *if* you do it

Baier's headline result: static-baseline 0.5400 → drift-handled **0.7010** accuracy on throughput-time prediction. This is a **+30 percentage-point absolute** improvement from continuous monitoring + adaptation, on a real industrial event log.

### `last` strategy outperforms `next` and `mixed`

Baier compares three data-selection strategies for retraining the model after a detected drift:
- `next` — retrain on the batch *after* drift detection
- `mixed` — retrain on data spanning the detection
- `last` — retrain on the *last* batch *before* drift detection

`last` wins consistently across both detectors (Page-Hinkley, ADWIN) and all four batch sizes (500, 1000, 2000, 5000). Intuition: drift detectors fire with lag, so the most recent pre-alarm batch already belongs to the new concept.

### Small batches outperform large

Baier finds batch size 500 > 1000 > 2000 > 5000 across detectors. Larger batches are *too slow to adapt* to subsequent drift. Suggests a deployment heuristic: pick the smallest batch size your inference latency budget allows.

### Drift detector choice matters less than retraining-strategy choice

Page-Hinkley narrowly beats ADWIN in Baier, but the *strategy* gap is wider than the *detector* gap. The data-selection question dominates the detection-algorithm question for practical performance.

## Divergent findings / open tensions

### Sato's reproducibility crisis vs. Riess's evaluation rigour critique

Both flag evaluation gaps but from different angles:
- **Sato**: F-score TP/FP/FN definitions vary across 14 of the surveyed drift-detection papers; no shared benchmark; no agreed detection-delay protocol.
- **Riess**: 4 of 17 metaheuristic-adaptation studies don't report class distribution despite using accuracy as sole metric; drift type/pattern unreported when real-world data used; no head-to-head comparison of population-based metaheuristics on the same drift task.

**Synthesis**: the drift-in-PM literature suffers from *two* parallel reproducibility crises — one on the detection side, one on the adaptation side. Both surveys converge on the recommendation: publish drift characteristics + class distributions + comparable evaluation protocols.

### Online vs. offline

Sato documents that only 7 of 45 surveyed PM papers handle drift online; the remaining 38 are offline detection. Baier's industrial case operates *near-online* (stream-of-batches, not stream-of-events). [[concepts/predictive-process-monitoring|PPM]] deployments overwhelmingly need online or near-online behaviour, yet the literature is dominated by offline analysis.

### Control-flow drift vs. multi-perspective drift

Sato's perspective taxonomy distinguishes control-flow, time, resource, data, and multi-order drift. The literature is heavily biased toward control-flow drift; time/resource/data drift (which Baier's case exemplifies — an *automation-level* feature shift) is under-served.

## Implications for LLM-PPM

[[sources/2026-padella-llm-features-ppm|Padella et al. 2026]] do not directly test drift robustness, but several of their findings are drift-relevant:

- **Embodied prior knowledge** (confirmed by [[concepts/semantic-hashing-probe|semantic hashing]]) means the LLM brings prior structure that classical models can only learn from data. This *may* make LLM-PPM more robust to short-term distribution shifts — but is also *brittle* if the new concept conflicts with pre-training priors.
- **In-context learning** with 100 examples is closer to *near-online retraining* (per Baier's `last` strategy) than to traditional fixed-model training: the LLM "retrains" implicitly with every query.
- **Benchmark contamination risk** (the [[concepts/semantic-hashing-probe|semantic-hashing probe]] addresses this) is parallel to drift evaluation rigour — both demand the same discipline of decomposing what the model "knew" before vs. learned from the test data.

**Open question**: how do LLM-PPM predictions degrade under drift, particularly when the drift contradicts strong pre-training priors? Empirical work missing.

## Reading order

1. [[sources/2021-sato-concept-drift-pm-survey|Sato et al. 2021]] — taxonomy & open-problem inventory.
2. [[sources/2020-baier-handling-concept-drift-bpm|Baier, Reimold & Kühl 2020]] — concrete industrial case + retraining-strategy lessons.
3. [[sources/2022-riess-metaheuristics-concept-drift-survey|Riess 2022]] — adaptation toolkit (metaheuristics).
4. [[sources/2026-padella-llm-features-ppm|Padella et al. 2026]] — drift-adjacent implications of LLM-PPM.

## Open threads

- Maaradji et al. 2017 (sudden/gradual drift detection from execution traces) — referenced in Sato + Baier, not yet in `raw/`.
- Bose, van der Aalst, Žliobaitė & Pechenizkiy 2011 (foundational PM-drift paper) — referenced everywhere, not yet in `raw/`.
- Ostovar et al. 2020 (robust drift characterisation) — referenced in Sato, not in `raw/`.
- Yeshchenko et al. 2019 (visual-analytics drift detection) — referenced in Sato, not in `raw/`.
- **LLM-PPM + drift**: no ingested paper combines the two empirically.
- **Causal drift** — when is a detected drift *intervention-worthy* vs. *spurious*? Currently treated as detection-only.

## Related

- Concept hub: [[concepts/concept-drift]].
- PPM hub: [[concepts/predictive-process-monitoring]].
- LLM-PPM hub: [[concepts/llm-based-ppm]].
- Landscape: [[syntheses/ppm-landscape]].
- Reading list: [[syntheses/llm-bpm-reading-list]].