--- title: Remaining Time Prediction type: concept tags: [ppm, prediction, regression, time, earliness, temporal-consistency] sources: - "[[sources/2017-tax-lstm-process-prediction]]" - "[[sources/2017-navarin-lstm-data-aware-remaining-time]]" - "[[sources/2019-verenich-survey-ppm]]" - "[[sources/2023-riess-temporal-loss-remaining-cycle-time]]" - "[[sources/2026-padella-llm-features-ppm]]" created: 2026-04-13 updated: 2026-05-11 --- # Remaining Time Prediction A [[concepts/predictive-process-monitoring|PPM]] task: given a prefix of an event trace, predict the **remaining execution time** until the case completes. ## Formulation - **Input:** prefix `⟨e₁, …, eₖ⟩` with timestamps. - **Output:** scalar `t_remaining` (regression). - **Training data:** for each prefix extracted from a completed case, label = (case completion timestamp − prefix-end timestamp). ## Canonical approaches - **Transition system annotation** — van der Aalst et al.; annotate states of a transition system with mean remaining time. - **Regression trees / ensembles** — feature-engineered. Includes the CatBoost regressor used as a benchmark in [[sources/2026-padella-llm-features-ppm|Padella et al. 2026]]. - **[[concepts/lstm-ppm|LSTM]] regression head** — often in multi-task setup with next-activity prediction. - **[[concepts/transformer-ppm|Transformer]] regressors** — recent state of the art (e.g. PGTNet — Process Graph Transformer Network — used as benchmark in Padella et al. 2026). - **[[concepts/llm-based-ppm|LLM-based prediction]]** — emerging family, particularly competitive in data-scarce settings. [[sources/2026-padella-llm-features-ppm|Padella, de Leoni & Dumas 2026]] document Gemini 2.5 Flash Thinking trained on 100 traces matching or surpassing CatBoost / PGTNet on the full BPI12 / Bac / Hospital event logs for Total Time MAE. ## Evaluation — three axes - **Accuracy** — MAE / MAPE / RMSE in time units (hours, days). - **Earliness** — accuracy at early prefixes, when information is minimal and prediction is hardest ([[sources/2019-verenich-survey-ppm|Verenich et al. 2019]]; [[sources/2023-riess-temporal-loss-remaining-cycle-time|Riess 2023]]). Operationally critical for queue prioritisation and dynamic resource planning. - **Temporal Consistency (TC)** — the degree to which consecutive predictions follow the natural monotonic decrease of true remaining time ([[sources/2023-riess-temporal-loss-remaining-cycle-time|Riess 2023]]). Proposed axis; none of the standard losses evaluated (unweighted MAE, exponential / power / moderate temporal decay) produce strictly monotonically decreasing predictions. - **Error curves over prefix length** — accuracy typically improves as more of the case has been observed. ### Loss-function design Standard: unweighted **L1 (MAE)** — robust to time-between-event outliers and typically outperforms MSE ([[sources/2019-verenich-survey-ppm|Verenich et al. 2019]]). **Temporally weighted L1** ([[sources/2023-riess-temporal-loss-remaining-cycle-time|Riess 2023]]): three variants — exponential decay (MAEEtD), power decay (MAEPtD), moderate decay (MAEMtD) — upweight early-prefix residuals. Exponential decay gave statistically significant earliness improvements on 2 of 4 public event logs; should be treated as a tunable hyper-parameter, with a potential trade-off against temporal consistency. ## Challenges - **Long-tail case durations** — outlier long cases skew error metrics. - **Case heterogeneity** — different variants have very different remaining-time distributions. - **Non-stationarity** — process behaviour drifts; models need retraining ([[concepts/concept-drift]]). - **Data-awareness matters** — incorporating case data (attributes, resources) outperforms control-flow-only ([[sources/2017-navarin-lstm-data-aware-remaining-time]] and others). - **Small-benchmark external validity** — most published methods evaluated on 3–9 public event logs; controlled synthetic evaluation via [[concepts/business-process-simulation|parametric simulation]] ([[sources/2024-riess-synbps-simulation-framework|SynBPS]]) recommended for data-generating-process-factor analysis. ## Related [[concepts/predictive-process-monitoring]] · [[concepts/next-activity-prediction]] · [[concepts/outcome-prediction]] · [[concepts/trace-encoding]] · [[concepts/concept-drift]] · [[concepts/business-process-simulation]] · [[concepts/llm-based-ppm]]