---
title: "Deep Learning for Predictive Business Process Monitoring: Review and Benchmark"
type: source
tags: [ppm, deep-learning, survey, benchmark, hub]
authors: [Rama-Maneiro Efrén; Vidal Juan C.; Lama Manuel]
year: 2020
venue: "arXiv:2009.13251v1 [cs.LG]"
kind: paper
raw_path: "raw/Predictive process monitoring/Deep learning for predictive business process monitoring - Review and benchmark 2020.pdf"
created: 2026-04-13
updated: 2026-04-13
key_claims:
  - Systematic literature review of deep-learning PPM approaches up to 2020.
  - Empirical benchmark of 10 approaches on 12 publicly available event logs.
  - Identifies difficulties in selecting the most suitable approach for a specific problem.
  - Proposes a categorisation of DL approaches for PPM across prediction target, architecture, encoding, and data perspective.
---

# Rama-Maneiro, Vidal, Lama 2020 — Deep Learning PPM Review & Benchmark

**Hub survey/benchmark paper** for deep-learning [[concepts/predictive-process-monitoring|PPM]]. Combines a systematic literature review ([[methods/systematic-literature-review|SLR]]; selection criteria, architecture analysis) with an empirical benchmark of 10 concrete approaches on 12 public logs.

## Why this matters
- Provides a **coherent taxonomy** of DL-PPM methods: by prediction target (next-activity / remaining-time / outcome), architecture (LSTM / GRU / CNN / attention / ensemble), and encoding strategy.
- The benchmark is one of the primary references for choosing a PPM approach.
- Uses the same public logs (BPI Challenges, Helpdesk, Sepsis) as the community standard — results are directly comparable.

## Bridges
Together with [[sources/2016-teinemaa-outcome-ppm-review]] (outcome-PPM review) and [[sources/2019-verenich-survey-ppm]] (remaining-time survey), this is part of the **survey triad** a PPM researcher should start from.

## Cited by
- [[sources/2023-riess-temporal-loss-remaining-cycle-time]] — Riess cites Rama-Maneiro et al. as the canonical DL-PPM benchmark reference when motivating the need for a third evaluation axis (temporal consistency) beyond accuracy and earliness.
- [[sources/2023-riess-phd-thesis-ppm]] — used in the thesis background as the primary DL-PPM survey.

## External-validity caveat
The 12-log benchmark is the community standard but constitutes a closed sample — cross-log transfer is not evaluated, so benchmark-leading results do not guarantee performance on a specific organisation's process. The [[concepts/rct-limitations|external-validity]] critique Cartwright & Hardie apply to policy RCTs (*"it worked there, but will it work here?"*; see [[sources/2023-anjum-rocca-phi403-lecture-11-is-more-data-better]] and [[sources/2023-anjum-rocca-phi403-lecture-19-what-rcts-do-not-show]]) applies directly to PPM benchmarking.

## Connections
**Concepts:** [[concepts/predictive-process-monitoring]] · [[concepts/lstm-ppm]] · [[concepts/transformer-ppm]] · [[concepts/trace-encoding]] · [[concepts/rct-limitations]]