#evaluation
10 notes.
- AI Agent Benchmarks & Productivity MeasurementBPM/wiki/concepts/ai-agent-benchmarks
- BPMN Assistant: An LLM-Based Approach to Business Process ModelingBPM/wiki/sources/2026-licardo-bpmn-assistant
- Business Process SimulationBPM/wiki/concepts/business-process-simulation
- LLMs Corrupt Your Documents When You DelegateBPM/wiki/sources/2026-laban-schnabel-neville-llms-corrupt-documents-delegate
- Process Model Quality & Soundness — Evaluation Guide for BPMN Models from Qualitative InterviewsBPM/wiki/syntheses/process-model-quality-and-soundness-evaluation-guide
- Remaining Cycle Time Prediction: Temporal Loss Functions and Prediction ConsistencyBPM/wiki/sources/2023-riess-temporal-loss-remaining-cycle-time
- Study sketch — SynBPS-APM: A controlled testbed for agentic BPMBPM/wiki/syntheses/study-sketch-synbps-apm
- Study sketch — Temporal consistency of LLM-agent runtime recommendationsBPM/wiki/syntheses/study-sketch-temporal-consistency-agents
- SynBPS: A Parametric Simulation Framework for the Generation of Event-Log DataBPM/wiki/sources/2024-riess-synbps-simulation-framework
- The BRAGE Benchmark: Evaluating Zero-shot Learning Capabilities of LLMs for Norwegian Customer Service DialoguesBPM/wiki/sources/2025-riess-jorgensen-brage-benchmark-norwegian-llm