#benchmark
6 notes.
- Deep Learning for Predictive Business Process Monitoring: Review and BenchmarkBPM/wiki/sources/2020-rama-maneiro-deep-learning-ppm-review
- LLMs Corrupt Your Documents When You DelegateBPM/wiki/sources/2026-laban-schnabel-neville-llms-corrupt-documents-delegate
- Measuring the Impact of Early-2025 AI on Experienced Open-Source Developer ProductivityBPM/wiki/sources/2025-becker-metr-ai-developer-productivity
- Survey and Cross-benchmark Comparison of Remaining Time Prediction Methods in Business Process MonitoringBPM/wiki/sources/2019-verenich-survey-ppm
- The BRAGE Benchmark: Evaluating Zero-shot Learning Capabilities of LLMs for Norwegian Customer Service DialoguesBPM/wiki/sources/2025-riess-jorgensen-brage-benchmark-norwegian-llm
- TheAgentCompany: Benchmarking LLM Agents on Consequential Real-World TasksBPM/wiki/sources/2024-xu-the-agent-company-benchmark