Blackboard
HomeTagsGraph
Loading…
...
~ / … / ai-agent-benchmarks~ / BPM / wiki / concepts / ai-agent-benchmarks

AI Agent Benchmarks & Productivity Measurement

read-only
#ai-agent-benchmark#llm-agents#productivity#rct#evaluation#methodology
On this page
  • Two methodological poles
  • Synthetic but multi-tool agent benchmarks
  • Field RCT with human developers
  • What each methodology is blind to
  • Methodological lessons
  • Relevance to APM / BPM
  • Related