← all tags
#ai-agent-benchmark
2 notes.
AI Agent Benchmarks & Productivity Measurement
BPM/wiki/concepts/ai-agent-benchmarks
TheAgentCompany: Benchmarking LLM Agents on Consequential Real-World Tasks
BPM/wiki/sources/2024-xu-the-agent-company-benchmark