← all tags
#simulated-environment
1 note.
TheAgentCompany: Benchmarking LLM Agents on Consequential Real-World Tasks
BPM/wiki/sources/2024-xu-the-agent-company-benchmark