bench
success score 0.90
check-packages
3.3s duration 3 events 2026-07-03 08:33:31
"The agent correctly identified the maintenance status of most packages, but did not provide a clear summary or conclusion."
Input
{ "packages": [ "requests", "pandas", "httpx", "openai" ] }
Output
"3/4 packages actively maintained\nrequests 2.34.2: 49d ago, active\npandas 3.0.3: 52d ago, active\nhttpx 0.28.1: 573d ago, STALE\nopenai 2.44.0: 8d ago, active"
0 / 3 events
Event stream (3)
start 08:33:31
log 08:33:34
end 08:33:34