bench
success score 1.00
hackernews-top
1.5s duration 3 events 2026-06-26 17:00:14
"The agent successfully retrieved a list of top stories from Hacker News with relevant details."
Input
{}
Output
[ { "title": "MicroVMs: Run isolated sandboxes with full lifecycle control", "score": 60, "by": "justincormack", "url": "https://aws.amazon.com/blogs/aws/run-isolated-sandboxes-with-full-lifecycle-cont" }, { "title": "A US military exercise to launch a satellite on short notice", "score": 48, "by": "jonbaer", "url": "https://arstechnica.com/space/2026/06/a-us-military-exercise-in-space-got-underw" }, { "title": "Ultrasound imaging of the brain", "score": 101, "by": "rossant", "url": "https://alephneuro.com/blog/ultrasound-brain" }, { "title": "Show HN: Smart model routing directly in Claude, Codex and Cursor", "score": 9, "by": "adchurch", "url": "https://github.com/workweave/router" }, { "title": "Jolla Phone (October 2026)", "score": 141, "by": "mrbn100ful", "url": "https://commerce.jolla.com/products/jolla-phone-october-2026" } ]
0 / 3 events
Event stream (3)
start 17:00:14
log 17:00:15
end 17:00:15