ESI‑Bench: The ImageNet‑Style Benchmark for Embodied Spatial Intelligence
ESI‑Bench, introduced by Fei‑Fei Li's team, transforms the observer into an active agent to evaluate embodied spatial intelligence across 10 task categories and 3,081 instances, revealing that perception is not the bottleneck, action strategies are critical, imperfect 3D reconstructions can hurt performance, and current models suffer from action blindness and metacognitive deficits compared with humans.
