Original article excerpt
Server-side extracted preview paragraphs from the original source.
One-off tests don’t measure AI’s true impact. We’re better off shifting to more human-centered, context-specific methods.
For decades, artificial intelligence has been evaluated through the question of whether machines outperform humans. From chess to advanced math, from coding to essay writing, the performance of AI models and applications is tested against that of individual humans completing tasks.