Large Language Mafia: Why AIs Playing Werewolf Matters
by Dan Roque | Reading Time: 22 Minutes | In Bots of the Future As Large Language Models (LLMs) transition from static knowledge retrievers to autonomous agents , we are witnessing a fundamental shift in how we must measure "intelligence." The old benchmarks—coding accuracy, bar exam scores, and mathematical proofs—only measure utility. But the real world is built on social friction. By evaluating AI through the lens of social games like Mafia (Werewolf) and the Prisoner’s Dilemma, we move from measuring raw logic to diagnosing social agency . Understanding these behavioral signatures is not just an academic exercise; it is the vital strategic pivot required to understand if an AI can function as a collaborator or merely a high-speed defector in our social and economic systems. Why Are We Playing Games with Robots? Step up to the chalkboard, everyone. Today, we’re putting aside the Python scripts and the transformer diagrams. Instead, I want you to imagine a roo...