Computer-controlled players in video games can usually be spotted for their repetitive, illogical or unemotional behavior. Unlike humans, non-player characters (NPCs) don't get angry, frustrated or scared in stressful game situations, and have trouble planning ahead. In order to address this problem, 2KGames launched the BotPrize, a Turing-style Test aimed at creating more convincing artificial players.
A human audience watched players in battling their way through Unreal Tournament 2004 and rated them on their apparent "humanness". A team from the University of Texas at Austin tied for the win, creating an NPC so realistic that it scored a humanness rating of 52%. That's impressive, and even more so taking into account that plain-ole real humans only clocked in at 40%.
The UT team was able to create their more-human-than-human bot through a process called "neuroevolution". Using existing models of in-game human behavior, the researchers created different NPCs that were weeded out via a Darwinian process. As with mutations in genetic evolution, each new generation of the different NPCs lineages were tweaked slightly with behaviors that could either prove to be adaptive (more human) or maladaptive (less human). After five years of digital evolution, the game bot finally outperformed its human competition.
As the BotPrize shows, appearing "human" in a video game environment is not an insurmountable task. Instead, it is a very clever combination of strategy, the ability to navigate in a 3D environment, and appearing rational or irrational at the appropriate times. Unlike human conversation, an area where chatbots have been trying and spectacularly failing for years, gameplay may be a relatively simple set of behaviors to mimic (sorry, gamers).
The BotPrize may also point to flaws in the Turing Test, which, in Turing's defense, was never meant as more than a thought experiment. The fact that the human players didn't score 100% on humanness may mean that the environment of Unreal Tournament 2004 is not adequate for expressing human memory, wit, foibles and aspirations. Or, it could be that we're still so culturally new to games that we have no yardstick against which to measure "real" behavior when it comes to gunning down robots and slicing up orcs.