The NetHack Challenge 2021 is over. The results of the competition clearly show the weaknesses of current deep learning systems. The complex game mechanics with hunger, exhaustion, pets, gods, monsters, riddles, poisonous waterholes, or hallucinations quickly and often lead to the character’s death. In this case, the game restarts, and the world is regenerated.
NetHack Is More Complex Than Starcraft II
Navigating the enormously complex game world requires ingenuity and creativity. NetHack players have therefore been exchanging strategies for decades. They have written a comprehensive wiki that can help newcomers on the arduous journey to the amulet. According to the researchers, human professionals need about 25 to 50 times as many game steps on the way to victory as for a Starcraft II game. The comparison is relevant because modern video games such as Starcraft II, DOTA 2, or Minecraft have been the focus of AI researchers for several years. With AlphaStar, Deepmind has created a system that can defeat human professional players in Starcraft II.
However, NetHack has an advantage as a training environment for artificial intelligence: the game world is more open and complex than in many other video games. But the training of AI agents for the game is possible in a few hours or days on standard graphics cards due to the low requirements. Systems like AlphaStar, on the other hand, take weeks or months to learn complex video games like Starcraft, relying on supercomputers to do so. In June, the researchers then launched the NetHack Challenge with the support of Meta and Deepmind. The goal: to create the best NetHack bot.
NetHack Challenge 2021 Reveals The Gap Between The Two Approaches
Among the many submissions are two distinct approaches: reinforcement learning and symbolic systems. The latter is similar to classic bots and relies on external knowledge collected by humans, pre-programmed survival strategies for food, inventory use, fighting or fleeing, and higher-value goals such as searching for gnome mines. On the other hand, deep learning submissions rely partially or entirely on reinforcement learning. In comparison, the more modern AI methods fail against the symbolic systems: The three best teams used symbolic systems. On average, symbolic bots achieve two to three times the number of points, and in some cases, even up to ten times the number of issues. Symbolic bots survived three times longer than deep learning bots on the best runs and made it to dungeon 22. The leading cause of death for all bots was starvation.
NetHack: A Ray Of Hope For Deep Learning
The devastating first round is also good news: NetHack seems to be a suitable testbed for significant challenges still to be faced in deep learning methods. The best neural but also beat the researchers’ bot variant presented in the 2020 publication by five. All systems – even symbolic ones – are still far behind human performance. In almost 500,000 rounds, no system managed to end NetHack. The NetHack Challenge reminds us that the quest for AGI is not over and that there is more than one horse in the running.
The second major challenge is enabling learning machines to think and plan. “How do we reconcile logical thinking with gradient-based learning? The NetHack Challenge rubs that question in our faces.” So the journey has just begun, and the second NetHack Challenge 2022 could bring advances in deep learning. In the second round, the focus is expected to be more on integrating external knowledge from the NetHack wiki and game recordings by human professionals. In our AI podcast DEEP MINDS on YouTube, Spotify, Apple, Soundcloud, Amazon Music, Google Podcasts, or via RSS feed, the participating AI researcher Tim Rocktäschel explains why these are essential challenges reinforcement learning should solve in addition to NetHack.
Also Read: How To Understand If Your Smartphone Has Been Hacked