Pushing the State of the Art
When it comes to DRL training, an agent is given the freedom to play a game repeatedly, making and learning from its own decisions. Each time the agent makes a decision, it is given a signal that describes how successful it was. Those signals allow the agent to learn through trial and error: Strategies that seem to produce positive signals are reinforced, and behaviors that lead to bad outcomes are used less and less, explained Staley.
But how well can the agent solve circular mazes? Or mazes displayed in a different color? Additionally, what tools and methods can be used to train an agent that solves mazes in general? Should it be shown many colors and then different maze types? Or maybe the other way around?
“Those are difficult questions to answer because the expertise of the agent is entirely measured against the training problem itself,” Staley said. “What my colleagues and I realized when studying these topics is that we rarely have the tools to properly ask these research questions. That’s what prompted Meta Arcade.”
The tool’s name reflects its objective: Meta Arcade not only allows researchers to train AI agents through gaming but also prompts researchers to evaluate the games themselves. By creating new gaming environments through Meta Arcade, researchers can create problems and therefore benchmarks to evaluate algorithm performance, Ashcraft explained. This enables DRL researchers to create rich problem sets and compare one algorithm’s problem-solving capabilities to those of another.
“The value in creating new environments and setting new benchmarks,” Ashcraft said, “is that it helps us push the state of the art.”
Genetic research on fruit flies set the path for research on more complex organisms, and AI techniques developed for chess playing were foundational to solving problems like data mining and molecular dynamics, explained Mike Wolmetz, who manages APL’s Human and Machine Intelligence program.
So, similar to how computer chess was once called the “fruit fly of AI,” Wolmetz said Meta Arcade is the fruit fly for lifelong machine-learning research — a critical mechanism through which more complex problems can be solved.
“Meta Arcade is helping the Lab solve problems related to agent adaptability, including maritime overhead imagery recognition and missile defense in unpredictable contexts,” he said.
Meta Arcade was developed with support from an APL team that includes Wolmetz as well as DARPA Lifelong Learning Machines project manager and technical lead Gautam Vallabha, robotics software engineer Kapil Katyal, electrical engineer Chris Ratto and AI researcher Cash Costello.
Meta Arcade Applied
In work funded by the Office of Naval Research (ONR), APL researchers are using Meta Arcade to study strategies for producing agents steeled for perception and task changes.
Jared Markowitz, an AI researcher in REDD who leads the ONR-funded project, said that insights gained from the arcade’s testing environments are being used to produce more versatile maritime platform defense agents capable of handling different fleet geometries, threat types and countermeasures. “Meta Arcade is also helping to refine algorithms that can classify overhead ocean imagery collected under variable viewing conditions,” he noted.
Tamim Sookoor, a computer scientist in APL’s Asymmetric Operations Sector, and former staff member Christina Selby applied Meta Arcade while leading a project for the Johns Hopkins University Institute for Assured Autonomy. The project, RADICS (Runtime Assurance of Distributed Intelligent Control Systems), sought to understand and predict how DRL models will fail in a given scenario. Meta Arcade enabled the team to observe and quantify a DRL model’s uncertainty with respect to specific changes in the environment, like game background color and ball speed.
As APL’s sponsors look to deploy AI agents in unpredictable real-world environments, the Lab’s DRL community will continue to develop intelligent agents with the ability to quickly and reliably adapt their strategies to changing conditions in the field, according to Ratto, who leads the ISC’s Artificial Intelligence Group.
“Meta Arcade will challenge the larger AI research community to develop better tools that improve AI robustness and strengthen trust in an AI agent’s decision-making,” he said.