Study Questions AI's Communication Prowess

In the strategy game Diplomacy, set just before World War I, winning depends not on chance but on negotiating skill. Players, representing the military powers of Europe, spend their time forging alliances, building trust, and ultimately betraying each other in pursuit of territorial control. “The best negotiator will rise to victory,” says Avalon Hill, the game’s publisher.

So, when an AI model from Meta named CICERO entered an online Diplomacy league in 2022 and trounced human players across 40 games, it appeared as though machines had cracked the code of human-like communication.

Strategic prowess

However, a new study from USC Viterbi’s Information Sciences Institute suggests otherwise. CICERO’s victories, they found, have more to do with its strategic prowess than any mastery of conversation. In fact, its communication abilities still fall short of those displayed by human players.

The study’s findings may provide new insight into how AI interacts with humans—not just during a board game but in solving everyday problems.

“We’re interested in AI-human communication,” the researchers explain. “A key question is: How much deception is the AI using?”

To investigate, the team set up a series of Diplomacy games, with CICERO pitted against human players. Over the course of 24 games and 200 hours of competition, the team collected over 27,000 in-game messages. But unlike previous studies that focused on the AI’s win rate, this research dug deeper into CICERO’s ability to deceive and persuade—skills central to Diplomacy.

Abstract reasoning

Using a method called Abstract Meaning Representation (AMR), the researchers analyzed in-game messages by converting them into structured data. This allowed them to compare what players said they would do with what they actually did. For example, if Germany told England, “I’ll help you invade Sweden next turn,” the researchers could check whether Germany really followed through or instead took a contradictory action.

This approach helped the team measure deception and persuasion, and compare CICERO’s communication abilities to those of human players.

Despite CICERO winning 20 of the 24 games, the study found that its messages often didn’t align with its actions. “If you listen to what it’s saying, it’s nonsense,” the researchers note. “It’s repeating things that a Diplomacy player has said before. But it doesn’t match up with what it’s doing.”

Restricted comms

In other tests, CICERO was restricted in its communication—sometimes unable to send messages at all, and other times allowed only basic strategic information. Surprisingly, these restrictions didn’t have much impact on its success, suggesting that negotiation wasn’t a major factor in its victories.

Human players, on the other hand, excel at lying. The study revealed that people are far more deceptive and persuasive than CICERO. In fact, once players realized they were up against an AI, they tended to lie to CICERO more frequently.

“What makes CICERO so effective is its experience—it’s seen a lot of Diplomacy games and knows what moves to make,” the researchers explain. “But it struggles to be convincing or deceptive, and it doesn’t respond much to what other players are saying.”

While Diplomacy is just a game, understanding AI deception in this context could have broader applications. The researchers believe this knowledge could help in developing AI tools to counter real-world threats, such as identifying misinformation online.

“There are plenty of bad actors out there,” they conclude. “We want to help protect against that by offering an extra layer of defense.”