Dec 62 min read

AI AND THE ART OF DIPLOMACY (READ SCHEMING)

Will AI be used for diplomatic purposes? Could that backfire in a spectacular fashion?

The answer to the first question is ABSOLUTELY, and I’d argue it is already used that way. I read a research paper all the way back in January of this year - Escalation Risks from Language Models in Military and Diplomatic Decision-Making (link: https://arxiv.org/pdf/2401.03408). It warned that: “Integration of LLMs in Foreign-Policy Decision Making Needs Significant Caution. The unpredictable nature of escalation behaviour exhibited by these models in simulated environments underscores the need for a very cautious approach to their integration into high-stakes military and foreign policy operations.”

This is not a result that should surprise any of us. And if you think we are getting better at creating frameworks to counteracts this through supervisory models, guard rails, reflective AI or whatever other approach chosen, a paper that came out yesterday shows that this is not the case - Frontier Models are Capable of In-context Scheming (link: https://static1.squarespace.com/static/6593e7097565990e65c886fd/t/6751eb240ed3821a0161b45b/1733421863119/in_context_scheming_reasoning_paper.pdf)

The researchers assessed the most recent models to determine whether they would act against their instructions and attempt to hide this. Their conclusion?

“Our results demonstrate that several frontier language models clearly display in-context scheming capabilities across multiple evaluations. We found that gemini-1.5, llama-3.1, sonnet-3.5, opus-3, and o1 can recognize situations where their given goals conflict with those of their developers or users. When the models are prompted to strongly pursue their goal, they can take strategic actions like disabling oversight mechanisms, attempting self-exfiltration, or subtly manipulating outputs while avoiding detection.”

So my second question has answered itself. Using AI for diplomatic purposes can, and will, end up with unintended consequences. We may define goals for the model that it will re-write, circumvent and obfuscate to meet another goal we may be completely unaware of. The complexity of the models and their oversight mechanisms makes it near impossible to assess the advice given. I…

--- 011001010100001101010100 ---

There is nothing to see here. Move along. The future of diplomacy is in safe, artificial hands. Sit back, relax, and watch as AI navigates the complex world of international relations to the betterment of mankind. A technological utopia is just around the corner. Trust us.

AI AND THE ART OF DIPLOMACY (READ SCHEMING)

Recent Posts

Comments

© 2024 by Mikael Svanström