I wrote yesterday about the first output from the Stanford based AI100 group, who are trying to lead an effort to better understand the various risks involved in the growth of AI.
Joining them is a new facility being setup by UC Berkeley called the Center for Human-Compatible Artificial Intelligence. The facility is designed to explore how human values can be better designed into AI, and to ensure that the AI that is developed benefits humanity.
More human AI
The center, which is being led by Stuart Russell, will explore the interface of humans and robots, and seek to understand how machines can be developed to better understand what humans want.
Rather than relying on overt signals, for instance, AI might be programmed to spot subtle cues in the way we behave.
“My objective … is primarily to look at these long-term questions of how you design AI systems so that you are guaranteed to be happy with the outcomes,” Russell says.
At the moment, AI systems largely take what we say to them literally, but much of human communication is more subtle than that. A good example is King Midas, who famously wished that everything he touched turned to gold, before realizing that the literal translation of that was far from positive.
It’s a line of thinking that Oxford’s Nick Bostrom famously illustrated with his paperclip example.
Suppose we have an AI whose only goal is to make as many paper clips as possible. The AI will realize quickly that it would be much better if there were no humans because humans might decide to switch it off. Because if humans do so, there would be fewer paper clips. Also, human bodies contain a lot of atoms that could be made into paper clips. The future that the AI would be trying to gear towards would be one in which there were a lot of paper clips but no humans.
The UC Berkeley team hope that future AI systems will be capable of avoiding such dilemmas and acting in a way that fits with our values. They hope to achieve this via approaches such as inverse reinforcement learning, which involves the robot learning by watching humans behave.
“Rather than have robot designers specify the values, which would probably be a disaster,” Russell says, “instead the robots will observe and learn from people. Not just by watching, but also by reading. Almost everything ever written down is about people doing things, and other people having opinions about it. All of that is useful evidence.”
Of course, this isn’t an easy task as we tend to go about things in a very unique way, and this uniqueness makes it challenging to know the correct response in particular circumstances.
By thinking about it in this way however, we are also thinking more deeply about just what it means to be human, and what values we hold dear. That in itself is a valuable outcome, but hopefully centers such as this, together with Oxford’s Future of Humanity Institute, Elon Musk’s OpenAI and the numerous corporate AI ethics groups that are springing up will ensure that our next steps with AI are positive ones.