Many of the advances in AI in the past year or so have come about as a result of Deep Neural Networks (DNNs), whose power comes from their ability to cope with high-dimensional inputs. They can be vulnerable to what are known as adversarial examples however. These are inputs that have been deliberately modified to produce a particular response by a DNN. This deliberate misclassification can often be the result of hackers seeking to manipulate AI technologies.
These kind of attacks are especially damaging because they’re so hard to detect. Alterations of images, videos and other types of data used to train and guide AI systems are virtually undetectable. Indeed, the alterations are effective even if the hacker doesn’t know the exact design of the DNN.
Researchers from IBM Research Ireland have recently published a paper outlining strategies for stopping such attacks. The paper has been accompanied by the release of the Adversarial Robustness Toolbox on GitHub.
Keeping AI safe
The toolbox aims to support both researchers and developers in the creation of new defense techniques, whilst also supporting the deployment of these techniques in real-world scenarios.
“One of the biggest challenges with some of the existing models to defend against adversarial AI is they are very platform specific. The IBM team designed their Adversarial Robustness Toolbox to be platform agnostic. Whether you’re coding/developing in Keras or TensorFlow, you can apply the same library to build defenses in,” IBM say.
The library has been written in Python and takes a three-pronged approach to defending DNNs:
- Measuring robustness – The first step is to test the robustness of the DNN. This could often involve simply testing the loss of accuracy when inputs have been deliberately altered.
- Firming the defenses – The second step is to then harden the DNN so that it rejects adversarial inputs. This could involve preprocessing the inputs or augmenting the training data with adversarial examples.
- Real-time detection – The final step is to then construct methods for flagging false inputs in real-time.
The toolbox contains extensive support and documentation to help people get going as quickly as possible, with the paper providing a range of examples of the various methods outlined in action.
Ensuring AI systems are secure and reliable is likely to be increasingly important as they take on a growing range of vital tasks. The Adversarial Robustness Toolbox is therefore a valuable project as it supports an ecosystem of contributors from across industry and academia. It will be a fascinating project to track as more people join the community and get involved in improving the robustness of DNN systems.