How To Protect The IP Of AI

As AI has taken on ever greater importance in the priority of organizations around the world, it is understandable that efforts are underway to protect the intellectual property of algorithms that have strategic importance.

A recent paper from IBM Research highlights one strategy being worked on to provide this protection.  Their approach takes inspiration from the digital watermarking that helps to protect video, audio and photos.

Watermarking is typically done in two stages.  The first is an embedding stage where a word, usually “COPYRIGHT” is placed on top of the photo to allow people to detect whether it’s been used illegally or not.  The second stage is then the detection stage, where owners can extract this watermark to use it as legal evidence of ownership.

Embedding watermarks

The IBM team believe a similar approach can be used with deep neural network (DNN) systems.  They suggest that by embedding watermarks into such systems it can help to verify their ownership and therefore prevent theft.  Doing so requires fundamentally different methods to those used when embedding watermarks into other digital assets however.

The paper describes an approach used to do just that, whilst also outlining a remote verification mechanism that uses API calls to determine the ownership of the AI system.  The team developed three distinct methods for generating different kinds of watermarks:

  1. embedding meaningful content together with the original training data as watermarks into the protected DNNs,
  2. embedding irrelevant data samples as watermarks into the protected DNNs, and
  3. embedding noise as watermarks into the protected DNNs.

These were tested on two large, public datasets that allowed them to provoke an unexpected but controlled response if a system had been watermarked.

The authors admit that watermarking is a method that has been tried before, but previous efforts have been hampered by needing access to model parameters.  This doesn’t work in the real world as stolen models will usually be deployed remotely, with the IP thiefs not tending to publicize the parameters of the models they’ve stolen.

The system isn’t perfect, as the authors themselves admit.  For instance, if the stolen system is deployed internally it’s largely impossible to detect, so policing things requires the system to be deployed online.  Similarly, the watermarking method cannot prevent systems being stolen via prediction APIs, but the researchers say that such approaches have enough limitations not to worry about this being a sizeable loophole.

They’re currently looking to deploy the system internally at IBM, before then scaling it up and deploying it with clients.

Related

Facebooktwitterredditpinterestlinkedinmail