OpenAI Gym launches to support reinforcement learning

reinforcement-learningElon Musk has been a well spoken skeptic about the potential for AI to have a positive influence on our lives.  Suffice to say, the development of AI is something that’s inevitable, so last December he launched OpenAI in a bid to ensure that any developments that did occur were positive ones.

The non-profit project aims to ensure that any research done in AI is open and accessible for free rather than locked away in corporate research vaults.

The OpenAI Gym

The first official output from the project was announced recently, with the launch of the OpenAI Gym, which exists to help researchers and developers in the building of learning algorithms.

It’s a tool that’s designed to assist in the development of reinforcement learning algorithms.  These are algorithms that learn through the receipt of extensive feedback.

The platform provides users with sample code and case studies to help them make their first steps into reinforcement learning, with examples including teaching an algorithm to play a basic video game.  Suffice to say, it isn’t quite at AlphaGo levels, but reinforcement learning was a part of that successful foray into gaming so it does at least provide inspiration for future coders.

It’s an interesting break from machine learning, which relies on huge amounts of data to train the algorithm on.  Reinforcement learning, by contrast, is more scaleable as it allows the machine to learn as it goes via trial and error.

Reinforced learning

It has also gained traction in industry, such as at Japanese company Fanuc.  Their industrial robots are capable of figuring out the best way to perform a particular task in a matter of hours.

The machine marks a significant improvement on current robots that need extensive programming in order to perform the precise tasks required of them.  The Fanuc approach can potentially save huge amounts of time in this ‘teaching’ phase.

The robots use deep reinforcement learning to rapidly pick up new tricks by brute force of continual trial and error.  After just eight hours of training it’s capable of around 90% accuracy.

This becomes really useful because the robots are capable of instantly sharing their knowledge with their ‘peers’.  It’s an approach I wrote about last year when Rethink Robotics used an array of cameras and sensors to look at objects from a range of angles, and then shared that learning and experience with others.

Beneficial robotics

Suffice to say, the learning approach taken is no guarantee of the output being used for the benefit of mankind, or even to provide a buffer against nefarious uses of AI.

After all, there’s nothing really to stop a Google researcher from building on the work that’s freely shared on OpenAI.  History is littered with such examples of ‘recombination’, so there’s certainly no guarantee that the project will be able to achieve its aims.  Nevertheless, it remains a venture worth keeping an eye on.


Leave a Reply

Your email address will not be published. Required fields are marked *