Prudence Pitch

15:58 15/01/2020 |

Total post : 1,624

DeepMind recently proposed a technique that optimizes for discrete and continuous actions simultaneously, treating hybrid problems in their native form

(Tech) The team’s model-free algorithm - which leverages reinforcement learning, or a training technique that rewards autonomous agents for accomplishing goals - solves control problems both with continuous and discrete action spaces and hybrid optimal control problems with controlled and autonomous switching. 



Furthermore, it allows for novel solutions to existing robotics problems by augmenting the action space with meta actions or other such schemes, enabling strategies that can address challenges like mechanical wear and tear during AI training.

The researchers validated their approach on a range of simulated and real-world benchmarks, including a Rethink Robotics Sawyer robot arm. The say that, given the task of reaching, grasping, and lifting a cube where the reward was the sum of the three sub-tasks, their algorithm outperformed existing approaches, which were unable to solve the task.

In a separate experiment, the team set their algorithm loose on a Parameterized Action Space Markov Decision Processes (PAMDP), or a hierarchical problem where agents first select a discrete action and subsequently a continuous set of parameters for that action. In this case, the agent was tasked with manipulating the robot arm such that it inserted a peg into a hole, where the reward was computed based on the hole position and kinematics.


Post new