Every so often there’s a list categorizing robots, the top ten X robots. Like this fun one that IEEE Spectrum recently did on humanoid robots, highlighting ASIMO and four more recently developed bots.
Alternatively, Heather Knight has been getting some press for the opposite kind of list. Rather than a top-10 she’s doing a robot census at CMU. Frank Tobe of the Robot Report also left some interesting stats in the comments, pointing to current robot usage numbers and projected future usage.
So far she’s tallied ~550 robots on campus, and is getting the word out to count robots more broadly. So log in and count yer bots!
In the paper they investigate using a human reward signal in combination with environmental rewards for a reinforcement learning agent. In particular they analyze eight different ways to combine these two reward signals for performance gains. This makes an important contribution in formalizing the impact of social guidance on a reinforcement learning process.
As learning agents move from research labs to the real world, it is increasingly important that human users, including those without programming skills, be able to teach agents desired behaviors. Recently, the tamer framework was introduced for designing agents that can be interactively shaped by human trainers who give only positive and negative feedback signals. Past work on tamer showed that shaping can greatly reduce the sample complexity required to learn a good policy, can enable lay users to teach agents the behaviors they desire, and can allow agents to learn within a Markov Decision Process (MDP) in the absence of a coded reward function. However, tamer does not allow this human training to be combined with autonomous learning based on such a coded reward function. This paper leverages the fast learning exhibited within the tamer framework to hasten a reinforcement learning (RL) algorithm’s climb up the learning curve, effectively demonstrating that human reinforcement and MDP reward can be used in conjunction with one another by an autonomous agent. We tested eight plausible tamer+rl methods for combining a previously learned human reinforcement function, H, with MDP reward in a reinforcement learning algorithm. This paper identifies which of these methods are most effective and analyzes their strengths and weaknesses. Results from these tamer+rl algorithms indicate better final performance and better cumulative performance than either a tamer agent or an RL agent alone.
Charlie Kemp and his students took Georgia Tech’s PR2 down to the CNN studios last week for a live demo! They showed off some RFID assisted manipulation, the robot autonomously drove up and delivered a pill bottle to the newscaster. Their demo set up some comments from Willow Garage about the future of personal robotics, where robots are going to take over our repetitive tasks to free up our time for creative human endeavors. When asked when the PR2 or other such robots are going to be affordable for everyday folks, Keenan Wyrobek says its not 20 years out, but still a couple years away.
It’s the DARPA grand challenge on steroids, a Prius driving in traffic!
This is cool in and of itself for robotics, but even better is a company with huge resources deciding to put a lot of effort into robotics. They have been quite hush hush about what they might be working on, but have quietly been inviting lots of top notch robotics researchers to take a 1..2…n year sabbatical from their academic jobs and come work on robotics at google. So I expect this is the first of many cool things to come out of Google-Robotics.