Every so often there’s a list categorizing robots, the top ten X robots. Like this fun one that IEEE Spectrum recently did on humanoid robots, highlighting ASIMO and four more recently developed bots.
Alternatively, Heather Knight has been getting some press for the opposite kind of list. Rather than a top-10 she’s doing a robot census at CMU. Frank Tobe of the Robot Report also left some interesting stats in the comments, pointing to current robot usage numbers and projected future usage.
So far she’s tallied ~550 robots on campus, and is getting the word out to count robots more broadly. So log in and count yer bots!
In the paper they investigate using a human reward signal in combination with environmental rewards for a reinforcement learning agent. In particular they analyze eight different ways to combine these two reward signals for performance gains. This makes an important contribution in formalizing the impact of social guidance on a reinforcement learning process.
As learning agents move from research labs to the real world, it is increasingly important that human users, including those without programming skills, be able to teach agents desired behaviors. Recently, the tamer framework was introduced for designing agents that can be interactively shaped by human trainers who give only positive and negative feedback signals. Past work on tamer showed that shaping can greatly reduce the sample complexity required to learn a good policy, can enable lay users to teach agents the behaviors they desire, and can allow agents to learn within a Markov Decision Process (MDP) in the absence of a coded reward function. However, tamer does not allow this human training to be combined with autonomous learning based on such a coded reward function. This paper leverages the fast learning exhibited within the tamer framework to hasten a reinforcement learning (RL) algorithm’s climb up the learning curve, effectively demonstrating that human reinforcement and MDP reward can be used in conjunction with one another by an autonomous agent. We tested eight plausible tamer+rl methods for combining a previously learned human reinforcement function, H, with MDP reward in a reinforcement learning algorithm. This paper identifies which of these methods are most effective and analyzes their strengths and weaknesses. Results from these tamer+rl algorithms indicate better final performance and better cumulative performance than either a tamer agent or an RL agent alone.
Charlie Kemp and his students took Georgia Tech’s PR2 down to the CNN studios last week for a live demo! They showed off some RFID assisted manipulation, the robot autonomously drove up and delivered a pill bottle to the newscaster. Their demo set up some comments from Willow Garage about the future of personal robotics, where robots are going to take over our repetitive tasks to free up our time for creative human endeavors. When asked when the PR2 or other such robots are going to be affordable for everyday folks, Keenan Wyrobek says its not 20 years out, but still a couple years away.
It’s the DARPA grand challenge on steroids, a Prius driving in traffic!
This is cool in and of itself for robotics, but even better is a company with huge resources deciding to put a lot of effort into robotics. They have been quite hush hush about what they might be working on, but have quietly been inviting lots of top notch robotics researchers to take a 1..2…n year sabbatical from their academic jobs and come work on robotics at google. So I expect this is the first of many cool things to come out of Google-Robotics.
The electrical signals inside Lyric’s chips represent probabilities, instead of 1s and 0s. While the transistors of conventional chips are arranged into components called digital NAND gates, which can be used to implement all possible digital logic functions, those in a probability processor make building blocks known as Bayesian NAND gates. … Whereas a conventional NAND gate outputs a “1″ if neither of its inputs match, the output of a Bayesian NAND gate represents the odds that the two input probabilities match. This makes it possible to perform calculations that use probabilities as their input and output.
Sounds like their initial impact will be in the flash memory market, making error-checking faster and more efficient. But I can definitely see how this kind of hardware could have a major impact in Machine Learning and Robotics. Most of statistical machine learning has its roots in the kind of math that this processor is designed for. This could make reasoning about larger (more real-world) problems increasingly feasible.
Last week I attended the IEEE International Conference on Development and Learning, held at the University of Michigan. This is an interesting conference that I’ve been going to for the past few years. It’s goal is to very explicitly mingle researchers working on Machine Learning and Robotics with researchers working on understanding human learning and development.
My lab had two presentations
“Optimality of Human Teachers for Robot Learners” (M. Cakmak, A. L. Thomaz): Here we take the notion of teaching in Machine Learning Theory, and analyze the extent to which people teaching our robot are adhering to theoretically optimal strategies. Turns out they teach about positive examples optimally, but not negative. And we can use active learning in the negative space to make up for people’s non-optimality.
“Batch vs. Interactive Learning by Demonstration” (P. Zang, R. Tian, A.L. Thomaz, C. Isbell): We show the computational benefits of collecting LbD examples online rather than in a batch fashion. In an interactive setting people automatically improve their teaching strategy when it is sub-optimal.
And here are some cool things I learned at ICDL.
Keynote speaker, Felix Warneken, gave a really interesting talk about the origins of cooperative behavior in humans. Are people helpful and good at teamwork because you learn it, or do we have some predisposition? His work takes you through a series of great experiments with young children, showing that helping and cooperation are things we are at least partly hardwired to do.
Chen Yu, from Indiana, does some really nice research looking into how babies look around a scene, and how this is different than adults or even older children. They do this by having them wear headbands with cameras, then they can do some nice correlations across multiple video streams and audio streams to analyze the data. For younger children, visual selection is very tied to manual selection. And the success of word learning is determined by the visual dominance of the named target.
Vollmer et al, from Bielefeld, did an analysis of their motionese video corpus, and showed the different ways that a child learner gives feedback to an adult teacher. Particularly that this changes from being dominated by gaze behaviors, to more complex anticipatory gestures between the ages of 8mo to 30 mo.
Several papers touched on the topic of Intrinsic motivation for robots, as inspired by babies and other natural learners. Over the past few years there has been growing interest in this idea. People have gone from focusing on curiosity and novelty, to competence and mastery. There were papers on this topic from Barto’s lab, and from Oudeyer’s. The IM CLeVeR project was also presented, this is a large EU funded collaboration that aims to address intrinsic motivation for robots.
Exciting news recently in the realm of robotics and public policy. Robotics is recommended as a Science and Technology Priority for the 2012 budget. The recent OSTP/OMB memo lists six challenge areas, the first of which is “Promoting sustainable economic growth and job creation,” and one of the three recommendations in this section is:
“Support R&D in advanced manufacturing to strengthen U.S. robotics, cyber-physical systems, and flexible manufacturing.”
Congratulations to Henrik Christensen and all of those in the robotics community that have worked hard over the past couple of years to educate the science and technology policy makers about the ways in which robotics research and development can have a positive impact on the U.S. economy and society.
The Nineteenth Edition of the Robotics Program at AAAI is happening right now in Atlanta GA, July 12-15th, at the Westin Peachtree Plaza. This year’s event is co-chaired by Monica Anderson and Andrea Thomaz, and is sponsored by the NSF, Microsoft Research, and iRobot.
The AAAI Robotics Program has a long tradition of demonstrating innovative research at the intersection of robotics and artificial intelligence. This year, the AAAI-10 Robotics Program will feature a workshop on “Enabling Intelligence through Middleware” (July 12th) and a robotics exhibition (July 13-15th) with the following demonstrations of intelligent robotics on display Tues-Thursday, in the Vinings rooms on the 6th floor (near registration).
Robotic Chess: Small Scale Manipulation Challenge: The AAAI-2010 Small-Scale Manipulation Challenge is designed to highlight advances in embodied intelligence using smaller than human size robots. Robotic chess requires the integration of sensing, planning and actuation and provides an opportunity for performance on a common, well-defined task. The chess challenge will run on Tuesday July 13th, with matches at 10am, 1pm, and 4pm. Additionally the chess robots will be on display throughout the exhibit.
Learning by Demonstration Challenge: This will be the second annual exhibit and challenge on robot Learning by Demonstration (LbD). The purpose of this event is to bring together research and commercial groups to demonstrate complete platforms performing LbD tasks. Our long-term aim is to define increasingly challenging experiments for future LbD events and greater scientific understanding of the area. This year 5 teams will bring robots that will learn a sorting task from a human teacher. The LbD challenge will run on Wednesday July 14th at 4pm. And the LbD robots will be on display throughout the exhibit.
Robotics Education Track: This venue offers an accessible and flexible opportunity for undergraduate, early graduate, or pre-college student teams to design, implement, and demonstrate an autonomous robotic system. The tasks involved span physically-embodied AI: exploration, interaction, and learning within an unknown environment. In the long run, we hope to motivate hands-on AI robotics investigation both for its own sake and in service to other academic disciplines and educational goals. This year we have 8 university teams contributing to this part of the exhibit.