So, Where’s My Robot?

Thoughts on Social Machine Learning

Guided Exploration

How autonomous should learning be?

This is a question I’ve been thinking about in my research. Starting from the assumption that machines will need to learn from people, an open question is how much interaction or what level of interaction should be required.

It’s interesting to look at prior work in relation to this question of human interaction. If we think of a spectrum from guidance to exploration, prior work is generally situated at the two extremes:

(1) On the guidance end of the spectrum is a system that is completely dependent on a human instruction and guidance: Learning by Imitation (Schaal review), Programming by Example (Lieberman), Programming by Natural Language (Lauria), Learning via Tutelage (Lockerd, Nicolescu).

(2) On the exploration end is a system that mostly learns through self exploration and takes some advantage of the human partner: Robot Shaping (Saksida), Reinforcement Learning with human rewards (Isbell)(Stern)(Evans), Learning by animal training techniques (Blumberg)(Kaplan).

There are real benefits to both of these extremes. On the guidance end, this type of system lets the human have complete control of the learning interaction. However, this requires that the teacher know exactly what the robot needs to do, which is not always the case in a teaching¬† situation. Imagine teaching someone to ride a bicycle, it is easier to give high level feedback rather than precise instructions about the movement. Another benefit of an exploratory learner is that it does not require the human’s presence or undivided attention in order for learning to take place.

Clearly the answer is that we need both! A social learner cannot simply occupy a single point on this spectrum, they must have both capabilities. So a question I’m tackling in my research is how to seamlessly incorporate both guidance and exploration, resulting in a system that can learn on its own, but also take full advantage of a human partner if they are there to provide guidance.¬† The goal is to slide dynamically along this guidance-exploration spectrum.

November 1st, 2006 Posted by | Machine Learning | no comments

Enter your password to view comments.

No Comments »

No comments yet.

Leave a comment