Curiosity and Learning
The folks over at robots.net posted recently about the Intelligent Adaptive Curiosity work of Oudeyer and Kaplan at Sony’s Computer Science Lab in Paris. And as this is one of my favorite topics these days, I thought I’d do a follow-up.
In most machine learning examples, learning is an explicit activity. The system is designed to learn a particular thing at a particular time. But with people, on the other hand, there is a motivation or a drive to learn new things and achieve new goals. People have curiosity about new environments and experiences, and are able to judge their level of mastery in the environment. Learning is not activity, but is part of all activity.
Internal motivation drives the exploration process, causes the learner to recognize learning opportunities and take advantage of them, and to have the right amount of focus on a particular problem. Essentially, these internal motivations help a child or an adult or an animal, learn the right thing at the right time.
Machine Learning researchers would *love* to have a machine that learns flexibly, and proactively explores an environment and understand what kinds of things can be achieved in this environment. Which is why there has been focus recently by a few researchers on how to incorporate internal motivation into machine learning algorithms, to try to achieve some of the efficient learning by exploration that is seen in humans and animals. The following are two different approaches in Motivated Reinforcement Learning:
Intelligent adaptive curiosity is an approach that uses a Progress Drive, where learning progress is defined as the error in the prediction model, P(St+1 | St, a). In essence, the agent is `motivated’ to learn the world completely as the reward signal is defined by the agent’s world knowledge.
Intrinsically Motivated Reinforcement Learning uses intrinsic rewards in combination with extrinsic environmental rewards. In this case, intrinsic reward is proportional to the novelty of a state transition: (1-P(St+1|St)). New `skills’ or options are learned via Q-learning whereby the reward is the combination of the intrinsic reward and any extrinsic reward from the environment. Thus, a novel state change initially increases the reward received after that state change and this diminishes over time until the reward is only the extrinsic reward from the environment.
I think computational implementations of curiosity are a great start. What other drives are needed other than curiosity? Piaget talked about the two competing motivations of novelty (curiosity) and mastery. I think an important next step in these curiosity driven systems is a multifaceted motivation system that represents the push-pull of novelty and mastery. The system should be driven to new situations because it seeks novelty, but then pull back to practice known skills because there is also a desire to mastery the environment.
Topics focus on web design. | Our service:Transfer service in Stockholm | suprax | sms tracking
Artificial curiosity is definitely an exciting topic. Once you consider its implications, it seems like a necessary component of any developmental robot.
One comment… the progress drive in the Intelligent Adaptive Curiosity algorithm is actually defined as the reduction in prediction errors over time, not just the error itself. This method was first introduced by Juergen Schmidhuber* (http://www.idsia.ch/~juergen).
Schmidhuber, J. 1991. Curious model-building control systems. In Proceedings of the International Joint Conference on Neural Networks, vol. 2, 1458-1463. IEEE.
Comment by Tyler Streeter | October 31, 2006
Hi, Tyler. Thanks for the clarification.
Comment by A.L.T. | November 1, 2006
Sure, no problem. I don’t mean to be picky, but I think the distinction is important. As Schmidhuber explains, the problem with using curiosity rewards proportional to prediction errors is that agents can be attracted to situations that are too complex (including those with random elements). Using curiosity rewards proportional to the reduction in prediction error over time will attract agents only to those situations with high potential for learning. Interestingly, these situations can change over time (from the agent’s subjective viewpoint) as the agent becomes more capable – situations that are initially confusing might eventually become interesting.
Comment by Tyler Streeter | November 7, 2006
[...] One that we’ve mentioned previously, novelty is a compelling idea for a learning machine. Animal and human learners are incredibly [...]
Pingback by So, Where's My Robot? | April 27, 2007
Hi,
has anyone tried to develop the algo IAC ?
If yes, pleaseeeeeeee contact me at sduche@hotmail.com and help me !!
Comment by Sylvain Duché | June 27, 2007