Alison Gopnik recently had an opinion article in the NYTimes. Gopnik is a Psychologist that studies child development and “Theory of Mind.”
I find much of Gopnik’s work inspiring for robot learning, and the ideas in this article are a good example. She lays out evidence and findings related to the difference between adult and child learning. In many ways children are much better at learning and exploring than adults. They observe and create theories that are consistent with a keen probabilistic analysis of seen events. These theories guide their “play” or exploration in a way that efficiently gathers information about their complex and dynamic world.
The description of adult versus child-like learning sounds like the traditional explore/exploit tradeoff in machine learning. But this raises a question we are often asked with respect to robot learning, do we actually want robots to explore like children? I think the answer is yes and no. We probably don’t want robots to need a babysitter, but we do want robots to exhibit the kind of creativity and experimentation that you see in some of Gopnik’s studies of causal structure for example.
I’m most excited about the idea that Gopnik ends the article with: “But what children observe most closely, explore most obsessively and imagine most vividly are the people around them. There are no perfect toys; there is no magic formula. Parents and other caregivers teach young children by paying attention and interacting with them naturally and, most of all, by just allowing them to play.”
I think that the importance of social learning in human development is a strong argument for robot learning by demonstration or instruction—that we should be looking for the short cuts and computational gains we can get from leveraging a partner.
Empathy: The ability to understand and share the feelings of another. It’s an important concept for social robots because it is the foundation of any human interaction. I can’t see what’s going on in your brain, so in order to interact with you, I have to infer your mental states based on my own perceptions and experiences.
This philosophy of Theory of Mind is known as “Simulation Theory,” and the neural basis of this theory is the mirror neuron. Until recently these neurons had only been specifically identified in lab monkeys. But this week an article in the Science Journal reports on Dr. Iacoboni’s work at UCLA, published in a PLoS ONE article.
“Preliminary data from unpublished experiments this spring suggest that researchers at UCLA, probing the exposed brain tissue of patients undergoing neurosurgery, for the first time have isolated individual human brain cells that act as mirror neurons…”
Mirror neurons can be thought of as “subconscious seeds of social behavior…Located in the brain’s motor cortex, which orchestrates movement and muscle control, the cells fire when we perform an action and also when we watch someone else do the same thing. When someone smiles or wrinkles her nose in distaste, motor cells in your own brain associated with those expressions resonate in response like a tuning fork, triggering a hint of the feeling itself.”
Robotics researchers have recently been inspired by simulation theory in thinking about action understanding and imitation. From an engineering standpoint the idea of repurposing generative mechanisms for recognition is elegant. Some of the open questions I think are interesting: how to do the mapping of perception to action; how to do both action generation and inference at the same time, or at least multi-tasked in a way that signals don’t get crossed; among others.
One of my favorite topics these days is motivation. In trying to figure out what should motivate robot behavior, I find myself learning more about human motivation. I just finished a good book by Gregory Burns — “Satisfaction: Sensation seeking, novelty, and the science of finding true fulfillment”.
Burns, an M.D. Ph.D at Emory, is a fun storyteller. The book details years of his research into the neural basis of satisfaction. What satisfaction do people get out of money, sex, food, exercise? Burns tackles these and many other questions about how the human brain experiences and evaluates the world around it.
Due to how my brain is wired, I read this and think, what does that mean for robots? So, here are some aspects of the neural basis of human satisfaction that I think should inspire designers of robots and learning machines.
Work for it
Satisfaction is directly linked to action. An fMRI experiment looked at people the brain’s response to receiving monetary rewards. People saw shapes on a screen and pressed a button when a triangle appeared. Randomly a $1 would appear on the screen to indicate they were receiving a reward. The group that had to press the button to transfer the $1 into their account had a bigger response than people that got a freebie. “Reward, at least as far as the brain is concerned, is in the action and not in the payoff.” Also mentioned is a study finding that rats prefer to work for rewards rather than get them for free.
So, a learning machine should seek out “rewards” but any good state that comes about for free is much less valuable than ones brought about by the machine’s own actions. This seems like a good way to focus a learner on aspects of the world they have control over. It forces us to focus on the entire integrated embodied system. Learning, reward, experience does not happen in a vacuum. Additionally, this finding argues against learning techniques that rely entirely on observation.
Keep the options open
This one comes from looking at how people reason about risk and the utility of money. “The buying of possibilities, and not the actual goods purchased, is what accounts for the allure of money. When you increase the number of options available to you, risk actually decreases….our brains seem to have a built-in bias to [this]. People prefer more choices.”
This indicates that a machine, as part of it’s decision making process, may need the ability to maximize expected future opportunity (available actions) in addition to expected future payoffs/rewards.
All about novelty
The intricate ways in which the brain deals with novelty is a major topic of the book. Essentially our brains constantly try to make sense of the world and predict the future. New information is the best way to be able to build better models and do a better job of predicting the future. So, the brain really really likes new information, good or bad! The striatum is the area of the brain that seems to play the biggest role here. It seems to determine the importance of all information that is encountered (reminds me of Damasio’s somatic marker hypothesis, though I’m not sure he theorized about the striatum in particular). It “lights up” on prediction of pain/pleasure indicating “something significant” is about to happen.
This says that people working on curiosity drives are going in the right direction, driving behavior based on the agent’s ability to predict outcomes in the world. I’m interested in what kinds of representations are really going to work for this. A robot can’t have a complete world model, so what representation is necessary to be able to do this kind of immediate overall assessment that we see in the striatum?
The Information Gap
Novel events can lead to either a retreat or explore response. Curiosity is when novel events trigger exploration, when the brain perceives an “information gap” between what it knows and what it wants to know. Then it follows naturally that the more you know the more you’re curious about…a sort of learning snowball effect. This is the real chicken/egg problem for machine learning, because you have to already know something in order to perceive this information gap.
Shared insight is more fun
Much of the book deals with an individual’s perceptions and experience. But he does mention briefly that social interaction adds another dimension to novelty. It seems that novelty encountered as part of a team is even more rewarding than flying solo. Shared insight, “you see what I see,” is exciting probably because it is more rare than novelty alone.
This is a subtle but interesting point for social robots. The shared experience is an important part of situated learning for human teachers and learners. So, I think that the ability for a machine to communicate to a human teacher: “hey something new just happened, and I noticed it…”, is going to be very important for the teacher’s ability to easily intervene in the learning process.
Burns sums up nicely in a concluding definition: “Satisfaction is the uniquely human need to impart meaning to one’s actions.” I’ll have to look him up once I’m in Atlanta, pick his brain about robot motivations. Maybe I’ll pretend I skipped the chapter about how he and his wife jazzed up their sex life (blush).
What should drive a robot’s behavior? Today, robots are designed to do a particular thing, or designed to learn to do a particular thing. What if, instead, robots were internally motivated to do and learn to do useful new things? Maybe there are ways to computationally model the kinds of motivational systems that humans and animals have to create a flexible framework for the meta-control of an autonomous agent’s behavior.
So, what should your robot “want”? I’ve been doing some digging into what motivates human behavior, and efficiently and effectively drives a learner to good learning experiences. Here’s a working list of useful motivations that we may need computational equivalents for a social robot.
(Some of the books/papers that influence this list: Piaget, Lave, Thoman, Meltzoff)
Novelty: One that we’ve mentioned previously, novelty is a compelling idea for a learning machine. Animal and human learners are incredibly tuned to novelty, and are able to detect and seek out novel events in the world in a safe and efficient way.
Mastery: A corollary to novelty, mastery is another important motivation in human learning. That inherent pleasure in ‘figuring it out.’ In many ways it’s a great balance to novelty, the competing drive to find new things versus understand and master the things you’ve found. This is similar to the explore-exploit tradeoff in Reinforcement Learning, but the novelty-mastery tradeoff is rooted in the future goals and survival of the system rather than the abstract goal of maximizing “reward”.
Like-me: This one is a solely human motivation, though the jury is still out. A ‘like-me’ bias, the propensity and ability to map between others’ actions and self actions, is seen at a very early age. As a child grows older, interacting with adults, they come to understand that the adult is ‘like-me’ and is therefore a source of information about actions and skills. This is said to be an important overall motivation for children’s learning, their desire to be like adults and to participate in the adult world.
Interaction: The inherent ability and desire to engage, communicate, and interact with others is seen from an early age. Children put themselves in a good position to learn new things by being able to recognize and seek proximity to their caregivers. Two month olds can actively engage in communication or turn-taking routines with adults. Studies have shown that infants can start and stop communication with their mother through gesture and gaze, and that it is the infants that control the pace of the turn taking interaction. For a social robot, a critical part of learning effectively in its environment will be quickly developing the ability to recognize who it should interact with and being motivated to try to interact with its ‘caregivers’.
Collaboration/partnership: I have less psychological evidence for this one, but it seems like a good motivation for a social robot. I’m of course not the first to suggest that a robot should be fundamentally motivated to be helpful and collaborative with its human partners. The important part will be how to design a system that successfully balances this motivation with other system goals.
For those of us working towards Social Machine Learning, the ongoing research into the evolutionary roots of this ability in humans can provide some great insight. This Science Times article reports on a recent symposium, “The Mind of the Chimpanzee,” where scientists presented research on various aspects of chimpanzee emotional, cultural, and cognitive abilities.
It’s now widely acknowledged that chimps have some level of social learning abilities. Some of the best evidence of this is tool use–which “seems to vary from one isolated chimp community to another.” This indicates that chimps in the different communities are picking up skills by emulation.
They fold leaves in a mat to sponge water out of tree hollows and scoop algae off stream surfaces. They collect edible ants with sticks. They take stouter tree branches and pound the juicy palm fiber to a pulp, preparing another favorite food.
Dr. Sanz, who has worked with her husband, David B. Morgan, on some of the research, described mother chimps’ carefully withdrawing from a hole sticks swarming with black termites while their infants looked on. These social interactions, she said, passed on essential techniques and behaviors to the next generation.
The article is a nice introduction that points to some recent work on chimpanzee social and cognitive abilities. Here are a couple of other good reads on the topic of non-human social learning abilities:
C. M. Heyes, B.G. Galef (Eds.). Social Learning in Animals: The Roots of Culture, San Diego, CA: Academic Press, 1996.
K. Dautenhahn & C. Nehaniv (Eds.), Imitation in animals and artifacts, Cambridge, MA: MIT Press, 2002.
What is a Social Machine Learner actually meant to learn… As humans, we are fundamentally wired to interpret the actions of other people in goal-oriented ways, which really helps speed up teaching/learning. People are really good at seeing an example of something and abstracting out what the “point” was. Studies have been done showing how kids will learn the correct goal of a new task even when they never saw a correct demonstration of it (Meltzoff).
Given that their social partners will act and interpret action in intentional and goal-oriented ways, an Social Machine Learning system will need to continually work to refine the concept of what the human partner is meaning to communicate, and what the activity is about.
Csibra’s theory of human action understanding give some inspiration about how our social machines will have to interpret their human partners. In the theory, activity has the representation [context][action][goal], and a series of experiments with infants finds that they have efficiency expectations with respect to each of these three (Csibra 2003). For instance, given a goal and a context infants expect the most efficient action to be used (and are surprised when it is not); the experiments show the ability to infer goal and context in a similar fashion. In one experiment, 9-12 month old infants were repeatedly shown animations of a ball jumping over an obstacle to reach and contact a second ball. In this case the jumping action is instrumental to the goal (contacting the second ball). After habituating to this animation the infants are shown the test configuration where the obstacle is gone. In one test condition infants are shown an animation where the approaching ball does the same jumping action to reach the other ball, and in the second test condition the approaching ball makes the more efficient straight-line approach to the other ball. Using looking time as a measure of broken expectations, Csibra found that infants were using a goal-oriented interpretation. Despite habituating to the jumping action, in the test configuration infants preferred the new instrumental straight-line action to the now unnecessary jumping action. Thus the ability to understand the goal of observed actions
This kind of “efficiency” representation would be great for Social Machine Learners because it leads to a reasonable generalization of activity across contexts. For instance, if the system is always trying to build a better model of the [context] component of an activity representation, this will lead to the ability to say, “this looks like the kind-of-situation where I do X” or abstracted even further “I feel like doing X.” Also, this representation implies the flexibility to learn multiple ways to accomplish the same goal.
This is VERY different than most Machine Learning examples out there. Often systems are designed to learn a particular thing at a particular time. The goal is defined by the designer, either in the nature of the training data or the description of an optimization criteria, etc. Or in many systems there is an implicit goal to learn a “complete” model of the environment.
How do we bridge this gap, to make machine learners able to flexibly learn new goals (concepts) from interacting with a non-expert human partner? I’ve been trying to tackle various aspects of this question in the systems I’ve been building for the past few years now. I think this is a good example of an aspect of Social Machine Learners where the “Social” element has to be a fundamental part of the system, not just a nice interface that is slapped on at the end. The machine needs to understand and represent the world in social (goal-oriented) ways in order to learn in the way that a social partner is going to expect.
Do you know of a system/algorithm that you believe is actually learning a new goal that it wasn’t specifically designed to learn? Leave a comment!
Picking up on the last post, I really like to think about how machine learning could or should be more like human learning. So, even though I’m a computer scientist by training, I read a lot of psychology literature as inspiration for building more flexible, efficient, personable and teachable machines.
Situated Learning is a field of study that looks at the social world of a child and how it contributes to their development. Throughout development, a child’s learning is aided in crucial ways by the structure and support of their environment and especially their social environment.
In a situated learning interaction, a good instructor structures the task appropriately with timely feedback and guidance. The learner contributes to the process by communicating their internal state (understanding, confusion, attention, etc.). This tightly coupled interaction enables the learner to leverage from instruction to build the appropriate representations and associations. This situated learning process stands in contrast to typical scenarios of machine learning which are often not interactive or intuitive for the human partner.
Here are a few key qualities of human learning that we need to consider for teachable machines:
Learning is a part of all activity
In most machine learning examples, learning is an explicit activity. The system is designed to learn a particular thing at a particular time. With humans on the other hand, there is a motivation for learning, a drive to be a better “system”, and an ability to seek out the expertise of others. Learning is not activity, but is part of all activity.
Teachers scaffold the learning process
An important characteristic of a good learner is the ability to learn both on one’s own and by interacting with another. Children are capable of exploring and learning on their own, but in the presence of a teacher they can take advantage of the social cues and communicative acts provided to accomplish more. For instance, the teacher often guides the child’s search process by providing timely feedback, luring the child to perform desired behaviors, and controlling the environment so the appropriate cues are easy to attend to, thereby allowing the child to learn more effectively, appropriately, and flexibly.
Expression provides feedback to guide a teacher
To be a good instructor, one must maintain a mental model of the learner’s state (e.g., what is understood so far, what remains confusing or unknown) in order to appropriately structure the learning task with timely feedback and guidance. The learner helps the instructor by expressing their internal state via expressions, gestures, or vocalizations that reveal understanding, confusion, attention, etc.