Operant & Classical Conditioning1. Classical conditioning forms associations between stimuli (CS and US). Operant conditioning, on the other hand, forms an association between behaviors and the resulting events.
Operant & Classical Conditioning Classical conditioning involves respondent behavior that occurs as an automatic response to a certain stimulus. Operant conditioning involves operant behavior, a behavior that operates on the environment, producing rewarding or punishing stimuli.Basically, the two characteristics that help us distinguish the two forms ofconditioning are the following: In classical conditioning, the organism learnsassociations between two stimuli, and its behavior is respondent, that is,automatic. In operant conditioning, the organism learns associations betweenits behavior and resulting events; the organism operates on the environment.
Transforming Couch Potatoes with Operant ConditioningPsychologist David Allison of Columbia University College of Physicians and Surgeons have reported a nifty application of operant conditioning principles to both weight control and leisure management in children. Presenting at the Experimental Biology meeting in Washington, DC, in April 1999, Allison described how his team successfully got overweight, sedentary children moving while watching TV.The researchers wondered what would happen if kids had to ride a stationary bicycle to keep the television on. So they created TV-cycles and randomly assigned overweight 8- to 12-year-olds to two conditions. In one condition, children had to pedal in order to keep the TV on. In the second condition, a bicycle was present but not necessary for the TV’s operation. Results? Children who had to pedal to watch TV biked an average of an hour a week while the others biked an average of only eight minutes. The treatment group watched one hour of TV per week, while the controls watched 20 hours. Equally significant was the finding that the treatment group significantly decreased overall body fat.
Skinner’s ExperimentsSkinner’s experiments extend Thorndike’s thinking, especially his law of effect. This law states that rewarded behavior is likely to occur again. Yale University Library
Operant Chamber Using Thorndikes law of effect as a startingpoint, Skinner developed the Operant chamber, or the Skinner box, to study operant conditioning. Edition by Michael P. Domjan, 2005. Used with permission From The Essentials of Conditioning and Learning, 3rd by Thomson Learning, Wadsworth Division Walter Dawn/ Photo Researchers, Inc.
Operant Chamber The operant chamber, or Skinner box, comes with a bar or key thatan animal manipulates to obtain a reinforcerlike food or water. Thebar or key is connected to devices that recordthe animal’s response.
Edward Thorndike’s law of effect states that rewarded behavior is likely to recur. Using this as his starting point, Skinner explored the principles and conditions of learning through operant conditioning, in which behavior operates on the environment to produce rewarding or punishing stimuli. Skinner used an operant chamber (Skinner box) in his pioneering studies with rats and pigeons.In his experiments, Skinner used shaping, a procedure in which reinforcers, such as food, guide an animal’s natural behavior toward a desired behavior. By rewarding responses that are ever closer to the final desired behavior (successive approximations), and ignoring all other responses, researchers can gradually shape complex behaviors. Because nonverbal animals and babies can respond only to what they perceive, their reactions demonstrate which events they can discriminate.This rat knows the difference between classical and jazz—watch: http://www.youtube.com/watch?v=d3PrZeCmXd0
Shaping Shaping is the operant conditioning procedure inwhich reinforcers guide behavior towards the desiredtarget behavior through successive approximations. Khamis Ramadhan/ Panapress/ Getty Images Fred Bavendam/ Peter Arnold, Inc. A rat shaped to sniff mines. A manatee shaped to discriminate objects of different shapes, colors and sizes.
Positive Reinforcement of “Remote Controlled” RatsSanjir Talwar and colleagues at the State University of New York, Downstate Medical Center, Brooklyn, have reported a fascinating application of operant conditioning principles. The researchers implanted tiny stimulating electrodes into the brains of five rats and then used a laptop computer to guide them over obstacles and through mazes. “Our rats,” report the research team, “were easily guided through pipes and across elevated runways and ledges, and could be instructed to climb or jump.” They were even able to lead the rats over piles of rubble and through bright, open fields—an environment rats normally avoid. Such remote-controlled rats may eventually serve as “living robots” for land- mine detection and search-and-rescue missions after a disaster or terrorist attack. For example, a rat fitted with a microphone and video camera could be directed to where people are believed to be buried alive.How does this all work? The researchers planted electrodes in two regions of the rat’s brain: the somatosensory cortex, which receives signals when the rat’s whiskers brush against something, and the medial forebrain bundle, whose activation produces reward signals. A tiny electronic backpack on top of each rat took signals from the laptop that was up to 500 feet away. When the left somatosensory cortex was stimulated, the rat interpreted it as a signal that something had brushed its right whiskers and it immediately turned right. Similarly, activating the right somatosensory cortex made the rat turn left. After the rat made the correct turn, the researchers activated the electrode in the rat’s reward center, thereby delivering positive reinforcement.
Types of Reinforcers Reinforcement: Any event that strengthens the behavior it follows. A heat lamp positively reinforces a meerkat’s behavior in the cold. Reuters/ Corbishttp://www.youtube.com/watch?v=6DWbV5VKZxc
Primary & Secondary Reinforcers Primary Reinforcer: An innately reinforcing stimulus like food or drink. Conditioned Reinforcer: A learned reinforcer that gets its reinforcing power through association with the primary reinforcer. Money is a conditioned reinforcer. The actual paper bills are not themselves reinforcing. However, the paper bills can be used to acquire primary reinforcers such as food, water, and shelter. Therefore, the paper bills become reinforcers as a result of pairing them with the acquisition of food, water, and shelter.
Immediate & Delayed Reinforcers Immediate Reinforcer: A reinforcer that occurs instantly after a behavior. A rat gets a food pellet for a bar press. Delayed Reinforcer: A reinforcer that is delayed in time for a certain behavior. A paycheck that comes at the end of a week. We may be inclined to engage in small immediate reinforcers(watching TV) rather than large delayed reinforcers (getting an A in a course) which require consistent study.
Reinforcement Schedules• Continuous Reinforcement: Reinforces the desired response each time it occurs. Learning is rapid but so is extinction if rewards cease.• Partial Reinforcement: Reinforces a response only part of the time. Though this results in slower acquisition in the beginning, it shows greater resistance to extinction later on. Reinforcement schedules may vary according to the number of responses rewarded or the time gap between responses.
Ratio Schedules Fixed-ratio schedule: Reinforces a response only after a specified number of responses. e.g., piecework pay. Variable-ratio schedule: Reinforces a response after an unpredictable number of responses. This is hard to extinguish because of the unpredictability. (e.g., behaviors like gambling, fishing.)
Interval Schedules Fixed-interval schedule: Reinforces a response only after a specified time has elapsed. (e.g., preparing for an exam only when the exam draws close.) Variable-interval schedule: Reinforces a response at unpredictable time intervals, which produces slow, steady responses. (e.g., pop quiz.)
Schedules of Reinforcement
Superstitious BehaviorAccording to Skinner, a superstitious behavior is a response that is accidentally reinforced—that is, there is no prearranged contingency between the response and reinforcement. Because the behavior and reinforcement occur together, the behavior is repeated and, by chance, is again followed by reinforcement. This process may explain why we carry a half dollar as a good luck piece, wear the same slacks when taking tests, and step over cracks in the sidewalk.In one study, Skinner placed hungry pigeons in a Skinner box where food was presented for five seconds at regular intervals. The food was made available regardless of the pigeon’s behavior. Six of the eight pigeons exhibited “superstitious” behavior. One pigeon happened to be turning counterclockwise when the food was presented early in the experiment, and so it would reliably turn two or three times in a counterclockwise direction between reinforcements. A second bird received food after thrusting its head into one of the upper corners of the cage. Two other pigeons learned to swing their upper bodies in a pendulum motion.Skinner reported that a 15-second interval between reinforcements was ideal for the development of these superstitious behaviors. Longer intervals decreased the likelihood that the same behavior would occur at the time of the next reinforcement. Shorter intervals limited the number and kinds of behaviors that might precede reinforcement. In such cases, only the response “head lowered in front of the cup entrance (food dispenser)” was likely to be reinforced.http://www.youtube.com/watch?v=TtfQlkGwE2U&feature=related
PunishmentPunishment is an aversive event that decreases the behavior it follows.Punishment attempts to decrease the frequency of a behavior.Punishment administers an undesirable consequence, for example, spanking orwithdrawing something desirable, such as taking away a favorite toy. Negativereinforcement removes something undesirable (an annoying beeping sound) toincrease the frequency of a behavior (fastening a seatbelt).Punishment is not simply the logical opposite of reinforcement, for it can haveseveral undesirable side effects, including suppressing rather than changingunwanted behaviors, creating fear, and teaching aggression.
Punishment Although there may be some justification for occasional punishment (Larzelaere & Baumrind, 2002), it usually leads to negative effects.1. Results in unwanted fears.2. Conveys no information to the organism.3. Justifies pain to others.4. Causes unwanted behaviors to reappear in its absence.5. Causes aggression towards the agent.6. Causes one unwanted behavior to appear in place of another.
Extending Skinner’s UnderstandingSkinner believed in inner thought processes and biological underpinnings, but because many psychologists thought that he didn’t, they criticized him.
Cognition & Operant ConditioningEvidence of cognitive processes during operant learning comes from rats during a maze exploration in which they navigate the maze without an obvious reward. Rats seem to develop cognitive maps, or mental representations, of the layout of the maze (environment).
Latent Learning Such cognitive maps are based on latent learning, which becomes apparent only whenan incentive is given (Tolman & Honzik, 1930).
Operant Conditioning and CognitionRats exploring a maze seem to develop a mental representation (a cognitive map) of the maze even in the absence of reward. Their latent learning becomes evident only when there is some incentive to demonstrate it. Children, too, may learn from watching a parent, but demonstrate the learning much later when needed. The conclusion: There is more to learning than associating a response with a consequence. There is also cognition.
Mindful LearningEllen Langer’s distinction between mindful and mindless learning highlights the importance of cognitive processes in education.Langer argues that learning requires mindful engagement with the material in question. Mindfulness, she writes, is a “flexible state of mind in which we are actively engaged in the present, noticing new things, and sensitive to context.” Being mindful involves drawing novel distinctions and thereby avoiding mind sets that limit us. When we are in a state of mindlessness, “we act like automatons who have been programmed to act according to the sense our behavior made in the past, rather than the present.” Research findings over the past 25 years, notes Langer, suggest that mindfulness leads to increased competence, fewer accidents, and improved memory, creativity, and positive affect. Mindlessness, Langer argues, comes about through both repetition and single exposure. For example, if we repeat some task many times, we may come to establish a mind-set for performing it. We may drive a familiar route so often that finally the car seems to arrive at the destination by itself. Similarly, if we process information without questioning it, that is, without considering the alternative ways it could be understood, we take it in mindlessly. It will not occur to us to reconsider it. Our commitment to “one” understanding may later be to our disadvantage. Langer identifies three myths or mind-sets that detract from our ability to learn.
Myth 1 is that “the basics should be learned so well that they become second nature.” The problem is that if we learn the basics so well it will not occur to us to change them when we need to. In one study, Langer and her colleagues taught subjects a new sport, “smack-it ball,” in which the players wear a glovelike racket. Some were taught “this is how you play the game”; others were told, “here is how it could be played.” After all were well practiced, the researchers substituted a much heavier ball. Those who learned the game mindfully were better able to accommodate than those who took the basics for granted.Myth 2 is that “to pay attention to something, we should hold it still and focus on it.” Attending to a still image is difficult; it fades from view. However, attending to an image mindfully, noticing different things about it, is easy. In several studies, Langer’s research team asked participants to pay attention to a stimulus or to notice new things about it. Whether the subjects were elderly or children with attention problems, instructions to vary the target of attention improved performance. Not only is the task easier, but people remember more about the target of their attention and like it better.
Myth 3 is that “it is important to learn how to delay gratification.” The problem with this idea is that it suggests tasks are inherently good or bad. Evaluation resides in our minds, not in the tasks. Work and study are not negative. However, we often make them appear to be so. Langer and Sofia Snow asked subjects to evaluate the humor in cartoons, in some cases calling the task “work” and in other cases “play.” When they called it work, people tended to enjoy it less and their minds were more likely to wander. In other studies, people engaged in activities they did not like (viewing art, watching football). Some were led to engage the task the way they typically did, while others were asked to notice new things about it. The more the subjects noticed, the more they liked the task. Mindful learning engages people and the experience tends to be positive.
Research indicates that people may come to see rewards, rather than intrinsic interest, as the motivation for performing a task. Again, this finding demonstrates the importance of cognitive processing in learning. By undermining intrinsic motivation, the desire to perform a behavior for its own sake, rewards can carry hidden costs. Extrinsic motivation is the desire to perform a behavior because of promised rewards or threats of punishment. A person’s interest often survives when a reward is used neither to bribe nor to coerce but to signal a job well done.
Intrinsic MotivationIntrinsic Motivation:The desire to perform abehavior for its ownsake.Extrinsic Motivation:The desire to perform abehavior due topromised rewards orthreats of punishments.
Over-justification EffectExcessive rewards can undermine intrinsic motivation. Over- justification refers to the impact of promising a reward for doing what one already likes to do. The person comes to see the reward, rather than intrinsic interest, as the motivation for performing the task.Many people find the over-justification effect counterintuitive, and, as such, it shows that psychological research often goes beyond common sense. Do you think that preschool children who normally enjoy drawing and then (later on) receive recognition in the form of “good player” badges and honor- roll boards still enjoy it just as much?Pay close attention to the following synopsis of research by M. R. Lepper and colleagues:
Only preschoolers showing high interest in drawing during free playtime were selected for the research. The children were tested individually and assigned randomly to one of three conditions. In the expected reward condition, children were shown a good player badge and told that if they did a good job of drawing, they could earn a badge and have their names put on the school honor-roll board. All children in this condition received the expected rewards. In the unexpected reward condition, children were asked to draw without any mention of the awards. Unexpectedly, at the end of the drawing period, all of these children were given the awards. Finally, in the control condition, children were asked simply to draw without promise or presentation of the awards. After this task, children were observed back in the classroom during free playtime, and the amount of time they spent drawing was recorded.
What were the similarities and differences among the three conditions for reward expected and reward received? Expected reward condition, unexpected reward condition, & control condition Expected reward condition: children were shown a good player badge and told that if they did a good job of drawing, they could earn a badge and have their names put on the school honor-roll board. All children in this condition received the expected rewards. Unexpected reward condition: children were asked to draw without any mention of the awards. Unexpectedly, at the end of the drawing period, all of these children were given the awards. Control condition: children were asked simply to draw without promise or presentation of the awards.Predict how much time you think the children from each condition would spend drawing during the later free play period. Which condition drew more or less in comparison to each other?Your predictions may be quite different from the actual results:The correct prediction is that children from the expected-reward condition draw less than children from either the control or the unexpected-reward condition, with no significant differences between the latter two conditions.How did this already justifiable activity become over-justified by the promise of added reward?Interest can survive when rewards are used to communicate a job well done, not to bribe or to control.
Over-justification EffectPhilip Zimbardo relates the amusing story of Nunzi, a shoemaker and an Italian immigrant. Every day after school a gang of young American boys came to his shop to taunt and to tease. After attempting in a variety of ways to get the boys to stop, Nunzi hit upon the following solution.When the boys arrived the next day after school, he was in front of his store waving a fistful of dollar bills. “Don’t ask me why,” said Nunzi, “but I’ll give each of you a new dollar bill if you will shout at the top of your lungs 10 times: ‘Nunzi is a dirty Italian swine.’” Taking the money, the boys shouted the chants in unison. The next afternoon Nunzi successfully enticed the gang to repeat their taunts for a half dollar. On the third day, he had only a handful of dimes: “Business has not been good and I can only give you each 10 cents to repeat your marve-lous performance of yesterday.”“You must be crazy,” said the ringleader, “to think we would knock ourselves out screaming and cursing for a lousy dime.”“Yah,” said another. “We got better things to do with our time than to do favors for only a dime.” And away the boys went, never to bother Nunzi again.
Over-justification EffectDo rewards sometimes undermine motivation in adults? Many studies now show this to be so. In one experiment, adults who were paid to lose weight at first lost pounds faster than those who were not paid. When payments stopped, the paid subjects regained some of the lost weight, while those who had not been paid continued to lose. Similarly, rewards can cast a pall over romantic love. Dating couples were asked to think of either the extrinsic rewards (for example, “she/he knows a lot of people”) or the intrinsic rewards (for example, “we always have a good time together”) they obtained from going out with their partners. When later asked to state their feelings, the couples who had thought about the extrinsic rewards evaluated themselves as being less in love than did those who had thought about the intrinsic rewards.The simplest interpretation of these findings is that rewards lead people to think that an activity does not deserve doing in its own right. Why else would someone offer rewards? People therefore come to see the activity as a means rather than an end, and their actions come under the control of the extrinsic reward. When rewards are withdrawn, people judge the activity as no longer worth doing.
Over-justification EffectEdward Deci has argued that rewards do not inevitably undermine intrinsic motivation. He suggests that rewards—money, praise, gold stars, or candy bars—can be used in two ways: to control us or to inform us on how well we are doing in meeting the challenge of a particular task. When rewards are used to control or manipulate, they are likely to undermine intrinsic motivation. When they are used to inform, they may actually boost people’s feelings of competence and intrinsic motivation.Deci reports research findings in which teachers’ use of rewards had either a positive or negative impact on intrinsic motivation. Teachers who valued order and control in the classroom tended to use rewards as sanctions. Those who favored autonomy, encouraging the children to take responsibility for their actions, tended to use re- wards informationally. The former undermined intrinsic motivation, while the latter actually fostered it. In the Nunzi story, as well as in the other research examples, the recipients of the rewards probably viewed them as attempts to control rather than inform.
Over-justification EffectHow rewards are presented often determines whether children will see them as controlling or informative. In one study, children were offered prizes for playing with a drum. For one group the prize was in plain view. For the other group the prize was hidden, and the leaders made no further mention of it during the children’s perfor-- m-ance. Only the children with the reward in plain view showed a significant decrease in intrinsic motivation. Evidently, a clearly presented reward siphons attention away from enjoyment of the immediate task.Anticipated rewards thus seem to have more serious (and negative) consequences than unanticipated rewards. People are more likely to see the latter as giving them information about their performance, since the reward was not presented at the beginning as a bribe. Rather than emphasizing rewards from the outset to control a class or a child, perhaps teachers and parents might better use them occasionally as an unexpected bonus.
Biological PredispositionAs with classical conditioning, ananimal’s natural predispositionsconstrain its capacity for operantconditioning. Biologicalconstraints predispose organismsto learn associations that arenaturally adaptive.Breland and Breland (1961)showed that animals drift towardstheir biologically predisposed Photo: Bob Baileyinstinctive behaviors. Training thatattempts to override thesetendencies will probably notendure because the animals willrevert to their biologically Marian Breland Baileypredisposed patterns.
Skinner’s Legacy Skinner argued that behaviors were shaped byexternal influences instead of inner thoughts and feelings. Critics argued that Skinnerdehumanized people by neglecting their free will. Falk/ Photo Researchers, Inc .
Skinner has been criticized for repeatedly insisting that external influences, not internal thoughts and feelings, shape behavior and for urging the use of operant principles to control people’s behavior. Critics argue that he dehumanized people by neglecting their personal freedom and by seeking to control their actions. Skinner countered: People’s behavior is already controlled by external reinforcers, so why not administer those consequences for human betterment?Skinner argued that denial of the fact that we are controlled by our environment leaves us vulnerable to control by subtle and malignant circumstances and by malicious people. Governments and political leaders, he contended, may seek to control us for their own benefit rather than serve our best interest. Recognizing that behavior is shaped by its consequences is the first step in taking control of the environment and ensuring that it delivers consequences promoting desirable behavior. When we demand freedom, argued Skinner, what we really mean is freedom from aversive consequences and not freedom to make choices. In the final analysis, we can have “freedom” but only by arranging our own consequences and not by leaving it to “fate” or the “government.”
For Skinner, “dignity” was also an illusion. “We recognize a person’s dignity or worth,” he argued, “when we give him credit for what he has done.” We tend to do this when we are unable to readily identify the environmental factors that control another’s behavior. When a person makes an anonymous charitable donation, for example, we may attribute it to something inside the person, to his or her “altruism.” To credit people for doing good is to ignore the environmental factors that give rise to “good” behavior. Something in the person’s formative years has obviously shaped the desirable behavior. Only by identifying the external factors that gave rise to “doing good” can we bring them under control so that more people will do good more often. This movement toward a better society demands giving up the belief in “dignity.” Did Skinner practice what he preached? Yes, as you can see here: “And now my labor is over. I have had my lecture. I have no sense of fatherhood. If my genetic personal histories had been different, I should have come into possession of a different lecture. If I deserve any credit at all, it is simply for having served as a place in which certain processes could take place. I shall interpret your polite applause in that light.”
Applications of Operant ConditioningOperant principles have been applied in a variety of settings. For example, in schools, on-line testing systems and interactive student software embody the operant ideal of individualized shaping and immediate reinforcement. In the workplace, positive reinforcement for jobs well done has boosted employee productivity. At home, parents can reward their children’s desirable behaviors and not reward those that are undesirable. To reach our personal goals, we can monitor and reinforce our own desired behaviors and cut back on incentives as the behaviors become habitual.
Applications of Operant Conditioning Skinner introduced the concept of teachingmachines that shape learning in small steps and provide reinforcements for correct rewards. In School LWA-JDL/ Corbis
Applications of Operant ConditioningReinforcers affect productivity. Many companies now allow employees to share profits and participate in company ownership. At work
Applications of Operant Conditioning At HomeIn children, reinforcing good behavior increases the occurrence of these behaviors. Ignoring unwanted behavior decreases their occurrence.
Operant vs. Classical Conditioning
Both classical and operant conditioning are forms of associative learning. They both involve acquisition, extinction, spontaneous recovery, generalization, and discrimination. Both classical and operant conditioning are influenced by biological and cognitive predispositions. The two forms of learning differ in an important way. In classical conditioning, organisms associate different stimuli that they do not control and respond automatically. In operant conditioning, organisms associate their own behaviors with their consequences.