Main article: Law of effect Operant conditioning, sometimes called instrumental learning, was first extensively studied by Edward L. Thorndike —who observed the behavior of cats trying to escape from home-made puzzle boxes.

With repeated trials ineffective responses occurred less frequently and successful responses occurred more frequently, so the cats escaped more and more quickly.

In short, some consequences strengthen behavior and some consequences weaken behavior. By plotting escape time against trial number Thorndike produced the first known animal learning curves through this procedure. That is, responses are retained when they lead to a successful outcome and discarded when they do not, or when they produce aversive effects. This usually happens without being planned by any "teacher", but operant conditioning has been used by parents in teaching their children for thousands of years.

Skinner[ edit ] Main article: B. Skinner B. Skinner at the Harvard Psychology Department, circa B. Skinner — is referred to as the Father of operant conditioning, and his work is frequently cited in connection with this topic. His book "The Behavior of Organisms: An Experimental Analysis", [6] initiated his lifelong study of operant conditioning and its application to human and animal behavior.

Following the ideas of Ernst Mach, Skinner rejected Thorndike's reference to unobservable mental states such as satisfaction, building his analysis on observable behavior and its equally observable consequences.

Operant conditioning, in his opinion, better described human behavior as it examined causes and effects of intentional behavior. To implement his empirical approach, Skinner invented the operant conditioning chamberor "Skinner Box", in which subjects such as pigeons and rats were isolated and could be exposed to carefully controlled stimuli.

Unlike Thorndike's puzzle box, this arrangement allowed the subject to make one or two simple, repeatable responses, and the rate of such responses became Skinner's primary behavioral measure. These records were the primary data fogyás automatikus válaszsor Skinner and his colleagues used to explore the effects on response rate of various reinforcement schedules.

Skinner defined new functional relationships such as "mands" and "tacts" to capture some essentials of language, but he introduced no new principles, treating verbal behavior like any other behavior controlled by its consequences, which included the behavior of the speaker's audience. Concepts and procedures[ edit ] Origins of operant behavior: operant variability[ edit ] Operant behavior is said to be "emitted"; that is, initially it is not elicited by any particular stimulus.

Thus one may ask why it happens in the first place. The answer to this question is like Darwin's answer to the question of the origin of a "new" bodily structure, namely, variation and selection. Similarly, the behavior of an individual varies from moment to moment, in such aspects as the specific motions involved, the amount of force applied, or the timing of the response.

Variations that lead to reinforcement are strengthened, and if reinforcement is consistent, the behavior tends to remain stable.

However, behavioral variability can itself be altered through the manipulation of certain variables. These terms are defined by their effect on behavior. Either may be positive or negative. Positive reinforcement and negative reinforcement increase the probability of a behavior that they follow, while positive punishment and negative punishment reduce the probability of behaviour that they follow.

Another procedure is called "extinction". Extinction occurs when a previously reinforced behavior is no longer reinforced with either positive or negative reinforcement.

During extinction the behavior becomes less probable.

Occasional reinforcement can lead to an even longer delay before behavior extinction due to the learning factor of repeated instances becoming necessary to get reinforcement, when compared with reinforcement being given at each opportunity before extinction. Positive reinforcement occurs when a behavior response is rewarding or the behavior is followed by another stimulus that is rewarding, increasing the frequency of that behavior.

This procedure is usually called simply reinforcement. Negative reinforcement a.

In the Skinner Box experiment, the aversive stimulus might be a loud noise continuously inside the box; negative reinforcement would happen when the rat presses a lever to turn off the noise. Positive punishment also referred to as "punishment by contingent stimulation" occurs when a behavior response is followed by an aversive stimulus.

Example: pain from a spankingwhich would often result in a decrease in that behavior. Positive punishment is a confusing term, so the procedure is usually referred to as "punishment". Negative punishment penalty also called "punishment by contingent withdrawal" occurs when a behavior response is followed by the removal of a stimulus.

Extinction occurs when a behavior response that had previously been reinforced is no longer effective.

Example: a rat is first given food many times for pressing a lever, until the experimenter no longer gives out food as a reward.

The rat would typically press the lever less often and then stop. The lever pressing would then be said to be "extinguished. Reinforcement, punishment, and extinction are not terms whose use is restricted to the laboratory.

Operant conditioning

Schedules of reinforcement[ edit ] Schedules of reinforcement are rules that control the delivery of reinforcement.

The rules specify either the time that reinforcement is to be made available, or the number of responses to be made, or both. The rules are possible, but the following are the most basic and commonly used [18] [9] Fixed interval schedule: Reinforcement occurs following the first response after a fixed time has elapsed after the previous reinforcement.

This schedule yields a "break-run" pattern of response; that is, after training on this schedule, the organism typically pauses after reinforcement, and then begins to respond rapidly as the time for the next reinforcement approaches.

Variable interval schedule: Reinforcement occurs following the first response after a variable time has elapsed from the previous reinforcement. This schedule typically yields a relatively steady rate of response that varies with the average time between reinforcements.

Fixed ratio schedule: Reinforcement occurs after a fixed number of responses have been emitted since the previous reinforcement. An organism trained on this schedule typically pauses for a while after a reinforcement and then responds at a high rate.

If the response requirement is low there may be no pause; if the response requirement is high the organism may quit responding altogether.

Variable ratio schedule: Reinforcement occurs after a variable number of responses have been emitted since the previous reinforcement. This schedule typically yields a very high, persistent rate of response.

Continuous reinforcement: Reinforcement occurs after each response. Organisms typically respond as rapidly as they can, given the time taken to obtain and consume reinforcement, until they are satiated. Factors that alter the effectiveness of reinforcement and punishment[ edit ] The effectiveness of reinforcement and punishment can be changed.

The opposite effect will occur if the individual becomes deprived of that stimulus: the effectiveness of a consequence will then increase.


If one gives a dog a treat for sitting within five seconds, the dog will learn faster than if the treat is given after thirty seconds. Learning may be slower if reinforcement is intermittent, that is, following only some instances of the same response. Responses reinforced intermittently are usually slower to extinguish than are responses that have always been reinforced. Humans and animals engage in cost-benefit analysis.

If a lever press brings ten food pellets, lever pressing may be learned more rapidly than if a press brings only one pellet. A pile of quarters from a slot machine may keep a gambler pulling the lever longer than a single quarter. Most of these factors serve biological functions.