What is Operant Conditioning and How Does it Work?
Every morning, billions of people around the world get up early in the morning and beat the morning traffic to get to their places of work. If they could, many of these people would avoid this daily routine.
Many people hate having to get up at 6, prepare themselves in a rush and head out for work, yet all of them do it anyway, because they know that at the end of the month, they will receive a paycheck.
What would happen if these people stopped receiving the paycheck at the end of the month? How many people would get up every morning and go to work if they were not promised a salary at the end of the month?
Very few, and the reason behind this is because of something known as operant conditioning. Going to work 5 days a week in exchange for a salary is a great example of operant conditioning.
WHAT IS OPERANT CONDITIONING?
Operant conditioning, also referred to as Skinnerian conditioning or instrumental conditioning, is a learning method where desired and voluntary behavior is taught through the use of positive and negative incentives.
Through a system of rewards and punishments, individuals make an association between a specific behavior and the consequences of the behavior. The association of the behavior with a reward or punishment leads to a modification in the strength of the behavior.
Operant conditioning is not something new. We can all point out numerous examples of how rewards and punishments have shaped our own behavior. Growing up, we tried a number of behaviors and then learned whether they are good or bad behaviors from their consequences.
Operant conditioning plays a very powerful role in everyday learning, and we see the principles of operant conditioning at play almost every day.
Below are some examples of how a system of rewards and punishment are used to influence behavior on a daily basis:
- A sales person receiving a bonus for hitting his targets. The bonus acts as a reward, encouraging the sales person to continue hitting his targets.
- A parent giving a child a prize for excellent grades to encourage the child to continue performing well in school.
- An employee who is habitually late to work is scolded by the boss, leading to a decrease in the behavior.
- A student who remains in detention because of playing truant is likely to stop the behavior.
- Giving customers redeemable loyalty points for shopping at a specific store increases their likelihood of shopping at the same store.
Operant conditioning is based on three main assumptions. The first assumption is that any action leads to an experience that is a direct consequence of the action.
The second assumption is that the perceived quality of the consequences of an action influences the likelihood of the action being repeated.
The final assumption is that behavior is mainly influenced by external, rather than internal factors.
HOW DID OPERANT CONDITIONING COME ABOUT
The concept of operant conditioning was first put forward by B. F. Skinner, an American psychologist, behaviorist and social philosopher. The term Skinnerian conditioning is a reference to his name. At the turn of the 20th century, psychologists had grown very interested in behaviorism.
Already, the concept of classical conditioning, had been proposed. Behaviorists who subscribed to the classical conditioning concept believed that learning was a mental and emotional process. They believed that the best way of studying behavior and learning was by looking at the internal thoughts and motivations of an individual.
While Skinner did not deny that the fact that internal thoughts and motivations have an influence on behavior, he thought that viewing them as the key drivers of behavior was too simplistic to explain complex human behavior. Skinner theorized that the best way of understanding learning and human behavior was to look at an individual’s actions and the consequences of these actions.
In explaining his theory, Skinner came up with the term “operant conditioning.” Skinner defined an operant as any active and deliberate behavior that led to a consequence. Skinner’s theory of operant conditioning borrowed heavily from Edward Thorndike’s Law of Effect.
Thorndike’s principle stated that actions that lead to favorable outcomes have a higher probability of being repeated. On the other hand, actions that lead to unfavorable outcomes are less likely to be repeated.
Operant conditioning is based on an equally simple premise. Actions that are reinforced will be strengthened and are more likely to be repeated in future. For example, if you take some risks at work and your boss praises you for your courage, you are more likely to take another risk in future.
If you purchase from a particular store and they give you a discount, you are likely to shop from the same store again in future. In this case, receiving praise from your boss and receiving a discount from the store are positive reinforcements that encourage your behavior. The outcomes of your actions were desirable, thus strengthening the preceding actions.
Some actions, on the other hand, lead to undesirable consequences or punishment.
Such actions are weakened and are less likely to be repeated. If you took a risk at work and your boss scolded you for acting without running things through him, you will be less likely to take another risk at work.
Similarly, if you shop from a particular store and you later realize they sold you a low quality product, you are less likely to shop from them in future. In this case, the scolding from your boss and the poor quality product are undesired outcomes or punishments.
To test his theory, Skinner invented the operant conditioning chamber, also known as the Skinner box, which he used to conduct experiments using animals. The operant conditioning chamber allowed Skinner to isolate small animals, such as rats and pigeons, and then expose them to carefully controlled stimuli.
Skinner also came up with another invention known as the cumulative recorder, which allowed him to keep a record of the response rates (the number of times an animal pressed a key or bar inside the Skinner box).
HOW OPERANT CONDITIONING WORKS
Skinner stated that individuals (both humans and animals) display two key types of behaviors. The first type is known as respondent behaviors. Respondent behavior refers to actions automatically and on reflex. You don’t need any learning in order to display respondent behavior.
A good example of respondent behavior occurs when you touch something hot. Without thinking about it, you immediately draw your hand back from the hot surface.
Pavlov’s classic experiments with dogs is another great example of respondent behavior. Dogs automatically and involuntarily salivate to the presentation of food. By ringing a bell every time before presenting food to his dogs, Pavlov formed an association between the ringing of a bell and the presentation of food, and his dogs learned to salivate when they simply heard a bell, even if no food was presented.
Skinner noted that classical conditioning was good at explaining how respondent behaviors affected learning. However, not all learning is based on respondent behaviors. According to Skinner, the greatest learning came from voluntary actions and their consequences.
The second type of behaviors that Skinner identified are known as operant behaviors. Skinner defined operant behaviors as voluntary behaviors that act upon the environment resulting in a consequence.
Unlike respondent behaviors, operant behaviors are under our conscious control, and can be learned voluntarily. According to Skinner, the outcomes of our actions have a major impact on the process of learning operant behaviors.
COMPONENTS OF OPERANT CONDITIONING
We noted earlier that operant conditioning is based on two major factors: reinforcement and punishment. Let us take a look at these two factors.
Reinforcement refers to any environmental consequence to an action that increases the likelihood of the action being repeated. Reinforcement strengthens behavior. There are two types of reinforcement:
Positive reinforcement: This refers to consequences where a favorable event or outcome is added following a certain behavior, leading to the strengthening of the behavior. For example, when you go the extra mile and receive praise from your boss, this is an example of positive reinforcement.
To show how positive reinforcement works, Skinner placed a hungry rat in the operant conditioning chamber. In one side of the chamber was a lever that dropped food pellets into the chamber when pressed. As the rat moved around the box, at one point it would accidently press the lever, resulting in a pellet of food being dropped into the chamber immediately.
Over time, the rat would learn that pressing the lever led to food being released, and it quickly learned to go directly to the lever whenever it was placed in the chamber. Receiving food every time it pressed the lever acted as positive reinforcement, ensuring that the rat would keep pressing the lever again and again.
Negative reinforcement: This refers to consequences where an unfavorable event or outcome is removed following a certain behavior. In this case, the behavior is strengthened not by the desire to get something good, but rather by the desire to get out of an unpleasant condition.
A good example of negative reinforcement is a teacher promising to exempt students who have perfect attendance from the final test. The test is something unpleasant for the students, but if they display certain behavior (perfect attendance), they won’t have to sit the test. This encourages them to attend all classes.
Such responses are referred to as negative reinforcement because the removal of the unfavorable event or outcome is rewarding to the individual. While they have not actually received anything, not sitting a test can still be seen as a reward.
To show how negative reinforcement works, Skinner placed a rat in the operant conditioning chamber and then delivered an unpleasant electric through the floor of the chamber. As the rat moved about in discomfort, it would accidently knock the lever, switching off the electric current immediately.
Over time, the rat learns that it can escape from the unpleasant electric current by pressing the lever, and it starts going directly to the lever every time the current is switched on.
Punishment refers to any adverse or unwanted environmental consequence to an action that reduces the probability of the action being repeated. In other words, punishment weakens behavior. There are two types of punishment:
Positive punishment: This refers to consequences where an unfavorable or unpleasant event or outcome is presented or applied following a certain behavior in order to discourage the behavior.
For instance, when you get fined for a traffic infraction, that is an example of positive punishment. An unfavorable outcome (payment of the fine) is applied to discourage you from committing the infraction again.
Negative punishment: This refers to consequences where a favorable or pleasant outcome is removed following a certain behavior. This can also be referred to as punishment by removal. An example of negative punishment is where a parent denies a child the opportunity to watch television following misbehavior by the child.
Sometimes, it can be challenging to distinguish between punishment and negative reinforcement. What you need to remember is that reinforcement (both positive and negative) is meant to strengthen behavior, while punishment is used to weaken behavior.
It is also good to note that reinforcement is a more effective in effecting behavior change compared to punishment for a number of reasons. These include:
- Punishment merely suppresses behavior. The behavior is not forgotten, and once the punishment is no longer present, the behavior might return.
- Punishment does not necessarily lead to desired behavior. It only discourages unwanted behavior.
- Punishment can lead to increased aggression – it teaches the individual that aggression is an acceptable way of dealing with problems.
- Punishment leads to fear, which can lead to other unwanted behavior. For instance, spanking a child for not performing well can lead to fear of school.
Apart from reinforcement and punishment, behaviorists also discovered that operant conditioning is also influenced by reinforcement schedules.
Reinforcement schedule refers to the rules that determine when and how often behavior reinforcements are delivered.
Reinforcement schedules have an impact on how quickly behaviors are learned and the strength of the acquired behavior.
There are several different delivery schedules that can be used to influence the operant conditioning process. These include:
Continuous reinforcement: This is a schedule where a reinforcement is immediately delivered every time a response occurs. For instance, a food pellet is dropped immediately every time the lever is pressed. With continuous reinforcement, new behaviors are learned relatively quickly.
However, the response rate (the rate at which the rat presses the lever) is quite low. The learned behavior is also forgotten very quickly once reinforcement stops.
Fixed ratio reinforcement: This is a schedule where the reinforcement is delivered only after a behavior or response has occurred a specified number of times. For instance, a pellet of food is released every fifth time the rat presses the lever. With fixed ratio schedules, the response rate as well as the extinction rate (the rate at which the learned behavior is forgotten) is medium.
Fixed interval reinforcement: This refers to a schedule where reinforcement is delivered after a specified interval of time, provided the correct response has been made at least once. The response rate is medium, though the responses tend to increase as the interval approaches and slow down following the delivery of the reinforcement.
Variable ratio reinforcement: This refers to a reinforcement schedule where reinforcement is delivered after an unpredictable number of responses. A good example of variable ratio reinforcement is gambling. Variable ratio reinforcement results in a very high response rate and a very slow extinction rate. This explains why gambling becomes addictive.
Variable interval reinforcement: This refers to a reinforcement schedule where reinforcement is delivered after an unpredictable interval of time has elapsed, provided the correct response has been made at least once. Variable ratio reinforcement also results in a very high response rate and a very slow extinction rate.
Apart from reinforcement schedules, there are a few other factors that influence the effectiveness of reinforcement and punishment. These include:
Satiation/Deprivation: Reinforcements lead to behavior change because of the individual’s craving for the reward/reinforcement. However, if the individual has received enough of the reward to satiate his or her craving, the individual will be less inclined to display the desired behavior.
When the individual has been deprived of the reward, on the other hand, the effectiveness of the reinforcement will be increased due to the increased craving for the reward. This explains why Skinner used hungry rats in his experiments.
Immediacy: Learning occurs faster when the consequence (reinforcement or punishment) is delivered immediately after an action or behavior. The more the consequence is delayed, the more ineffective it becomes.
Consistency: Reinforcements that are consistently delivered following every correct response lead to faster learning times. Intermittent delivery of reinforcements leads to slower learning, but then the learned behavior is harder to extinguish compared to when reinforcements are consistently delivered after each correct response.
Size: The amount of reinforcement or punishment also has an effect on the effectiveness of the consequence. When the reward is too little, it might not seem worthwhile to go through a lot of effort displaying the desired behavior for such a small reward.
Similarly, when the punishment is too small, the benefits of engaging in the unwanted behavior might outweigh the discomfort of experiencing the punishment.
APPLICATIONS OF OPERANT CONDITIONING AT THE WORKPLACE
Operant conditioning can be applied at the workplace in various ways, from instituting corporate culture and addressing interactions between employees to helping an organization achieve its annual targets.
Below are some ways operant conditioning can be useful at the workplace:
Positive reinforcement, one of the key components of operant conditioning, can be used to increase productivity at the workplace.
Providing employees with positive reinforcement – through verbal praise and through incentives such as bonuses, generous perks and pay increases can motivate employees to work harder, leading to increased productivity for the entire organization.
Company culture is very important. It affects everything, from employee satisfaction to performance and how your organization is perceived in the media.
To cultivate a great company culture, managers should identify the behaviors that need to be encouraged within the workplace and those that need to be discouraged.
They can then come up with a system of rewards and punishments that are in line with the company’s desired culture.
Having your employees work in teams is a great way of harnessing the benefits of both reinforcement and punishment. Working in teams can help your employees cover each other’s weaknesses and achieve their targets, helping them receive praise or promotions (reinforcement).
At the same time, if certain members of the team are not working as hard as they should, they will incur negative backlash (punishment) from their team members, thus discouraging them from slacking off in future.
This way, working as a team provides reinforcement for good performance and hard work and at the same time provides punishment for those who go against the grain.
Using Sales as a Reward
Reinforcement is also commonly used to boost performance in sales departments. Many businesses provide bonuses for sales people who hit their targets. The bonus acts as positive reinforcement for achieving their targets.
This motivates the sales people to learn everything they need to do in order to close more sales, hit their targets and get the bonus.
Skinner’s theory of operant conditioning has been an important tool in helping psychologists understand how individuals learn and modify their behavior. The theory surmises that our environment and its reactions to our actions has a major influence on our behavior.
Skinner’s theory of operant conditioning is something we use in daily life to either encourage behavior – by providing reinforcements – or to discourage behavior – by meting out punishment.
You can see examples of operant conditioning in various spheres of daily life, from teaching your children good behavior and pet training to encouraging good performance at work and teaching good discipline in the military.