What does it mean to stretch the ratio when it comes to dog training? First of all, it's important to understand how dogs are trained in the initial stages of learning. When it comes to dog training, positive reinforcement is a very powerful and effective method as it offers the win-win situation of strengthening desired behaviors while making the training process fun and rewarding without resorting to pain, punishment or intimidation. To better understand what stretching the ratio means, we will therefore have to take a peak at what happens to dogs during the initial stages of learning and how we can balance a reduced reliance on treats with keeping the dog motivated and happy.
The Power of Positive Reinforcement
When we apply positive reinforcement, we are basically adding a consequence that, from the dog's perspective, is rewarding enough so to entice him to want to repeat the behavior.
The power of positive reinforcement is that it results in behaviors increasing and strengthening.
So for example, if we are training our dog to sit and give him a treat every time his bottom touches the floor, with time and practice, we will see an increase of the sitting behavior.
When we provide our dogs with a reward for every desired response, we are using what is known as a Continuous Schedule of Reinforcement (CRF). This schedule is not limited to dog training. We can see plenty of examples of this happening in our everyday lives.
Every time we press the power button on our remotes, our T.V. it turns on (when the battery is not dead of course), every time we turn the notch of our gas stove, the burner lights up, every time we insert a dollar bill in the vending machine, it releases our favorite soda.
The Problems With Using CRF
While a continuous schedule works great initially when we first start training a new behavior, if we continue rewarding the dog all the time for every correct response we will eventually end up rewarding also below average responses. For example, when we reward our dog for sitting correctly all the time, most likely among those sits are also slow-to-respond sits, and we may expect even, sloppy sits (with the legs spread out to the side) to mix in every now and then.
By continuing to dole out treats for every single correct response we will be therefore removing opportunities for improvement and the quality of the behavior is affected. On top of that, the longer the dog is rewarded for every correct response, the harder it becomes to start phasing out all those rewards when a dog has relied on them for so long. This results in a dog who expects a reward every single time and risks getting frustrated when he doesn't get it.
" If you reward a dog for every correct response, approximately 50% of the time you will reward the dog for above–average responses and 50% of the time you will reward a dog for below average responses. It is simply too silly to reward a dog for below-average responses." Ian Dunbar
A Bit of a Stretch
Stretching the ratio is the procedure used to gradually increase the number of responses required for the dog to earn reinforcement (rewards that increase/strengthen behaviors). We don't want to phase out the food rewards completely, otherwise the behavior risks becoming extinct eventually disappearing from the dog's behavior repertoire.
So at some point, once the dog shows signs of responding at a steady rate, it's time to stretch the ratio and start working our way up from a continuous schedule to an intermittent one, where behavior is rewarded randomly on some occasions and not others, which works great for maintaining behavior and preventing it from becoming extinct.
This schedule indeed leads to permanence of the behavior. An intermittent schedule also works great for gradually thinning out those food rewards, so that the dog doesn't rely on them too much. Yes, gradually is the important keyword here!
"Stretching the ratio: gradually increasing the number of times a behavior must be performed to qualify for reinforcement. May produce ratio strain if done incorrectly." ~Science of Behavior
Preventing Ratio Strain
Just like an elastic band may break if you stretch it too much, your dog's behavior may start breaking apart if you stretch the ratio too much. Ratio strain is the technical term used to depict the phenomenon when a dog's pattern of responding begins disrupting because of stretching that ratio too much.
It's the classic cliche' seen in workplaces across the globe when workers start grumbling because they are overworked and underpaid.
So asking too much and giving a low rate of reinforcement frequency can cause problems that may lead to dogs getting too frustrated, showing displacement behaviors and giving up.
Just imagine what a person would do if the vending machine doesn't deliver the soda upon inserting the dollar bill. Most likely, he may try pushing the buttons and possibly even kicking the machine!
So to prevent this from happening, we can stretch the ratio very gradually, paying attention not to thin the reinforcement schedule too quickly and not making the increase in the reinforcement delivery ratio too high.
The process of stretching the ratio must therefore be very gradual as we're shaping persistence. We would therefore start by giving a treat to the dog for every successful sit at first (CRF), then as the dog responds at a steady rate, we can start giving the treat every other sit, then we can start rewarding randomly like the third sit, the second sit, the fifth sit, etc. This is a good time to start raising criteria, raising the bar and paying attention to what the dog does so we can start picking out only the best sits to reward, so that we improve quality. Once we have successfully stretched the ratio, we should see a dog who is on his toes and eager to work for that random reward, yes, just like a gambler playing the slots at Vegas!
"Casinos, believe me, use the power of the variable ratio schedule to develop behaviors, such as playing slot machines, that are very resistant to extinction, despite highly variable and unpredictable reinforcement."~Karen Pryor
An Up and Down Process
Moving from a continuous schedule to an intermittent one is not a clear cut process like turning on a light switch. For example, when your dog learns to sit reliably in your living room (like at least eight times out of ten,) you may start giving treats randomly, but then, once you're out in the yard, where there are more distractions around, your best bet is to move back to a continuous schedule temporarily until your dog responds reliably in spite of those distractions.
Also, when training a dog to perform a behavior when using shaping (a training method that entails rewarding successive approximations of the final behavior ) " you'll also find yourself rewarding continuously and then variably at times as you establish new criterion.
"Reinforcement may go from predictable to a little unpredictable back to predictable, as you climb, step by step, toward your ultimate goal...Marian Breland Bailey told me she called this a "shaping schedule." It's a natural part of the shaping process."~ Karen Pryor
Tip: If you couple giving a reward with praise (eg. good boy!), your dog will associate those words with something good, so that when you're not giving treats, praise will still have good value to communicate a job well done!
Did you know? Stretching the ratio is astutely used in gambling establishments. Card sharks will let you win frequently during the early stages of play and then once you're hooked, they'll stretch the ratio gradually and then start winning more and more of the games, explains Paul Chance in the book "Learning and Behavior."
Variable Ratio Reinforcement Schedule or Reinforcement Variety?
Other than gradually stretching the ratio, are there any other ways to prevent or reduce ration strain?
Recently, Ken Ramirez pointed out the use of "reinforcement variety." Turns out, what many trainers use is reinforcement variety rather than a variable ratio reinforcement schedule.
Reinforcement variety is preferable because it helps prevent the frustration associated with ratio strain and the process of moving to a variable schedule. What's the difference among the two? Let's take a closer look.
A variable ratio reinforcement schedule as already mentioned, entails reinforcing responses only some of the time. Mary Burch and Jon S. Bailey, in the book "How Dogs Learn" compare the unpredictability of reinforcement delivered, as seen in variable ratio schedule, to the way slot machines, fishing and the lottery work. This means no reinforcement at all is delivered at times and this can cause frustration, perhaps in part because dogs in their heart know they are performing correctly and therefore come to expect it.
However, reinforcement variety offers the opportunity to reinforce the dog at all times, only that the type of reinforcement varies from one time to another and doesn't always involve food. You can switch between different types of primary reinforcers that dogs are naturally drawn to such as food (chicken, hot dogs, freeze-dried liver) and natural activities the dog perceives as reinforcing (going out the yard to explore and exercise, playing with other dogs, chasing a tossed ball, sniffing a bush).
Some reinforcement substitutes can also be mixed in and these include several secondary reinforcers most of us are familiar with (praise, pats, belly rubs or even the opportunity to perform another behavior). Such reinforcers need to have a strong conditioning history consisting of being paired consistently with primary reinforcers before being used on their own and they also need to be maintained to preserve their reinforcing power.
Did you know? Although playing is a primary reinforcer, toys are considered secondary reinforcers because to a dog, a motionless ball is not rewarding in itself, but becomes valuable once it's associated with play (being wiggled, being tossed).
- Clicker Training, Extinction and Intermittent Reinforcement, retrieved from the web on Aug 10th, 2016
- Clicker Training, Reinforce Every Behavior? retrieved from the web on Aug 10th, 2016
- The Whole Dog Journal, Common Training Mistakes, retrieved from the web on Aug 10th, 2016
- Learning and Behavior: Active Learning Edition (PSY 361 Learning)6th Edition, by Paul Chance, Cengage Learning; 6 edition (February 22, 2008)