c

CONCURRENT SCHEDULES OF REINFORCEMENT



Introduction & Definition

The concept of concurrent schedules of reinforcement constitutes a fundamental experimental paradigm within the field of operant conditioning, providing critical insights into how organisms allocate behavior when faced with multiple, simultaneously available response options. This arrangement differs significantly from simple or multiple schedules, as it necessitates the subject’s continuous choice between two or more distinct reinforcement contingencies, each associated with a unique set of requirements for the delivery of a reinforcer. Essentially, a concurrent schedule is defined as a scenario where two or more unique reinforcement schedules, each correlated with an independent response requirement, are in effect at the exact same time, allowing the organism to switch freely between the available alternatives. The primary focus of research utilizing these schedules is to meticulously analyze the patterns of responding that emerge, aiming to understand the underlying behavioral mechanisms governing preference and choice, particularly how the distribution of responses aligns with the distribution of reinforcement obtained across the various options. This experimental setup is highly ecologically valid, mirroring real-world situations where an organism, human or animal, must constantly decide how to allocate its finite time and energy among competing activities that yield different probabilistic outcomes, such as choosing between two different jobs, two sources of food, or two methods of study, making the study of concurrent schedules pivotal for predictive behavioral science.

Crucially, the inherent structure of concurrent schedules demands that the experimental subject actively allocates their behavior across the available options, providing researchers with a robust methodology for studying choice behavior in a controlled environment. For instance, an animal in a laboratory setting might be presented with two levers (Lever A and Lever B), where pressing Lever A operates on a Variable Interval 30-second (VI 30s) schedule, while pressing Lever B operates on a Variable Interval 60-second (VI 60s) schedule. The schedules operate simultaneously and independently; the subject is free to switch between pressing Lever A and pressing Lever B at any moment. The resulting distribution of lever presses relative to the distribution of reinforcers received provides the core data for analysis. The theoretical framework guiding this analysis often revolves around the powerful quantitative principle known as the Matching Law, which predicts a proportional relationship between relative response rate and relative reinforcement rate. This relationship holds that the proportion of responses directed toward one option will approximately match the proportion of total reinforcement derived from that same option, serving as the cornerstone for understanding how organisms optimize their behavioral investment in environments offering simultaneous choices.

It is imperative to note the distinction highlighted in the original definition: concurrent schedules of reinforcement do not necessitate that parties within an experiment respond to all stimuli or schedules available. Subjects exhibit inherent selectivity, often allocating most responses to the option yielding the highest rate of reinforcement, or distributing responses based on the relative richness of the schedules. The behavioral output is a reflection of the organism’s continuous evaluation of the environment, a dynamic decision-making process where the momentary value and availability of reinforcement dictate the probability of a response switch or continuation. This contrasts sharply with procedures like multiple schedules, where different schedules are presented sequentially, signaled by distinct stimuli, and the subject is forced to complete the requirements of one before moving to the next. The simultaneous presentation and the freedom of choice are the defining characteristics of concurrent schedules, making them the primary tool for investigating the fundamental laws of choice and behavioral economics within the laboratory context, where researchers can rigorously manipulate variables such as reinforcer magnitude, quality, and delay.

Historical Context and Foundation in Operant Conditioning

The systematic investigation of choice behavior through concurrent schedules emerged directly from the broader framework of B.F. Skinner’s operant conditioning, which emphasized the relationship between behavior and its consequences. While Skinner and his early colleagues primarily focused on single-schedule control (e.g., Fixed Ratio, Variable Interval) to isolate fundamental principles of learning, it quickly became apparent that real-world behavior rarely occurs in isolation. Organisms are perpetually faced with choices between competing sources of reinforcement. The need to model these complex decision-making scenarios necessitated the development of experimental procedures that allowed for simultaneous alternatives, leading to the formal adoption and standardization of the concurrent schedule arrangement. Early experiments using these setups, often involving pigeons pecking at two keys, allowed researchers to move beyond simple response acquisition and extinction, delving into the more nuanced area of behavioral preference and resource allocation, paving the way for quantitative models of choice.

Prior to the detailed quantitative analyses provided by concurrent schedules, understanding choice was largely descriptive. However, the introduction of simultaneous alternatives allowed for precise measurement of relative preference. The foundation of this work was significantly advanced by the research of J. A. Nevin and particularly Richard Herrnstein, whose seminal work in the 1960s crystallized the quantitative relationship between behavior and reinforcement distribution. Herrnstein’s formulation of the Matching Law, discussed in detail below, elevated the analysis of concurrent schedules from a simple descriptive study to a powerful predictive science. This shift demonstrated that complex behavioral phenomena, such as preference and choice, could be accurately modeled using simple mathematical equations derived directly from empirical data collected under concurrent schedules, solidifying the importance of this specific experimental design in validating and extending operant principles into the domain of behavioral economics.

The theoretical underpinnings of concurrent schedules are rooted in the fundamental assumption that behavior is motivated by the maximization of reinforcement. Although organisms may not achieve perfect maximization in every moment, the overall pattern of behavior observed under concurrent schedules suggests a dynamic process of optimization. The subject constantly samples the environment, assessing the profitability of each available option. When a subject switches its response from Schedule A to Schedule B, it is effectively driven by the momentary perception that Schedule B currently offers a higher probability of reinforcement or a better overall payoff. This constant sampling and switching behavior ensures that, over time, the distribution of responses approximates the distribution of reinforcement, a principle that challenged earlier, non-quantitative theories of motivation and learning by providing a powerful, testable mathematical framework for behavioral allocation.

The Mechanics and Types of Concurrent Schedules

The core mechanism of concurrent schedules involves the independent operation of at least two distinct reinforcement contingencies. These contingencies are usually implemented using standard schedules of reinforcement, such as Variable Interval (VI), Fixed Interval (FI), Variable Ratio (VR), or Fixed Ratio (FR) schedules. While any combination is theoretically possible, research focusing on the Matching Law overwhelmingly utilizes combinations of VI schedules (e.g., Conc VI 30s VI 60s) because VI schedules generate steady, stable response rates that are largely free from the characteristic pausing associated with fixed schedules, thus providing cleaner data for analyzing choice allocation. The use of VI schedules ensures that reinforcement is always available stochastically, meaning that the organism cannot predict the exact moment of the next available reinforcer, forcing a continuous sampling strategy across both options.

A key structural feature necessary for the accurate assessment of choice under concurrent schedules is the implementation of the Changeover Delay (COD). Without a COD, subjects often engage in rapid, highly frequent switching between the two available options—a phenomenon known as “superstitious switching” or “pacing.” If a subject could switch rapidly without penalty, they might maximize reinforcement simply by ensuring they respond immediately after a reinforcer becomes available on either side, regardless of the effort required. The COD is a brief time period (e.g., 1.5 to 3 seconds) immediately following a switch from one schedule to the other, during which no reinforcement can be delivered, even if it is currently available on the newly chosen option. This delay effectively penalizes rapid switching, forcing the subject to commit to an option for a short period before being eligible for reinforcement. The COD ensures that the measured response allocation genuinely reflects preference for the schedule requirements, rather than an artifact of the physical ease of switching, yielding data that more accurately reflects the relative profitability of the two schedules being compared.

The experimental notation for concurrent schedules is standardized to indicate the schedules operating simultaneously. For example, a schedule designated as Conc FR 10 VI 60s indicates a concurrent schedule where one response option operates on a Fixed Ratio 10 schedule (requiring 10 responses for reinforcement) and the other operates on a Variable Interval 60-second schedule (reinforcement available on average every 60 seconds). Experimenters manipulate various parameters to study their effect on choice:

  • Reinforcement Rate: Manipulating the interval or ratio requirement (e.g., Conc VI 30s VI 120s).
  • Reinforcer Quality: Offering different types of rewards (e.g., food pellets versus sugar water).
  • Reinforcer Magnitude: Varying the amount of the reward (e.g., one pellet versus three pellets).

By systematically varying these parameters, researchers can build precise quantitative models predicting how changes in reinforcement characteristics translate into changes in behavioral allocation, providing rigorous empirical support for the principles governing choice and preference.

The Matching Law: Quantifying Choice Behavior

The most pivotal theoretical development stemming from the study of concurrent schedules is Richard Herrnstein’s Matching Law (1961). The Matching Law is a quantitative rule that describes the relationship between the relative rate of responding on a particular choice alternative and the relative rate of reinforcement obtained from that alternative. Mathematically, the basic Matching Law is expressed as: $R_1 / (R_1 + R_2) = r_1 / (r_1 + r_2)$. Where $R_1$ and $R_2$ represent the rates of responding on option 1 and option 2, respectively, and $r_1$ and $r_2$ represent the rates of reinforcement obtained from option 1 and option 2. In simpler terms, the proportion of responses directed toward Option 1 equals the proportion of total reinforcement derived from Option 1. This principle suggests that organisms distribute their behavior across simultaneous options in direct proportion to the payoff received from those options, demonstrating a remarkable sensitivity to relative reinforcement frequency.

While the basic form describes ideal, perfect matching, empirical data often necessitate the use of a more general, or “generalized,” matching equation which includes parameters to account for observed deviations: bias and sensitivity. The generalized Matching Law is typically expressed in logarithmic form: $log(R_1/R_2) = a cdot log(r_1/r_2) + log(c)$. In this equation, the parameter $a$ represents sensitivity (the slope of the logarithmic function), indicating how sensitive the organism’s behavior ratio is to changes in the reinforcement ratio. If $a$ equals 1, perfect matching occurs. If $a$ is less than 1, the organism exhibits undermatching, meaning the behavioral ratio is less extreme than the reinforcement ratio (e.g., they respond more equally than the reinforcement distribution dictates). The parameter $c$ represents bias (the intercept), indicating an inherent preference for one choice alternative over the other, regardless of the reinforcement rate, perhaps due to physical location, response effort, or intrinsic stimulus preference. Analyzing these parameters allows researchers to diagnose specific factors influencing choice beyond simple reinforcement rates.

The Matching Law has proven to be an extraordinarily robust empirical finding, applicable across a vast range of species, including pigeons, rats, monkeys, and humans, and across various behaviors, reinforcing its status as a fundamental principle of behavioral science. Its predictive power extends beyond simple laboratory choice tasks, influencing models used in areas such as educational instruction (allocating study time), clinical behavior modification (choosing between adaptive and maladaptive behaviors), and even economics (consumer choice). The law effectively operationalizes the idea that organisms are economical in their behavior, naturally allocating effort to maximize the rate of return on investment, thereby providing a powerful mathematical bridge between psychology and microeconomics.

Variations in Concurrent Schedule Design

While the standard concurrent schedule involves immediate choice between two concurrently operating schedules (Conc VI VI), advanced variations have been developed to study more complex decision-making processes, particularly those involving delayed consequences and commitments. The most notable variation is the Concurrent Chain Schedule. Unlike standard concurrent schedules where reinforcement is delivered immediately upon meeting the response requirement, concurrent chain schedules involve two distinct phases: the choice link and the terminal link. In the choice link, the subject chooses between two initial responses (e.g., pressing Key A or Key B). This choice does not yield immediate reinforcement but instead leads the subject into the terminal link, which is associated with a specific schedule of reinforcement (e.g., Terminal Link A might be a VI 30s schedule, and Terminal Link B might be a VI 60s schedule). Once in the terminal link, the subject is committed to that schedule until reinforcement is obtained.

The concurrent chain procedure is primarily used to measure preference for reinforcement schedules themselves, rather than just preference for immediate reinforcement rates. By separating the choice phase from the consequence phase, researchers can investigate how organisms value different future outcomes, even if the absolute rate of reinforcement is held constant. For example, a subject might prefer a schedule that offers a higher certainty of reinforcement (Fixed Interval) over one that offers a lower certainty (Variable Interval), even if the overall reinforcement rate is matched. Preference measured in the choice link of a concurrent chain schedule is often referred to as relative value or the measure of the effectiveness of the terminal schedule as a conditioned reinforcer, as the choice link response is reinforced only by the onset of the terminal link stimulus, which signals the availability of the primary reward.

Other variations include Concurrent Ratio Schedules (Conc FR FR or Conc VR VR), though these often present unique challenges. When two ratio schedules are run concurrently, the subject typically engages in exclusive responding, allocating 100% of responses to the schedule that yields the highest reinforcement rate per response (i.e., the smaller ratio requirement). This finding is known as the “breakdown of matching” under ratio schedules, as the organism switches to a simple maximization strategy rather than a proportional matching strategy. This divergence provides crucial information regarding the constraints on choice behavior, suggesting that when reinforcement is entirely response-dependent (ratio schedules), organisms are more likely to pursue the absolute maximum payoff, whereas when reinforcement is time-dependent (interval schedules), they exhibit the more subtle proportional allocation predicted by the Matching Law.

Applications in Behavior Analysis and Economics

The principles elucidated through concurrent schedules and the Matching Law have profound implications across diverse fields, extending the utility of operant conditioning beyond the laboratory. In Applied Behavior Analysis (ABA), concurrent schedules provide a powerful metaphor and methodology for understanding and modifying maladaptive behavior. Problem behaviors often exist in competition with alternative, adaptive behaviors. A child engaging in disruptive behavior (R1) might be receiving reinforcement ($r_1$) in the form of adult attention, while engaging in appropriate play ($R_2$) might receive less frequent or less potent reinforcement ($r_2$). By viewing this interaction through the lens of a concurrent schedule, practitioners can quantitatively assess the relative reinforcement rates maintaining the problem behavior versus the alternative behavior.

The therapeutic strategy derived from this analysis is often referred to as Differential Reinforcement of Alternative Behavior (DRA). The goal is not simply to punish the problem behavior, but to shift the relative reinforcement distribution ($r_1 / r_2$) by reducing the reinforcement for the problem behavior and dramatically increasing the reinforcement rate and quality for the desired alternative behavior. According to the Matching Law, increasing the relative profitability of the alternative behavior should automatically lead to a corresponding increase in the relative response rate for that alternative behavior, thereby effectively replacing the maladaptive response. This framework provides a rigorous, data-driven foundation for behavioral intervention planning, moving away from subjective assessments of motivation toward objective, measurable reinforcement contingency adjustments.

Furthermore, concurrent schedule research has heavily influenced the field of Behavioral Economics. The Matching Law serves as a descriptive model for consumer choice, resource allocation, and job performance. For instance, workers often allocate effort across tasks in proportion to the relative financial or social payoff of those tasks. Economists use concepts derived from concurrent schedules to understand phenomena such as multitasking, where an individual allocates time between two competing sources of productivity (e.g., work email versus creative project), where the perceived reinforcement rates (success, completion, salary) dictate the distribution of effort. The elegance of the Matching Law is its ability to model these complex human choices with the same fundamental principles observed in simple animal choice experiments, suggesting a universal principle of behavioral allocation driven by relative profitability.

Challenges, Deviations, and Further Research

While the Matching Law provides an excellent first-order approximation of choice behavior, it is not universally predictive, and researchers have dedicated significant effort to understanding the circumstances under which deviations occur. The two primary deviations observed, as mentioned previously, are undermatching and bias. Undermatching, where the sensitivity parameter ($a$) is less than 1, is the most common deviation in standard concurrent VI VI schedules. Hypotheses for undermatching often center on imperfect experimental control, particularly the effectiveness of the Changeover Delay (COD). If the COD is insufficient, the slight reinforcement advantage gained by switching may dilute the true difference in the reinforcement schedules, resulting in a more equal distribution of responses than predicted.

Bias ($c$ not equal to 1) occurs when the organism exhibits a consistent, disproportionate preference for one option regardless of the reinforcement schedule ratios. Sources of bias can be external or internal. External bias includes subtle physical differences in the experimental setup, such as one response key requiring slightly less effort or being closer to the food dispenser. Internal bias relates to the organism’s innate or learned preferences, such as a natural tendency to favor the left response key over the right response key. Identifying and quantifying bias is critical because it highlights factors other than objective reinforcement rate that influence choice, providing valuable insights into motivational variables that might be overlooked in simpler experimental designs.

Future research continues to extend the concurrent schedule paradigm to increasingly complex scenarios, such as three- or four-alternative concurrent schedules, or schedules involving punishment components (concurrent reinforcement and punishment schedules). The development of advanced mathematical models, such as the Molecular Matching Law (which focuses on moment-to-moment decisions rather than overall response rates) and models incorporating temporal discounting, seeks to refine the predictive power of the framework. Ultimately, the concurrent schedule of reinforcement remains the gold standard for studying choice in psychology, serving as a powerful, flexible, and essential tool for uncovering the fundamental laws by which organisms allocate their behavior in environments of simultaneous alternatives.