Guillaume Calmettes
So far, we have seen how to compare whole groups
of individuals (ex: basketball wins with vs without a sell-out crowd, commuting times on steel vs carbon bike frame, performance score improvement with or without sleep deprivation, etc ...).
all of these studies used an independent groups design (no systematic connections relating individuals in one group to individuals in another group)
Today we will learn about a different study design based on pairs of units.
For many situations, such a paired design allow more focused comparisons.
How would you go about collecting your data for each of the following:
Paired samples (also called dependent samples) are samples in which natural or matched couplings occur. This generates a dataset in which each data point in one sample is uniquely paired to a data point in the second sample.
Two data sets are "paired" when the following one-to-one relationship exists between values in the two data sets:
For a paired design, response values (explanatory groups)
come in pairs, with one response value in the pair for each group.
The data pairing can come from:
when a paired design is possible, you typically get more informative results because units within a pair tend to be similar to each other. When pairing is effective, differences within such each pair on the response variable tend to be due mainly to the explanatory variable.
Can you study with music blaring? Does the presence of lyrics hurt students' ability to focus on their work?
Response variable: Memorization game score (ex: number of words students can remember from a sheet).
Explanatory variable: music with or without lyrics
Independent study design
Randomly assign students to two separate groups:
- one group that will listen music with lyrics
- one group that will listen music without lyrics
Paired study design
Paired design using repeated measures.
Each student will play the memorization game twice:
- once listening music with lyrics
- once listening music without lyrics
Paired design using matching.
Another way to create pairs could use a pretest to rank students according to how many words they could memorize and create pairs of students with similar abilities.
Pairing often makes it easier to detect statistical significance.
Can we still have random assignment and make cause-and-effect conclusions in paired design?
In our memorizing with or without lyrics example, if we see significant improvement in performance:
Randomly assign each person to which song they hear first: with lyrics first, or without. This cancels out any “experience” effect
Can we use pairing in observational studies?
Yes!
ex: If you are interested in which test was more difficult in a course, the first or the second, compare the average difference in scores for each individual (this takes away the variability in general learning abilities).
The 2008 Olympics were full of controversy about new swimsuits possibly providing unfair advantages to swimmers, leading to new international rules that came into effect January 1, 2010, regarding swimsuit coverage and material.
Can a wetsuit really make someone swim faster? How much faster?
Twelve competitive swimmers and triathletes swam 1500m at maximum speed twice each, once wearing a wetsuit and once wearing a regular suit
The order of the trials was randomized
Maximum velocity (m/sec) recorded (one of several possible outcomes)
$H_0: \mu_\textrm{Wetsuit}-\mu_\textrm{No Wetsuit}=0$
$H_a: \mu_\textrm{Wetsuit}-\mu_\textrm{No Wetsuit}>0$
($H_a: \mu_\textrm{Wetsuit}-\mu_\textrm{No Wetsuit}\neq0$)
No Wetsuit | Wetsuit | |
Mean | 1.43 | 1.51 |
0.078 |
NHST of means difference
(pooling the two samples, shuffling and splitting into two groups)
Using this naive method, we do not find any association between wearing a wetsuit and swimming speeds.
No Wetsuit | Wetsuit | |
Mean | 1.43 | 1.51 |
0.078 |
However, taking a closer look at the data, we see that every single swimmer swam faster wearing the wetsuit!
Surely this must provide conclusive evidence that swimmers are faster wearing a wetsuit.
What went wrong in our analysis?
We failed to take into account the paired structure of the data.
No Wetsuit | Wetsuit | |
Mean | 1.43 | 1.51 |
0.078 |
Wetsuit study:
Swimmers
Triathletes
Not surprisingly, there is a great deal of variability in the maximum velocities of the swimmers (individual variability among the observational units).
Because of all this variability, it is difficult to tell whether the difference in mean swim speed observed in the sample represents a real difference or is just due to random chance.
No Wetsuit | Wetsuit | |
Mean | 1.43 | 1.51 |
0.078 |
The key to analyzing paired data is to work with the difference within each pair of data values rather than between the two original samples.
This helps us to eliminate the variability across different units (different swimmers) and instead focus on what we really care about: the difference between the values with and without a wetsuit.
(By looking at this difference, the variability in general swimming ability is taken away).
Swimmer | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | $\bar{x}$ |
Wetsuit | 1.57 | 1.47 | 1.42 | 1.35 | 1.22 | 1.75 | 1.64 | 1.57 | 1.56 | 1.53 | 1.49 | 1.51 | 1.507 |
No Wetsuit | 1.49 | 1.37 | 1.35 | 1.27 | 1.12 | 1.64 | 1.59 | 1.52 | 1.50 | 1.45 | 1.44 | 1.41 | 1.429 |
Difference | 0.08 | 0.10 | 0.07 | 0.08 | 0.10 | 0.11 | 0.05 | 0.05 | 0.06 | 0.08 | 0.05 | 0.10 | 0.078 |
Note: the mean of the sample of differences is equal to the difference of the sample means: $\bar{x}_\textrm{Difference} = \bar{x}_\textrm{Wetsuit}-\bar{x}_\textrm{No Wetsuit}$
Null hypothesis: If wearing a wetsuit had no effect on swimming speed, then the speeds recorded for each swimmer could have been equally likely observed under either condition (with or without a wetsuit).
the sign (positive/negative) of each observed difference in speed between the two conditions
could have equally likely been the opposite.
Swimmer | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 |
Wetsuit | 1.57 | 1.47 | 1.42 | 1.35 | 1.22 | 1.75 | 1.64 | 1.57 | 1.56 | 1.53 | 1.49 | 1.51 |
No Wetsuit | 1.49 | 1.37 | 1.35 | 1.27 | 1.12 | 1.64 | 1.59 | 1.52 | 1.50 | 1.45 | 1.44 | 1.41 |
Difference | 0.08 | 0.10 | 0.07 | 0.08 | 0.10 | 0.11 | 0.05 | 0.05 | 0.06 | 0.08 | 0.05 | 0.10 |
-0.08 | -0.10 | -0.07 | -0.08 | -0.10 | -0.11 | -0.05 | -0.05 | -0.06 | -0.08 | -0.05 | -0.10 |
Swimmer | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 |
Wetsuit | 1.57 | 1.47 | 1.42 | 1.35 | 1.22 | 1.75 | 1.64 | 1.57 | 1.56 | 1.53 | 1.49 | 1.51 |
No Wetsuit | 1.49 | 1.37 | 1.35 | 1.27 | 1.12 | 1.64 | 1.59 | 1.52 | 1.50 | 1.45 | 1.44 | 1.41 |
Difference | 0.08 | 0.10 | 0.07 | 0.08 | 0.10 | 0.11 | 0.05 | 0.05 | 0.06 | 0.08 | 0.05 | 0.10 |
-0.08 | -0.10 | -0.07 | -0.08 | -0.10 | -0.11 | -0.05 | -0.05 | -0.06 | -0.08 | -0.05 | -0.10 |
To obtain the sampling distribution of the paired differences (or ratio):
Keeping the structure of the data intact (pairs) allows to filter out the variability accross pairs.
These data provide very convincing evidence that swimmers are faster on average when wearing wetsuits.
Because this was a randomized experiment, we have strong evidence that
wetsuits cause swimmers to swim faster.
No Wetsuit | Wetsuit | |
Mean | 1.43 | 1.51 |
0.078 |
The same data were analyzed in each case, but the conclusions reached were drastically different. It is very important to think about how the data were collected before proceeding with the analysis!
Studies leading to paired data are often more efficient (better able to detect differences between conditions) because you have controlled for a source of variation (individual variability among the observational units).
No Wetsuit | Wetsuit | |
Mean | 1.43 | 1.51 |
0.078 |
Conclusions
The differences (or ratios) are a one quantitative sample. We can extract the 95% CI by drawing bootstrap samples, as previously seen.
We are 95% confident that for competitive swimmers and triathletes, wetsuits increase maximum swimming velocity by an average of between 0.066 and 0.089 meters per second.
Similar methodology than for a one quantitative sample (one-sample t-test).
Similar required validity conditions