Stats 13

Lecture 10

Paired samples comparison

Guillaume Calmettes

Independent vs dependent groups design

So far, we have seen how to compare whole groups of individuals (ex: basketball wins with vs without a sell-out crowd, commuting times on steel vs carbon bike frame, performance score improvement with or without sleep deprivation, etc ...).
all of these studies used an independent groups design (no systematic connections relating individuals in one group to individuals in another group)

Today we will learn about a different study design based on pairs of units.
For many situations, such a paired design allow more focused comparisons.

What would you do?

How would you go about collecting your data for each of the following:

  • You want to compare grocery prices between Traders Joe's and Ralphs. Are prices different, on average?
  • You want to test “The Freshmen 15” theory. Do college students gain, on average, 15 pounds during their first year?

Paired samples

Paired samples (also called dependent samples) are samples in which natural or matched couplings occur. This generates a dataset in which each data point in one sample is uniquely paired to a data point in the second sample.

Two data sets are "paired" when the following one-to-one relationship exists between values in the two data sets:

  • There is one pair of response values for each observational units: each sample has the same number of data points
  • There is a built-in comparison: each data point in one sample is related to one, and only one, data point in the other sample

Paired samples

For a paired design, response values (explanatory groups) come in pairs, with one response value in the pair for each group.
The data pairing can come from:

  • matching similar individuals to create groups of two: this is referred as paired design using matching.
  • measuring the same individual (observational unit) twice, once under each condition (explanatory variable): this is referred as paired design using repeated measures

when a paired design is possible, you typically get more informative results because units within a pair tend to be similar to each other. When pairing is effective, differences within such each pair on the response variable tend to be due mainly to the explanatory variable.

Examples of paired samples

  • pre-test/post-test samples in which a factor is measured before and after an intervention
  • cross-over trials in which individuals are randomized to two treatments and then the same individuals are crossed-over to the alternative treatment
  • matched samples, in which individuals are matched on personal characteristics such as age and sex
  • duplicate measurements on the same biological samples
  • any circumstance in which each data point in one sample is uniquely matched to a data point in the second sample

Independent vs paired study design

Can you study with music blaring? Does the presence of lyrics hurt students' ability to focus on their work?
Response variable: Memorization game score (ex: number of words students can remember from a sheet).
Explanatory variable: music with or without lyrics

Independent study design

Randomly assign students to two separate groups:
- one group that will listen music with lyrics
- one group that will listen music without lyrics

Paired study design

Paired design using repeated measures.
Each student will play the memorization game twice:
- once listening music with lyrics
- once listening music without lyrics

Paired design using matching.
Another way to create pairs could use a pretest to rank students according to how many words they could memorize and create pairs of students with similar abilities.

Pairing and Random Assignment

Pairing often makes it easier to detect statistical significance.
Can we still have random assignment and make cause-and-effect conclusions in paired design?

In our memorizing with or without lyrics example, if we see significant improvement in performance:

  • is it attributable to the type of song (with/without lyrics)?
  • is it attributable to the order the songs (with/without lyrics) were played? ("experience" effect)

Randomly assign each person to which song they hear first: with lyrics first, or without. This cancels out any “experience” effect

Pairing and Observational studies

Can we use pairing in observational studies?

Yes!

ex: If you are interested in which test was more difficult in a course, the first or the second, compare the average difference in scores for each individual (this takes away the variability in general learning abilities).

Wetsuits and swimming speed

The 2008 Olympics were full of controversy about new swimsuits possibly providing unfair advantages to swimmers, leading to new international rules that came into effect January 1, 2010, regarding swimsuit coverage and material.

Can a wetsuit really make someone swim faster? How much faster?

Wetsuits and swimming speed

Twelve competitive swimmers and triathletes swam 1500m at maximum speed twice each, once wearing a wetsuit and once wearing a regular suit

The order of the trials was randomized

Maximum velocity (m/sec) recorded (one of several possible outcomes)

$H_0: \mu_\textrm{Wetsuit}-\mu_\textrm{No Wetsuit}=0$

$H_a: \mu_\textrm{Wetsuit}-\mu_\textrm{No Wetsuit}>0$
($H_a: \mu_\textrm{Wetsuit}-\mu_\textrm{No Wetsuit}\neq0$)

No Wetsuit Wetsuit
Mean 1.43 1.51
0.078

Independent samples analysis

NHST of means difference
(pooling the two samples, shuffling and splitting into two groups)

Using this naive method, we do not find any association between wearing a wetsuit and swimming speeds.

No Wetsuit Wetsuit
Mean 1.43 1.51
0.078

Independent samples analysis

However, taking a closer look at the data, we see that every single swimmer swam faster wearing the wetsuit! Surely this must provide conclusive evidence that swimmers are faster wearing a wetsuit.
What went wrong in our analysis?

We failed to take into account the paired structure of the data.

No Wetsuit Wetsuit
Mean 1.43 1.51
0.078

Wetsuits and swimming speed study

Wetsuit study:

Swimmers
Triathletes

Not surprisingly, there is a great deal of variability in the maximum velocities of the swimmers (individual variability among the observational units).

Because of all this variability, it is difficult to tell whether the difference in mean swim speed observed in the sample represents a real difference or is just due to random chance.

No Wetsuit Wetsuit
Mean 1.43 1.51
0.078

Paired samples analysis

The key to analyzing paired data is to work with the difference within each pair of data values rather than between the two original samples.
This helps us to eliminate the variability across different units (different swimmers) and instead focus on what we really care about: the difference between the values with and without a wetsuit. (By looking at this difference, the variability in general swimming ability is taken away).

Swimmer 1 2 3 4 5 6 7 8 9 10 11 12 $\bar{x}$
Wetsuit 1.57 1.47 1.42 1.35 1.22 1.75 1.64 1.57 1.56 1.53 1.49 1.51 1.507
No Wetsuit 1.49 1.37 1.35 1.27 1.12 1.64 1.59 1.52 1.50 1.45 1.44 1.41 1.429
Difference 0.08 0.10 0.07 0.08 0.10 0.11 0.05 0.05 0.06 0.08 0.05 0.10 0.078

Note: the mean of the sample of differences is equal to the difference of the sample means: $\bar{x}_\textrm{Difference} = \bar{x}_\textrm{Wetsuit}-\bar{x}_\textrm{No Wetsuit}$

Paired samples analysis

Null hypothesis: If wearing a wetsuit had no effect on swimming speed, then the speeds recorded for each swimmer could have been equally likely observed under either condition (with or without a wetsuit).
the sign (positive/negative) of each observed difference in speed between the two conditions could have equally likely been the opposite.

Swimmer 1 2 3 4 5 6 7 8 9 10 11 12
Wetsuit 1.57 1.47 1.42 1.35 1.22 1.75 1.64 1.57 1.56 1.53 1.49 1.51
No Wetsuit 1.49 1.37 1.35 1.27 1.12 1.64 1.59 1.52 1.50 1.45 1.44 1.41
Difference 0.08 0.10 0.07 0.08 0.10 0.11 0.05 0.05 0.06 0.08 0.05 0.10
-0.08 -0.10 -0.07 -0.08 -0.10 -0.11 -0.05 -0.05 -0.06 -0.08 -0.05 -0.10
Swimmer 1 2 3 4 5 6 7 8 9 10 11 12
Wetsuit 1.57 1.47 1.42 1.35 1.22 1.75 1.64 1.57 1.56 1.53 1.49 1.51
No Wetsuit 1.49 1.37 1.35 1.27 1.12 1.64 1.59 1.52 1.50 1.45 1.44 1.41
Difference 0.08 0.10 0.07 0.08 0.10 0.11 0.05 0.05 0.06 0.08 0.05 0.10
-0.08 -0.10 -0.07 -0.08 -0.10 -0.11 -0.05 -0.05 -0.06 -0.08 -0.05 -0.10

Resampling for paired data NHST

To obtain the sampling distribution of the paired differences (or ratio):

  1. Calculate the effect size (difference or ratio) of the explanatory variable on each pair
  2. Randomly assign the sign (or ratio order [1, r$^{-1}$]) for the effect sizes
  3. Calculate the mean (or median) with that new arrangement of signs (or ratio order)
  1. Repeat steps #2-3 10000 times (we obtain 10000 effect sizes that we could have observed if the explanatory variable didn't have any effect)

Paired samples analysis

Keeping the structure of the data intact (pairs) allows to filter out the variability accross pairs.

These data provide very convincing evidence that swimmers are faster on average when wearing wetsuits.
Because this was a randomized experiment, we have strong evidence that wetsuits cause swimmers to swim faster.

No Wetsuit Wetsuit
Mean 1.43 1.51
0.078

Wetsuits and swimming speed

The same data were analyzed in each case, but the conclusions reached were drastically different. It is very important to think about how the data were collected before proceeding with the analysis!

Studies leading to paired data are often more efficient (better able to detect differences between conditions) because you have controlled for a source of variation (individual variability among the observational units).

No Wetsuit Wetsuit
Mean 1.43 1.51
0.078

Wetsuits and swimming speed

Conclusions

  • With a $pvalue<0.0001$, we have very strong evidence against the null hypothesis and can conclude that wearing a wetwuit increases, on average, your maximum speed.
  • We can draw a cause-and-effect conclusion since the researcher used random assignment for the order the two time trials (with/without wetsuit) took place for each swimmer (if all the swimmers had all started with their "wetsuit trial", then we could have argued that the slower speeds recorded on the second attempt, without wetsuit, was due to fatigue).
  • The selected 12 swimmers were males/females, triathletes/swimmers, we can generalize to this specific population.

95% Confidence intervals

The differences (or ratios) are a one quantitative sample. We can extract the 95% CI by drawing bootstrap samples, as previously seen.

We are 95% confident that for competitive swimmers and triathletes, wetsuits increase maximum swimming velocity by an average of between 0.066 and 0.089 meters per second.

Theory based

Similar methodology than for a one quantitative sample (one-sample t-test).

Similar required validity conditions