# Bootstrap Resampling

## Data Description

(Adopted from an example by Ĺšaunak Sen)

Systolic blood pressure was measured in progeny from a backcross between two mouse strains. 50 (randomly chosen) mice were genotyped at the `D4Mit214` marker. We want to detect association between the `D4Mit214` marker genotype and blood pressure. The values show the systolic blood pressure (in mm of Hg) by the marker genotype, `BA` (heterozygous) or `BB` (homozygous).

## R code

This is the bare R code with number of replicates = `1,000` and α = `0.05`.

```# Heterozygous (BA)
a = c(86, 88, 89, 89, 92, 93, 94, 94, 94, 95, 95, 96, 96, 97, 97, 98, 98, 99, 99, 101, 106, 107, 110, 113, 116, 118)

# Homozygous (BB)
b = c(89, 90, 92, 93, 93, 96, 99, 99, 99, 102, 103, 104, 105, 106, 106, 107, 108, 108, 110, 110, 112, 114, 116, 116)

# Difference between means of observed datasets
diff.observed = mean(b) - mean(a)

# Level of significance
alpha = 0.05

# Number of replicates
n = 1000

# Difference between means of bootstrapped datasets (n replicates)
diff.bootstrap = NULL

for (i in 1 : n) {
# Sample with replacement
a.bootstrap = sample  (a, length(a), TRUE)
b.bootstrap = sample  (b, length(b), TRUE)

diff.bootstrap[i] = mean(b.bootstrap) - mean(a.bootstrap)
}

# Confidence interval
quantile(diff.bootstrap, c(alpha/2, 1 - alpha/2))```

## 1. Observed Samples

The vertical lines are means of `BA` and `BB`. The “observed” difference between the two means is about `4.75`.

## 3. Difference between Means of Bootstrapped Samples

Distribution of the differences between the bootstrapped datasets. The solid line is the observed difference. The dashed line is the mean of the bootstrapped differences.

## 5. Bootstrapped and Null Differences

The dark gray is the bootstrapped difference as above. The light gray is distribution of the differences under the null hypothesis, generated by shifting the bootstrapped differences by their mean.

## 6. P-value

P-value is estimated as the portion of the null (light gray) curve that is equal to or more extreme (the tail) than the “observed” (the solid vertical line) difference.