## Data Description

*(Adopted from an example by Ĺšaunak Sen*)

Systolic blood pressure was measured in progeny from a backcross between two mouse
strains. 50 (randomly chosen) mice were genotyped at the `D4Mit214`

marker. We want to detect association between the
`D4Mit214`

marker genotype and blood pressure. The values show the systolic blood
pressure (in mm of Hg) by the marker genotype, `BA`

(heterozygous) or `BB`

(homozygous).

## Bootstrapping

## R code

This is the bare R code with number of replicates = `1,000`

and α = `0.05`

.

# Heterozygous (BA)
a = c(86, 88, 89, 89, 92, 93, 94, 94, 94, 95, 95, 96, 96, 97, 97, 98, 98, 99, 99, 101, 106, 107, 110, 113, 116, 118)
# Homozygous (BB)
b = c(89, 90, 92, 93, 93, 96, 99, 99, 99, 102, 103, 104, 105, 106, 106, 107, 108, 108, 110, 110, 112, 114, 116, 116)
# Difference between means of observed datasets
diff.observed = mean(b) - mean(a)
# Level of significance
alpha = 0.05
# Number of replicates
n = 1000
# Difference between means of bootstrapped datasets (n replicates)
diff.bootstrap = NULL
for (i in 1 : n) {
# Sample with replacement
a.bootstrap = sample (a, length(a), TRUE)
b.bootstrap = sample (b, length(b), TRUE)
diff.bootstrap[i] = mean(b.bootstrap) - mean(a.bootstrap)
}
# Confidence interval
quantile(diff.bootstrap, c(alpha/2, 1 - alpha/2))

See also Permutation Test.

## 1. Observed Samples

The vertical lines are means of `BA`

and `BB`

.
The “observed” difference between the two means is about `4.75`

.

## 2. Means of Bootstrapped Samples

## 3. Difference between Means of Bootstrapped Samples

Distribution of the differences between the bootstrapped datasets.
The solid line is the observed difference.
The dashed line is the mean of the bootstrapped differences.

## 4. Confidence Interval and Decision

## 5. Bootstrapped and Null Differences

The dark gray is the bootstrapped difference as above.
The light gray is distribution of the differences under the null hypothesis,
generated by shifting the bootstrapped differences by their mean.

## 6. *P*-value

*P*-value is estimated as the portion of the null (light gray)
curve that is equal to or more extreme (the tail) than the “observed”
(the solid vertical line) difference.