# algorithms – Probability that two specific elements are in reservoir sample

Let us study the event that $$x_a$$ and $$x_b$$ are both in the reservoir at the end of the algorithm, where $$k < a < b$$.

In order for this to occur four things must happen:

1. On the $$a$$th iteration the random number must be $$leq k$$.
2. Between the $$a$$th and $$b$$th iteration $$x_a$$ must not be replaced.
3. On the $$b$$th iteration the random number generator must be $$leq k$$ but also not equal the location of $$x_a$$. WLOG we say that the random number must be $$leq k-1$$.
4. For the remaining iterations neither $$x_a$$ or $$x_b$$ must be replaced.

This gives us the equation

$$p = frac{k}{a} cdot prod_{i=a+1}^{b-1}(1 – 1/i)cdot frac{k-1}{b}cdot prod_{i=b+1}^n(1-2/i).$$

Left as an exercise to the reader, this solves to
$$p = frac{k}{a} cdot frac{a}{b-1}cdot frac{k-1}{b}cdot frac{(b-1)b}{(n-1)n},$$
$$p = frac{(k-1)k}{(n-1)n}.$$

Thus the probability that $$x_a, x_b$$ are both in the sample is independent of $$a, b$$ when $$k < a < b$$. The case where $$k geq a$$ is once again left as an exercise.