# probability – Variance of sum of indicator random variables.

I’m reading some notes and there’s part that I’m not sure about.

Lemma 6.1. Given a sequence of random variables $$X_{1}, X_{2}, ldots, X_{n}$$, let $$X=sum_{i} X_{i} .$$ Then
$$operatorname{Var}(X)=sum_{i=1}^{n} operatorname{Var}left(X_{i}right)+sum_{i neq k} operatorname{Cov}left(X_{i}, X_{j}right)$$

When a random variable $$X$$ can be written as a sum of indicator random variable of a set of events $$mathcal{A} subset Sigma$$, that is
$$X=sum_{A in mathcal{A}} I_{A}$$
then it is possible to express the variance of $$X$$ in a simple way as a sum. We note that $$mathbb{E}left(I_{A}right)=mathbb{P}(A)$$, and that $$left(I_{A}right)^{2}=I_{A}$$ for all $$A$$, and so
$$operatorname{Var}left(I_{A}right)=mathbb{E}left(left(I_{A}right)^{2}right)-left(mathbb{E}left(I_{A}right)right)^{2}=mathbb{P}(A)-mathbb{P}(A)^{2}=mathbb{P}(A)(1-mathbb{P}(A))$$
Also, it is a simple check that
$$operatorname{Cov}left(I_{A}, I_{B}right)=mathbb{E}left(I_{A} I_{B}right)-mathbb{E}left(I_{A}right) mathbb{E}left(I_{B}right)=mathbb{P}(A cap B)-mathbb{P}(A) mathbb{P}(B)=mathbb{P}(A)(mathbb{P}(B mid A)-mathbb{P}(B))$$
Therefore by Lemma $$6.1$$ we see that
$$operatorname{Var}(X)=sum_{A in mathcal{A}} mathbb{P}(A)(1-mathbb{P}(A))+sum_{A neq B} mathbb{P}(A)(mathbb{P}(B mid A)-mathbb{P}(B))=sum_{A in mathcal{A}} mathbb{P}(A)left(sum_{B in mathcal{A}} mathbb{P}(B mid A)-mathbb{P}(B)right)$$

The last inequality, when written out, would contain terms like $$mathbb{P}(A)left( mathbb{P}(B|A) – mathbb{P}(B) – mathbb{P}(B cap A) + mathbb{P}(A)mathbb{P}(B) right)$$, so the last two summands must cancel each other out, which is only possible if $$A$$ and $$B$$ are independent. But do we know this as a fact?

I’m working with examples like the following:

For example, in a random graph, $$T$$ is the sum of the indicator random variables $$I_{{A text { is a triangle }}}$$ over $$A subset(n)$$ such that $$|A|=3 .$$ So to evaluate $$operatorname{Var}(T)$$ we can use the above sum.