### a problem of moments

We would like to prove the following fact:

For any non-negative random variable \(X\) having finite first and second moments, \(\mathbb P(X>0) \ge (\mathbb EX)^2/\mathbb EX^2\).

The proof isn’t difficult. Here are three different ones.

**Proof 1.** We already know from Jensen’s inequality that \(\mathbb E f(X) \ge f(\mathbb E X)\) if \(f\) is convex. This gives \((\mathbb EX)^2/\mathbb EX^2 \le 1\) for any \(X\). The trick to make it be \(\le \mathbb P(X>0)\) is to note that the density at \(X=0\) contributes nothing to any moment. In particular, if \(F_X(t)\) is the distribution function of \(X\), then define a random variable \(Y\) that is \(X\) without the probability at zero, that is, according to the distribution function \(F_Y(t)=(F_X(t)-F_X(0))/(1-F_X(0))\). Then Jensen’s gives \((\mathbb EY)^2/\mathbb EY^2 \le 1\). However, \(\mathbb EY = \mathbb EX / (1-F_X(0)) = \mathbb EX / \mathbb P(X>0)\), and \(\mathbb EY^2 = \mathbb EX^2 / (1-F_X(0)) = \mathbb EX^2 / \mathbb P(X>0)\), so \((\mathbb EX)^2/\mathbb EX^2 = [(\mathbb EY)^2 \mathbb P(X>0)^2] / [\mathbb EY^2 \mathbb P(X>0)] \le \mathbb P(X>0)\). \(\blacksquare\)

The statement would also work for non-positive \(X\), of course; and an analogous statement can be made for arbitrary \(X\) comparing \(\mathbb P(X\ne 0)\) with some combination of moments for the positive and negative parts of \(X\).

**Proof 2.** Apparently this problem can also be proved by an application of the Cauchy-Schwarz inequality. Assume the probability space \((\Omega, \mathcal F, \mathbb P)\). The space of finite second-moment real-valued random variables \(L_2(\Omega)=\{X(\omega):\Omega \to R\}\) with the inner product \(\langle X,Y\rangle_{L_2(\Omega)}=\mathbb E XY\) and induced norm \(\Vert X\Vert_{L_2(\Omega)}=\sqrt{\mathbb EX^2}\) is a Hilbert space (modulo \(L_2\) equivalence). Given this, let us apply Cauchy-Schwarz on the two random variables \(X\) and \(\mathbf 1_{X>0}\):

\((\mathbb E X \mathbf 1_{X>0})^2 \le \mathbb EX^2 \mathbb E\mathbf 1_{X>0}\), specializing to \(L_2(\Omega)\)

\((\mathbb E X)^2 \le \mathbb EX^2 \mathbb P(X>0)\), by noting that \(X = X \mathbf 1_{X>0}\). \(\blacksquare\)

This is a special case of something called the Paley-Zygmund inequality. I didn’t know such a thing existed.

**Proof 3.** This one only proves the discrete case. It is well known that for positive discrete random variables \(X\), \(\mathbb EX = \sum_{k=0}^\infty \mathbb P(X>k) = \mathbb P(X>0)+\mathbb P(X>1)+\cdots\). Basically \(\mathbb P(X=1)\) is counted once, \(\mathbb P(X=2)\) is counted twice, and so on. The analogous thing can be derived for \(\mathbb EX^2\), except now we need to count in squares. Happily we also know that squares accumulate by odd integers, i.e. \(n^2=1+3+5+\cdots+(2n-1)\), so \(\mathbb EX^2 = \sum_{k=0}^\infty \mathbb (2k+1) \mathbb P(X>k) = \mathbb P(X>0)+3\mathbb P(X>1)+5\mathbb P(X>2)+\cdots\).

Let’s simplify the notation a bit. Put \(q_k=\mathbb P(X>k)\), then \(q_0\ge q_1\ge q_2 \ge \cdots\). We just need to prove that \(q_0\ge (q_0+q_1+q_2+\cdots)^2 / (q_0+3q_1+5q_2+\cdots)\), which is to say, \((q_0+q_1+q_2+\cdots)^2 \le q_0(q_0+3q_1+5q_2+\cdots)\). The two sides both have limits, so this just requires some accounting. On the left hand side, \((q_0+q_1+q_2+\cdots)^2\) expands to \(q_0^2+(q_1^2+2q_0q_1)+(q_2^2+2q_0q_2+2q_1q_2)+\cdots = Q_0+Q_1+Q_2+\cdots\), where \(Q_k \triangleq q_k^2 + 2 \sum_{i=0}^{k-1} q_iq_k \le (2k+1)q_0q_k \triangleq R_k\). But \(R_0+R_1+R_2+\cdots\) is exactly the right hand side. So the left hand sum is dominated by the right hand sum. \(\blacksquare\)

With some real analysis, this proof could be made to work for random variables that are not discrete, but it might also turn into a special case of Proof 1. In any case, it’s interesting in its own right.