<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Some stuff &#187; mathbb</title>
	<atom:link href="http://blog.yhuang.org/?feed=rss2&#038;tag=mathbb" rel="self" type="application/rss+xml" />
	<link>https://blog.yhuang.org</link>
	<description>here.</description>
	<lastBuildDate>Wed, 27 Aug 2025 08:50:58 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.1.1</generator>
		<item>
		<title>a problem of moments</title>
		<link>https://blog.yhuang.org/?p=934</link>
		<comments>https://blog.yhuang.org/?p=934#comments</comments>
		<pubDate>Sun, 07 Oct 2012 07:59:39 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[mathbb]]></category>
		<category><![CDATA[mathbf]]></category>
		<category><![CDATA[proof]]></category>
		<category><![CDATA[sum]]></category>
		<category><![CDATA[Vert]]></category>

		<guid isPermaLink="false">http://scripts.mit.edu/~zong/wpress/?p=934</guid>
		<description><![CDATA[We would like to prove the following fact: For any non-negative random variable having finite first and second moments, . The proof isn&#8217;t difficult. Here are three different ones. Proof 1. We already know from Jensen&#8217;s inequality that if is convex. This gives for any . The trick to make it be is to note [...]]]></description>
			<content:encoded><![CDATA[<p>We would like to prove the following fact:</p>
<p>For any non-negative random variable \(X\) having finite first and second moments, \(\mathbb P(X>0) \ge (\mathbb EX)^2/\mathbb EX^2\).</p>
<p>The proof isn&#8217;t difficult. Here are three different ones.<br />
<span id="more-934"></span><br />
<strong>Proof 1.</strong> We already know from Jensen&#8217;s inequality that \(\mathbb E f(X) \ge f(\mathbb E X)\) if \(f\) is convex. This gives \((\mathbb EX)^2/\mathbb EX^2 \le 1\) for any \(X\). The trick to make it be \(\le \mathbb P(X>0)\) is to note that the density at \(X=0\) contributes nothing to any moment. In particular, if \(F_X(t)\) is the distribution function of \(X\), then define a random variable \(Y\) that is \(X\) without the probability at zero, that is, according to the distribution function \(F_Y(t)=(F_X(t)-F_X(0))/(1-F_X(0))\). Then Jensen&#8217;s gives \((\mathbb EY)^2/\mathbb EY^2 \le 1\). However, \(\mathbb EY = \mathbb EX / (1-F_X(0)) = \mathbb EX / \mathbb P(X>0)\), and \(\mathbb EY^2 = \mathbb EX^2 / (1-F_X(0)) = \mathbb EX^2 / \mathbb P(X>0)\), so \((\mathbb EX)^2/\mathbb EX^2 = [(\mathbb EY)^2 \mathbb P(X>0)^2] / [\mathbb EY^2 \mathbb P(X>0)] \le \mathbb P(X>0)\). \(\blacksquare\)</p>
<p>The statement would also work for non-positive \(X\), of course; and an analogous statement can be made for arbitrary \(X\) comparing \(\mathbb P(X\ne 0)\) with some combination of moments for the positive and negative parts of \(X\).</p>
<p><strong>Proof 2.</strong> Apparently this problem can also be proved by an application of the Cauchy-Schwarz inequality. Assume the probability space \((\Omega, \mathcal F, \mathbb P)\). The space of finite second-moment real-valued random variables \(L_2(\Omega)=\{X(\omega):\Omega \to R\}\) with the inner product \(\langle X,Y\rangle_{L_2(\Omega)}=\mathbb E XY\) and induced norm \(\Vert X\Vert_{L_2(\Omega)}=\sqrt{\mathbb EX^2}\) is a Hilbert space (modulo \(L_2\) equivalence). Given this, let us apply Cauchy-Schwarz on the two random variables \(X\) and \(\mathbf 1_{X>0}\):</p>
\(\langle X, \mathbf 1_{X>0} \rangle^2 \le \Vert X \Vert^2 \Vert \mathbf 1_{X>0} \Vert^2\), by Cauchy-Schwarz<br />
\((\mathbb E X \mathbf 1_{X>0})^2 \le \mathbb EX^2 \mathbb E\mathbf 1_{X>0}\), specializing to \(L_2(\Omega)\)<br />
\((\mathbb E X)^2 \le \mathbb EX^2 \mathbb P(X>0)\), by noting that \(X = X \mathbf 1_{X>0}\).  \(\blacksquare\)
<p>This is a special case of something called the <a href="http://en.wikipedia.org/wiki/Paley%E2%80%93Zygmund_inequality">Paley-Zygmund inequality</a>. I didn&#8217;t know such a thing existed.</p>
<p><strong>Proof 3.</strong> This one only proves the discrete case. It is well known that for positive discrete random variables \(X\), \(\mathbb EX = \sum_{k=0}^\infty \mathbb P(X>k) = \mathbb P(X>0)+\mathbb P(X>1)+\cdots\). Basically \(\mathbb P(X=1)\) is counted once, \(\mathbb P(X=2)\) is counted twice, and so on. The analogous thing can be derived for \(\mathbb EX^2\), except now we need to count in squares. Happily we also know that squares accumulate by odd integers, i.e. \(n^2=1+3+5+\cdots+(2n-1)\), so \(\mathbb EX^2 = \sum_{k=0}^\infty \mathbb (2k+1) \mathbb P(X>k) = \mathbb P(X>0)+3\mathbb P(X>1)+5\mathbb P(X>2)+\cdots\).</p>
<p>Let&#8217;s simplify the notation a bit. Put \(q_k=\mathbb P(X>k)\), then \(q_0\ge q_1\ge q_2 \ge \cdots\). We just need to prove that \(q_0\ge (q_0+q_1+q_2+\cdots)^2 / (q_0+3q_1+5q_2+\cdots)\), which is to say, \((q_0+q_1+q_2+\cdots)^2 \le q_0(q_0+3q_1+5q_2+\cdots)\). The two sides both have limits, so this just requires some accounting. On the left hand side, \((q_0+q_1+q_2+\cdots)^2\) expands to \(q_0^2+(q_1^2+2q_0q_1)+(q_2^2+2q_0q_2+2q_1q_2)+\cdots = Q_0+Q_1+Q_2+\cdots\), where \(Q_k \triangleq q_k^2 + 2 \sum_{i=0}^{k-1} q_iq_k \le (2k+1)q_0q_k \triangleq R_k\). But \(R_0+R_1+R_2+\cdots\) is exactly the right hand side. So the left hand sum is dominated by the right hand sum.  \(\blacksquare\)</p>
<p>With some real analysis, this proof could be made to work for random variables that are not discrete, but it might also turn into a special case of Proof 1. In any case, it&#8217;s interesting in its own right.</p>
]]></content:encoded>
			<wfw:commentRss>https://blog.yhuang.org/?feed=rss2&#038;p=934</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>data structure problem</title>
		<link>https://blog.yhuang.org/?p=854</link>
		<comments>https://blog.yhuang.org/?p=854#comments</comments>
		<pubDate>Fri, 09 Mar 2012 04:50:41 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[entropy]]></category>
		<category><![CDATA[fraction]]></category>
		<category><![CDATA[mathbb]]></category>
		<category><![CDATA[query]]></category>
		<category><![CDATA[structure]]></category>

		<guid isPermaLink="false">http://scripts.mit.edu/~zong/wpress/?p=854</guid>
		<description><![CDATA[Another problem by fakalin. A data structure has the entropy bound if all queries have amortized time , where is the fraction of the time that key is queried. It has the working-set property if the time to search for an element is , where is the number of elements queried since the last access [...]]]></description>
			<content:encoded><![CDATA[<p>Another problem by fakalin.</p>
<blockquote><p>A data structure has the entropy bound if all queries have amortized time \(O(\sum_k p_k \log 1/p_k)\), where \(p_k\) is the fraction of the time that key \(k\) is queried. It has the working-set property if the time to search for an element \(x_i\) is \(O(\log t_i)\), where \(t_i\) is the number of elements queried since the last access to \(x_i\). Prove that the working-set property implies the entropy bound.</p></blockquote>
<p>This isn&#8217;t really a data structure problem, per se.<br />
<span id="more-854"></span><br />
The general intuition here is that, if the waiting time between two queries to a key \(k\) is \(t(k)\), then key \(k\) ends up taking up about a \(p_k = 1/t(k)\) fraction of the queries, and therefore the average query time is about \(\sum_{k\in K} 1/t(k) (\log t(k))\).</p>
<p>While this is exactly true for evenly spaced-out queries, the general case only requires a slight modification using any of the rudimentary convex inequalities such as:</p>
<p><strong>Jensen&#8217;s inequality:</strong> If \(f\) is a convex function and \(X\) is a random variable, then \(\mathbb{E}f(X)\ge f(\mathbb{E}X)\).</p>
<p>(The proof just uses the definition of what a convex function is. Furthermore, if we recognize that \(-\log x\) is a convex function, then we get \(\mathbb{E}\log(X)\le \log(\mathbb{E}X)\), which restates the well-known fact that the geometric mean is less than or equal to the arithmetic mean of a collection of real numbers.)</p>
<p>Now, let \(N\) be the total number of queries. Let \(n(k)\) be the number of queries on key \(k\). Let \(t_i(k)\) be the time between the \(i\)th query on key \(k\) and the previous query on the same key. Let \(\bar{a}(k)< C \sum_{i=1}^{n(k)} \log t_i(k) / n(k)\) (for some \(C>0\)) be the <em>average</em> query time looking up key \(k\), as guaranteed by the working-set property (*). Let \(\bar{t}(k) = \sum_{i=1}^{n(k)} t_i(k) / n(k)\) be the <em>average</em> time between queries for key \(k\).</p>
<p>Note that we must have \(\bar{t}(k) \le N/n(k)\). Furthermore, \(p_k = n(k)/N\) by definition, hence \(\bar{t}(k) \le 1/p_k\) (**). The average query time over all keys is therefore:</p>
<p>\(\sum_k \bar{a}(k) n(k) / N\)<br />
\(< \sum_k [C \sum_{i=1}^{n(k)} \log t_i(k) / n(k)] [n(k) / N]\), by (*)<br />
\(\le C \sum_k [\log \sum_{i=1}^{n(k)} t_i(k) / n(k)] [n(k) / N]\), by Jensen&#8217;s inequality<br />
\(= C \sum_k [\log \bar{t}(k)] p_k\)<br />
\(\le C \sum_k p_k \log 1/p_k\), by (**)</p>
<p>∎</p>
]]></content:encoded>
			<wfw:commentRss>https://blog.yhuang.org/?feed=rss2&#038;p=854</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>a polynomial problem</title>
		<link>https://blog.yhuang.org/?p=291</link>
		<comments>https://blog.yhuang.org/?p=291#comments</comments>
		<pubDate>Mon, 22 Nov 2010 21:52:13 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[answer]]></category>
		<category><![CDATA[expansion]]></category>
		<category><![CDATA[integer]]></category>
		<category><![CDATA[integer coefficients]]></category>
		<category><![CDATA[math math]]></category>
		<category><![CDATA[mathbb]]></category>
		<category><![CDATA[number]]></category>
		<category><![CDATA[number math]]></category>
		<category><![CDATA[polynomial coefficients]]></category>
		<category><![CDATA[positive integer]]></category>

		<guid isPermaLink="false">http://allegro.mit.edu/~zong/wpress/?p=291</guid>
		<description><![CDATA[The latest problem from fakalin. Took some wrong turns and hints to solve it&#8230; Given a polynomial with positive integer coefficients, how many evaluations of does it take to obtain the polynomial? (An polynomial with real coefficients would take the number of degrees plus 1 to specify, which, if it held in this case, would [...]]]></description>
			<content:encoded><![CDATA[<p>The latest problem from fakalin. Took some wrong turns and hints to solve it&#8230;</p>
<p>Given a polynomial \(f: \mathbb{Z}\to \mathbb{Z}\) with positive integer coefficients, how many evaluations of \(f\) does it take to obtain the polynomial?</p>
<p>(An \(f: \mathbb{R}\to \mathbb{R}\) polynomial with real coefficients would take the number of degrees plus 1 to specify, which, if it held in this case, would render the answer unbounded. But the correct answer in this case is surprisingly small.)<br />
<span id="more-291"></span><br />
It takes only 2 evaluations. Suppose in the following that \(b>0\). Let us note that a polynomial \(f(b) = a_n b^n + &#8230; + a_0 b^0\) specifies essentially a base \(b\) representation of the number \(f(b)\), in that \(a_n a_{n-1} &#8230; a_0\) is an expansion of \(f(b)\) in base \(b\). The only problem is this expansion is non-unique, as it is possible for any \(a_j \ge b\).</p>
<p>However, it is not possible for any \(a_j \ge f(b) + 1\), since for all \(j\), \(f(b) \ge a_j\) by the problem statement and assumption on \(b\). Then take \(B = f(b) + 1\). Now we are guaranteed that \(a_n a_{n-1} &#8230; a_0\) is the unique (and canonical) base \(B\) expansion of \(f(B)\), from which the polynomial coefficients immediately obtain.</p>
<p>So the two evaluations are at \(f(b)\) and \(f(B=f(b)+1)\).</p>
<p>Example: \(f(b) = 3b^2 + 2b + 1\). Evaluate at, e.g., \(b=1\) to get \(f(1) = 6\). Then evaluate at \(B=f(1)+1=7\) to get \(f(7)=162=321_{7}\).</p>
]]></content:encoded>
			<wfw:commentRss>https://blog.yhuang.org/?feed=rss2&#038;p=291</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
