entropy of English

This video on model scaling highlights an interesting point, which is to use the best models (nowadays LLM’s) to estimate the entropy rate of the source, in this case, the English language. This isn’t a new idea at all and empirical entropy measurements have been done in the past.

What’s interesting is that past estimates of bits per word of English have been way, way higher. Shannon’s original estimates are 11.82 bits per word, for example, or 2.62 bits per letter.
(Read the article)

data structure problem

Another problem by fakalin.

A data structure has the entropy bound if all queries have amortized time \(O(\sum_k p_k \log 1/p_k)\), where \(p_k\) is the fraction of the time that key \(k\) is queried. It has the working-set property if the time to search for an element \(x_i\) is \(O(\log t_i)\), where \(t_i\) is the number of elements queried since the last access to \(x_i\). Prove that the working-set property implies the entropy bound.

This isn’t really a data structure problem, per se.
(Read the article)