different kind of coupon collector’s problem

The classic coupon collector’s problem asks for the number of samples it takes to collect all coupon types from a sequence of coupons in which each of the \(k\) types of coupons has an unlimited number of copies.

Here is a different kind of problem: if there are limited copies of each of the \(k\) coupon types (say, \(d\) copies), how many samples does it take to collect \(t\) coupon types?

By the pigeonhole principle, we have two extremes: from \(n=0\) to \(n=t-1\) samples, it is impossible to collect \(t\) coupon types; and for \(n>(t-1)d\) samples, it is impossible not to. For number of samples in between, there is some monotonically decreasing probability that \(t\) or more types are not collected. Let’s call the latter probability \(P(n)\). The expected number of samples it takes to collect \(t\) types is then

\(\sum_{n=1}^{(t-1)d} P(n)\)

Finding \(P(n)\) for \(n\ge t\) doesn’t appear easy, though the combinatorics could be written down (in theory), maybe involving partition functions, Stirling numbers, or whatever. But the question is reduced to the following, which somebody must have solved: given a \(d \times k\) 0-1 matrix, in which \(n\) 1′s are placed randomly, what is the probability that exactly \(z\) columns are 0-weight (have 0 column sums)?

These papers give some approximations:



Does anybody know the actual answer?

No comments yet. Be the first.

Leave a reply