bounding overlaps

A Venn diagram gives a schematic view of joint counts on a set of \(n\) categories, e.g. \(c(S_1^n=s_1^n)\) where \(s_i\in\{0,1\}\). Each “patch” of the diagram corresponds to one of \(2^n\) possible values of \(s_1^n\).

If we have the total count \(C\triangleq \sum_{s_j: j\in \{1,…,n\}} c(S_1^n=s_1^n)\), then we can take the counts as probabilities by normalizing with \(p(S_1^n=s_1^n)=c(S_1^n=s_1^n)/C\).

Suppose we are given only singleton marginals \(p(S_i=1)\triangleq \sum_{s_j: j\in \{1,…,n\}\backslash i} p(S_1^n=s_1^n)\), then we can bound the other probabilities by imposing universal constraints on probabilities to be between 0 and 1.

To get the bounds, we solve a linear programming problem. The standard form is \(\min_x f’x \ \ \mathrm{s.t.}\ Ax \leq b\). Here, \(f\) is a length-\(2^n\) vector that indicates the particular marginal probability we want to bound, by selecting among the \(s_1^n\)-sequences. \(A\) is \([I_{2^n\times 2^n}; B; \mathbf{1}_{1\times 2^n}]\) where the columns correspond to all possible \(s_1^n\)-sequences, and where the rows of \(B\) are the \(n\) half-1 masks indicating the \(n\) singleton marginals. Finally, let \(u=[\mathbf{1}_{2^n\times1}; p(S_1=s_1);...;p(S_n=s_n); 1]\) and \(l=[\mathbf{0}_{2^n\times1}; p(S_1=s_1);...;p(S_n=s_n); 1]\).

Hence to get the lower bound on the desired marginal, we solve \(\min_x f’ x \ \ \mathrm{s.t.}\ A x \leq u\ \mathrm{and}\ -A x \leq -l\). Likewise, for the upper bound, \(-\min_x -f’ x \ \ \mathrm{s.t.}\ A x \leq u\ \mathrm{and}\ -A x \leq -l\).

For example, \(n=2\), \(p(S_1=1)=0.6\), \(p(S_2=1)=0.5\), then we can bound \(p(S_1=1,S_2=1)\) by \(0.1\le p(S_1=1,S_2=1)\le 0.5\).

Of course, we can add additional constraints when we know them. In some cases with categories, we have that \(p(S_1^n=0)=0\), that is, everything must belong to at least one category. Then it would appear the doublet marginal \(p(S_i=1,S_j=1)\) satisfies
\(\max \{0, p(S_i=1)+p(S_j=1)-1\}\le p(S_i=1,S_j=1)\)
\(\le \min \{(\sum_{i=1}^n p(S_i=1))-1, p(S_i=1), p(S_j=1)\}\).

No comments yet. Be the first.

Leave a reply