3
$\begingroup$

I have the following problem: given an $m \times n$ binary matrix $A$ like e.g. the following $9 \times 39$ matrix:

111000011100001101000001001111111111000
111000011100001101000000101111111111000
111111111111110010010100000011100000000
000111100011111100110000001100011111010
111111111111110010010010000011100000000
111000011100001101000000011111111111000
000111100011111100110000001100011111100
111111111111110010011000000011100000000
000111100011111100110000001100011111001

I want to find the maximum value of the product $(1+|C_1|)(1+|C_2|)$, where $C_1$ and $C_2$ are any two disjoint subsets of the column indexes of $A$, such that for every $i \in C_1$ and every $j \in C_2$ and for every $1 \le k \le m$, either $a_{ki} = 1$ or $a_{kj} = 1$. It is not required that $C_1 \cup C_2 = [n] = \{1, \ldots, n\}$.

For example, if we permute rows and columns of the above matrix we get:

111111111111111111000000000100100000000
111111111111111111000000000100010000000
111111111111111111000000000100001000000
111111111000000000111111111010000100000
111111111000000000111111111010000010000
111111111000000000111111111010000001000
000000000111111111111111111001000000100
000000000111111111111111111001000000010
000000000111111111111111111001000000001

and then it is clear that $\max{(1+|C_1|)(1+|C_2|)} = 10 \cdot 20 = 200$.

The matrix $A$ can be regarded as the biadjacency matrix of a bipartite graph and $C_1$ and $C_2$ correspond to two complete bipartite subgraphs such that the union of their respective "row" parts gives the "row" part of the bipartite graph corresponding to $A$.

I could try all possible choices for $C_1$ and then evaluate the product based on the $C_2$ that follows, but the algorithm has a time complexity greater than $O(2^n)$.

Can we do better than that? Is there any measure (e.g. some entropy measure) to at least estimate the maximum value?

A possible generalization of the problem would be computing (or estimating) the maximum of the product $\prod_{r=1}^{q}{(1+|C_r|)}$ (with any $q \ge 2$ that maximizes the product) and $C_1, \ldots ,C_q$ disjoint, such that for every $i \in C_s$ and every $j \in C_t$, $1 \le s \lt t \le q$, and for every $1 \le k \le m$, either $a_{ki} = 1$ or $a_{kj} = 1$.

$\endgroup$
1
  • 2
    $\begingroup$ We can define a graph over the columns, where two columns are connected if their union isn't $[m]$. The condition is exactly that there aren't two connected nodes with one in $C_1$ and one in $C_2$. This condition can be realized as a 2-SAT instance, but I'm not sure how you can optimize $(1 + |C_1|)(1 + |C_2|)$ over it. $\endgroup$ Commented Jun 28, 2024 at 14:02

2 Answers 2

4
$\begingroup$

Similarly to what Daniel Weber suggested in the comments, construct a graph $G$ on columns as vertices and connect every pair of vertices with an edge whenever the corresponding columns has at least one 1 in each row. You are interested in a non-induced biclique (i.e. complete bipartite subgraph) in $G$ with partite sets $C_1$ and $C_2$ having the maximum value $(1+|C_1|)(1+|C_2|)$. Such a biclique will necessarily be maximal, and there exist algorithms for finding all maximal bicliques - e.g. see this paper.

Alternatively, one can notice that there are only a finite number (less than $n^2$) values of $(1+|C_1|)(1+|C_2|)$, go over them in decreasing order and search for the corresponding complete bipartite graph(s) as a subgraph in $G$ until one is found.

$\endgroup$
1
  • $\begingroup$ Regarding the alternative method, once I fix a value for the product, how do I search efficiently for the corresponding complete bipartite subgraphs? $\endgroup$ Commented Jun 28, 2024 at 18:16
4
$\begingroup$

You can solve the problem via integer linear programming as follows. Let $$E=\left\{(j_1,j_2)\in[n]\times[n]: j_1 < j_2 \land \lnot \bigwedge_{i=1}^m (a_{ij_1} \lor a_{ij_2})\right\}$$ be the set of edges in the graph described in the comment by Daniel Weber. Let $P=\{1,2\}$ be the set of parts. For $j\in [n]$ and $p\in P$, let binary decision variable $x_{jp}$ indicate whether column $j$ appears in part $p$, so that $C_p=\{j\in[n]:x_{jp}=1\}$. For $p\in P$ and $k\in \{0,\dots,n\}$, let binary decision variable $y_{pk}$ indicate whether part $p$ has cardinality $k$. To linearize the product $(1+|C_1|)(1+|C_2|)$, take logarithms. The problem is to maximize $$\sum_{p \in P} \sum_{k=0}^n \log(1+k) y_{pk}$$ subject to linear constraints \begin{align} \sum_{k=0}^n y_{pk} &= 1 &&\text{for $p \in P$} \tag1\label1\\ \sum_{k=0}^n k y_{pk} &= \sum_{j=1}^n x_{jp} &&\text{for $p \in P$} \tag2\label2\\ \sum_{p \in P} x_{jp} &\le 1 &&\text{for $j \in [n]$} \tag3\label3\\ x_{j_1,p} + x_{j_2,3-p} &\le 1 &&\text{for $(j_1,j_2) \in E$ and $p \in P$} \tag4\label4 \end{align} Constraint \eqref{1} selects exactly one cardinality for $C_p$. Constraint \eqref{2} forces the selected cardinality for $C_p$ to be consistent with $x_{jp}$. Constraint \eqref{3} enforces disjoint parts. Constraint \eqref{4} prevents assigning columns $j_1$ and $j_2$ to different parts. One optimal solution is \begin{align} C_1&=\{1,2,3,8,9,10,15,16,18,27,28,29,30,31,32,33,34,35,36\} \\ C_2&=\{4,5,6,7,11,12,13,14,20\} \end{align} with $|C_1|=19$ and $|C_2|=9$.

$\endgroup$

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.