Let $x_1,\dots,x_n$ be a set of given vectors in $\mathbb{R}_{+}^d$. Let $c_1,\dots,c_n$ be given positive constants. I am interested in finding the vectors $w_1,\dots,w_n$ in $\mathbb{R}_{+}^d$ that solves the optimization problem \begin{align} \max_{w_1,\dots,w_n}\sum_{i=1}^{n}c_i\frac{\exp(w_i^Tx_i)}{\sum_{j=1}^{n}\exp(w_j^Tx_i)} \end{align} I am not even sure how to start around this and I currently use a out-of-the-box optimization algorithm. Is this type of problem known in literature? What would be some starting point?
1 Answer
Here is an approach which is quite possibly sub-optimal, but I hope it helps.
Let me use a special form of AM-GM inequality (see ``Proofs from the Book'' for a beautiful proof) to get a lower bound on you cost function. Given positive numbers $\{a_1,\cdots, a_n\}$ and positive numbers $\{p_1,\cdots, p_n\}$ such that $\sum^n_{k=1}p_k = 1$, the following is true: $$ p_1a_1+\cdots+p_na_n \geq a_1^{p_1}a_2^{p_2}\cdots a_n^{p_n}. $$ The same applied inequality substituting $a_i = \frac{\exp(w_i^Tx_i)}{\sum_{j=1}^{n}\exp(w_j^Tx_i)}$ and $p_i = \frac{c_1}{\sum^n_{k=1}c_k}$, we get: $$ \sum_{i=1}^{n}\left(\frac{c_1}{\sum^n_{k=1}c_k}\right)\frac{\exp(w_i^Tx_i)}{\sum_{j=1}^{n}\exp(w_j^Tx_i)} = \sum_{i=1}^{n}\tilde{c_i}\frac{\exp(w_i^Tx_i)}{\sum_{j=1}^{n}\exp(w_j^Tx_i)} \geq \prod_{i=1}^n \left(\frac{\exp(w_i^Tx_i)}{\sum_{j=1}^{n}\exp(w_j^Tx_i)}\right)^{\tilde{c_i}}. $$ Maximizing the last term in the last inequalities would serve as a relaxation to the original problem. Towards that end, we maximize instead the $\log$ of the last term, since composing wtth a monotonic function does not change optima. Thus, the relaxed problem is given by (post $\log$): $$ \max_{w_1,\dots,w_n} \left\{\left(\sum^n_{i=1}\tilde{c_i}\left(w_i^Tx_i\right)\right) - \sum^n_{i=1}\tilde{c_i}\log\left(\sum_{j=1}^{n}\exp(w_j^Tx_i)\right)\right\}, ~\mbox{subject to}~ w_i\geq 0~\forall i. $$ The above optimization problem is concave and can be solved easily using CVXPY, or optimizers alike.
As far as the extent of sub-optimality is concerned, note that $\sum_{i=1}^n c_i$ is a natural upper bound for the cost function. Thus the relaxation gap be at most the difference between $\sum_{i=1}^n c_i$ and the value at the $\arg\max$ of the relaxed problem. Also, if the relaxed problem is unbounded, the original problem is unbounded as well.