2
$\begingroup$

Let $S_m=\{-1,1\}^m$ be the hypercube of signs. Define the set of "admissible weights" $W_m$ as the subset of $\{w\in\mathbb{R}_+^m : \|w\|^2=m\}$ with a "support property" of the corresponding threshold function $f_w: S_m\to \mathbb{R}$ given by

$$ f_w(s_1,\ldots,s_m)=s_1 w_1+\cdots s_m w_m\;. $$

For $w\in W_m$ to be admissible, then $f_w(s)\ne 0$ for all $s\in S_m$ and for all $i\in\{1,\ldots,m\}$ there exists some $s^*\in S_m$ such that the sign of $f$ is flipped when only the sign of $s^*_i$ is flipped:

$$ \mathrm{sgn}{f(\ldots,s^*_i,\ldots)} = -\mathrm{sgn}{f(\ldots,-s^*_i,\ldots)}\;.$$

It's a "support property" because clearly only weights with support $m$ are admissible.

Here is the optimization problem:

Characterize the weights $w^*\in W_m$ that maximize the "margin"

$$ \mu_m=\min_{s\in S_m} |s\cdot w^*|\;. $$

Here's what I know so far. We start at $m=3$ since $W_2$ is the empty set.

$m=3: \qquad w^*=(1,1,1)\qquad\mu_3=1$

$m=4: \qquad w^*=\sqrt{4/7}\;(1,1,1,2)\qquad\mu_4=\sqrt{4/7}$

$m=5: \qquad w^*=(1,1,1,1,1)\qquad\mu_5=1$

I suspect this odd-even pattern continues, of unique margin maximizers at odd $m$ and $m$-fold sets at even $m$.

Edit:

Boolean threshold functions (BTFs) are miniature models of classification, where the points of a hypercube are partitioned into two sets by a hyperplane. The standard representation of a BTF is like the activation functions of machine learning:

$$ b_w(s)=\mathrm{sgn}(w\cdot s)\;. $$

There's a continuum of $w$'s that represent the same BTF, so it's natural to ask which is best. I think everyone would agree that you want to maximize the margin, the interval over which the "decision" takes place:

$$ \mu=\min_{s\in S_m} |w\cdot s|\;. $$

Geometrically, this is proportional to the width of the thickest slab that separates the two sets of hypercube points. This shows the thickest slabs that arise for $m=3$ :

enter image description here

The corresponding weights (with the same normalization as above) are:

$$ (\sqrt{3},0,0) $$ $$ (1,1,1) $$

The thickest one, with $\mu=\sqrt{3}$, is not a good choice for a 3-input BTF since it can be implemented as a 1-input BTF (whose only function is to flip or not-flip the sign of the single input (like a NOT gate). My "support property" of admissible weights restricts the BTF-support to be $m$. When changing the sign of one input ($s_i$ above) has no effect on the value of the BTF, then the corresponding weight can just as well be zero (and so $w$ no longer has support $m$).

Here is what happens for $m=2$. Without loss of generality, $w=(w_1,w_2)$ and $w_1>w_2$ (when $w_1=w_2$ there exists an $s$ such that $w\cdot s=0$). But the sign of $w_1 s_1+w_2 s_2$ is now independent of $s_2$ (no different than the 1-input BTF we get by setting $w_2=0$).

BTFs have been around forever, but my conjectured relationship between support and a bound on the margin seems new.

$\endgroup$
2
  • $\begingroup$ It is pretty difficult for me to parse your question: it really isn't clear to me "where the quantifiers go". I also don't understand the motivation for the question. I think it would help if you could spend a little more time clarifying. Perhaps spelling out exactly what optimization problem you solved for $m=3$ and $m=4$ would be enough to clarify your question. $\endgroup$ Commented Jul 22 at 18:45
  • 1
    $\begingroup$ I don't agree that my quantifiers are ambiguous. The optimization is to maximize a quantity called $\mu_m$ over the continuous domain called $W_m$, where $\mu_m$ is expressed in terms of a minimization over the discrete set $S_m$. But I do agree the question could use some context, so I will add some material at the end. I'll also explain why $W_2$ is empty and how I came up with the unique optimizer for $m=3$. $\endgroup$ Commented Jul 22 at 22:09

0

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.