convex optimization – How do you call a linear programming problem when the solution should be “constrained” to a norm?

(apologies for the n00b question)

Let’s say we have a vector of length $n$, with to-be-determined values: $a_1, a_2, …,a_n$.

And we have information that partial sums of these elements are equal to something, say:
a_1 + a_2 + … + a_{k_1} = A_{1} \
a_{k_1+1} + a_{k_1+2} + … + a_n = B_1 \
a_1 + a_2 + … + a_{k_2} = A_2 \
a_{k_2+1} + a_{k_2+2} + … + a_n = B_2 \
a_1 + a_2 + … + a_{k_m} = A_m \
a_{k_m+1} + a_{k_m+2} + … + a_n = B_m \

Where $m<<n$: so we have much fewer such $m$ equations/constraints than the $n$ unknown values $a_i$.

If we want to know which combination of $a_i$ values can solve these equations, there are probably infinite many such combinations (or 0). So I’d like to add two constraints to this:

  1. $a_i>0$ for any i.
  2. I want the solution with $a_i$ values that are as “similar” to each other as possible. For example, keeping $sum_{i=1}^n (a_i – bar a)^2$ as small as possible (L2 norm). Where $bar a = sum frac{a_i}{n}$.

How is such optimization problem called? (would also love to know how to solve it, but I assume that once I have the name, I can find solvers).

memory hardware – Assembly Language Step By Step: Why deviate from the norm and design computers where the presence of voltage encodes a 0 bit?

That the presence of voltage across a switch encodes 12 is purely arbitrary… Jeff Duntemann’s book mentions:

We could as well have said that the lack of voltage indicates a binary
1 and vice versa (and computers have been built this way for obscure

I find this (italicized part) quite fascinating. It would be great if someone could shed some light on what “obscure” reason(s) may motivate people to do so?

fa.functional analysis – Determine the norm of a continuous linear operator $T:L^1[a,b]to L^1[a,b]$

I have encountered the following exercise:

Let $K(x,y)$ be a measurable function on $(a,b)times (a,b)$. The
function $I:yin (a,b) mapsto int_a^b |K(x,y)|, text{d}x in (0,+infty)$ belongs to $L^infty (a,b)$. Define an operator $T$ by
begin{equation} Tf: xin (a,b)mapstoint_a^b K(x,y) f(y) , text{d} y in (-infty,+infty),
qquad forall fin L^1(a,b).
Show that $T$ is a
continuous linear operator from $L^1(a,b)$ to $L^1(a,b)$, and
begin{equation} |T| = |I|_infty. end{equation}

Clearly, by Fubini’s theorem, we obtain
|Tf|_1& = int_a^b left| int_a^b K(x,y) f(y) , text{d} yright|, text{d}x
leq int_a^b int_a^b left| K(x,y) f(y)right|, text{d} y , text{d}x
\ &=int_a^b |f(y)|int_a^b left| K(x,y)right| , text{d}x , text{d} y
= int_a^b |f(y) I(y)|, text{d} y \& leq |I|_infty |f|_1

for all $fin L^1(a,b)$, thus $|T|leq |I|_infty$.

But I have no idea how to prove the converse inequality. I know we just need to find some $f_varepsilonin L^1(a,b)$ such that $|Tf_varepsilon|_1 geq (|I|_infty-varepsilon) |f_varepsilon|_1$ for any $varepsilon>0$. But I have difficluty in finding such an appropriate function arbitrarily close to attaining the infinity norm of $I$.

Any ideas would be greatly appreciated.

linear algebra – Follow-Up to “Least Squares with Euclidean $(L_2)$ Norm Constaint”

I would have a couple of follow-up questions to the following answer made by Royi:

The setup is the following:

$$ begin{alignat*}{3} text{minimize} & quad & frac{1}{2} left| A
x – b right|_{2}^{2} \ text{subject to} & quad & {x}^{T} x leq 1
end{alignat*} $$

The Lagrangian is given by:

$$ L left( x, lambda right) = frac{1}{2} left| A x – b
right|_{2}^{2} + frac{lambda}{2} left( {x}^{T} x – 1 right) $$

The KKT Conditions are given by:

$$ begin{align*} nabla L left( x, lambda right) = {A}^{T} left(
A x – b right) + lambda x & = 0 && text{(1) Stationary Point} \
lambda left( {x}^{T} x – 1 right) & = 0 && text{(2) Slackness} \
{x}^{T} x & leq 1 && text{(3) Primal Feasibility} \ lambda & geq
0 && text{(4) Dual Feasibility} end{align*} $$

1.) Why is the “Slackness” condition fulfilled, i.e. $lambdaleft( x^{T}x – 1 right) = 0$? After all, we only know that $leftvertleftvert xrightvertrightvert_{2}^{2} leq 1$, correct?

2.) Very related: Where do we know from that $lambdageq 0$?

3.) Here, the Lagrangian was defined by $mathcal L left( x, lambda right) := frac{1}{2} left| A x – b right|_{2}^{2} + frac{lambda}{2} left( {x}^{T} x – 1 right)$. One can also define the Lagrangian by $mathcal L left( x, lambda right) := frac{1}{2} left| A x – b right|_{2}^{2} – frac{lambda}{2} left( {x}^{T} x – 1 right)$, it shouldn’t matter at the end. But in the latter case, would $lambdageq 0$ still hold?

4.) I would have a very general question concerning the KKT theorem: Can one also apply them when one deals with equality constraints?

Scalar-by-vector derivative involving L2 norm and Hadamard product

I have a function $f(mathbf{x})$: $mathbb{R}^N rightarrow mathbb{R}$ given by:

$f(mathbf{x}) = lvert mathbf{A}(mathbf{x} circ mathbf{x})-mathbf{b} rvert^2$.

with $mathbf{A} in mathbb{R}^{Ntimes N}$ a constant matrix, $mathbf{b}$ the known vector $in mathbb{R}^N$, $lvert cdot rvert^2$ the $ell_2$ norm and $circ$ the hadamard product. I would like to know the expression about the gradient of $f$ with respect to $mathbf{x}$.

L1 Norm Optimization Solution – Mathematics Stack Exchange

I am trying to find the solution for the following optimization problem:

$max_{w} {z^Tw – lambda ||w – w_0||_1}$
where $z, w, w_0 in R^{Nx1}$ and $z, w_0$ are known.

We let $s$ be the subgradient of $lambda ||w – w_0||_1, hspace{2mm} s in { {u | u^T(w – w_0) = lambda ||w – w_0||_1, ||u||_{infty} leq lambda } }$.

If we write $lambda ||w – w_0||_1 = max_{||s||_{infty} leq lambda} s^T(w – w_0)$, the objective function can now be written as:
$max_{w} min_{||s||_{infty} leq lambda} {z^Tw – s^T(w – w_0)}$
$min_{||s||_{infty} leq lambda} max_{w} {z^Tw – s^T(w – w_0)}$.

The first order condition for the inner objective function is:
$0 = z – s hspace{1cm} (1)$
Substituting (1) into the objective function,
$min_{||s||_{infty} leq lambda} s^Tw_0$.

Hence, this objective function can be simplified as,
$min_s s^Tw_0 text{ s.t } -lambda leq s leq lambda$.

However, if all the steps above are correct (which I am not very certain about), I am not sure how the solution of s from the “simplified” optimization above will help me solve for w.
The sequence of steps taken to solve the initial optimization problem was inspired by the following paper.

Any help on this is very much appreciated. Thank you.

mg.metric geometry – A generalized norm function in $mathbb{R}^n$

We defined a new norm. The norm of $x in mathbb{R}^n$ is defined as
$$ N_P(x) = min {t geq 0 : x in tcdot P} enspace,$$
where $P$ is a centrally symmetric and convex body centered at the origin point.

We prove that it is a norm.

1.Identity of indiscernibles.
Obviously, $N_P(x)=0 Leftrightarrow x=0$.

2.Absolutely scalable.

Because of centrally symmetric property, $N_P(ax)=|a|N_P(x)$.

3.Triangle inequality.

Denote $N_P(x+y), N_P(x),N_P(y)$ as $t_0,t_1,t_2$ respectively. And let vectors $x+y,x,y$ go from the origin point and hit the border of $P$ at $a,b,c$ respectively.

Therefore $(x+y)=x+y$ implies $t_0vec{a}=t_1vec{b}+t_2vec{c}$, implies $vec{a}=frac{t_1}{t_0}vec{b}+frac{t_2}{t_0}vec{c}$

Suppose $t_0 > t_1+t_2$, thus $0leq frac{t_1}{t_0}+frac{t_2}{t_0}<1$.

However, this contradict to the convex property because border $bac$ is not convex. QED

We realized this new norm consists all possible norms in $mathbb{R}^n$, including $ell_p$.

Because for any norm $N(cdot)$, define $P={ x : N(x)leq 1}$, one can verify that $N_P=N$. It shows a simple fact: a norm is equivalent to the space which has unit norm.

Our question is, did anyone discover it before? What is the name? We googled it but did not get answers.

linear algebra – Derivative of a norm

I learned not use the Norm() function when computing vector derivative, so I use the dot product instead:

In: D(x.x, x)
Out: 1.x + x.1

What does the result mean? Is 1=(1, 1, .., 1) here? Why can’t it show just 2x as the result?
And Mathematica won’t resolve it when I define x?

In: 1.x + x .1 /. x -> {3, 4}
Out: {0.3 + 1.{3, 4}, 0.4 + 1.{3, 4}}