complexity theory – NP-hardness proofs and approximation classes of optimization problems with real values

I’m studying complexity theory and I have some questions regarding $NP$-harndess proofs of optimization problems with real values, and whether approximation classes can be applied for these problems too. Any reference is much appreciated.

For the questions, take the balanced MSSC problem as an example:

Balanced MSSC ($mathcal{P}$) – Given a set of $n$ points $X={x_1, dots, x_n } subset mathbb{R}^n$ find a partition $mathcal{C} = { C_1, dots, C_K }$ of $K$ clusters with equal cardinalities which minimizes $sum_{k=1}^K sum_{i: x_i in C_k} | x_i – c_k |^2$, where $c_k$ is the centroid of the points $x_i in C_k$.

Pyatkin et al. (2017) prooves the $NP$-hardness (please see Definition 3) of this optimization problem for $n/K=3$. For this, they use a polynomial-time reduction from the Partition into Triangles (PiT) $NP$-complete problem to a decision problem $mathcal{P}_D$ which is restricted to vectors $x_i in {0,1}^q$. They proove that solving PiT is equivalent to finding a partition in problem $mathcal{P}_D$ for which the value of the measure function ($m_{mathcal{P}}$) is smaller than or equal to some $W in mathbb{Q}$. According to my understanding $NP$-hardness result follows because in this way we could use the optimization problem as an oracle to solve the PiT problem.

Q1 – Defition 1 (please see below) does not include optimization problems with measure functions with real values. I assume the definition ‘naturally’ extends to this case and the above approach can be used because $W in mathbb{Q}$ in $mathcal{P}_D$, and the oracle is assumed to do the calculation in one step no matter if the measure function is real- or rational-valued. Is it correct?

Q2 – Let’s suppose we have an optimization problem with a real-valued function and we have a reduction from an $NP$-complete problem to a decision problem $mathcal{P}_D’$ with $W in mathbb{R}$. For the sake of argument let’s say the measure function is $sum_{k=1}^K sum_{i: x_i in C_k} | x_i – c_k |^p$ for some $p in mathbb{R}$ fixed, and $W$ is of the form $w^p, w in mathbb{Q}$. Can we conclude an $NP$-hardness result here in any way?

I’ve found this NP-hardness of an optimization problem with real value related question. In case of $mathcal{P}_D’$ and for a fixed $p$ the value of $W$ comes from a countable set so I think the $NP$-hardness follows but I’m not sure I understood everything correctly.

Also, Manyem (2013) uses an $NP$-optimization problem definition, where the measure function takes values in $mathbb{R}_0^{+}$. He notes that “when it comes to computer representation, rational numbers will be used”. At the same time he defines the decision version of the optimization problem, which also uses a constant $W in mathbb{R}_0^{+}.$ So I wonder whether a precise argument for $NP$-hardness could be made using this computer representation too.

Q3 – Can an optimization problem $mathcal{P}$ with a real-valued measure function $m_{mathcal{P}}$ be in $NPO$ (please see Definition 2), $APX$, etc.?

If my understanding is correct a measure function with arbitrary real values cannot be calculated in polynomial time, hence the optimization problem cannot be in $NPO$. I wonder whether a computer representation argument (or anything else) can be applied here, so that $mathcal{P}$ may be in any approximation classes.

— Key definitions from Ausiello et al. (2003) —

Definition 1 – An optimization problem $mathcal{P}$ is characterized by the following quadruple of objects $(I_{mathcal{P}}, {SOL}_{mathcal{P}}, m_{mathcal{P}}, {goal}_{mathcal{P}})$, where:

  1. $I_{mathcal{P}}$ is the set of instances of $mathcal{P}$;
  2. ${SOL}_{mathcal{P}}$ is a function that associates to any input instance $x in I_{mathcal{P}}$ the set of feasible solutions of $x$;
  3. $m_{mathcal{P}}$ is the measure function, defined for pairs $(x,y)$ such that $x in I_{mathcal{P}}$ and $y in {SOL}_{mathcal{P}}(x)$. For every such pair $(x,y)$, $m_{mathcal{P}}$ provides a positive integer which is the value of the feasible solution $y$;
  4. ${goal}_{mathcal{P}} in { min, max }$ specifies whether $mathcal{P}$ is a maximization or a minimization problem.

Ausiello et al. (2003) makes a note at point 3 that in practice for several problems the measure function is defined to have values in $mathbb{Q}$, and that it is equivalent with the above one.
Denote the value of an optimal solution of $x$ by $m^{*}(x)$. Then the corresponding decision problem to $mathcal{P}$ is the following:

Decision Problem ($mathcal{P}_D$) – Given an instance $x in I$ and a positive integer $W in mathbb{Z}^{+}$, decide whether $m^{*}(x) geq W$ (if goal $= max$) or whether $m^{*}(x) leq W$ (if goal $= min$). If goal $= max$, the set ${ (x,W) | x
in I wedge m^{*}(x) geq W }$
(or ${ (x,W) | x
in I wedge m^{*}(x) leq W }$
if goal $= min$) is called the underlying language of $mathcal{P}$.

Based on the above note I assume when the measure function is defined to have values in $mathbb{Q}$ then the variable $W$ may also have values in $mathbb{Q}$ in the above definition.

Definition 2 – An optimization problem $mathcal{P} = (I, SOL,m,goal)$ belongs to the class $NPO$ if the following holds:

  1. the set of instances $I$ is recognizable in polynomial time;
  2. there exists a polynomial $q$ such that, given an instance $x in I$, for any $y in SOL(x), |y| leq q(|x|)$ and, besides, for any $y$ such that $|y| leq q(|x|)$, it is decidable in polynomial time whether $y in SOL(x)$;
  3. the measure function $m$ is computable in polynomial time.

Definition 3 – An optimization problem $mathcal{P}$ is called $NP$-hard if, for every decision problem $mathcal{P}’ in NP, mathcal{P}’ leq_{T}^{p} mathcal{P}$, that is, $mathcal{P}’$ can be solved in polynomial time by an algorithm which uses an oracle that, for any instance $x in I_{mathcal{P}}$, returns an optimal solution $y^{*}(x)$ of $x$ along with its value $m^{*}_{mathcal{P}} (x)$.

References:

Ausiello, G., Crescenzi, P., Gambosi, G., Kann, V., Marchetti-Spaccamela, A. and Protasi, M. (2003). Complexity and approximation: Combinatorial optimization problems andtheir approximability properties. Springer-Verlag Berlin Heidelberg 1999. DOI: 10.1007/978-3-642-58412-1.

Manyem, P. (2013). Syntactic expressions to express NP-hard optimization problems and problems with zero duality gap. A Journal of Mathematical Programming and Operations Research. 62(9):1227-1246. DOI: 10.1080/02331934.2011.625027.

Pyatkin, A., Aloise, D. and Mladenović, N. (2017). NP-hardness of balanced minimum sum-of-squares clustering. Pattern Recognition Letters, 97:44–45. DOI: 10.1016/j.patrec.2017.05.033.