# machine learning – Axis aligned rectangles: why is A an ERM in the case of infinite domain?

I’m working on a problem 2.3a in Shalev-Shwartz/Ben-David’s Machine learning textbook, which states:

An axis aligned rectangle classifier in the plane is a classifier that assigns 1 to a point if and only if it is inside a certain rectangle. Formally, given real numbers $$a_1leq b_1, a_2leq b_2,$$ define the classifier $$h_{(a_1, b_1, a_2, b_2)}$$ by
$$h_{(a_1, b_1, a_2, b_2)}(x_1, x_2) = begin{cases}1&textrm{if a_1leq x_1leq b_1 and a_2leq x_2leq b_2}\ 0&textrm{otherwise}end{cases}$$
The class of all axis aligned rectangles in the plane is defined as $$mathcal{H}_mathrm{rec}^2 = {h_{(a_1, b_1, a_2, b_2)}:textrm{a_1leq b_1 and a_2leq b_2}}$$…rely on realizability assumption.
Let $$A$$ be an algorithm that returns the smallest rectangle enclosing all positive examples in the training set. Show that $$A$$ is ERM (empirical risk minimizer).

I don’t understand why $$A$$ would be ERM, when the domain/instance space is infinite. Below is the counterexample I considered to this claim.

For simplicity, consider the one-dimensional case, where the rectangles would be intervals. Sample instance $$xsim mathcal{U}_{(0, 1)}$$ from a uniform distribution over $$(0, 1)$$, where the labeling function $$f:(0, 1)to{0, 1}$$ is defined by
$$f(x) = begin{cases} 1 &textrm{x = 0 or x = 1}\ 0 &textrm{otherwise,} end{cases}$$
then the realizability assumption holds as there exists an $$h^*$$ whose rectangle described by the singleton $${1}$$ has an error set $${0}$$ of measure $$0$$. However, in the case of a particular sample
$$S = {0, 1/4, 1/2, 3/4, 1},$$
the smallest rectangle $$A$$ is the interval $$(0, 1)$$, and the consequent classifier $$h_A$$ has an empirical error of $$L_S(h_A) = 3/5$$. In contrast, the classifier $$h^*$$ has empirical error $$L_S(h^*) = 1/5$$, a lower error than the classifier described by $$A$$, even though it doesn’t contain all positive examples. Therefore $$A$$ does not minimize empirical error.