# real analysis – What is meant by “class of indicator functions”?

In section 5 VC-Dimension of the Set of Functions of the paper Principles of Risk Minimization for Learning Theory by V. Vapnik, the author says the following:

The theory of uniform convergence of empirical risk to actual risk developed in the 70’s and 80’s, includes a description of necessary and sufficient conditions as well as bounds for the rate of convergence (Vapnik, 1982). These bounds, which are independent of the distribution function $$P(x, y)$$, are based on a quantitative measure of the capacity of the set of functions implemented by the learning machine: the VC-dimension of the set.

For simplicity, these bounds will be discussed here only for the case of binary pattern recognition, for which $$y in {0, 1}$$ and $$f(x, w), w in W$$ is the class of indicator functions. The loss function takes only two values $$L(y, f(x, w)) = 0$$ if $$y = f(x, w)$$ and $$L(y, f(x, w)) = 1$$ otherwise. In this case, the risk functional (2) is the probability of error, denoted by $$p(w)$$). The empirical risk functional (3), denoted by $$v(w)$$, is the frequency of error in the training set.

It is this part that I am confused about:

For simplicity, these bounds will be discussed here only for the case of binary pattern recognition, for which $$y in {0, 1}$$ and $$f(x, w)$$, $$w in W$$ is the class of indicator functions.

What exactly is meant by “class of indicator functions”, and which one of $$y in {0, 1}$$ or $$f(x, w)$$ is this “class of indicator functions”? $$f$$ is the only function I see here, but I’m not sure how it’s a “class”. Could this be in reference to “equivalence classes”?