# neural networks – Looking for references for real-world scenarios of data-poisoning attack on labels while doing supervised learning

Consider the following mathematical model of training a neural net : Suppose $$f_{w} : mathbb{R}^n rightarrow mathbb{R}$$ is a neural net whose weights are $$w$$. Suppose during the training the adversary is sampling $$x sim {cal D}$$ from some distribution $${cal D}$$ on $$mathbb{R}^n$$ and sending in training data of the form $$(x, theta(x) + f_{w^*}(x))$$ i.e the adversary is corrupting the true labels generated by $$f_{w^*}$$ (for some fixed $$w^*$$) by adding a real number to it.

Now suppose we want to have an algorithm which will use such corrupted training data as above and try to get as close to $$w*$$ as possible i.e despite getting data corrupted the above way the algorithm is trying to minimize (over $$w$$) the “original risk” $$mathbb{E}_{x sim {cal D}} left ( frac{1}{2} left ( f_w (x) – f_{w*}(x) right )^2 right )$$ as best as possible.

• Is there a real life deep-learning application which comes close to the above framework or can motivate the above algorithmic aim?