neural networks – Looking for references for real-world scenarios of data-poisoning attack on labels while doing supervised learning

Consider the following mathematical model of training a neural net : Suppose $f_{w} : mathbb{R}^n rightarrow mathbb{R}$ is a neural net whose weights are $w$. Suppose during the training the adversary is sampling $x sim {cal D}$ from some distribution ${cal D}$ on $mathbb{R}^n$ and sending in training data of the form $(x, theta(x) + f_{w^*}(x))$ i.e the adversary is corrupting the true labels generated by $f_{w^*}$ (for some fixed $w^*$) by adding a real number to it.

Now suppose we want to have an algorithm which will use such corrupted training data as above and try to get as close to $w*$ as possible i.e despite getting data corrupted the above way the algorithm is trying to minimize (over $w$) the “original risk” $mathbb{E}_{x sim {cal D}} left ( frac{1}{2} left ( f_w (x) – f_{w*}(x) right )^2 right )$ as best as possible.

  • Is there a real life deep-learning application which comes close to the above framework or can motivate the above algorithmic aim?