data structures – What is the primary reference for the observation/discussion of how neural networks struggle with ambiguous training datasets?

It is known that neural networks, such as convolutional neural networks, struggle with pattern recognition if training sets contain ambiguities (i.e. several labels can correspond to one and the same pattern). However, I struggle to locate the paper that directly discusses this issue, or demonstrates it for the first time, etc. If anybody can point me to the paper(s), that would really help. Thanks!