More important than writing, size and style is the use of spaces. You need to make sure your readers know that this text belongs to the image as a caption. You can use 2em between the image, the narrow label, and the surrounding text to see if the effect of the caption belongs to the image or not.
From what I've seen, subtitles usually make up 80% (0.8em) of body text size. However, if you make the text smaller, reading is even more difficult for users. And that affects the readability. A second style change with italics would mean exaggerating the desired effect. Decrease the size or italicize the text.
The meaning of italics is often the emphasis of the word or phrase. If you make the text smaller, send the message that this text is less important. Decide what message to send before changing the style or size.