neural networks – How long can the short memory last in the RNN?

For a recurrent neural network, the LSTM was a model of how the network worked. However, consider the case where an input was a long paragraph or even an article.
$$c_1c_2…c_n$$
where $c_i$ were some characters. The LSTM would work as expected given $n$ not a large number. But what if $n$ was a large number, say $1e5$. Clearly, the short term memory would not work as expected in the LSTM model.

Logically, with each input of $c_{a+i}$ where $a$ was some fixed integers and $igeq 1$, the “information” or “probability” of the outcome contributed at $c_a$ got “modified” or even “suppressed”, the reason why the LSTM worked. However, with sufficient large iteration of $i$, the information at $c_a$ might be completely suppressed.

How long can the short memory last in the RNN? and how would this affect the training?