pr.probability – The reason why a test is undersized?


Now I have a statistic $T_n$ for testing $H_0 leftrightarrow H_1$, and I have proved that:
$$n T_n rightarrow_d chi_K^2$$
under $H_0$. Then an asymptotic $chi^2$ test can be used, an asymptotic level $alpha$ test for the null hypothesis is obtained by rejecting $H_0$ whenever
begin{equation*}
T_n > n^{-1} chi_{K, 1 – alpha}^2.
end{equation*}

When I do simulation, however, I counter a big problem. Even $n$ is large enough (15,000 for instance), the empirical rejecting rate under $H_0$ is much smaller than $alpha$! Does this performance means the theoretical properties of $T_n$ that I have proved may be wrong? Or there is a ubiquitous phenomenon in chisuqare testing, and it is related to some more involved questions?