pr.probability – consequence of “the best coupling” of two SDEs with different diffusion matrices

My question comes form a potion of the long review paper, which is attached below
enter image description here

In the set-up, $sigma_1$ and $sigma_2$ are possibly different, constant diffusion matrices. To my knowledge, if we take the (“cheap”) synchronous coupling $B^1_t = B^2_t$, then we can estimate $mathbb{E}(|X^1_t – X^2_t|^2)$ using for instance Ito isometry or BDG inequality. However, what will be the error estimate $mathbb{E}(|X^1_t – X^2_t|^2)$ under this new (“the best”) coupling? The reference paper FH16 published on The Annals of Probability is overwhelming long and it is also too technical for me. Thanks for any help!