I am solving following question given in my textbook. But my answer is not tallying. Can someone please help? Probably my math with the variance is not correct, but I am not getting a clue how to calculate that.

Let $p_x(x|ω_i)$ be arbitrary densities with means $µ_i$ and covariance matrices $Σ_i$

— not necessarily normal — for $i = 1, 2.$ Let $y = w^t

x$ be a projection, and let the

induced one-dimensional densities $p(y|ω_i)$ have means $µ_i$ and variances $σ^2_

i$ .

Show that the criterion function

$J_1(w) = frac{(µ_1 − µ_2)^2}{

σ^2_

1 + σ^2_

2}$

is maximized by

$w = (Σ_1 + Σ_2)^{−1}(µ_1 − µ_2)$.

My try:

$y=w^Tx Rightarrow mu_1-mu_2=w^T(mu_1-mu_2)$. Here the first mean difference is in $y$ line and the second one on right hand side corresponds to the means of x’s.

Similarly, $sigma_i^2=E((y-mu_i)^2)=E((w^Tx-w^Tmu_i)^2)=w^TE((x-mu_i)^2)w=w^TSigma_iw$

So $J_1(w)=frac{w^T(mu_1-mu_2)^T(mu_1-mu_2)w}{w^T(Sigma_1+Sigma_2)w}$

From here, to calculate the maxima, I am differentiating $J$ with $w$ and equating to $0$.

Getting an equation like $w^T(Sigma_1+Sigma_2)w(mu_1-mu_2)^2w=w^T(mu_1-mu_2)^2w(Sigma_1+Sigma_2)w$

which seems like an equality. Unable to proceed from here. I am learning ML. Any help is greatly appreciated.