# machine learning – Fisher Linear Discrimination analysis – mathematical question

I am solving following question given in my textbook. But my answer is not tallying. Can someone please help? Probably my math with the variance is not correct, but I am not getting a clue how to calculate that.

Let $$p_x(x|ω_i)$$ be arbitrary densities with means $$µ_i$$ and covariance matrices $$Σ_i$$
— not necessarily normal — for $$i = 1, 2.$$ Let $$y = w^t x$$ be a projection, and let the
induced one-dimensional densities $$p(y|ω_i)$$ have means $$µ_i$$ and variances $$σ^2_ i$$ .
Show that the criterion function
$$J_1(w) = frac{(µ_1 − µ_2)^2}{ σ^2_ 1 + σ^2_ 2}$$
is maximized by
$$w = (Σ_1 + Σ_2)^{−1}(µ_1 − µ_2)$$.

My try:
$$y=w^Tx Rightarrow mu_1-mu_2=w^T(mu_1-mu_2)$$. Here the first mean difference is in $$y$$ line and the second one on right hand side corresponds to the means of x’s.

Similarly, $$sigma_i^2=E((y-mu_i)^2)=E((w^Tx-w^Tmu_i)^2)=w^TE((x-mu_i)^2)w=w^TSigma_iw$$

So $$J_1(w)=frac{w^T(mu_1-mu_2)^T(mu_1-mu_2)w}{w^T(Sigma_1+Sigma_2)w}$$

From here, to calculate the maxima, I am differentiating $$J$$ with $$w$$ and equating to $$0$$.

Getting an equation like $$w^T(Sigma_1+Sigma_2)w(mu_1-mu_2)^2w=w^T(mu_1-mu_2)^2w(Sigma_1+Sigma_2)w$$

which seems like an equality. Unable to proceed from here. I am learning ML. Any help is greatly appreciated.