# Vectorization of Logistic Regression

I want to take the derivative (with respect to $${Theta}$$ of:

$$sum_{i = 1}^{m} w^{i}(y^{i} log(h_{Theta }(x^{i}) + (1-y^{i})log(1-h_{Theta}(x^{i})))$$

Where $$h_{Theta}(x)$$ is the sigmoid function:

$$frac{1}{1+e^{-{Theta}^{T}x^{i}}}$$

i in this case is the training example (related to Machine Learning).

I’m able to get it to:

$$sum_{i=1}^{m} w^{i}(y^{i} – h_{Theta}(x^{i}))$$

However, I’m unable to understand how it turns into:

$$X^{T}(w^{i}(y^{i}-h_{Theta}(x^{i}))$$

It’s really a disconnection that I’m having. I understand that Matrix multiplication is the sum of column entries being multiplied by row entries, but I’ve never found a concrete way to naturally understand that the sum turns into X-transpose here.Just looking to build stronger intuition.