machine learning – How does Gradient Descent treat multiple features?

As far as I know, when you reach the step, in a gradient descent algorithm, to calculate step_size, you calculate learning_rate * slope

Now, slope is obtained by calculating the derivative of the cost_function with respect to the feature you want to find the optimal coefficient for.

Let’s say that the cost function for the purposes of this question is the sum of squared residuals.

My question is, how are coefficient of other features treated in the differentiation of the equation? For instance, if I have the equation $y = b_0 + x_1 + x_2$, then by calculating the derivative of the cost function with respect to $b_1$, one gets:

$frac{d}{dleft(x_1right)}left(left(mean:-:left(b_0:+:x_1+x_2right)right)^2right) =$

$2times:left(mean:-:b_{0:}-x_1-x_2right)left(frac{d}{dleft(x_1right)}(x_2):+:1right)$

In this case, how is a value obtained by substituting a value for $x_1$ while $frac{d}{d(x_1)}(x_2)$ is still in the formula?

I watched a YouTube video (it starts at the right point) that says that $x_2$ is a constant (while it’s a different feature) and, therefore, when differentiating, $frac{d}{d(x_1)}(x_2)$ is omitted and we are left with $2times :left(mean:-:b_{0:}-x_1-x_2right)(1)$. Is this the case or am I missing something?