We can create variants of the loss function, especially of ridge regression by adding more regularizer terms. One of the variants I saw in a book is given below

$min_{w in mathbf{R}^d} alpha.||w||^2 + (1-alpha).||w||^4 + C||y-X^T.w||^2$

where $y in mathbf{R^n}, s in mathbf{R^d}, X in mathbf{R^{dxn}}$ and $C$ a regularization parameter $in mathbf{R}$ and $alpha in (0,1)$

My question is how does change in $alpha$ affects our optimization problem? and how does generally adding more regularizers help? why is not one regularizer enough?