We have a Horizontal Pod Auto Scaler (HPA) Installed on gke clusterMost of the time the car scaler works perfectly. but from time to time (Especially during peak hours of our customers) The auto scaler is getting faster and faster Error code 503 from the stack driver,
Here are the mistakes we encounter:
Failed request to stackdriver api: googleapi: Error 503: The service is currently unavailable., backendError
and
"apiserver received an error that is not an metav1.Status: &googleapi.Error{Code:503, Message:"The service is currently unavailable.", Body:"{n "error": {n "code": 503,n "message": "The service is currently unavailable.",n "errors": (n {n "message": "The service is currently unavailable.",n "domain": "global",n "reason": "backendError"n }n ),n "status": "UNAVAILABLE"n }n}n", Header:http.Header(nil), Errors:()googleapi.ErrorItem{googleapi.ErrorItem{Reason:"backendError", Message:"The service is currently unavailable."}}}"
Now I am a bit confused here; Google recommends using the Stackdriver as a source for HPAs (https://cloud.google.com/kubernetes-engine/docs/tutorials/external-metrics-autoscaling) if it is not 100% available or fault-tolerant – The cluster is just broken, because the pods are not enlarged and the resources are exhausted.
Does anyone know how to work here?