Fargate - rolling update
dumps from long thread
Hey, I'm seeing a few fargate tasks going from pending --> running --> stopped during rolling-update. Any idea why this'd happen?
Ashwini 10 days ago
I'm assuming it enters running state after all the criteria like ALB health check, permissions necessary, parameter store access, etc. are fulfilled. But somehow that's not the case..
I've verified by deliberately giving wrong health check port on ALB and it entered running state anyway.
Ladean Unser 10 days ago
I've seen this a few times as well. In our case it was either the task not able to start or was failing the Load Balancer Health check. Is there anything in the logs for the stopped tasks? Or any indication from the status on the stopped tasks?
Ashwini 10 days ago
I deliberately caused the failure by giving the wrong port. My concerned is it should have never gone from Pending to running state at all.
Ashwini 10 days ago
To say a task is running isn't it ensuring that ALB health check is passed ?? :disappointed::thinking_face:
Ladean Unser 10 days ago
correct me if i am misunderstanding, but i think what is happening is the task is coming up succefully on a certain port but the health check is failing because it's expecting a different port.
Ashwini 9 days ago
Yes...You are right @Ladean Unser
But my concern is before showing a task is in Running stage it should do ALB health check too.
Ladean Unser 9 days ago
As I understand how it works - ECS will bring up the task first then attach the task to the autoscaling group after it is up and running. Then the LB does it's health checks. So yes you would see a RUNNING task prior to the LB failing the health check. Remember these are different services. You can have an ECS Service running with tasks that are not behind an LB.
Ashwini 8 days ago
@Ladean Unser now am wondering, it is just checking if TASK has run successfully or not, based on that it will follow rolling update (registering / deregistering ) ? and not minding container health ?
Correct me if am wrong.
Ladean Unser 8 days ago
hmm so the task will get registered to the autoscaling group when the task is RUNNING. if the task does not enter RUNNING state, the task in autoscaling may show up as pending
but correct me if i'm wrong on this. however after the task is registered successfully the load balancer will do it's health checks. if the health check fails the task will get terminated and a new one will be brought up. This is the behavior that i'm seeing on our infrastructure. you can also set timeout values and number of checks
Ashwini 8 days ago
Auto-scaling - am still debating about it in my mind. :exploding_head:
However, when a Task goes from running to stopped stage due to LB health check failure,
new Tasks brought up further, are due to Desired count which triggers an infinite loop.
Whereas if a Task is going from pending to stopped state..new Tasks are spawned up due to Throttle Logic (retry for a finite time period) --> https://docs.aws.amazon.com/AmazonECS/latest/developerguide/service-throttle-logic.html