We have auto-scaling groups for one of our cloud-forming stacks that have a processor-based signal that determines when to scale instances.
This is great, but we recently expanded it from one node to three, and one of these nodes could not boot via cfn-init. After reducing the workload and downsizing the group to one node, he killed two good instances and left the partially loaded node with the only remaining instance. This meant that we stopped processing the work until someone logged in and restarted the boot process.
Obviously, this is not perfect. What is the best way to notify the auto-scaling group that the node is not healthy when it is not behind the ELB?
Since this is only a bootstrap, I would really like it to return to the auto-scaling group that failed in this node and completed it, and the new node is deployed in its place.
amazon-web-services autoscaling
mjallday
source share