How to set up a managed instance group and autoscaling in the Google cloud platform

Auto-scaling allows you to automatically add or remove computational mechanisms based on load. The prerequisites for autoscaling in GCP are an instance template and a group of managed instances.

This question is part of another answer, which is to create an autoscaled and load-balanced backend.

I wrote an answer below that contains steps for setting up autoscaling in GCP.

+5
autoscaling google-compute-engine google-cloud-platform
source share
1 answer

Autoscaling is a function of a group of managed instances in GCP. This helps to handle very high traffic by scaling instances and at the same time reduces the number of instances in the absence of traffic, which saves a lot of money.

To configure automatic scaling, we need the following:

  • Instance template
  • Managed Instance Group
  • Autoscale policy
  • Health check

An instance template is a project that defines the type of machine, image, disks of homogeneous instances that will work in an automatically scalable group of managed instances. I wrote the steps to install the instance template here .

A managed instance group helps maintain a group of homogeneous instances based on a single instance template. Assuming an instance template as a sample template . This can be configured by running the following command in gcloud :

gcloud compute instance-groups managed \ create autoscale-managed-instance-group \ --base-instance-name autoscaled-instance \ --size 3 \ --template sample-template \ --region asia-northeast1 

The above command creates a group of managed instances containing 3 computing mechanisms located in three different zones in the region of Asia-North-East1, based on a sample template.

  • base-instance-name will be the base name for all automatically created instances. In addition to the base name, each instance name will be added with a uniquely generated random string.
  • The size represents the desired number of instances in the group. At the moment, 3 instances will work continuously, regardless of the amount of traffic generated by the application. Later it can be automatically scaled by applying a policy to this group.
  • region (multizone) or single zone : a group of managed instances can be configured either in the region (multizone), i.e. homogeneous specimens will be evenly distributed across all zones in a given region, or all specimens may be deployed in the same zone within a region. It can also be deployed as interregional, which is currently in alpha mode.

The auto-scaling policy defines the behavior of auto-scaling. The autoscaler combines data from instances and compares them with the required capacity specified in the policy and determines the action to be taken. There are many auto-scaling policies, such as:

  • Average CPU Usage

  • HTTP Load Balancing Service (number of requests / sec)

  • Stackdriver Standard and Custom Metrics

  • and much more

Now that you have introduced automatic scaling for this group of managed instances, run the following command in gcloud :

 gcloud compute instance-groups managed \ set-autoscaling \ autoscale-managed-instance-group \ --max-num-replicas 6 \ --min-num-replicas 2 \ --target-cpu-utilization 0.60 \ --cool-down-period 120 \ --region asia-northeast1 

The above command sets the autoscaler based on the CPU load in the range from 2 (in the absence of traffic) to 6 (in the case of heavy traffic).

  • The cooling period flag indicates the number of seconds to wait after starting the instance before the corresponding autoscaler starts collecting information from it.
  • Autoscaler can be associated with a maximum of 5 different policies . For multiple policies, Autoscaler recommends a policy with a maximum number of instances.
  • An interesting fact: when an instance is started by autoscale, it ensures that the instance will work for at least 10 minutes, regardless of traffic. This is because GCP bills for at least ten minutes of computing engine operation. It also protects against accidental rotation and shutdown of instances.

Recommendations: from my point of view, it is better to create your own image with all the software installed than to use a startup script. Since the time required to start new instances in the auto-scaling group should be as small as possible. This will increase the scaling speed of your web application.

This is part 2 of a 3- part series about building autoscaled and load balancing the backend.

+16
source share

All Articles