Auto Scaling

Auto Scaling Groups (ASGs) enable dynamic scaling of infrastructure in response to changing demand. However, scaling policies are not mandatory for ASGs to function. CloudWatch Alarms monitors thresholds (e.g., CPU utilization, network traffic) that trigger scaling actions.

  1. ASG Without Scaling Policies:
    • An ASG can operate without any scaling policies.
    • In this case, the group maintains static values for: Minimum size, Maximum size, Desired capacity
    • Adjustments to these values must be made manually to scale the infrastructure.
  2. Dynamic Scaling with Policies:
    • To scale automatically based on demand, implement scaling policies like: Simple, strep and Target Tracking Scaling

Dynamic Scaling with Policies

  1. Simple Scaling Policy:
    1. Sets a fixed threshold; scaling actions occur when the threshold is breached.
    2. Example: Add 2 instances if CPU usage exceeds 80% for 5 minutes.
  2. Step Scaling Policy:
    1. Triggers scaling actions based on multiple thresholds for more granular control.
    2. Example:
      1. Add 2 instances if CPU exceeds 80% for 5 minutes.
      2. Add 4 instances if CPU exceeds 90%.
  3. Scheduled Scaling Policy:
    1. Scheduled scaling allows you to scale based on a specific time or a recurring schedule. Ideal for applications with predictable traffic patterns.
    2. Example: You could set a policy with MinSize, MaxSize and DesiredCapacity to scale up the number of instances during peak hours (e.g., 9 AM to 5 PM) and scale down after hours.
    3. Scheduled Scaling Policy example
          {
          "ScheduledActionName": "ScaleOutAction",
          "AutoScalingGroupName": "MyAutoScalingGroup",
          "StartTime": "2024-12-31T23:00:00Z",
          "DesiredCapacity": 10,
          "MinSize": 10,
          "MaxSize": 10
          }
      
  4. Target Tracking Scaling Policy:
    1. Automatically adjusts instances to maintain a specific target metric.
    2. Example: Maintain 50% average CPU utilization across the Auto Scaling group. If CPU usage goes above or below that, the group will scale out or in automatically to bring it back to the target. Here Scale out/in means - that means it will add/remove EC2 instances to reduce the load per instance.
    3. Supported Predefined Metrics for Target Tracking:
      1. ASGAverageCPUUtilization: Average CPU utilization of the Auto Scaling group.
      2. ASGAverageNetworkIn: Average bytes received per instance across all network interfaces.
      3. ASGAverageNetworkOut: Average bytes sent out per instance across all network interfaces.
      4. ALBRequestCountPerTarget: Average Application Load Balancer requests per target.
    4. Target Tracking Scaling Policy Example
      {
          "PolicyName": "TargetTrackingScalingPolicy",
          "AutoScalingGroupName": "MyAutoScalingGroup",
          "PolicyType": "TargetTrackingScaling",
          "TargetTrackingConfiguration": {
              "PredefinedMetricSpecification": {
              "PredefinedMetricType": "ASGAverageCPUUtilization"
              },
              "TargetValue": 50.0,
              "DisableScaleIn": false
          }
      }
      

Termination Policy

ASGs use termination policies to determine which instances to terminate first during a scale-in event.

  1. Default Termination Policy: When no specific policy is set -
    1. Terminate Instances from the AZ with Most Instances: Balances the number of instances across Availability Zones to ensure high availability.
    2. Oldest Launch Configuration or template: Selects instances launched from the oldest launch configuration or template.
    3. Closest to the Next Billing Hour: Prefers instances nearing the end of their billing cycle (for cost optimization).
    4. Random Selection: If multiple instances meet the above criteria, AWS terminates a random instance.
  2. Custom Termination Policies :
    1. Oldest Instance: Prioritizes terminating the instance that has been running the longest.
    2. Newest Instance: Targets the most recently launched instance.
    3. Oldest Launch Configuration: Terminates instances created with older launch configurations or templates.
    4. Closest to Billing Hour: Focuses on cost-saving by terminating instances closest to the next billing cycle.

Question: Why terminate instances from the oldest Launch Configuration or template?

Answer: Older instances may be -

  1. Running outdated AMIs
  2. Less optimized
  3. Not aligned with current infrastructure standards

Question: Why terminate instances closest to the next billing hour?

Answer: AWS charges by the hour or fraction of an hour for EC2 instances. If an instance is close to the next billing hour, terminating it right before the hour ends can minimize cost because the instance will be billed for less time. Example - If an instance has been running for 50 minutes and is about to enter a new billing hour, terminating it ensures you don't pay for the extra time that would be incurred if it continued running into the next cycle.

Launch Configuration and Launch Template

In Auto Scaling Groups (ASG), we define how EC2 instances should be launched using either a Launch Configuration or a Launch Template — this configuration tells the ASG how to create and manage EC2 instances.

What is a Launch Configuration?

It is a legacy resource used by an ASG to define how to launch EC2 instances. Without these information, ASG will not know how to create EC2 instances.

It includes settings like:

  1. AMI ID (Amazon Machine Image)
  2. Instance type (e.g., t3.medium)
  3. Key pair
  4. Security groups
  5. EBS volumes
  6. User data script
  7. IAM role

Important:

  1. Immutable — once created, it cannot be edited. You must create a new one if you want changes.
  2. AWS recommends using Launch Templates instead.
What is a Launch Template?

It is a more flexible and modern version of a Launch Configuration.

It supports:

  1. All features of launch configurations
  2. Multiple versions (so you can update settings easily)
  3. Support for Spot Instances, T2/T3 Unlimited, Placement groups, and mixed instance policies
  4. Integration with services like EC2 Fleet and Spot Fleet

Suspend and Resume Amazon EC2 Auto Scaling processes

The suspend-resume feature in Auto Scaling supports the below Eight process types

  1. Launch: Triggers the creation of new instances to meet scaling requirements.
  2. Terminate: Triggers the termination of instances to scale down the group.
  3. HealthCheck: Performs health checks on instances to ensure they are functioning properly.
  4. ReplaceUnhealthy: Replaces instances that are deemed unhealthy with new ones.
  5. AZRebalance: Rebalances instances across Availability Zones to maintain even distribution.
  6. AlarmNotification: Sends notifications based on CloudWatch alarms to alert when scaling actions occur.
  7. ScheduledActions: Executes scaling actions based on predefined schedules.
  8. AddToLoadBalancer: Adds instances to the associated load balancer to start receiving traffic.

When any of the above process types is suspended, any activity related to that process will be ignored or paused — even if a trigger occurs. The trigger may still fire, but the associated action will not be executed. Once the process is resumed, the Auto Scaling group regains the ability to act on those triggers going forward.

What is Process and Trigger in ASG?

  1. Processes are internal operations that the Auto Scaling group performs. These processes(Launch,Terminate,.. ) can be suspended or resumed.
  2. Triggers are external events or conditions (such as CloudWatch alarms, scheduled actions, or instance health changes) that initiate Auto Scaling actions. Triggers themselves cannot be suspended, but the actions they attempt to initiate can be paused by suspending the corresponding process.

Standby State for Instances in an Auto Scaling Group

You can put an instance that is in the InService state into the Standby state, update some software or troubleshoot the instance, and then return the instance to service. Instances that are on standby are still part of the Auto Scaling group, but they do not actively handle application traffic.

Use Case Example

If you need to update the Amazon Machine Image (AMI) for your Auto Scaling group, you can:

  1. Place the existing instances in Standby,
  2. Apply the necessary updates (e.g., software patches),
  3. And then return them to InService once completed.

Difference Between Detach and Standby

  1. Detach: It is an action we take on an instance within the ASG. It removes the instance from the ASG. The instance becomes independent and can be treated as a standalone EC2 instance or attached to another group.
  2. Standby: Keeps the instance within the group, but removes it temporarily from active traffic, allowing you to manage or troubleshoot without affecting scaling behavior.

Suspend-Resume vs. Standby State

  1. Suspend-Resume: Affects Auto Scaling processes (like Launch, Terminate, HealthCheck) at the group or process level. It is used to pause scaling actions across the group.
  2. Standby: Affects a specific instance by removing it temporarily from traffic while keeping it associated with the group.

Suspend-Resume & Standby State in practical example

To ensure DevOps can apply production patches without triggering unnecessary scaling events:

  1. Put the instance into Standby: This stops the instance from serving traffic, but it remains part of the Auto Scaling group.
  2. Suspend health checks: This prevents the Auto Scaling group from mistakenly replacing or terminating the instance during maintenance.