AWS Auto Scaling

  • Configure automatic scaling for the AWS resources quickly through a scaling plan that uses dynamic scaling and predictive scaling.
  • Optimize for availability, for cost, or a balance of both.
  • Scaling in means decreasing the size of a group while scaling out means increasing the size of a group.
  • Useful for
    • Cyclical traffic such as high use of resources during regular business hours and low use of resources overnight
    • On and off traffic patterns, such as batch processing, testing, or periodic analysis
    • Variable traffic patterns, such as software for marketing campaigns with periods of spiky growth
  • It is a region specific service. 
  • Features

    • Launch or terminate EC2 instances in an Auto Scaling group.
    • Launch or terminate instances from an EC2 Spot Fleet request, or automatically replace instances that get interrupted for price or capacity reasons.
    • Adjust the ECS service desired count up or down in response to load variations.
    • Enable a DynamoDB table or a global secondary index to increase or decrease its provisioned read and write capacity to handle increases in traffic without throttling.
    • Dynamically adjust the number of Aurora read replicas provisioned for an Aurora DB cluster to handle changes in active connections or workload.
    • Use Dynamic Scaling to add and remove capacity for resources to maintain resource utilization at the specified target value.
    • Use Predictive Scaling to forecast your future load demands by analyzing your historical records for a metric. It also allows you to schedule scaling actions that proactively add and remove resource capacity to reflect the load forecast, and control maximum capacity behavior. Only available for EC2 Auto Scaling groups.
    • AWS Auto Scaling scans your environment and automatically discovers the scalable cloud resources underlying your application, so you don’t have to manually identify these resources one by one through individual service interfaces.
    • You can suspend and resume any of your AWS Application Auto Scaling actions.
  • Amazon EC2 Auto Scaling

    • Ensuring you have the correct number of EC2 instances available to handle your application load using Auto Scaling Groups.
    • An Auto Scaling group contains a collection of EC2 instances that share similar characteristics and are treated as a logical grouping for the purposes of instance scaling and management.
    • You specify the minimum, maximum and desired number of instances in each Auto Scaling group.
    • Key Components


Your EC2 instances are organized into groups so that they are treated as a logical unit for scaling and management. When you create a group, you can specify its minimum, maximum, and desired number of EC2 instances.

Launch configurations

Your group uses a launch configuration as a template for its EC2 instances. When you create a launch configuration, you can specify information such as the AMI ID, instance type, key pair, security groups, and block device mapping for your instances.

Scaling options

How to scale your Auto Scaling groups.

    • Auto Scaling Lifecycle

AWS Training AWS Auto Scaling

    • You can add a lifecycle hook to your Auto Scaling group to perform custom actions when instances launch or terminate.
    • Scaling Options
      • Scale to maintain current instance levels at all times
      • Manual Scaling
      • Scale based on a schedule
      • Scale based on a demand
    • Scaling Policy Types
      • Target tracking scaling—Increase or decrease the current capacity of the group based on a target value for a specific metric.
      • Step scaling—Increase or decrease the current capacity of the group based on a set of scaling adjustments, known as step adjustments, that vary based on the size of the alarm breach.
      • Simple scaling—Increase or decrease the current capacity of the group based on a single scaling adjustment.
    • The cooldown period is a configurable setting that helps ensure to not launch or terminate additional instances before previous scaling activities take effect.
      • EC2 Auto Scaling supports cooldown periods when using simple scaling policies, but not when using target tracking policies, step scaling policies, or scheduled scaling.
    • Amazon EC2 Auto Scaling marks an instance as unhealthy if the instance is in a state other than running, the system status is impaired, or Elastic Load Balancing reports that the instance failed the health checks.
    • Termination of Instances
      • When you configure automatic scale in, you must decide which instances should terminate first and set up a termination policy. You can also use instance protection to prevent specific instances from being terminated during automatic scale in.
      • Default Termination Policy

AWS Training AWS Auto Scaling

      • Custom Termination Policies
        • OldestInstance – Terminate the oldest instance in the group.
        • NewestInstance – Terminate the newest instance in the group.
        • OldestLaunchConfiguration – Terminate instances that have the oldest launch configuration.
        • ClosestToNextInstanceHour – Terminate instances that are closest to the next billing hour.
  • You can create launch templates that specifies instance configuration information when you launch EC2 instances, and allows you to have multiple versions of a template.
  • A launch configuration is an instance configuration template that an Auto Scaling group uses to launch EC2 instances, and you specify information for the instances.
    • You can specify your launch configuration with multiple Auto Scaling groups.
    • You can only specify one launch configuration for an Auto Scaling group at a time, and you can’t modify a launch configuration after you’ve created it.
    • When you create a VPC, by default its tenancy attribute is set to default. You can launch instances with a tenancy value of dedicated so that they run as single-tenancy instances. Otherwise, they run as shared-tenancy instances by default.
    • If you set the tenancy attribute of a VPC to dedicated, all instances launched in the VPC run as single-tenancy instances.
    • When you create a launch configuration, the default value for the instance placement tenancy is null and the instance tenancy is controlled by the tenancy attribute of the VPC.
IT Certification Category (English)728x90

Launch Configuration Tenancy

VPC Tenancy = default

VPC Tenancy = dedicated

not specified

shared-tenancy instance

Dedicated Instance


shared-tenancy instance

Dedicated Instance


Dedicated Instance

Dedicated Instance

    • If you are launching the instances in your Auto Scaling group in EC2-Classic, you can link them to a VPC using ClassicLink.
  • Application Auto Scaling

    • Allows you to configure automatic scaling for the following resources:
      • Amazon ECS services
      • Spot Fleet requests
      • Amazon EMR clusters
      • AppStream 2.0 fleets
      • DynamoDB tables and global secondary indexes
      • Aurora replicas
      • Amazon SageMaker endpoint variants
      • Custom resources provided by your own applications or services.
    • Features
      • Target tracking scaling—Scale a resource based on a target value for a specific CloudWatch metric.
      • Step scaling— Scale a resource based on a set of scaling adjustments that vary based on the size of the alarm breach.
      • Scheduled scaling—Scale a resource based on the date and time.
    • Target tracking scaling
      • You can have multiple target tracking scaling policies for a scalable target, provided that each of them uses a different metric.
      • You can also optionally disable the scale-in portion of a target tracking scaling policy.
    • Step scaling
      • Increase or decrease the current capacity of a scalable target based on a set of scaling adjustments, known as step adjustments, that vary based on the size of the alarm breach.
    • Scheduled scaling
      • Scale your application in response to predictable load changes by creating scheduled actions, which tell Application Auto Scaling to perform scaling activities at specific times.
    • The scale out cooldown period is the amount of time, in seconds, after a scale out activity completes before another scale out activity can start.
    • The scale in cooldown period is the amount of time, in seconds, after a scale in activity completes before another scale in activity can start.
  • You can attach one or more classic ELBs to your existing Auto Scaling Groups. The ELBs must be in the same region.
  • Auto Scaling rebalances by launching new EC2 instances in the AZs that have fewer instances first, only then will it start terminating instances in AZs that had more instances
  • Monitoring

    • Health checks – identifies any instances that are unhealthy
      • Amazon EC2 status checks (default)
      • Elastic Load Balancing health checks
      • Custom health checks.
    • Auto scaling does not perform health checks on instances in the standby state. Standby state can be used for performing updates/changes/troubleshooting without health checks being performed or replacement instances being launched.
    • CloudWatch metrics – enables you to retrieve statistics about Auto Scaling-published data points as an ordered set of time-series data, known as metrics. You can use these metrics to verify that your system is performing as expected.
    • CloudWatch Events – Auto Scaling can submit events to CloudWatch Events when your Auto Scaling groups launch or terminate instances, or when a lifecycle action occurs.
    • SNS notifications – Auto Scaling can send Amazon SNS notifications when your Auto Scaling groups launch or terminate instances.
    • CloudTrail logs – enables you to keep track of the calls made to the Auto Scaling API by or on behalf of your AWS account, and stores the information in log files in an S3 bucket that you specify.
  • Security

    • Use IAM to help secure your resources by controlling who can perform AWS Auto Scaling actions.
    • By default, a brand new IAM user has NO permissions to do anything. To grant permissions to call Auto Scaling actions, you attach an IAM policy to the IAM users or groups that require the permissions it grants.
  • Limits

    • Scaling plans: 100
    • Target tracking configurations per instruction: 10
    • Target tracking configurations per scaling plan: 500
    • Scalable targets: 500
    • Scaling policies per scalable target: 50
    • Scheduled actions per scalable target: 200
    • Step adjustments per scaling policy: 20
    • You can request a limit increase for all the limits mentioned above.

Capacity Management Made Easy with Amazon EC2 Auto Scaling:


AWS Auto Scaling-related Cheat Sheets:


Validate Your Knowledge

Question 1

A large Philippine-based Business Process Outsourcing company is building a two-tier web application in their VPC to serve dynamic transaction-based content. The data tier is leveraging an Online Transactional Processing (OLTP) database but for the web tier, they are still deciding what service they will use.

What AWS services should you leverage to build an elastic and scalable web tier?

  1. Elastic Load Balancing, Amazon EC2, and Auto Scaling
  2. Elastic Load Balancing, Amazon RDS with Multi-AZ, and Amazon S3
  3. Amazon RDS with Multi-AZ and Auto Scaling
  4. Amazon EC2, Amazon DynamoDB, and Amazon S3

Correct Answer: 1

Amazon RDS is a suitable database service for online transaction processing (OLTP) applications. However, the question asks for a list of AWS services for the web tier and not the database tier. Also, when it comes to services providing scalability and elasticity for your web tier, Auto Scaling and Elastic Load Balancer should immediately come into mind. Therefore, Option 1 is the correct answer.

To build an elastic and a highly-available web tier, you can use Amazon EC2, Auto Scaling, and Elastic Load Balancing. You can deploy your web servers on a fleet of EC2 instances to an Auto Scaling group, which will automatically monitor your applications and automatically adjust capacity to maintain steady, predictable performance at the lowest possible cost. Load balancing is an effective way to increase the availability of a system. Instances that fail can be replaced seamlessly behind the load balancer while other instances continue to operate. Elastic Load Balancing can be used to balance across instances in multiple availability zones of a region.

Options 2, 3 and 4 are incorrect since they don’t mention all of the required services in building a highly available and scalable web tier, such as EC2, Auto Scaling, and Elastic Load Balancer. Although Amazon RDS with Multi-AZ and DynamoDB are highly scalable databases, the scenario is more focused on building its web tier and not the database tier.


Question 2

A tech company has a CRM application hosted on an Auto Scaling group of On-Demand EC2 instances. The application is extensively used during office hours from 9 in the morning till 5 in the afternoon. Their users are complaining that the performance of the application is slow during the start of the day but then works normally after a couple of hours.

Which of the following can be done to ensure that the application works properly at the beginning of the day?

  1. Configure a Dynamic scaling policy for the Auto Scaling group to launch new instances based on the CPU utilization.
  2. Configure a Dynamic scaling policy for the Auto Scaling group to launch new instances based on the Memory utilization.
  3. Configure a Scheduled scaling policy for the Auto Scaling group to launch new instances before the start of the day.
  4. Set up an Application Load Balancer (ALB) to your architecture to ensure that the traffic is properly distributed on the instances.

Correct Answer: 3

Scaling based on a schedule allows you to scale your application in response to predictable load changes. For example, every week the traffic to your web application starts to increase on Wednesday, remains high on Thursday, and starts to decrease on Friday. You can plan your scaling activities based on the predictable traffic patterns of your web application.

                                        An illustration of a basic Auto Scaling group.

To configure your Auto Scaling group to scale based on a schedule, you create a scheduled action. The scheduled action tells Amazon EC2 Auto Scaling to perform a scaling action at specified times. To create a scheduled scaling action, you specify the start time when the scaling action should take effect, and the new minimum, maximum, and desired sizes for the scaling action. At the specified time, Amazon EC2 Auto Scaling updates the group with the values for minimum, maximum, and desired size specified by the scaling action. You can create scheduled actions for scaling one time only or for scaling on a recurring schedule.

Option 3 is the correct answer. You need to configure a Scheduled scaling policy. This will ensure that the instances are already scaled up and ready before the start of the day since this is when the application is used the most.

Options 1 and 2 are incorrect because although this is a valid solution, it is still better to configure a Scheduled scaling policy as you already know the exact peak hours of your application. By the time either the CPU or Memory hits a peak, the application already has performance issues, so you need to ensure the scaling is done beforehand using a Scheduled scaling policy.

Option 4 is incorrect. Although the Application load balancer can also balance the traffic, it cannot increase the instances based on demand.



For more AWS practice exam questions with detailed explanations, check this out:

Tutorials Dojo AWS Practice Tests


Additional Training Materials: AWS Auto Scaling Video Courses on Udemy

  1. Amazon EC2 Master Class (with Auto Scaling & Load Balancer) by Stephane Maarek
  2. AWS: Get Started with Load Balancing and Auto-Scaling Groups by Savitra Sirohi



Pass your AWS and Azure Certifications with the Tutorials Dojo Portal

Tutorials Dojo portal

Our Bestselling AWS Certified Solutions Architect Associate Practice Exams

AWS Certified Solutions Architect Associate Practice Exams

Enroll Now – Our AWS Practice Exams with 95% Passing Rate

AWS Practice Exams Tutorials Dojo

Enroll Now – Our Azure Certification Exam Reviewers

azure reviewers tutorials dojo

Tutorials Dojo Study Guide and Cheat Sheets eBooks

Tutorials Dojo Study Guide and Cheat Sheets-2

FREE Intro to Cloud Computing for Beginners

FREE AWS Practice Test Samplers

Browse Other Courses

Generic Category (English)300x250

Recent Posts