HashiCorp Terraform AWS Provider v3.4.0 now supports aws_emr_managed_scaling_policy

Share on:

HashiCorp Terraform AWS-Provider Issue #13952 was highly sought after for a recent implementation of EMR v5.30.0. The requirements included the need for the utilization of AWS Auto Scaling for EMR. We sought out the AWS EMR Managed Scaling feature, but were sad to see that support for that attribute was not in the AWS provider yet.

As of 08/27/20, it is now supported in Terraform aws_provider v3.4.0! In our case, we utilized:

 1    resource "aws_emr_managed_scaling_policy" "emrautoscalingpolicy" {
 2      cluster_id = aws_emr_cluster.emr_cluster.id
 3      compute_limits {
 4        unit_type                       = "Instances"
 5        minimum_capacity_units          = var.emr_scaling_minimum_capacity_units
 6        maximum_capacity_units          = var.emr_scaling_maximum_capacity_units
 7      }
 8      depends_on = [
 9        aws_emr_cluster.emr_cluster
10      ]
11    }

(The resource aws_emr_cluster.emr_cluster.id obviously referenced an aforementioned EMR cluster creation)

Insight into the compute_limits parameters:

You need to configure the following parameters for managed scaling. The limit only applies to the core and task nodes. The master node cannot be scaled after initial configuration.

  • Minimum (MinimumCapacityUnits) – The lower boundary of allowed EC2 capacity in a cluster. It is measured through virtual central processing unit (vCPU) cores or instances for instance groups. It is measured through units for instance fleets.

  • Maximum (MaximumCapacityUnits) – The upper boundary of allowed EC2 capacity in a cluster. It is measured through virtual central processing unit (vCPU) cores or instances for instance groups. It is measured through units for instance fleets.

  • On-Demand limit (MaximumOnDemandCapacityUnits) (Optional) – The upper boundary of allowed EC2 capacity for On-Demand market type in a cluster. If this parameter is not specified, it defaults to the value of MaximumCapacityUnits.

    This parameter is used to split capacity allocation between On-Demand and Spot Instances. For example, if you set the minimum parameter as 2 instances, the maximum parameter as 100 instances, the On-Demand limit as 10 instances, then EMR managed scaling scales up to 10 On-Demand Instances and allocates the remaining capacity to Spot Instances. For more information, see Node Allocation Scenarios.

  • Maximum core nodes (MaximumCoreCapacityUnits) (Optional) – The upper boundary of allowed EC2 capacity for core node type in a cluster. If this parameter is not specified, it defaults to the value of MaximumCapacityUnits.

    This parameter is used to split capacity allocation between core and task nodes. For example, if you set the minimum parameter as 2 instances, the maximum as 100 instances, the maximum core node as 17 instances, then EMR managed scaling scales up to 17 core nodes and allocates the remaining 83 instances to task nodes. For more information, see Node Allocation Scenarios.

Reference the Terraform aws_provider v3.4.0:

1    provider "aws" {
2      version = "~> 3.4.0"
3      profile                 = "${var.aws-profile}"
4      region                  = "${var.aws_region}"
5    }

After adding the above, along with the needed variable declaration & TF apply, voila:

AWS EMR