Terraform vs. AWS CDK

Tanja Nyberg
7 min readJul 11, 2023

--

Infrastructure as Code for Data Scientists.

An article about IaC technology targeting Data Scientists with a hands-on lab

In the world of AWS cloud infrastructure automation, two prominent tools have emerged: Terraform and AWS CDK (Cloud Development Kit). Both Terraform and AWS CDK provide Infrastructure as Code (IaC) capabilities, allowing developers and operations teams to define and provision cloud resources in a declarative manner.

Infrastructure as Code equips data scientists with the skills to efficiently manage, provision, and scale infrastructure resources. It enhances reproducibility, collaboration, and automation capabilities while enabling cost optimization and portability across different environments.

Python is a preferred language for many data scientists due to its extensive data analysis libraries and ease of use. AWS CDK recognizes the popularity of Python among data scientists and offers it as one of the supported languages for developing Infrastructure as Code, enabling data scientists to leverage their Python skills for managing cloud infrastructure.

The question is where to start the exloring. I would suggest we compare the deployment of an Ubuntu EC2 instance with a 32 GiB Root volume and 2t.micro type using Terraform and CDK.

PART I:Terraform

What is Terraform?

Terraform, developed by HashiCorp, is a widely adopted open-source tool that supports multi-cloud and on-premises infrastructure provisioning. It uses a domain-specific language (DSL) called HashiCorp Configuration Language (HCL) to describe infrastructure resources and their dependencies. Terraform manages resources through providers, which are plugins that interface with various cloud providers, including AWS, Azure, Google Cloud, and more. It maintains a state file to track the state of the infrastructure and supports a wide range of resource types and configurations.

What you need:

- Terraform CLI from . https://www.terraform.io/downloads.htmlMake sure it’s properly installed and available in your system’s PATH.

- AWS CLI from https://docs.aws.amazon.com/cli/latest/userguide/cli-chap-getting-started.html. Run `aws configure` to enter your credentials from AWS IAM.

Documentation:

https://developer.hashicorp.com/terraform/tutorials/aws-get-started

Terraform Community:

Official Terraform Community Forum: https://discuss.hashicorp.com/c/terraform-core/13

HashiCorp Community Slack: https://slack.hashicorp.com/ (Join the #terraform channel)

Terraform GitHub Discussions: https://github.com/hashicorp/terraform/discussions

Project Plan:

1. Create a working directory, let’s say “Terraform”. In the terminal, change to this directory and create the configuration file `main.tf`, which will contain the declarative definition of infrastructure resources and their configuration settings.

2. Open the directory in the IDE of your choice, like Visual Studio Code, and enter the Terraform code. Enter code for main.tf:


terraform {
required_providers {
aws = {
source = "hashicorp/aws"
version = "~> 5.7.0"
}
}
}


provider "aws" {
region = "eu-central-1"
profile = "default"
}

resource "aws_instance" "ec2_instance" {
ami = "ami-04e601abe3e1a910f" # Find right ami ID in your region or it is not working
instance_type = "t2.micro"
key_name = "meinpaar" # Replace with your key pair name

root_block_device {
volume_size = 32
}

tags = {
Name = "MyEC2Instance"
}
}

3. In the Terminal run the command `terraform init` to initialize the Terraform working directory. This command performs backend initialization, provider plugin installation, module installation (if applicable), and initialization of local state (if you are using the default local backend). You will receive messages about what is being done.

$terraform init

4. Run the command `terraform plan` to create an execution plan for applying or modifying infrastructure changes. Note that this will not result in a running EC2 instance, but all configuration changes will be ready for deploying.

$terraform plan

5. Deploy the resource with the command `terraform apply`.Check the instance in AWS.

$terraform apply

6. To clean up the instance, use the command `terraform destroy’.

$terraform destroy

Tipps:

  • For better understanding set up environmental variables TF_LOG with option DEBUG. Go to your working directory and give commands:
$export TF_LOG=DEBUG
$set TF_LOG=DEBUG
  • Additionally, you can redirect the log output to a file for further analysis when you are performing ‘terraform plan command’:
$terraform plan > plan.log
  • Ensure you have the correct AMI ID for your region. If the ID is not correct, you will receive the error message “Error: error collecting instance settings: couldn’t find resource,” which can be confusing. You can incorporate in your code the code block which look up suitable AMI IDs in a list.
data "aws_ami" "ubuntu" {
most_recent = true
owners = ["amazon"]

filter {
name = "name"
values = ["ubuntu/images/hvm-ssd/ubuntu-focal-20.04-amd64-server-*"]
}

  • When you run the terraform destroy command without specifying a specific resource, it will destroy all the resources that were created using Terraform within the current Terraform workspace.It applies to all command, use target resource option to identify the resource or the stack of resources.
$terraform destroy -target=aws_instance.[name of instance]

PART II: CDK

What is CDK?

AWS CDK(Cloud Development)is a newer framework developed by Amazon Web Services (AWS) that enables developers to define cloud infrastructure using familiar programming languages such as Python, TypeScript, and Java. CDK allows you to define infrastructure as a construct, representing reusable cloud components. It provides a higher level of abstraction than Terraform, allowing developers to leverage the power of their chosen programming language to build and configure resources. AWS CDK uses AWS CloudFormation under the hood to deploy the infrastructure and automatically manages the CloudFormation stacks.

Many developers and engineers find AWS CDK more preferable over Terraform due to its higher level of abstraction and seamless integration with programming languages like Python. The ability to write infrastructure as code using familiar programming languages makes CDK more accessible and intuitive for Python users, allowing them to leverage their existing skills and libraries in their infrastructure provisioning and management tasks. This abstraction layer simplifies the development and maintenance of infrastructure code, offering a smoother experience for Python developers working with AWS resources.

What do you need:

AWC CLI

+AWS CLI From from https://docs.aws.amazon.com/cli/latest/userguide/cli-chap-getting-started.html

+Node.js https://nodejs.org/en

+ pip (I have pip inside conda) https://pypi.org/project/pip/

Documentation:

https://docs.aws.amazon.com/cdk/v2/guide/getting_started.html

https://docs.aws.amazon.com/cdk/api/v2/python/index.html

CDK Community:

CDK Developer Guide: https://docs.aws.amazon.com/cdk/latest/guide/home.html

AWS CDK GitHub Repository: https://github.com/aws/aws-cdk

CDK Developer Gitter Chat: https://gitter.im/aws/aws-cdk

Project Plan

1. In the Terminal run the command `aws config` and enter your credentials from the user profile in AWS IAM.

2. In the terminal, create a directory and move to this directory.

$mkdir hello-cdk
$cd hello-cdk

3. Run initialization commands. You may encounter some problems with dependencies and environments as you had before working with Python applications.

cdk init app --language python

4. Activate the environment and install dependencies.

$source .venv/bin/activate
$pip install -r requirements.txt

5. Now you have your working directory and it looks like that:

7.Open the `hello_cdk_stack.py` file and enter the CDK code.

import aws_cdk as cdk
from aws_cdk import aws_ec2 as ec2

class EC2InstanceStack(cdk.Stack):
def __init__(self, scope: cdk.App, construct_id: str, **kwargs) -> None:
super().__init__(scope, construct_id, **kwargs)

# Create a new VPC
vpc = ec2.Vpc(self, 'MyVPC', max_azs=2)

# Create a new EC2 instance
instance = ec2.Instance(self, 'MyInstance',
vpc=vpc,
instance_type=ec2.InstanceType.of(
ec2.InstanceClass.BURSTABLE2, ec2.InstanceSize.MICRO),
machine_image=ec2.MachineImage.generic_linux(
{'eu-central-1': 'ami-04e601abe3e1a910f'}),
block_devices=[
ec2.BlockDevice(device_name="/dev/xvda",
volume=ec2.BlockDeviceVolume.ebs(32,
encrypted=True,
delete_on_termination=True)
)
]
)

6. Change the code in the `app.py` file.


import os

import aws_cdk as cdk

from hello_cdk.hello_cdk_stack import EC2InstanceStack


app = cdk.App()
EC2InstanceStack(app, "EC2InstanceStack",

)

app.synth()

7. Generate the CloudFormation template with the `synth` command.

$ cdk synth

8. Check the `EC2InstanceStack.template.json` file, which is the CloudFormation template. It can be used later for creating a stack in CloudFormation Designer.

9. Deploy your stack and check if the instance deployed and running:

$ cdk deploy

10. Clean up the resources you created with the command `cdk destroy`.

$cdk destroy

Tipps:

  • The command ‘cdk destroy’ will delete the whole stack you are created. Check the status of your stack in Cloud Formation.

Conclusion:

In conclusion, both Terraform and CDK play important roles in the CI/CD (Continuous Integration/Continuous Deployment) process for infrastructure provisioning and management. Terraform is a versatile choice for multi-cloud environments and offers a broader ecosystem, while CDK is well-suited for AWS-centric projects and provides a tighter integration with AWS services. Assess your specific needs, preferences, and familiarity with programming languages to make an informed decision.

Here is a summary of the substantial differences between Terraform and CDK which you should take into account:

After completing this first project, you can experience the differences between Terraform and CDK. Terraform is relatively easier to learn and apply, providing a straightforward approach. On the other hand, CDK offers great potential if you are already using Python extensively in your daily work. Ultimately, the choice between Terraform and CDK for data projects in AWS depends on your specific requirements and preferences. Consider factors such as familiarity with programming languages, integration with existing workflows, and the level of abstraction you desire.

--

--

Tanja Nyberg

Passionate Data and Cloud specialist with background in finaces. Thriving in Germany, bridging innovation and tech expertise.