Practical Terraform: You're Doing it Wrong (Part 2)
This is Part 2 of the set of practical Terraform tips. If you missed Part 1, check it out here:
We'll explore several more Terraform pitfalls and how to avoid them so your infra teams can succeed long-term!
1. Too many conditional resources
Have you ever seen a module that creates too many conditional resources using syntax like for_each
and count
and ternary operators? These are less readable and generate a lot of complexity for maintainers.
For example:
# file: modules/storage/main.tf
variable "create_s3_bucket" {
type = boolean
description = "Whether to create an S3 bucket as well"
default = false
}
resource "aw_s3_bucket" "this" {
count = var.create_s3_bucket ? 1 : 0
...
}
# Other storage resources... EFS volumes, Backup configuration, etc.
...
Used in moderation, this can be a fine way to add flexibility to your module. However, take it too far, and you'll have a Frankenstein of faux-array references like aws_s3_bucket.this[0]
or worse: var.create_s3_bucket ? aws_s3_bucket.this[0] : ""
. It also makes your configuration more difficult to read and reason with.
When it becomes too much, decoupling these conditional components into a separate module is better. Instead of using ternary operators, you can instantiate a module or don't.
2. Resources that don't belong in a Module
We will only travel a short distance from the previous example. Let's say we have the above module with the conditionally-created S3 bucket. We may also have a few folders for deploying to Dev, Test, and Prod environments:
# file: deployments/dev/main.tf
module "storage" {
source = "../../modules/storage"
create_s3_bucket = true
env = "dev"
}
# file: deployments/test/main.tf
module "storage" {
source = "../../modules/storage"
create_s3_bucket = false
}
# file: deployments/prod/main.tf
module "storage" {
source = "../../modules/storage"
create_s3_bucket = false
}
In this scenario, we create the S3 bucket only in one environment. This is typical of infrastructure resources unique to one environment's purposes, such as Developer sandboxes, QA tooling, and ad-hoc troubleshooting devices.
Using conditional resources with variables is the wrong way to solve this use case; if a resource is specific to one environment, you should only create the resource in that environment's deployment. Either of the following refactors would work in this example:
Create a new module named
s3_storage
and move the S3 bucket and related resources inside. Remove the variable andcount
syntax, then instantiate the module only during the dev deployment.Don't use a module for the S3 bucket and related resources. Simply put them in the
deployments/dev/
folder directly.
3. Sandbox Environments
This next tip is a godsend for parallelization and will also increase your confidence in your ability to spin up and tear down the infrastructure.
We often write Terraform with our environments in mind, like Dev, Test, Stage, Prod, etc. We create resources with the environment keyword in the resource name, such as naming a Lambda function sqs-ingestor-${var.env}-${var.region}
, which creates sqs-ingestor-dev-us-east-1
. This may be fine for your team, especially on a smaller scale and when you work by yourselves; however, what do you do when your colleague needs to test their version of the Lambda function while you're testing your feature branch?
This is where Terraform Workspaces come in very handy. Say the two developers are myself and Sara Hollis, working on different new features simultaneously. If we plan the Terraform with parallelized work-streams in mind, we can name the Lambda function including a Terraform built-in variable ${terraform.workspace}
, which contains the name of the current workspace.
resource "aws_lambda_function" "sqs_ingestor" {
name = "sqs-ingestor-${terraform.workspace}"
tags = {
Name = "sqs-ingestor-${terraform.workspace}"
}
}
Then I proceed with my work:
zcking> git checkout feature/new-redundancy-options
zcking> terraform workspace new dev-zcking
zcking> terraform workspace select dev-zcking
zcking> terraform init && terraform apply
While Sara does the same for her work:
shollis> git checkout feature/json-schema-evolution
shollis> terraform workspace new dev-shollis
shollis> terraform workspace select dev-shollis
shollis> terraform init && terraform apply
Both will apply the resources successfully and in parallel--two Lambda functions will exist afterward. Terraform calls these workspaces, but I also like to refer to them as sandboxes because we each have our own isolated environment to play in, like a sandbox.
Note: This is usually feasible, but every use case is different. Sometimes, you may prefer to keep certain resources global or shared, such as ECS/EKS clusters, databases, and others—usually for cost and data reasons.
4. Remote States
My final tip for creating a more practical Terraform configuration is a data query to remote state files. This is where you programmatically query resource information stored in another remote Terraform state file, such as from another environment or team. By querying a remote state file, you can reuse outputs and modularize your project further without requiring the resources to be managed in the same deployable scope of Terraform code.
For example, we may deploy core networking infrastructure like our VPC with one state file:
terraform {
backend "s3" {
bucket = "my-terraform-state"
key = "vpc/terraform.tfstate"
region = "us-east-1"
}
}
resource "aws_vpc" "main" { ... }
output "vpc_id" {
value = aws_vpc.main.id
}
Now in a separate Terraform project, we can query the remote state and access the output to deploy an EC2 instance into the VPC:
data "terraform_remote_state" "vpc" {
backend = "s3"
config = {
bucket = "my-terraform-state"
key = "vpc/terraform.tfstate"
region = "us-east-1"
}
}
resource "aws_instance" "web" {
ami = "ami-12345678"
instance_type = "t2.micro"
vpc_security_group_ids = [aws_security_group.web_sg.id]
subnet_id = data.terraform_remote_state.vpc.outputs.vpc_id
}
You may wonder, "Couldn't I just use a data query to the aws_vpc resource rather than interrogate the Terraform state?" Technically, yes, and in your case, you may prefer that simplicity.
The tradeoff is that when using a traditional data query, you will need to include filters based on resource ID, tags, or other attributes; furthermore, some resources do not offer a data query to look up the infrastructure. Querying a remote state defers to the source of truth—the Terraform that deployed the dependency resources, like the VPC in this case.
Conclusion
I hope you enjoyed this expansion of my tips for practical Terraform-ing!
Follow for more content like this!