CI/CD & Infrastructure as Code
Infrastructure as Code with Terraform
Terraform is the industry standard for IaC. Every DevOps/SRE interview will test your knowledge.
Terraform Core Concepts
State Management
Terraform state tracks your infrastructure:
# State operations
terraform state list # List resources
terraform state show <resource> # Show details
terraform state mv <src> <dst> # Move/rename
terraform state rm <resource> # Remove from state
terraform state pull # Download state
terraform state push # Upload state (dangerous!)
Remote State Backend
# backend.tf
terraform {
backend "s3" {
bucket = "my-terraform-state"
key = "prod/infrastructure.tfstate"
region = "us-east-1"
encrypt = true
dynamodb_table = "terraform-locks" # State locking
}
}
Interview question: "Why use remote state with locking?"
Answer: Remote state enables team collaboration—everyone sees the same state. Locking (via DynamoDB) prevents concurrent modifications that could corrupt state or cause conflicting changes.
Terraform Module Design
Module Structure
modules/
├── vpc/
│ ├── main.tf
│ ├── variables.tf
│ ├── outputs.tf
│ └── README.md
├── eks/
│ ├── main.tf
│ ├── variables.tf
│ └── outputs.tf
└── rds/
├── main.tf
├── variables.tf
└── outputs.tf
Reusable Module Example
# modules/vpc/main.tf
resource "aws_vpc" "main" {
cidr_block = var.cidr_block
enable_dns_hostnames = true
enable_dns_support = true
tags = merge(var.tags, {
Name = "${var.name}-vpc"
})
}
resource "aws_subnet" "private" {
count = length(var.private_subnets)
vpc_id = aws_vpc.main.id
cidr_block = var.private_subnets[count.index]
availability_zone = var.azs[count.index]
tags = merge(var.tags, {
Name = "${var.name}-private-${count.index + 1}"
Type = "private"
})
}
# modules/vpc/variables.tf
variable "name" {
type = string
description = "Name prefix for resources"
}
variable "cidr_block" {
type = string
description = "VPC CIDR block"
}
variable "private_subnets" {
type = list(string)
description = "Private subnet CIDR blocks"
}
# modules/vpc/outputs.tf
output "vpc_id" {
value = aws_vpc.main.id
description = "VPC ID"
}
output "private_subnet_ids" {
value = aws_subnet.private[*].id
description = "Private subnet IDs"
}
Using Modules
# environments/prod/main.tf
module "vpc" {
source = "../../modules/vpc"
name = "prod"
cidr_block = "10.0.0.0/16"
private_subnets = ["10.0.1.0/24", "10.0.2.0/24", "10.0.3.0/24"]
azs = ["us-east-1a", "us-east-1b", "us-east-1c"]
tags = {
Environment = "production"
ManagedBy = "terraform"
}
}
module "eks" {
source = "../../modules/eks"
cluster_name = "prod-cluster"
vpc_id = module.vpc.vpc_id
subnet_ids = module.vpc.private_subnet_ids
}
Terraform Workspaces
# List workspaces
terraform workspace list
# Create workspace
terraform workspace new staging
# Switch workspace
terraform workspace select prod
# Current workspace in code
resource "aws_instance" "web" {
instance_type = terraform.workspace == "prod" ? "m5.large" : "t3.small"
}
Better alternative: Use separate directories per environment instead of workspaces for production systems. Easier to reason about and less error-prone.
Advanced Terraform Patterns
Data Sources
# Fetch existing resources
data "aws_ami" "amazon_linux" {
most_recent = true
owners = ["amazon"]
filter {
name = "name"
values = ["amzn2-ami-hvm-*-x86_64-gp2"]
}
}
resource "aws_instance" "web" {
ami = data.aws_ami.amazon_linux.id
instance_type = "t3.small"
}
Dynamic Blocks
resource "aws_security_group" "web" {
name = "web-sg"
dynamic "ingress" {
for_each = var.ingress_rules
content {
from_port = ingress.value.port
to_port = ingress.value.port
protocol = "tcp"
cidr_blocks = ingress.value.cidrs
}
}
}
Import Existing Resources
# Import existing infrastructure
terraform import aws_instance.web i-1234567890abcdef0
# Generate config (Terraform 1.5+)
terraform plan -generate-config-out=generated.tf
Interview Questions
Q: "How do you handle secrets in Terraform?"
# 1. Environment variables
# export TF_VAR_db_password="secret"
variable "db_password" {
type = string
sensitive = true # Masks in output
}
# 2. AWS Secrets Manager
data "aws_secretsmanager_secret_version" "db" {
secret_id = "prod/database/password"
}
resource "aws_db_instance" "main" {
password = data.aws_secretsmanager_secret_version.db.secret_string
}
# 3. Vault provider
data "vault_generic_secret" "db" {
path = "secret/database"
}
Q: "A terraform apply failed halfway. How do you recover?"
- Check state:
terraform state list- see what was created - Review error: Understand why it failed
- Fix and retry:
terraform applyis idempotent - If state is corrupted: Restore from backup or manually fix
- Taint if needed:
terraform taint <resource>to force recreation
Q: "How do you prevent accidental destruction of critical resources?"
resource "aws_db_instance" "production" {
# Prevent destruction
lifecycle {
prevent_destroy = true
}
# Or ignore changes to specific attributes
lifecycle {
ignore_changes = [tags]
}
}
Next, we'll cover deployment strategies—blue-green, canary, and rolling deployments. :::