Building Production-Grade ECS Microservices with CI/CD - Part 6: Production Cleanup

Complete guide to safely tearing down production AWS infrastructure to avoid ongoing charges, covering automated cleanup, cost optimization, and data preservation strategies.

AWS DevOps cleanup

October 22, 2025

Building Production-Grade ECS Microservices with CI/CD - Part 6: Production Cleanup

Share This Post

Twitter LinkedIn Copy Link

Building Production-Grade ECS Microservices with CI/CD - Part 6: Production Cleanup

Welcome to the final part of our comprehensive series on building production-grade microservices on AWS ECS. In this installment, we’ll cover the essential process of safely tearing down your AWS infrastructure to avoid ongoing charges while preserving important data and configurations.

What We’ll Cover

In this cleanup guide, we’ll provide you with:

Cost Analysis - Understanding what resources cost and why cleanup matters
Safe Cleanup Procedures - Step-by-step resource deletion in the correct order
Data Preservation - Creating snapshots and backups before deletion
Automated Cleanup - Scripts and Terraform commands for efficient cleanup
Cost Optimization - Strategies for reducing costs while keeping infrastructure
Verification Procedures - Ensuring complete cleanup and cost elimination

Production Cost Analysis

Understanding the cost implications of your infrastructure is crucial for effective cleanup planning.

Monthly Cost Breakdown

Resource	Configuration	Monthly Cost	Priority
NAT Gateways	2x Multi-AZ	~$65	⚠️ Critical
RDS PostgreSQL	Multi-AZ, db.t3.micro	~$35	⚠️ High
ECS Fargate Tasks	2 Nginx + 2 Flask + 1 Redis	~$40	⚠️ High
Application Load Balancer	Internet-facing	~$20	⚠️ Medium
Elastic IPs	2x Unattached	~$7	⚠️ Medium
CloudWatch Logs	7-day retention	~$2	⚠️ Low
Data Transfer	ALB to ECS	~$5	⚠️ Low
ECR Storage	Container images	~$1	⚠️ Low
Total	Complete Infrastructure	~$175/month	🚨 Urgent

Cost Optimization Priority

Immediate Action Required:

NAT Gateways - Delete immediately (saves $65/month)
RDS Database - Stop or delete (saves $35/month)
ECS Services - Scale to zero (saves $40/month)
Load Balancer - Delete when not needed (saves $20/month)

Production Cleanup Strategy

Critical Cleanup Order

⚠️ IMPORTANT: Follow this exact order to prevent dependency conflicts and ensure complete cleanup:

Phase 1: Application Layer (5-10 minutes)

ECS Services - Scale down to 0 tasks
Target Groups - Delete ALB target groups
Application Load Balancer - Delete ALB
ECS Task Definitions - Optional cleanup

Phase 2: Data Layer (10-15 minutes)

RDS Database - Create final snapshot, then delete
ECR Images - Optional cleanup (keep if needed)

Phase 3: Networking Layer (5-10 minutes)

NAT Gateways - ⚠️ CRITICAL - Delete immediately (saves $65/month)
Elastic IPs - Release unattached IPs
VPC and Networking - Delete subnets, route tables, security groups

Phase 4: Infrastructure Layer (5-10 minutes)

ECS Cluster - Delete cluster
Service Discovery - Delete Cloud Map namespace
CloudWatch Logs - Delete log groups
IAM Roles - Delete custom roles and policies

Cleanup Time Estimates

Automated (Terraform): 15-20 minutes
Manual (AWS Console): 30-40 minutes
Scripted (Bash): 20-25 minutes

Method 1: Automated Cleanup with Terraform (Recommended)

The safest and most efficient method for complete infrastructure cleanup.

Prerequisites

Terraform state file available
AWS credentials configured
All resources created with Terraform

Step 1: Navigate to Terraform Directory

cd terraform

Step 2: Review What Will Be Deleted

terraform plan -destroy

Review the output carefully. You should see all resources planned for destruction.

Step 3: Create RDS Snapshot (Optional but Recommended)

Before destroying, create a final snapshot:

aws rds create-db-snapshot \
  --db-instance-identifier ecs-microservices-postgres \
  --db-snapshot-identifier ecs-microservices-final-snapshot-$(date +%Y%m%d) \
  --region ap-south-1

# Wait for snapshot to complete
aws rds wait db-snapshot-completed \
  --db-snapshot-identifier ecs-microservices-final-snapshot-$(date +%Y%m%d) \
  --region ap-south-1

Step 4: Modify RDS to Skip Final Snapshot (if needed)

If you don’t want a final snapshot, update in Terraform:

Edit main.tf and change:

skip_final_snapshot = true

Then apply:

terraform apply -auto-approve

Step 5: Destroy All Resources

terraform destroy

Type yes when prompted.

This will take 15-20 minutes due to:

RDS deletion (~10 minutes)
NAT Gateway deletion (~5 minutes)
VPC cleanup

Step 6: Verify Cleanup

# Check ECS clusters
aws ecs list-clusters --region ap-south-1

# Check RDS instances
aws rds describe-db-instances --region ap-south-1

# Check VPCs (should only show default VPC)
aws ec2 describe-vpcs --region ap-south-1

# Check NAT Gateways
aws ec2 describe-nat-gateways --region ap-south-1 \
  --filter "Name=state,Values=available"

# Check ALBs
aws elbv2 describe-load-balancers --region ap-south-1

All should return empty or only default resources.

Method 2: Manual Cleanup via AWS Console

For situations where Terraform is not available or you need granular control over the cleanup process.

Prerequisites

AWS Console access
Administrator permissions
Understanding of resource dependencies

Step 1: Stop ECS Services

Go to ECS → Clusters → ecs-microservices-cluster
Click Services tab
For each service (nginx, flask-app, redis):
- Select the service
- Click Update
- Set Desired tasks to 0
- Click Update
- Wait for tasks to stop
After all tasks stopped, delete services:
- Select each service
- Click Delete
- Confirm deletion

⏱️ Time: ~5 minutes

Step 2: Delete Application Load Balancer

Go to EC2 → Load Balancers
Select ecs-microservices-alb
Actions → Delete
Type “confirm” and delete

⏱️ Time: ~2 minutes

Step 3: Delete Target Groups

Go to EC2 → Target Groups
Select ecs-microservices-nginx-tg
Actions → Delete
Confirm

⏱️ Time: ~1 minute

Step 4: Delete RDS Database

⚠️ Critical: Create final snapshot first if you need the data!

Go to RDS → Databases
Select ecs-microservices-postgres
Actions → Delete
Choose one:
- ✅ Create final snapshot (recommended): Enter snapshot name
- ❌ Skip final snapshot: Check the box (data will be lost)
Type “delete me” to confirm
Click Delete

⏱️ Time: ~10-15 minutes

Step 5: Delete NAT Gateways (Important - Saves $65/month!)

⚠️ This is the most expensive resource! Delete immediately to stop charges.

Go to VPC → NAT Gateways
Select both NAT gateways (should be 2)
Actions → Delete NAT gateway
Type “delete” and confirm

⏱️ Time: ~5 minutes

Step 6: Release Elastic IPs

Go to VPC → Elastic IPs
Select the allocated IPs (should be 2)
Actions → Release Elastic IP addresses
Confirm

⚠️ Note: You can only release EIPs after NAT Gateways are deleted.

⏱️ Time: ~1 minute

Step 7: Delete ECS Cluster

Go to ECS → Clusters
Select ecs-microservices-cluster
Delete Cluster
Confirm

⏱️ Time: ~2 minutes

Step 8: Delete Service Discovery Namespace

Go to AWS Cloud Map → Namespaces
Select ecs-microservices.local
Delete
Confirm

⏱️ Time: ~1 minute

Step 9: Delete CloudWatch Log Groups

Go to CloudWatch → Logs → Log groups
Select these log groups:
- /ecs/ecs-microservices/flask-app
- /ecs/ecs-microservices/nginx
- /ecs/ecs-microservices/redis
- /ecs/ecs-microservices/exec
Actions → Delete log group(s)
Confirm

⏱️ Time: ~1 minute

Step 10: Delete ECR Repositories (Optional)

⚠️ Only delete if you don’t need the images anymore!

Go to ECR → Repositories
Select repositories:
- ecs-microservices/flask-app
- ecs-microservices/nginx
Delete
Type “delete” and confirm

⏱️ Time: ~1 minute

Step 11: Delete VPC and Networking

Wait for NAT Gateways to finish deleting, then:

Go to VPC → Your VPCs
Select ecs-microservices-vpc
Actions → Delete VPC
Confirm

This will delete:

All subnets
Route tables
Internet gateway
Security groups
VPC

⚠️ If deletion fails, manually delete in this order:

Route table associations
Subnets
Route tables
Internet gateway (detach first)
Security groups (delete non-default)
VPC

⏱️ Time: ~5 minutes

Step 12: Delete IAM Roles

Go to IAM → Roles
Search for “ecs-microservices”
Select these roles:
- ecs-microservices-ecs-task-execution-role
- ecs-microservices-ecs-task-role
Delete
Confirm

⏱️ Time: ~2 minutes

Step 13: Delete Auto Scaling Policies (if created manually)

Go to Application Auto Scaling
Select policies related to ECS services
Delete them

⏱️ Time: ~1 minute

Method 3: Automated Cleanup Script

For automated cleanup without Terraform, use this production-ready script.

Prerequisites

AWS CLI configured
Bash shell available
Appropriate AWS permissions

#!/bin/bash

# Automated cleanup script
set -e

AWS_REGION="ap-south-1"
PROJECT_NAME="ecs-microservices"

echo "🧹 Starting cleanup process..."

# Step 1: Scale down ECS services to 0
echo "⏬ Scaling down ECS services..."
for service in nginx flask-app redis; do
    aws ecs update-service \
        --cluster ${PROJECT_NAME}-cluster \
        --service ${PROJECT_NAME}-${service} \
        --desired-count 0 \
        --region ${AWS_REGION} \
        > /dev/null 2>&1 || echo "Service ${service} not found or already deleted"
done

sleep 30

# Step 2: Delete ECS services
echo "🗑️ Deleting ECS services..."
for service in nginx flask-app redis; do
    aws ecs delete-service \
        --cluster ${PROJECT_NAME}-cluster \
        --service ${PROJECT_NAME}-${service} \
        --force \
        --region ${AWS_REGION} \
        > /dev/null 2>&1 || echo "Service ${service} not found or already deleted"
done

echo "⏳ Waiting for services to be deleted (60 seconds)..."
sleep 60

# Step 3: Use Terraform to destroy everything else
echo "🔥 Running terraform destroy..."
terraform destroy -auto-approve

echo "✅ Cleanup completed!"
echo "💰 Check AWS Console to verify all resources are deleted"

Make it executable and run:

chmod +x cleanup.sh
./cleanup.sh

Production Verification Checklist

Critical Resources Verification

⚠️ HIGH PRIORITY - Verify these are deleted to avoid charges:

NAT Gateways deleted ⚠️ Critical for cost savings ($65/month)
RDS Database deleted ⚠️ High priority ($35/month)
ECS Services stopped ⚠️ High priority ($40/month)
Application Load Balancer deleted ⚠️ Medium priority ($20/month)
Elastic IPs released ⚠️ Medium priority ($7/month)

Infrastructure Resources Verification

ECS Cluster deleted
Target Groups deleted
VPC and Networking deleted
Security Groups deleted (except default)
Service Discovery Namespace deleted
CloudWatch Log Groups deleted
IAM Roles deleted (custom roles only)

Optional Resources Verification

ECR Repositories deleted (if desired)
ECS Task Definitions deleted (optional)
Auto Scaling Policies deleted
Route53 Records deleted (if created)

Production Cost Verification

AWS Cost Explorer Verification

Navigate to Cost Explorer:
- Go to AWS Cost Management → Cost Explorer
- Select Daily costs view
- Filter by region: ap-south-1
- Set date range to last 7 days
Expected Results:
- Before cleanup: ~~$5-6/day (~~$175/month)
- After cleanup: ~~$0.10-0.50/day (~~$3-15/month)
- Target: Near $0 within 24-48 hours

Cost Monitoring Commands

# Check current costs
aws ce get-cost-and-usage \
  --time-period Start=2024-01-01,End=2024-01-31 \
  --granularity MONTHLY \
  --metrics BlendedCost \
  --region us-east-1

# Check for running resources
aws ec2 describe-instances --region ap-south-1 --query 'Reservations[*].Instances[*].[InstanceId,State.Name]'
aws rds describe-db-instances --region ap-south-1 --query 'DBInstances[*].[DBInstanceIdentifier,DBInstanceStatus]'
aws ecs list-clusters --region ap-south-1

Production Troubleshooting

Issue: VPC Won’t Delete

Error: “The vpc has dependencies and cannot be deleted”

Root Cause: Dependencies still exist in the VPC

Solution: Delete resources in this exact order:

NAT Gateways (wait until fully deleted - 5-10 minutes)
Elastic IPs (only after NAT Gateways are deleted)
Network interfaces (ENIs from ECS tasks)
Subnets (all custom subnets)
Internet Gateway (detach first, then delete)
Route tables (custom route tables)
Security groups (non-default groups)
Finally, VPC

Issue: NAT Gateway Stuck in “Deleting” State

Symptoms: NAT Gateway shows “deleting” for more than 10 minutes

Solution:

This is normal behavior
Can take 5-15 minutes depending on AWS load
Do not delete again - wait for completion
Check CloudWatch for any error logs

Issue: Security Group Deletion Fails

Error: “has dependent objects”

Root Cause: Network interfaces or instances still using the security group

Solution:

Delete ECS services first (removes ENIs)
Wait 5-10 minutes for ENIs to be removed
Check for orphaned ENIs in EC2 console
Delete ENIs manually if they exist
Try security group deletion again

Issue: RDS Deletion Hangs

Symptoms: RDS instance stuck in “deleting” state

Root Cause: Final snapshot creation or Multi-AZ cleanup

Solution:

Normal behavior for Multi-AZ instances (10-15 minutes)
Monitor RDS console for progress
Check CloudWatch logs for any errors
Do not interrupt the deletion process

Issue: Terraform Destroy Fails

Common Causes: State drift, resource dependencies, permission issues

Solution:

Refresh state: terraform refresh
Retry destroy: terraform destroy
Manual cleanup: Delete failed resources manually
Remove from state: terraform state rm <resource>
Retry destroy: terraform destroy again

Issue: ECS Services Won’t Scale Down

Symptoms: Services remain at desired count despite update

Solution:

Force new deployment: --force-new-deployment
Check task health: Ensure tasks are healthy
Wait for stability: Allow 5-10 minutes
Scale down gradually: Reduce desired count in steps

Cost Optimization Strategies

Strategy 1: Partial Cleanup (Keep Infrastructure)

For temporary cost reduction while preserving infrastructure:

Quick Cost Reduction (~$100/month savings)

1. Stop ECS Services (saves ~$40/month):

# Scale down all services to 0
aws ecs update-service --cluster ecs-microservices-cluster \
  --service ecs-microservices-flask-app --desired-count 0 --region ap-south-1
aws ecs update-service --cluster ecs-microservices-cluster \
  --service ecs-microservices-nginx --desired-count 0 --region ap-south-1
aws ecs update-service --cluster ecs-microservices-cluster \
  --service ecs-microservices-redis --desired-count 0 --region ap-south-1

2. Stop RDS Database (saves ~$35/month):

aws rds stop-db-instance \
  --db-instance-identifier ecs-microservices-postgres \
  --region ap-south-1

⚠️ Note: RDS auto-starts after 7 days

3. Delete NAT Gateways (saves ~$65/month):

# Delete NAT Gateways (requires Terraform apply to recreate)
aws ec2 delete-nat-gateway --nat-gateway-id <nat-gateway-id> --region ap-south-1

Strategy 2: Development Environment Optimization

For long-term development use:

1. Use Single-AZ RDS (saves ~$17/month):

Change from Multi-AZ to Single-AZ
Accept reduced availability for dev environment

2. Use Single NAT Gateway (saves ~$32/month):

Delete one NAT Gateway
Accept reduced availability for dev environment

3. Reduce ECS Task Counts (saves ~$20/month):

Use 1 task per service instead of 2
Accept reduced capacity for dev environment

Restart After Partial Cleanup:

# Start RDS
aws rds start-db-instance \
  --db-instance-identifier ecs-microservices-postgres \
  --region ap-south-1

# Scale up ECS services
aws ecs update-service --cluster ecs-microservices-cluster \
  --service ecs-microservices-redis --desired-count 1 --region ap-south-1
aws ecs update-service --cluster ecs-microservices-cluster \
  --service ecs-microservices-flask-app --desired-count 2 --region ap-south-1
aws ecs update-service --cluster ecs-microservices-cluster \
  --service ecs-microservices-nginx --desired-count 2 --region ap-south-1

Production Best Practices

Critical Cleanup Reminders

⚠️ HIGH PRIORITY - These actions save the most money:

NAT Gateways - Delete immediately (saves $65/month)
RDS Database - Stop or delete (saves $35/month)
ECS Services - Scale to zero (saves $40/month)
Load Balancer - Delete when not needed (saves $20/month)
Elastic IPs - Release unattached IPs (saves $7/month)

Data Preservation Best Practices

Before Deletion:

Create RDS final snapshot if you need the data
Export ECR images if you want to keep them
Save Terraform state for future recreation
Document configurations for reference

Cost Monitoring Best Practices

After Cleanup:

Check Cost Explorer within 24-48 hours
Set up AWS Budgets for future cost alerts
Monitor for orphaned resources weekly
Review costs monthly to catch any surprises

Cleanup Time Estimates

Method	Time Required	Complexity	Recommended For
Terraform Destroy	15-20 minutes	Low	Production use
Manual Console	30-40 minutes	Medium	Learning/understanding
Automated Script	20-25 minutes	Low	Repetitive cleanup

What to Keep for Future Use

Low-Cost Resources to Preserve

✅ ECR Repositories (~$0.10/GB-month) - Keep container images
✅ S3 Bucket (~$0.023/GB-month) - Terraform state storage
✅ IAM Users and Policies (no cost) - Access management
✅ RDS Snapshots (~$0.095/GB-month) - Database backups
✅ Route53 Hosted Zones ($0.50/month) - DNS management

High-Cost Resources to Delete

❌ NAT Gateways ($65/month) - Delete immediately
❌ RDS Multi-AZ ($35/month) - Delete or use Single-AZ
❌ ECS Fargate Tasks ($40/month) - Scale to zero
❌ Application Load Balancer ($20/month) - Delete when not needed

Final Production Checklist

Pre-Cleanup Verification

Data backed up (RDS snapshots, ECR images)
Terraform state saved (if using Terraform)
Documentation updated (configurations, lessons learned)
Team notified (if shared infrastructure)

Post-Cleanup Verification

AWS Console verified (all resources deleted)
Cost Explorer checked (costs near $0)
Billing alerts set (prevent future surprises)
Cleanup documented (for future reference)

Series Completion

🎉 Congratulations! You’ve successfully completed the entire production-grade ECS microservices series:

Part 1: Infrastructure setup with Terraform
Part 2: Application containerization with Docker
Part 3: ECS deployment with auto-scaling
Part 4: Production deployment and monitoring
Part 5: CI/CD pipeline with GitHub Actions
Part 6: Production cleanup and cost optimization

Key Takeaways

Cost Management

NAT Gateways are the most expensive resource - delete first
RDS Multi-AZ doubles database costs - use Single-AZ for dev
ECS Fargate charges by task count - scale to zero when not needed
Load Balancers charge hourly - delete when not in use
Elastic IPs charge when unattached - release them

Cleanup Best Practices

Follow the exact order to prevent dependency conflicts
Create snapshots before deleting databases
Verify cleanup in AWS Console and Cost Explorer
Set up monitoring to prevent future cost surprises
Document everything for future reference

Production Readiness

Automated cleanup with Terraform is most reliable
Manual cleanup provides better understanding
Scripted cleanup balances automation and control
Cost monitoring prevents unexpected charges
Data preservation ensures no data loss

Remember: AWS bills are prorated, so the sooner you delete resources, the less you pay!

Questions or feedback? Feel free to reach out in the comments below!

Share This Post

Twitter LinkedIn Copy Link

Building Production-Grade ECS Microservices with CI/CD - Part 6: Production Cleanup

Table of Contents

Share This Post

Building Production-Grade ECS Microservices with CI/CD - Part 6: Production Cleanup

What We’ll Cover

Production Cost Analysis

Monthly Cost Breakdown

Cost Optimization Priority

Production Cleanup Strategy

Critical Cleanup Order

Phase 1: Application Layer (5-10 minutes)

Phase 2: Data Layer (10-15 minutes)

Phase 3: Networking Layer (5-10 minutes)

Phase 4: Infrastructure Layer (5-10 minutes)

Cleanup Time Estimates

Method 1: Automated Cleanup with Terraform (Recommended)

Prerequisites

Step 1: Navigate to Terraform Directory

Step 2: Review What Will Be Deleted

Step 3: Create RDS Snapshot (Optional but Recommended)

Step 4: Modify RDS to Skip Final Snapshot (if needed)

Step 5: Destroy All Resources

Step 6: Verify Cleanup

Method 2: Manual Cleanup via AWS Console

Prerequisites

Step 1: Stop ECS Services

Step 2: Delete Application Load Balancer

Step 3: Delete Target Groups

Step 4: Delete RDS Database

Step 5: Delete NAT Gateways (Important - Saves $65/month!)

Step 6: Release Elastic IPs

Step 7: Delete ECS Cluster

Step 8: Delete Service Discovery Namespace

Step 9: Delete CloudWatch Log Groups

Step 10: Delete ECR Repositories (Optional)

Step 11: Delete VPC and Networking

Step 12: Delete IAM Roles

Step 13: Delete Auto Scaling Policies (if created manually)

Method 3: Automated Cleanup Script

Prerequisites

Production Verification Checklist

Critical Resources Verification

Infrastructure Resources Verification

Optional Resources Verification

Production Cost Verification

AWS Cost Explorer Verification

Cost Monitoring Commands

Production Troubleshooting

Issue: VPC Won’t Delete

Issue: NAT Gateway Stuck in “Deleting” State

Issue: Security Group Deletion Fails

Issue: RDS Deletion Hangs

Issue: Terraform Destroy Fails

Issue: ECS Services Won’t Scale Down

Cost Optimization Strategies

Strategy 1: Partial Cleanup (Keep Infrastructure)

Quick Cost Reduction (~$100/month savings)

Strategy 2: Development Environment Optimization

Restart After Partial Cleanup:

Production Best Practices

Critical Cleanup Reminders

Data Preservation Best Practices

Cost Monitoring Best Practices

Cleanup Time Estimates

What to Keep for Future Use

Low-Cost Resources to Preserve

High-Cost Resources to Delete

Final Production Checklist

Pre-Cleanup Verification

Post-Cleanup Verification

Series Completion

Key Takeaways

Cost Management

Cleanup Best Practices

Production Readiness

Table of Contents

Share This Post