AWS Serverless Media Processing Pipeline - Part 1: Infrastructure Foundation & Worker Lambda

Build a production-ready serverless media processing pipeline with AWS Lambda, S3, DynamoDB, and SQS. Part 1 covers infrastructure setup and image processing worker.

AWS Serverless Media Processing Pipeline - Part 1: Infrastructure Foundation & Worker Lambda

Table of Contents

AWS Serverless Media Processing Pipeline - Part 1: Infrastructure Foundation & Worker Lambda

Overview

In this comprehensive guide, we’ll build a production-ready serverless media processing pipeline that automatically watermarks images uploaded to S3. This is Part 1 of a 4-part series covering the complete infrastructure setup and core processing engine.

What we’ll build:

  • Event-driven architecture using S3, SQS, and Lambda
  • Automatic image watermarking with Pillow
  • Scalable job tracking with DynamoDB
  • Production-ready error handling and monitoring

Architecture Preview:

User Upload → S3 → SQS → Lambda Worker → Processed S3
     ↓
DynamoDB (Job Status)

Region: ap-south-1 (Mumbai)
Estimated Setup Time: 2-3 hours
Monthly Cost: ~$0.20 for 1000 operations


Table of Contents

  1. Prerequisites
  2. Phase 1: Core Infrastructure Setup
  3. Phase 2: Worker Lambda Development
  4. Testing & Verification
  5. Production Considerations
  6. Cost Analysis
  7. Troubleshooting

Prerequisites

Before starting, ensure you have:

  • AWS Account with appropriate permissions
  • Basic understanding of AWS services (S3, Lambda, DynamoDB, SQS)
  • Python knowledge (for Lambda functions)
  • Terminal access (for CloudShell or local development)

Required AWS Permissions:

  • S3: Create buckets, manage objects
  • Lambda: Create functions, layers, roles
  • DynamoDB: Create tables, manage items
  • SQS: Create queues, send/receive messages
  • IAM: Create roles and policies

Phase 1: Core Infrastructure Setup

Step 1: Create S3 Buckets

We’ll create two S3 buckets: one for original uploads and another for processed images.

1.1 Create Upload Bucket

  1. Navigate to S3:

    • Sign in to AWS Console
    • Search for “S3” in the services search bar
    • Click S3 to open the S3 console
  2. Create the uploads bucket:

    • Click Create bucket
    • Bucket name: amodhbh-media-uploads
    • AWS Region: Asia Pacific (Mumbai) ap-south-1
    • Object Ownership: ACLs disabled (recommended)
    • Block Public Access settings: Keep all boxes checked (block all public access)
    • Bucket Versioning: Disabled (to save costs, unless you need version history)
    • Tags (Optional):
      • Key: Project, Value: serverless-media
      • Key: Purpose, Value: uploads
      • Key: Environment, Value: production
    • Default encryption:
      • Encryption type: Server-side encryption with Amazon S3 managed keys (SSE-S3)
    • Click Create bucket
  3. Verify creation:

    • You should see amodhbh-media-uploads in your bucket list

1.2 Create Processed Bucket

  1. Create the processed bucket:

    • Click Create bucket again
    • Bucket name: amodhbh-media-processed
    • AWS Region: Asia Pacific (Mumbai) ap-south-1
    • Object Ownership: ACLs disabled (recommended)
    • Block Public Access settings: Keep all boxes checked
    • Bucket Versioning: Disabled
    • Tags (Optional):
      • Key: Project, Value: serverless-media
      • Key: Purpose, Value: processed
      • Key: Environment, Value: production
    • Default encryption:
      • Encryption type: Server-side encryption with Amazon S3 managed keys (SSE-S3)
    • Click Create bucket
  2. Verify creation:

    • You should now see both buckets: amodhbh-media-uploads and amodhbh-media-processed

Step 2: Create DynamoDB Table

We’ll create a DynamoDB table to track job status throughout the processing pipeline.

2.1 Create the Jobs Table

  1. Navigate to DynamoDB:

    • In the AWS Console, search for “DynamoDB”
    • Click DynamoDB to open the DynamoDB console
  2. Create table:

    • Click Create table
    • Table name: media-processing-jobs
    • Partition key: jobId (Type: String)
    • Sort key: Leave empty (not needed)
  3. Table settings:

    • Table class: DynamoDB Standard
    • Capacity mode: Select On-demand
      • This is cost-effective for unpredictable workloads
      • You only pay per request, no upfront capacity planning needed
  4. Encryption:

    • Encryption at rest: Owned by Amazon DynamoDB (default, no additional cost)
  5. Tags (Optional):

    • Key: Project, Value: serverless-media
    • Key: Environment, Value: production
  6. Create table:

    • Click Create table
    • Wait 20-30 seconds for the table to become Active

2.2 Verify Table Creation

  1. Click on the media-processing-jobs table
  2. Go to the Explore table items tab
  3. You should see an empty table with the jobId partition key

Step 3: Create SQS Queues

We’ll create two SQS queues: a main processing queue and a dead letter queue for failed messages.

3.1 Create Dead Letter Queue (DLQ)

We create the DLQ first, so we can reference it when creating the main queue.

  1. Navigate to SQS:

    • In the AWS Console, search for “SQS”
    • Click Simple Queue Service
  2. Create queue:

    • Click Create queue
    • Type: Standard
    • Name: media-processing-dlq
  3. Configuration:

    • Visibility timeout: 5 minutes (300 seconds)
    • Message retention period: 14 days (maximum, so you can investigate failures)
    • Delivery delay: 0 seconds
    • Maximum message size: 256 KB (default)
    • Receive message wait time: 0 seconds (short polling is fine for DLQ)
  4. Access policy:

    • Leave as default (only queue owner can send/receive)
  5. Encryption:

    • Server-side encryption: Disabled (optional, but adds cost)
  6. Tags (Optional):

    • Key: Project, Value: serverless-media
    • Key: Environment, Value: production
  7. Create queue:

    • Click Create queue
    • Note the Queue ARN - you’ll need this in the next step
    • Example: arn:aws:sqs:ap-south-1:123456789012:media-processing-dlq

3.2 Create Main Processing Queue

  1. Create queue:

    • From the SQS console, click Create queue
    • Type: Standard
    • Name: media-processing-queue
  2. Configuration:

    • Visibility timeout: 5 minutes (300 seconds)
      • This should be at least 6x your Lambda timeout
      • Our worker Lambda will have a 2-minute timeout, so 5 minutes is safe
    • Message retention period: 4 days (default)
    • Delivery delay: 0 seconds
    • Maximum message size: 256 KB (default)
    • Receive message wait time: 20 seconds (enables long polling, more efficient)
  3. Dead-letter queue:

    • Enable: Check the box
    • Choose queue: Select media-processing-dlq from the dropdown
    • Maximum receives: 3
      • After 3 failed processing attempts, the message moves to the DLQ
  4. Access policy:

    • Leave as default
  5. Encryption:

    • Server-side encryption: Disabled (to minimize costs for learning)
  6. Tags (Optional):

    • Key: Project, Value: serverless-media
    • Key: Environment, Value: production
  7. Create queue:

    • Click Create queue

3.3 Verify Queue Creation

  1. You should see both queues in the SQS console:

    • media-processing-queue
    • media-processing-dlq
  2. Click on media-processing-queue and verify:

    • Dead-letter queue is set to media-processing-dlq
    • Maximum receives is 3

Phase 2: Worker Lambda Development

Step 1: Create IAM Role for Worker Lambda

1.1 Navigate to IAM

  1. Open the AWS Console
  2. Search for “IAM” and click IAM
  3. In the left sidebar, click Roles
  4. Click Create role

1.2 Configure Trust Policy

  1. Trusted entity type: AWS service
  2. Use case: Lambda
  3. Click Next

1.3 Add Permissions Policies

We’ll attach AWS managed policies first, then add a custom inline policy.

  1. Search and attach these managed policies:

    • AWSLambdaBasicExecutionRole (for CloudWatch Logs)
  2. Click Next

1.4 Name and Create Role

  1. Role name: media-worker-lambda-role
  2. Description: IAM role for worker lambda to process media files
  3. Click Create role

1.5 Add Custom Inline Policy

Now we’ll add permissions for S3, SQS, and DynamoDB.

  1. In the Roles list, search for and click media-worker-lambda-role
  2. Click the Permissions tab
  3. Click Add permissionsCreate inline policy
  4. Click the JSON tab
  5. Replace the content with:
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "SQSReceiveAndDelete",
      "Effect": "Allow",
      "Action": [
        "sqs:ReceiveMessage",
        "sqs:DeleteMessage",
        "sqs:GetQueueAttributes"
      ],
      "Resource": "arn:aws:sqs:ap-south-1:*:media-processing-queue"
    },
    {
      "Sid": "S3ReadUploads",
      "Effect": "Allow",
      "Action": ["s3:GetObject"],
      "Resource": "arn:aws:s3:::amodhbh-media-uploads/*"
    },
    {
      "Sid": "S3WriteProcessed",
      "Effect": "Allow",
      "Action": ["s3:PutObject"],
      "Resource": "arn:aws:s3:::amodhbh-media-processed/*"
    },
    {
      "Sid": "DynamoDBUpdateJobs",
      "Effect": "Allow",
      "Action": ["dynamodb:GetItem", "dynamodb:UpdateItem"],
      "Resource": "arn:aws:dynamodb:ap-south-1:*:table/media-processing-jobs"
    }
  ]
}
  1. Click Next
  2. Policy name: WorkerLambdaCustomPolicy
  3. Click Create policy

Step 2: Prepare Lambda Deployment Package

The Worker Lambda needs the Pillow library for image processing. We’ll use a Lambda Layer approach.

2.1 Create Lambda Layer for Pillow

Option A: Use AWS-Provided Layer (Recommended - Easiest)

AWS provides pre-built layers for common Python packages. However, if not available, proceed to Option B.

Option B: Create Custom Layer (Most Reliable)

You’ll need to create this on a Linux environment or use AWS CloudShell.

  1. Open AWS CloudShell:

    • In the AWS Console (ap-south-1 region), click the CloudShell icon (terminal icon) in the top navigation bar
    • Wait for the shell to initialize
  2. Create the layer directory structure:

    mkdir -p pillow-layer/python
    cd pillow-layer
    
  3. Install Pillow:

    pip3 install Pillow -t python/
    
  4. Create the zip file:

    zip -r pillow-layer.zip python/
    
  5. Download to your local machine:

    • In CloudShell, click ActionsDownload file
    • Enter file path: pillow-layer/pillow-layer.zip
    • Save the file to your computer

2.2 Create Lambda Layer in AWS Console

  1. Navigate to Lambda:

    • Search for “Lambda” in the AWS Console
    • Click Lambda
  2. Create layer:

    • In the left sidebar, click Layers
    • Click Create layer
    • Name: pillow-layer
    • Description: Pillow library for image processing
    • Upload: Click Upload a .zip file
    • Choose the pillow-layer.zip file you downloaded
    • Compatible runtimes: Select Python 3.11, Python 3.12
    • Click Create
  3. Note the Layer ARN (you’ll need this when creating the Lambda function)

    • Example: arn:aws:lambda:ap-south-1:123456789012:layer:pillow-layer:1

Step 3: Create Worker Lambda Function

3.1 Create the Function

  1. Navigate to Lambda:

    • In Lambda console, click Functions in the left sidebar
    • Click Create function
  2. Basic information:

    • Select Author from scratch
    • Function name: worker-lambda
    • Runtime: Python 3.11 (or Python 3.12)
    • Architecture: x86_64
  3. Permissions:

    • Expand Change default execution role
    • Select Use an existing role
    • Existing role: Select media-worker-lambda-role from the dropdown
  4. Advanced settings:

    • Leave defaults for now
  5. Click Create function

3.2 Configure Function Settings

  1. In the Configuration tab:

    • Click General configurationEdit
    • Memory: 512 MB (image processing needs more memory)
    • Timeout: 2 minutes (120 seconds)
    • Ephemeral storage: 512 MB (default is fine)
    • Click Save
  2. Add the Pillow Layer:

    • Scroll down to Layers section
    • Click Add a layer
    • Select Custom layers
    • Choose pillow-layer and the latest version
    • Click Add

3.3 Add Environment Variables

  1. Click Configuration tab

  2. Click Environment variablesEdit

  3. Click Add environment variable for each:

    KeyValue
    UPLOAD_BUCKETamodhbh-media-uploads
    PROCESSED_BUCKETamodhbh-media-processed
    DYNAMODB_TABLEmedia-processing-jobs
  4. Click Save

3.4 Add Function Code

  1. Click the Code tab
  2. In the code editor, replace the contents of lambda_function.py with:
import json
import boto3
import os
from PIL import Image, ImageDraw, ImageFont
from io import BytesIO
from datetime import datetime
import logging

# Configure logging
logger = logging.getLogger()
logger.setLevel(logging.INFO)

# Initialize AWS clients
s3_client = boto3.client('s3')
dynamodb = boto3.resource('dynamodb')
table = dynamodb.Table(os.environ['DYNAMODB_TABLE'])

UPLOAD_BUCKET = os.environ['UPLOAD_BUCKET']
PROCESSED_BUCKET = os.environ['PROCESSED_BUCKET']

def lambda_handler(event, context):
    """
    Main Lambda handler. Processes SQS messages containing S3 upload events.
    """
    logger.info(f"Received event: {json.dumps(event)}")

    for record in event['Records']:
        try:
            # Parse the SQS message body
            message_body = json.loads(record['body'])

            job_id = message_body['jobId']
            s3_key = message_body['s3Key']

            logger.info(f"Processing job {job_id} for file {s3_key}")

            # Update job status to PROCESSING
            update_job_status(job_id, 'PROCESSING')

            # Download the image from S3
            image_data = download_image(UPLOAD_BUCKET, s3_key)

            # Process the image (add watermark)
            processed_image_data = add_watermark(image_data)

            # Generate output key
            output_key = f"processed/{s3_key}"

            # Upload processed image to S3
            upload_image(PROCESSED_BUCKET, output_key, processed_image_data)

            # Update job status to COMPLETED
            update_job_status(
                job_id,
                'COMPLETED',
                processed_url=f"s3://{PROCESSED_BUCKET}/{output_key}"
            )

            logger.info(f"Successfully processed job {job_id}")

        except Exception as e:
            logger.error(f"Error processing record: {str(e)}")

            # Update job status to FAILED
            if 'job_id' in locals():
                update_job_status(job_id, 'FAILED', error=str(e))

            # Re-raise the exception so SQS knows the message failed
            raise

    return {
        'statusCode': 200,
        'body': json.dumps('Processing complete')
    }

def download_image(bucket, key):
    """
    Download an image from S3 and return as bytes.
    """
    logger.info(f"Downloading s3://{bucket}/{key}")
    response = s3_client.get_object(Bucket=bucket, Key=key)
    image_data = response['Body'].read()
    return image_data

def add_watermark(image_data):
    """
    Add a watermark to the image with improved error handling and performance.
    """
    logger.info("Adding watermark to image")

    try:
        # Open the image
        image = Image.open(BytesIO(image_data))

        # Convert to RGBA if not already (for transparency support)
        if image.mode != 'RGBA':
            image = image.convert('RGBA')

        # Create a transparent overlay
        overlay = Image.new('RGBA', image.size, (255, 255, 255, 0))
        draw = ImageDraw.Draw(overlay)

        # Calculate watermark position (bottom-right corner)
        watermark_text = "© Amodhbh Media"

        # Try to use a better font, fall back to default if not available
        try:
            font = ImageFont.truetype("/usr/share/fonts/dejavu/DejaVuSans-Bold.ttf", 36)
        except:
            font = ImageFont.load_default()

        # Get text size using textbbox
        bbox = draw.textbbox((0, 0), watermark_text, font=font)
        text_width = bbox[2] - bbox[0]
        text_height = bbox[3] - bbox[1]

        # Position text in bottom-right with 20px margin
        x = image.width - text_width - 20
        y = image.height - text_height - 20

        # Draw semi-transparent background rectangle
        padding = 10
        draw.rectangle(
            [x - padding, y - padding, x + text_width + padding, y + text_height + padding],
            fill=(0, 0, 0, 128)
        )

        # Draw the watermark text
        draw.text((x, y), watermark_text, fill=(255, 255, 255, 255), font=font)

        # Composite the overlay onto the original image
        watermarked = Image.alpha_composite(image, overlay)

        # Convert back to RGB (removes alpha channel)
        watermarked = watermarked.convert('RGB')

        # Save to bytes with optimized settings
        output = BytesIO()
        watermarked.save(output, format='JPEG', quality=95, optimize=True)
        output.seek(0)

        return output.getvalue()

    except Exception as e:
        logger.error(f"Error adding watermark: {str(e)}")
        raise

def upload_image(bucket, key, image_data):
    """
    Upload processed image to S3 with metadata.
    """
    logger.info(f"Uploading to s3://{bucket}/{key}")
    s3_client.put_object(
        Bucket=bucket,
        Key=key,
        Body=image_data,
        ContentType='image/jpeg',
        Metadata={
            'processed-by': 'serverless-media-pipeline',
            'processing-timestamp': datetime.utcnow().isoformat()
        }
    )

def update_job_status(job_id, status, processed_url=None, error=None):
    """
    Update job status in DynamoDB with improved error handling.
    """
    logger.info(f"Updating job {job_id} to status {status}")

    try:
        update_expression = "SET #status = :status, updatedAt = :timestamp"
        expression_values = {
            ':status': status,
            ':timestamp': datetime.utcnow().isoformat()
        }
        expression_names = {
            '#status': 'status'
        }

        if processed_url:
            update_expression += ", processedUrl = :url"
            expression_values[':url'] = processed_url

        if error:
            update_expression += ", errorMessage = :error"
            expression_values[':error'] = error

        table.update_item(
            Key={'jobId': job_id},
            UpdateExpression=update_expression,
            ExpressionAttributeValues=expression_values,
            ExpressionAttributeNames=expression_names
        )

        logger.info(f"Successfully updated job {job_id} to {status}")

    except Exception as e:
        logger.error(f"Error updating job status: {str(e)}")
        raise
  1. Click Deploy to save the function code

Step 4: Add SQS Trigger

4.1 Configure the Trigger

  1. In the Lambda function, click Add trigger
  2. Select a source: SQS
  3. SQS queue: Select media-processing-queue
  4. Batch size: 1 (process one message at a time)
  5. Batch window: 0 seconds
  6. Enable trigger: Checked
  7. Click Add

4.2 Verify Trigger Configuration

  1. In the Configuration tab → Triggers
  2. You should see media-processing-queue listed
  3. Status should be Enabled

Testing & Verification

Step 1: Create Test Event (Manual Test)

We’ll simulate an SQS message manually to test the worker Lambda.

  1. Click the Test tab
  2. Event name: TestMediaProcessing
  3. Template: SQS
  4. Replace the event JSON with:
{
  "Records": [
    {
      "messageId": "test-message-id",
      "receiptHandle": "test-receipt-handle",
      "body": "{\"jobId\": \"test-job-123\", \"s3Key\": \"test-image.jpg\"}",
      "attributes": {
        "ApproximateReceiveCount": "1",
        "SentTimestamp": "1234567890000",
        "SenderId": "test-sender",
        "ApproximateFirstReceiveTimestamp": "1234567890000"
      },
      "messageAttributes": {},
      "md5OfBody": "test-md5",
      "eventSource": "aws:sqs",
      "eventSourceARN": "arn:aws:sqs:ap-south-1:123456789012:media-processing-queue",
      "awsRegion": "ap-south-1"
    }
  ]
}
  1. Click Save

Note: This test will fail because test-image.jpg doesn’t exist in your bucket. We’ll do a proper end-to-end test in Part 2.

Step 2: Verify CloudWatch Logs

  1. Go to Monitor tab → View CloudWatch logs
  2. Click the latest log stream
  3. You should see log entries from your function execution

Step 3: Verification Checklist

Before proceeding to Part 2, verify:

S3 Buckets (2)

  • amodhbh-media-uploads exists in ap-south-1
  • amodhbh-media-processed exists in ap-south-1
  • Both buckets have “Block all public access” enabled
  • Both buckets have encryption enabled

DynamoDB Table (1)

  • media-processing-jobs table exists
  • Table status is Active
  • Partition key is jobId (String)
  • Capacity mode is On-demand

SQS Queues (2)

  • media-processing-queue exists
  • media-processing-dlq exists
  • Main queue has DLQ configured with max receives = 3
  • Visibility timeout is 300 seconds (5 minutes)

IAM Role

  • Role media-worker-lambda-role exists
  • Has AWSLambdaBasicExecutionRole managed policy
  • Has custom inline policy with SQS, S3, and DynamoDB permissions

Lambda Layer

  • Layer pillow-layer created
  • Layer contains Pillow library

Lambda Function

  • Function worker-lambda exists
  • Runtime is Python 3.11 or 3.12
  • Memory is 512 MB
  • Timeout is 120 seconds (2 minutes)
  • Pillow layer is attached
  • Environment variables are set (UPLOAD_BUCKET, PROCESSED_BUCKET, DYNAMODB_TABLE)
  • Function code is deployed
  • SQS trigger is configured and enabled

Production Considerations

Security Enhancements

  1. Enable S3 Server-Side Encryption:

    • Use AWS KMS for additional security
    • Implement bucket policies for access control
  2. IAM Least Privilege:

    • Review and minimize IAM permissions
    • Use resource-specific ARNs where possible
  3. VPC Configuration:

    • Consider placing Lambda in VPC for additional security
    • Configure VPC endpoints for AWS services

Monitoring & Alerting

  1. CloudWatch Alarms:

    • Set up alarms for Lambda errors
    • Monitor SQS queue depth
    • Track DynamoDB throttling
  2. X-Ray Tracing:

    • Enable AWS X-Ray for distributed tracing
    • Monitor performance bottlenecks
  3. Custom Metrics:

    • Track processing times
    • Monitor image sizes and formats

Performance Optimization

  1. Lambda Configuration:

    • Adjust memory based on image sizes
    • Use provisioned concurrency for consistent performance
    • Implement connection pooling for AWS services
  2. S3 Optimization:

    • Use S3 Transfer Acceleration for faster uploads
    • Implement lifecycle policies for cost optimization
  3. DynamoDB Optimization:

    • Use batch operations where possible
    • Implement caching for frequently accessed data

Cost Analysis

Current Monthly Cost for Idle Resources

  • S3 buckets (empty): $0.00
  • DynamoDB table (on-demand, no requests): $0.00
  • SQS queues (no messages): $0.00
  • Lambda (no invocations): $0.00

Total idle cost: $0.00/month

Cost When in Use

Example: 1000 image processing jobs per month

  • S3 storage: ~$0.023 per GB/month
  • DynamoDB: $0.285 per million read requests, $1.4225 per million write requests
  • SQS: $0.40 per million requests (first 1 million free each month)
  • Lambda requests: $0.20 per 1 million requests
  • Lambda duration: $0.0000166667 per GB-second

Estimated monthly cost for 1000 operations: ~$0.20

Cost Optimization Tips

  1. S3 Lifecycle Policies:

    • Automatically delete old processed images
    • Move to cheaper storage classes
  2. DynamoDB Optimization:

    • Use on-demand billing for unpredictable workloads
    • Implement TTL for automatic cleanup
  3. Lambda Optimization:

    • Right-size memory allocation
    • Use provisioned concurrency only when needed

Troubleshooting

Common Issues and Solutions

“Unable to import module ’lambda_function’: No module named ‘PIL’”

  • Cause: The Pillow layer is not attached or incorrectly built
  • Solution:
    • Verify the layer is attached in the function’s ConfigurationLayers
    • Rebuild the layer ensuring the directory structure is python/PIL/...

“Task timed out after 120.00 seconds”

  • Cause: Image is too large or processing is too slow
  • Solution:
    • Increase timeout to 3-5 minutes in Configuration → General configuration
    • Consider increasing memory to 1024 MB (more memory = faster CPU)

“Access Denied” errors

  • Cause: IAM permissions issue
  • Solution:
    • Check IAM role permissions
    • Verify bucket names match exactly in environment variables and IAM policy
    • Ensure the S3 objects exist in the correct bucket

Messages going to DLQ

  • Cause: Lambda function errors
  • Solution:
    • Check CloudWatch Logs for specific error messages
    • Verify all environment variables are set correctly
    • Check that the function code was deployed (click Deploy after editing)

Bucket name already exists

  • Cause: S3 bucket names are globally unique across all AWS accounts
  • Solution:
    • If amodhbh-media-uploads is taken, try: amodhbh-media-uploads-<random-number>
    • Update the bucket name consistently across all phases

Cannot create DynamoDB table

  • Cause: Region or naming conflict
  • Solution:
    • Verify you’re in the correct region (ap-south-1)
    • Check you don’t already have a table with the same name

SQS queue creation fails

  • Cause: DLQ not created first
  • Solution:
    • Ensure the DLQ was created first
    • Verify the DLQ ARN is correctly selected

Next Steps

Congratulations! You’ve successfully set up the core infrastructure and worker Lambda for your serverless media processing pipeline.

What’s Next:

  • Part 2: API Gateway and supporting Lambda functions
  • Part 3: S3 event triggers and end-to-end testing
  • Part 4: Production deployment and monitoring

Quick Reference:

Resource TypeNamePurpose
S3 Bucketamodhbh-media-uploadsStores original uploaded images
S3 Bucketamodhbh-media-processedStores watermarked images
DynamoDB Tablemedia-processing-jobsTracks job status (PENDING/COMPLETED/FAILED)
SQS Queuemedia-processing-queueDecouples upload events from processing
SQS Queuemedia-processing-dlqCaptures failed messages for investigation
IAM Rolemedia-worker-lambda-roleWorker Lambda execution role
Lambda Layerpillow-layerProvides Pillow library for image processing
Lambda Functionworker-lambdaDownloads, watermarks, and uploads images
SQS Triggermedia-processing-queueTriggers worker Lambda when messages arrive

Summary

In this comprehensive guide, we’ve built the foundation of a production-ready serverless media processing pipeline:

Infrastructure Setup:

  • S3 buckets for uploads and processed images
  • DynamoDB table for job tracking
  • SQS queues with dead letter queue for reliability

Worker Lambda:

  • Image processing with Pillow library
  • Comprehensive error handling and logging
  • Production-ready code with optimizations

Security & Monitoring:

  • IAM roles with least privilege
  • CloudWatch logging and monitoring
  • Cost optimization strategies

Key Benefits:

  • Scalable: Handles thousands of concurrent requests
  • Cost-effective: Pay only for what you use
  • Reliable: Built-in error handling and retry logic
  • Maintainable: Clean, well-documented code

Ready for Part 2? We’ll add API Gateway, pre-signed URLs, and complete the end-to-end pipeline!


This is Part 1 of a 4-part series on building a production-ready serverless media processing pipeline. Stay tuned for Part 2 where we’ll add API Gateway and complete the user-facing components! Here is the Part 2, where we’ll add API Gateway and complete the user-facing components!

Table of Contents