AWS Serverless Media Processing Pipeline - Part 2: API Gateway & Complete Pipeline

Complete your serverless media processing pipeline with API Gateway, Lambda functions, and end-to-end testing. Part 2 covers the user-facing API and event-driven processing.

AWS serverless media-processing

October 19, 2025

Share This Post

Twitter LinkedIn Copy Link

AWS Serverless Media Processing Pipeline - Part 2: API Gateway & Complete Pipeline

Overview

In this comprehensive guide, we’ll complete our serverless media processing pipeline by adding the user-facing API and event-driven processing components. This is Part 2 of our 4-part series, building on the infrastructure foundation from Part 1.

What we’ll build:

API Gateway HTTP API with secure endpoints
Lambda functions for upload URL generation and job status checking
S3 event-driven dispatcher Lambda
Complete end-to-end testing and monitoring

Architecture Flow:

User → API Gateway → Lambda → S3 → Event → Dispatcher → SQS → Worker → Processed S3

Region: ap-south-1 (Mumbai)
Estimated Setup Time: 1-2 hours
Prerequisites: Part 1 completed

Prerequisites
Phase 3: API Gateway & Supporting Lambda Functions
Phase 4: S3 Dispatcher & End-to-End Testing
Production Monitoring & Optimization
Cost Analysis & Scaling
Troubleshooting Guide

Prerequisites

Before starting, ensure you have completed Part 1 and have:

✅ S3 buckets (amodhbh-media-uploads, amodhbh-media-processed)
✅ DynamoDB table (media-processing-jobs)
✅ SQS queues (media-processing-queue, media-processing-dlq)
✅ Worker Lambda function with Pillow layer
✅ All IAM roles and permissions configured

Required AWS Permissions:

API Gateway: Create APIs, routes, and integrations
Lambda: Create additional functions and layers
IAM: Create roles and policies for new functions

Phase 3: API Gateway & Supporting Lambda Functions

Step 1: Create IAM Role for API Lambda Functions

1.1 Create the IAM Role

Navigate to IAM:
- AWS Console → Search “IAM” → Click IAM
- Left sidebar → Roles
- Click Create role
Configure trust policy:
- Trusted entity type: AWS service
- Use case: Lambda
- Click Next
Add permissions:
- Search and select: AWSLambdaBasicExecutionRole
- Click Next
Name and create:
- Role name: media-api-lambda-role
- Description: IAM role for API Lambda functions to interact with S3 and DynamoDB
- Click Create role

1.2 Add Custom Inline Policy

Search for and click on media-api-lambda-role
Permissions tab → Add permissions → Create inline policy
Click JSON tab
Paste this policy:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "DynamoDBReadWrite",
      "Effect": "Allow",
      "Action": ["dynamodb:PutItem", "dynamodb:GetItem", "dynamodb:UpdateItem"],
      "Resource": "arn:aws:dynamodb:ap-south-1:*:table/media-processing-jobs"
    },
    {
      "Sid": "S3PresignedURL",
      "Effect": "Allow",
      "Action": ["s3:PutObject"],
      "Resource": "arn:aws:s3:::amodhbh-media-uploads/*"
    },
    {
      "Sid": "S3GetProcessedObjects",
      "Effect": "Allow",
      "Action": ["s3:GetObject"],
      "Resource": "arn:aws:s3:::amodhbh-media-processed/*"
    }
  ]
}

Click Next
Policy name: APILambdaCustomPolicy
Click Create policy

Step 2: Create Request Upload Lambda Function

This function generates a pre-signed S3 URL and creates a job record.

2.1 Create the Function

Navigate to Lambda:
- AWS Console → Search “Lambda” → Functions → Create function
Basic information:
- Select Author from scratch
- Function name: request-upload-lambda
- Runtime: Python 3.11
- Architecture: x86_64
Permissions:
- Expand Change default execution role
- Select Use an existing role
- Existing role: media-api-lambda-role
Click Create function

2.2 Configure Function Settings

Configuration tab → General configuration → Edit
- Memory: 128 MB (this is a lightweight function)
- Timeout: 30 seconds
- Click Save
Add environment variables:
- Configuration tab → Environment variables → Edit
- Add these variables:
Key Value
UPLOAD_BUCKET amodhbh-media-uploads
DYNAMODB_TABLE media-processing-jobs
- Click Save

Key	Value
`UPLOAD_BUCKET`	`amodhbh-media-uploads`
`DYNAMODB_TABLE`	`media-processing-jobs`

2.3 Add Function Code

Click the Code tab
Replace lambda_function.py with:

import json
import boto3
import uuid
from datetime import datetime
import logging

# Configure logging
logger = logging.getLogger()
logger.setLevel(logging.INFO)

s3_client = boto3.client('s3')
dynamodb = boto3.resource('dynamodb')

import os
UPLOAD_BUCKET = os.environ['UPLOAD_BUCKET']
DYNAMODB_TABLE = os.environ['DYNAMODB_TABLE']

table = dynamodb.Table(DYNAMODB_TABLE)

def lambda_handler(event, context):
    """
    Generate a pre-signed S3 upload URL and create a job record in DynamoDB.
    """
    logger.info(f"Received event: {json.dumps(event)}")

    try:
        # Parse request body
        if event.get('body'):
            body = json.loads(event['body'])
        else:
            body = {}

        # Get filename from request (optional)
        filename = body.get('filename', f"{uuid.uuid4()}.jpg")

        # Generate unique job ID
        job_id = str(uuid.uuid4())

        # Generate S3 key (path in bucket)
        s3_key = f"uploads/{job_id}/{filename}"

        # Create job record in DynamoDB
        create_job_record(job_id, s3_key)

        # Generate pre-signed URL (valid for 5 minutes)
        upload_url = generate_presigned_url(UPLOAD_BUCKET, s3_key)

        # Return response
        response = {
            'jobId': job_id,
            'uploadUrl': upload_url,
            'expiresIn': 300  # 5 minutes in seconds
        }

        logger.info(f"Generated upload URL for job {job_id}")

        return {
            'statusCode': 200,
            'headers': {
                'Content-Type': 'application/json',
                'Access-Control-Allow-Origin': '*',
                'Access-Control-Allow-Headers': 'Content-Type',
                'Access-Control-Allow-Methods': 'POST, OPTIONS'
            },
            'body': json.dumps(response)
        }

    except Exception as e:
        logger.error(f"Error: {str(e)}")
        return {
            'statusCode': 500,
            'headers': {
                'Content-Type': 'application/json',
                'Access-Control-Allow-Origin': '*'
            },
            'body': json.dumps({
                'error': 'Internal server error',
                'message': str(e)
            })
        }

def create_job_record(job_id, s3_key):
    """
    Create a new job record in DynamoDB with PENDING status.
    """
    logger.info(f"Creating job record for {job_id}")

    try:
        table.put_item(
            Item={
                'jobId': job_id,
                'status': 'PENDING',
                's3Key': s3_key,
                'createdAt': datetime.utcnow().isoformat(),
                'updatedAt': datetime.utcnow().isoformat()
            }
        )
        logger.info(f"Successfully created job record for {job_id}")
    except Exception as e:
        logger.error(f"Error creating job record: {str(e)}")
        raise

def generate_presigned_url(bucket, key):
    """
    Generate a pre-signed URL for uploading to S3.
    """
    logger.info(f"Generating pre-signed URL for s3://{bucket}/{key}")

    try:
        url = s3_client.generate_presigned_url(
            'put_object',
            Params={
                'Bucket': bucket,
                'Key': key,
                'ContentType': 'image/jpeg'
            },
            ExpiresIn=300  # 5 minutes
        )
        return url
    except Exception as e:
        logger.error(f"Error generating pre-signed URL: {str(e)}")
        raise

Click Deploy

Step 3: Create Get Job Status Lambda Function

This function retrieves job status from DynamoDB.

3.1 Create the Function

Lambda console → Functions → Create function
Basic information:
- Select Author from scratch
- Function name: get-job-status-lambda
- Runtime: Python 3.11
- Architecture: x86_64
Permissions:
- Expand Change default execution role
- Select Use an existing role
- Existing role: media-api-lambda-role
Click Create function

3.2 Configure Function Settings

Configuration tab → General configuration → Edit
- Memory: 128 MB
- Timeout: 10 seconds
- Click Save
Add environment variables:
- Configuration tab → Environment variables → Edit
Key Value
DYNAMODB_TABLE media-processing-jobs
PROCESSED_BUCKET amodhbh-media-processed
- Click Save

Key	Value
`DYNAMODB_TABLE`	`media-processing-jobs`
`PROCESSED_BUCKET`	`amodhbh-media-processed`

3.3 Add Function Code

Click the Code tab
Replace lambda_function.py with:

import json
import boto3
import os
from boto3.dynamodb.conditions import Key
import logging

# Configure logging
logger = logging.getLogger()
logger.setLevel(logging.INFO)

dynamodb = boto3.resource('dynamodb')
s3_client = boto3.client('s3')

DYNAMODB_TABLE = os.environ['DYNAMODB_TABLE']
PROCESSED_BUCKET = os.environ['PROCESSED_BUCKET']

table = dynamodb.Table(DYNAMODB_TABLE)

def lambda_handler(event, context):
    """
    Retrieve job status from DynamoDB by job ID.
    """
    logger.info(f"Received event: {json.dumps(event)}")

    try:
        # Extract jobId from path parameters
        path_parameters = event.get('pathParameters', {})
        job_id = path_parameters.get('jobId')

        if not job_id:
            logger.warning("Missing jobId parameter")
            return {
                'statusCode': 400,
                'headers': {
                    'Content-Type': 'application/json',
                    'Access-Control-Allow-Origin': '*'
                },
                'body': json.dumps({
                    'error': 'Missing jobId parameter'
                })
            }

        logger.info(f"Retrieving status for job {job_id}")

        # Get job from DynamoDB
        response = table.get_item(Key={'jobId': job_id})

        if 'Item' not in response:
            logger.warning(f"Job {job_id} not found")
            return {
                'statusCode': 404,
                'headers': {
                    'Content-Type': 'application/json',
                    'Access-Control-Allow-Origin': '*'
                },
                'body': json.dumps({
                    'error': 'Job not found'
                })
            }

        job = response['Item']

        # If job is completed, generate a pre-signed URL for the processed image
        if job['status'] == 'COMPLETED' and 'processedUrl' in job:
            download_url = generate_download_url(job['processedUrl'])
            if download_url:
                job['downloadUrl'] = download_url

        logger.info(f"Retrieved status for job {job_id}: {job['status']}")

        # Return job status
        return {
            'statusCode': 200,
            'headers': {
                'Content-Type': 'application/json',
                'Access-Control-Allow-Origin': '*',
                'Access-Control-Allow-Headers': 'Content-Type',
                'Access-Control-Allow-Methods': 'GET, OPTIONS'
            },
            'body': json.dumps({
                'jobId': job['jobId'],
                'status': job['status'],
                'createdAt': job.get('createdAt'),
                'updatedAt': job.get('updatedAt'),
                'downloadUrl': job.get('downloadUrl'),
                'errorMessage': job.get('errorMessage')
            })
        }

    except Exception as e:
        logger.error(f"Error: {str(e)}")
        return {
            'statusCode': 500,
            'headers': {
                'Content-Type': 'application/json',
                'Access-Control-Allow-Origin': '*'
            },
            'body': json.dumps({
                'error': 'Internal server error',
                'message': str(e)
            })
        }

def generate_download_url(s3_url):
    """
    Generate a pre-signed URL for downloading the processed image.
    """
    try:
        if s3_url.startswith('s3://'):
            # Parse the S3 URL
            s3_path = s3_url.replace('s3://', '')
            parts = s3_path.split('/', 1)
            if len(parts) == 2:
                bucket, key = parts

                # Generate pre-signed URL for download (valid 1 hour)
                download_url = s3_client.generate_presigned_url(
                    'get_object',
                    Params={
                        'Bucket': bucket,
                        'Key': key
                    },
                    ExpiresIn=3600  # 1 hour
                )
                return download_url
    except Exception as e:
        logger.error(f"Error generating download URL: {str(e)}")

    return None

Click Deploy

Step 4: Create API Gateway HTTP API

4.1 Create the API

Navigate to API Gateway:
- AWS Console → Search “API Gateway” → Click API Gateway
Create API:
- Click Create API
- Under HTTP API, click Build
- API name: media-processing-api
- Click Next
Configure routes (skip for now):
- Click Next
Configure stages:
- Stage name: $default (auto-deploy)
- Click Next
Review and create:
- Click Create
Note your API endpoint URL:
- Example: https://abc123xyz.execute-api.ap-south-1.amazonaws.com
- You’ll use this to make API calls

4.2 Create Routes and Integrations

Route 1: POST /uploads

In your API, click Routes in the left sidebar
Click Create
Method: POST
Path: /uploads
Click Create
Attach integration:
- Click on the POST /uploads route
- Click Attach integration
- Create and attach an integration
- Integration type: Lambda function
- Integration target: request-upload-lambda
- Click Create

Route 2: GET /jobs/{jobId}

Click Create (to create another route)
Method: GET
Path: /jobs/{jobId}
Click Create
Attach integration:
- Click on the GET /jobs/{jobId} route
- Click Attach integration
- Create and attach an integration
- Integration type: Lambda function
- Integration target: get-job-status-lambda
- Click Create

4.3 Enable CORS (Already handled in Lambda)

Our Lambda functions already return CORS headers, so no additional CORS configuration is needed in API Gateway for HTTP APIs.

Step 5: Test the API

5.1 Test Request Upload Endpoint

Using curl (Linux/Mac/Git Bash on Windows):

curl -X POST https://YOUR_API_ID.execute-api.ap-south-1.amazonaws.com/uploads \
  -H "Content-Type: application/json" \
  -d '{"filename": "test-image.jpg"}'

Expected response:

{
  "jobId": "550e8400-e29b-41d4-a716-446655440000",
  "uploadUrl": "https://amodhbh-media-uploads.s3.ap-south-1.amazonaws.com/uploads/...",
  "expiresIn": 300
}

Using Postman:

Method: POST
URL: https://YOUR_API_ID.execute-api.ap-south-1.amazonaws.com/uploads
Headers: Content-Type: application/json
Body (raw JSON):
```
{
  "filename": "my-photo.jpg"
}
```
Click Send

5.2 Test File Upload to S3

After getting the uploadUrl from the previous request, upload an actual image:

Using curl:

curl -X PUT "PRESIGNED_URL_FROM_PREVIOUS_RESPONSE" \
  -H "Content-Type: image/jpeg" \
  --data-binary @path/to/your/image.jpg

Using Postman:

Method: PUT
URL: Paste the uploadUrl from the previous response
Headers: Content-Type: image/jpeg
Body: Select binary and choose an image file
Click Send

5.3 Test Get Job Status Endpoint

Using curl:

curl https://YOUR_API_ID.execute-api.ap-south-1.amazonaws.com/jobs/YOUR_JOB_ID

Expected response (when status is PENDING):

{
  "jobId": "550e8400-e29b-41d4-a716-446655440000",
  "status": "PENDING",
  "createdAt": "2025-10-17T12:00:00.000000",
  "updatedAt": "2025-10-17T12:00:00.000000"
}

Expected response (when status is COMPLETED):

{
  "jobId": "550e8400-e29b-41d4-a716-446655440000",
  "status": "COMPLETED",
  "createdAt": "2025-10-17T12:00:00.000000",
  "updatedAt": "2025-10-17T12:00:15.000000",
  "downloadUrl": "https://amodhbh-media-processed.s3.ap-south-1.amazonaws.com/..."
}

Phase 4: S3 Dispatcher & End-to-End Testing

Step 1: Create IAM Role for Dispatcher Lambda

1.1 Create the IAM Role

Navigate to IAM:
- AWS Console → Search “IAM” → Click IAM
- Left sidebar → Roles
- Click Create role
Configure trust policy:
- Trusted entity type: AWS service
- Use case: Lambda
- Click Next
Add permissions:
- Search and select: AWSLambdaBasicExecutionRole
- Click Next
Name and create:
- Role name: dispatcher-lambda-role
- Description: IAM role for dispatcher lambda to send messages to SQS
- Click Create role

1.2 Add Custom Inline Policy

Search for and click on dispatcher-lambda-role
Permissions tab → Add permissions → Create inline policy
Click JSON tab
Paste this policy:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "SQSSendMessage",
      "Effect": "Allow",
      "Action": ["sqs:SendMessage", "sqs:GetQueueUrl"],
      "Resource": "arn:aws:sqs:ap-south-1:*:media-processing-queue"
    }
  ]
}

Click Next
Policy name: DispatcherLambdaCustomPolicy
Click Create policy

Step 2: Create Dispatcher Lambda Function

2.1 Create the Function

Navigate to Lambda:
- AWS Console → Search “Lambda” → Functions → Create function
Basic information:
- Select Author from scratch
- Function name: dispatcher-lambda
- Runtime: Python 3.11
- Architecture: x86_64
Permissions:
- Expand Change default execution role
- Select Use an existing role
- Existing role: dispatcher-lambda-role
Click Create function

2.2 Configure Function Settings

Configuration tab → General configuration → Edit
- Memory: 128 MB (this is a very lightweight function)
- Timeout: 10 seconds
- Click Save
Add environment variables:
- Configuration tab → Environment variables → Edit
Key Value
SQS_QUEUE_URL (We’ll get this in the next step)
- Click Save (we’ll update this after getting the queue URL)

Key	Value
`SQS_QUEUE_URL`	(We’ll get this in the next step)

2.3 Get SQS Queue URL

Navigate to SQS:
- AWS Console → Search “SQS” → Click Simple Queue Service
Click on media-processing-queue
Copy the Queue URL from the Details section
- Example: https://sqs.ap-south-1.amazonaws.com/123456789012/media-processing-queue
Go back to Lambda:
- Return to dispatcher-lambda function
- Configuration → Environment variables → Edit
- Update SQS_QUEUE_URL with the copied URL
- Click Save

2.4 Add Function Code

Click the Code tab
Replace lambda_function.py with:

import json
import boto3
import os
from urllib.parse import unquote_plus
import logging

# Configure logging
logger = logging.getLogger()
logger.setLevel(logging.INFO)

sqs_client = boto3.client('sqs')
SQS_QUEUE_URL = os.environ['SQS_QUEUE_URL']

def lambda_handler(event, context):
    """
    Triggered by S3 upload events. Extracts S3 object details and
    sends a message to SQS for processing by the worker Lambda.
    """
    logger.info(f"Received event: {json.dumps(event)}")

    try:
        # Process each S3 event record
        for record in event['Records']:
            # Extract S3 event information
            event_name = record['eventName']
            bucket_name = record['s3']['bucket']['name']
            object_key = unquote_plus(record['s3']['object']['key'])

            logger.info(f"Processing S3 event: {event_name}")
            logger.info(f"Bucket: {bucket_name}, Key: {object_key}")

            # Only process object creation events
            if not event_name.startswith('ObjectCreated'):
                logger.info(f"Ignoring event type: {event_name}")
                continue

            # Extract job ID from the S3 key
            # Expected format: uploads/{job_id}/{filename}
            job_id = extract_job_id(object_key)

            if not job_id:
                logger.warning(f"Could not extract job ID from key: {object_key}")
                continue

            # Create message for SQS
            message = {
                'jobId': job_id,
                's3Key': object_key,
                'bucket': bucket_name
            }

            # Send message to SQS queue
            send_to_sqs(message)

            logger.info(f"Successfully queued job {job_id} for processing")

        return {
            'statusCode': 200,
            'body': json.dumps('Successfully dispatched events to SQS')
        }

    except Exception as e:
        logger.error(f"Error processing S3 event: {str(e)}")
        # Re-raise to signal failure
        raise

def extract_job_id(s3_key):
    """
    Extract job ID from S3 key path.
    Expected format: uploads/{job_id}/{filename}
    """
    parts = s3_key.split('/')
    if len(parts) >= 2 and parts[0] == 'uploads':
        return parts[1]
    return None

def send_to_sqs(message):
    """
    Send a message to the SQS queue.
    """
    logger.info(f"Sending message to SQS: {json.dumps(message)}")

    try:
        response = sqs_client.send_message(
            QueueUrl=SQS_QUEUE_URL,
            MessageBody=json.dumps(message),
            MessageAttributes={
                'JobId': {
                    'StringValue': message['jobId'],
                    'DataType': 'String'
                }
            }
        )

        logger.info(f"Message sent with ID: {response['MessageId']}")
        return response
    except Exception as e:
        logger.error(f"Error sending message to SQS: {str(e)}")
        raise

Click Deploy

Step 3: Configure S3 Event Notification

This is the crucial step that triggers the entire processing pipeline.

3.1 Add Lambda Permissions for S3

First, we need to allow S3 to invoke our Lambda function.

In the dispatcher-lambda function:
- Go to Configuration tab → Permissions
- Scroll down to Resource-based policy statements
- Click Add permissions
Configure permissions:
- Policy statement:
  - Statement ID: AllowS3Invocation
  - Principal: s3.amazonaws.com
  - Source ARN: arn:aws:s3:::amodhbh-media-uploads
  - Action: lambda:InvokeFunction
- Click Save

3.2 Create S3 Event Notification

Navigate to S3:
- AWS Console → Search “S3” → Click S3
Open the uploads bucket:
- Click on amodhbh-media-uploads
Create event notification:
- Go to the Properties tab
- Scroll down to Event notifications
- Click Create event notification
Configure the event:
- Event name: trigger-dispatcher-on-upload
- Prefix: uploads/ (only trigger for files in the uploads/ folder)
- Event types:
  - Check All object create events (or specifically: s3:ObjectCreated:*)
- Destination:
  - Select Lambda function
  - Lambda function: Choose dispatcher-lambda from the dropdown
- Click Save changes

3.3 Verify Event Notification

In the S3 bucket Properties tab
Under Event notifications, you should see:
- Name: trigger-dispatcher-on-upload
- Prefix: uploads/
- Events: All object create events
- Destination: dispatcher-lambda

Step 4: End-to-End Testing

Now we’ll test the complete pipeline from start to finish!

4.1 Prepare a Test Image

Find a sample image on your computer (JPG or PNG)
Keep it relatively small for faster testing (< 2 MB)

4.2 Step-by-Step Test

Step 1: Request Upload URL

curl -X POST https://YOUR_API_ID.execute-api.ap-south-1.amazonaws.com/uploads \
  -H "Content-Type: application/json" \
  -d '{"filename": "beach-sunset.jpg"}'

Expected response:

{
  "jobId": "abc-123-def-456",
  "uploadUrl": "https://amodhbh-media-uploads.s3.ap-south-1.amazonaws.com/...",
  "expiresIn": 300
}

Save the jobId and uploadUrl from the response!

Step 2: Upload Image to S3

curl -X PUT "PASTE_UPLOAD_URL_HERE" \
  -H "Content-Type: image/jpeg" \
  --data-binary @/path/to/your/beach-sunset.jpg

Expected response: Empty response with 200 OK status

Step 3: Immediately Check Job Status (Should be PENDING or PROCESSING)

curl https://YOUR_API_ID.execute-api.ap-south-1.amazonaws.com/jobs/YOUR_JOB_ID

Expected response:

{
  "jobId": "abc-123-def-456",
  "status": "PENDING",
  "createdAt": "2025-10-17T12:00:00.000000",
  "updatedAt": "2025-10-17T12:00:00.000000"
}

Step 4: Wait 10-30 seconds, then check status again

curl https://YOUR_API_ID.execute-api.ap-south-1.amazonaws.com/jobs/YOUR_JOB_ID

Expected response:

{
  "jobId": "abc-123-def-456",
  "status": "COMPLETED",
  "createdAt": "2025-10-17T12:00:00.000000",
  "updatedAt": "2025-10-17T12:00:15.000000",
  "downloadUrl": "https://amodhbh-media-processed.s3.ap-south-1.amazonaws.com/..."
}

Step 5: Download the Processed Image

Copy the downloadUrl from the response and paste it in your browser, or use curl:

curl -o processed-image.jpg "PASTE_DOWNLOAD_URL_HERE"

Verify: Open processed-image.jpg and confirm it has the watermark “© Amodhbh Media” in the bottom-right corner!

Production Monitoring & Optimization

Step 1: Monitor and Debug

1.1 View CloudWatch Logs

For dispatcher-lambda:

Lambda console → dispatcher-lambda → Monitor tab
Click View CloudWatch logs
Check recent log streams for S3 events being received

For worker-lambda:

Lambda console → worker-lambda → Monitor tab
Click View CloudWatch logs
Verify image processing logs

1.2 Check SQS Queue

Go to SQS console
Click on media-processing-queue
Messages available should be 0 (all processed)
If messages are stuck, check Messages in flight or the DLQ

1.3 Check DynamoDB Table

Go to DynamoDB console
Click on media-processing-jobs
Click Explore table items
Find your job by jobId and verify status is COMPLETED

1.4 Check S3 Buckets

Uploads bucket:

Go to S3 → amodhbh-media-uploads
Navigate to uploads/{job_id}/
You should see your original uploaded image

Processed bucket:

Go to S3 → amodhbh-media-processed
Navigate to processed/uploads/{job_id}/
You should see the watermarked image

Step 2: Performance and Scalability

2.1 Current Configuration

Maximum concurrent Lambda executions: 1000 (default account limit)
SQS visibility timeout: 5 minutes
Lambda timeout: 2 minutes
Expected processing time per image: 5-15 seconds

2.2 Handling High Load

If you upload 1000 images simultaneously:

All 1000 uploads trigger dispatcher Lambda (1000 concurrent executions)
All 1000 messages go to SQS instantly
Worker Lambda auto-scales up to process messages in parallel
Processing completes in ~15-30 seconds (depending on image sizes)

2.3 Production Enhancements

Security:

Enable S3 Server-Side Encryption with KMS
Implement API Gateway authentication (Cognito, API keys)
Use VPC endpoints for additional security

Monitoring:

Set up CloudWatch alarms for failures
Enable AWS X-Ray for distributed tracing
Create custom dashboards for key metrics

Performance:

Use provisioned concurrency for consistent performance
Implement connection pooling for AWS services
Add caching for frequently accessed data

Cost Analysis & Scaling

Current Cost Breakdown

Processing 10,000 images per month:

API Gateway: $0.01 (10K requests)
Lambda requests: $0.00 (within free tier)
Lambda compute: ~$0.50 (depends on processing time)
S3 storage: ~$0.25 (10GB stored)
DynamoDB: ~$0.01 (20K read/write operations)
SQS: $0.00 (within free tier)
Total: ~$0.77/month

Cost Optimization Strategies

S3 Lifecycle Policies:
- Automatically delete old processed images
- Move to cheaper storage classes
DynamoDB Optimization:
- Use on-demand billing for unpredictable workloads
- Implement TTL for automatic cleanup
Lambda Optimization:
- Right-size memory allocation
- Use provisioned concurrency only when needed

Scaling Considerations

Current Limits:

Lambda: 1000 concurrent executions
SQS: 300 messages per second (can be increased)
DynamoDB: On-demand scales automatically

For Higher Load:

Request limit increases from AWS
Consider using Step Functions for complex workflows
Implement batch processing for multiple files

Troubleshooting Guide

Common Issues and Solutions

“Internal Server Error” from API

Cause: Lambda function errors Solution:

Check Lambda function CloudWatch Logs for errors
Verify environment variables are set correctly
Ensure IAM role has necessary permissions

“403 Forbidden” when uploading to pre-signed URL

Cause: URL expired or incorrect headers Solution:

Ensure Content-Type header matches (image/jpeg)
URL expires in 5 minutes - generate a new one if expired
Check S3 bucket permissions (should not block the upload)

“Job not found” when checking status

Cause: Incorrect job ID or DynamoDB issue Solution:

Verify you’re using the correct jobId from the upload response
Check DynamoDB table to see if the item was created
Check CloudWatch Logs for request-upload-lambda for errors

No processing happens after upload

Cause: S3 event notification or dispatcher Lambda issue Solution:

Check dispatcher Lambda logs:
- Go to CloudWatch Logs for dispatcher-lambda
- Verify the function was triggered by S3
- Look for any errors
Check S3 event notification:
- Verify it’s configured correctly
- Make sure the prefix matches your upload path (uploads/)
Check SQS queue:
- Are messages appearing in the queue?
- If messages are stuck, check the DLQ

Images not getting processed

Cause: Worker Lambda or SQS trigger issue Solution:

Check worker Lambda logs in CloudWatch
Verify the SQS trigger is enabled on worker-lambda
Check IAM permissions for worker Lambda

Watermark not appearing

Cause: Pillow library or image processing issue Solution:

Check worker Lambda logs for Pillow import errors
Verify the Pillow layer is attached to worker-lambda
Increase worker Lambda memory if needed

Job status stuck on PENDING

Cause: Worker Lambda failure Solution:

Worker Lambda may have failed - check CloudWatch Logs
Check SQS DLQ for failed messages
Verify DynamoDB table permissions

Verification Checklist

Before considering the pipeline complete:

IAM Roles

Role media-api-lambda-role exists with correct permissions
Role dispatcher-lambda-role exists with SQS permissions
All roles have least privilege access

Lambda Functions

request-upload-lambda exists and is deployed
get-job-status-lambda exists and is deployed
dispatcher-lambda exists and is deployed
All functions have correct environment variables
All functions have appropriate IAM roles

API Gateway

HTTP API media-processing-api exists
Route POST /uploads is configured and working
Route GET /jobs/{jobId} is configured and working
Both routes are integrated with their respective Lambda functions

S3 Event Notification

Event notification trigger-dispatcher-on-upload exists on uploads bucket
Configured for uploads/ prefix
Set to trigger on all object create events
Destination is dispatcher-lambda

End-to-End Test

Can request upload URL via API
Can upload image to S3 using pre-signed URL
Image upload triggers dispatcher Lambda (check CloudWatch Logs)
Message is sent to SQS queue
Worker Lambda processes the image
Job status updates to COMPLETED in DynamoDB
Processed image appears in processed bucket with watermark
Can retrieve downloadUrl via status API
Downloaded image has watermark

Next Steps

Congratulations! You’ve successfully built a complete serverless media processing pipeline.

What’s Next:

Part 3: Cleanup guide and resource management
Production Deployment: Advanced monitoring and security
Enhancements: Additional image processing features

Optional Enhancements:

Add support for different image formats (PNG, GIF, WebP)
Implement multiple processing options (resize, filters, effects)
Add authentication to API Gateway (Cognito, API keys)
Set up CloudWatch alarms for failures
Add SNS notifications when jobs complete
Implement batch processing for multiple files

Production Considerations:

Enable S3 versioning for data durability
Set up S3 lifecycle policies to automatically delete old files
Enable Lambda reserved concurrency to control costs
Add CloudWatch dashboards for monitoring
Implement proper error handling and retry logic
Add request validation and input sanitization

Quick Reference

Complete Architecture

User Application
    ↓ (1) POST /uploads
API Gateway → request-upload-lambda
    ↓ (2) Returns uploadUrl + jobId
    ↓ (creates DynamoDB record: PENDING)
    ↓
User uploads image to S3
    ↓ (3) S3 Event
dispatcher-lambda
    ↓ (4) Sends message
SQS Queue (media-processing-queue)
    ↓ (5) Triggers
worker-lambda
    ↓ (6) Downloads, processes, uploads
    ↓ (updates DynamoDB: COMPLETED)
    ↓
User Application
    ↓ (7) GET /jobs/{jobId}
API Gateway → get-job-status-lambda
    ↓ (8) Returns status + downloadUrl

Resource Summary

Resource Type	Name	Purpose
IAM Role	`media-api-lambda-role`	Execution role for API Lambda functions
Lambda	`request-upload-lambda`	Generates upload URLs and creates job records
Lambda	`get-job-status-lambda`	Retrieves job status from DynamoDB
Lambda	`dispatcher-lambda`	Forwards S3 events to SQS queue
API Gateway	`media-processing-api`	HTTP API for client applications
API Route	`POST /uploads`	Request upload URL endpoint
API Route	`GET /jobs/{jobId}`	Check job status endpoint
S3 Event	`trigger-dispatcher-on-upload`	Triggers dispatcher on file upload

API Endpoints

POST   https://{api-id}.execute-api.ap-south-1.amazonaws.com/uploads
GET    https://{api-id}.execute-api.ap-south-1.amazonaws.com/jobs/{jobId}

Summary

In this comprehensive guide, we’ve completed the serverless media processing pipeline:

✅ API Gateway Setup:

HTTP API with secure endpoints
CORS configuration for web applications
Integration with Lambda functions

✅ Lambda Functions:

Upload URL generation with pre-signed S3 URLs
Job status checking with DynamoDB integration
S3 event-driven dispatcher for automatic processing

✅ End-to-End Testing:

Complete workflow testing
Monitoring and debugging tools
Performance optimization strategies

✅ Production Readiness:

Comprehensive error handling
Cost optimization strategies
Scalability considerations

Key Benefits:

Event-driven: Automatic processing triggered by S3 uploads
Scalable: Handles thousands of concurrent requests
Cost-effective: Pay only for what you use
Reliable: Built-in error handling and retry logic
Maintainable: Clean, well-documented code

Ready for Part 3? We’ll cover cleanup procedures and resource management to ensure you can safely delete all resources when done testing!

This is Part 2 of a 3-part series on building a production-ready serverless media processing pipeline. Stay tuned for Part 3 where we’ll cover cleanup procedures and resource management! Here is the Part 3, where we’ll cover cleanup procedures and resource management!

Share This Post

Twitter LinkedIn Copy Link

AWS Serverless Media Processing Pipeline - Part 2: API Gateway & Complete Pipeline

Table of Contents

Share This Post

AWS Serverless Media Processing Pipeline - Part 2: API Gateway & Complete Pipeline

Overview

Table of Contents

Prerequisites

Phase 3: API Gateway & Supporting Lambda Functions

Step 1: Create IAM Role for API Lambda Functions

1.1 Create the IAM Role

1.2 Add Custom Inline Policy

Step 2: Create Request Upload Lambda Function

2.1 Create the Function

2.2 Configure Function Settings

2.3 Add Function Code

Step 3: Create Get Job Status Lambda Function

3.1 Create the Function

3.2 Configure Function Settings

3.3 Add Function Code

Step 4: Create API Gateway HTTP API

4.1 Create the API

4.2 Create Routes and Integrations

Route 1: POST /uploads

Route 2: GET /jobs/{jobId}

4.3 Enable CORS (Already handled in Lambda)

Step 5: Test the API

5.1 Test Request Upload Endpoint

5.2 Test File Upload to S3

5.3 Test Get Job Status Endpoint

Phase 4: S3 Dispatcher & End-to-End Testing

Step 1: Create IAM Role for Dispatcher Lambda

1.1 Create the IAM Role

1.2 Add Custom Inline Policy

Step 2: Create Dispatcher Lambda Function

2.1 Create the Function

2.2 Configure Function Settings

2.3 Get SQS Queue URL

2.4 Add Function Code

Step 3: Configure S3 Event Notification

3.1 Add Lambda Permissions for S3

3.2 Create S3 Event Notification

3.3 Verify Event Notification

Step 4: End-to-End Testing

4.1 Prepare a Test Image

4.2 Step-by-Step Test

Production Monitoring & Optimization

Step 1: Monitor and Debug

1.1 View CloudWatch Logs

1.2 Check SQS Queue

1.3 Check DynamoDB Table

1.4 Check S3 Buckets

Step 2: Performance and Scalability

2.1 Current Configuration

2.2 Handling High Load

2.3 Production Enhancements

Cost Analysis & Scaling

Current Cost Breakdown

Cost Optimization Strategies

Scaling Considerations

Troubleshooting Guide

Common Issues and Solutions

“Internal Server Error” from API

“403 Forbidden” when uploading to pre-signed URL

“Job not found” when checking status

No processing happens after upload

Images not getting processed

Watermark not appearing

Job status stuck on PENDING

Verification Checklist

IAM Roles

Lambda Functions

API Gateway

S3 Event Notification

End-to-End Test

Next Steps

Quick Reference

Complete Architecture

Resource Summary

API Endpoints

Summary