Learn how to migrate databases using AWS Database Migration Service (DMS). This comprehensive guide covers DMS replication instances, endpoint configuration, full load migration, and data validation for production-grade database migration.
In this comprehensive guide, we’ll use AWS Database Migration Service (DMS) to migrate the PostgreSQL database from the source environment to the production Multi-AZ RDS instance in the target environment. We’ll perform a Full Load migration, transferring all tables and data from the legacy database to the production database with complete data integrity validation.
Estimated Time: 60-90 minutes (including replication instance launch and data migration)
Region: ap-south-1 (Mumbai)
Difficulty: Intermediate
AWS Database Migration Service (DMS) is the enterprise-grade solution for database migration, offering:
⚠️ Resources that will incur charges:
Estimated DMS Cost: ~$1 per day during active migration
💡 Cost Tip: DMS is pay-as-you-go. Delete the replication instance immediately after migration completes to stop charges.
┌────────────────────────────────────────────────────────────────┐
│ Source Database (on-prem-vpc) │
│ ┌────────────────────────────────────────────────────────┐ │
│ │ RDS: legacy-db (PostgreSQL) │ │
│ │ - Database: legacydb │ │
│ │ - Tables: users, products, orders │ │
│ │ - Private subnet: 10.0.2.0/24 │ │
│ └────────────────────────────────────────────────────────┘ │
└────────────────────────────────────────────────────────────────┘
│
│ VPC Peering
▼
┌────────────────────────────────────────────────────────────────┐
│ DMS Replication Instance (aws-target-vpc) │
│ ┌────────────────────────────────────────────────────────┐ │
│ │ EC2-based DMS Instance (dms.t3.micro) │ │
│ │ - Connects to source database │ │
│ │ - Reads all table data │ │
│ │ - Transforms data (if needed) │ │
│ │ - Writes to target database │ │
│ │ - Public subnet: 10.1.1.0/24 │ │
│ └────────────────────────────────────────────────────────┘ │
└────────────────────────────────────────────────────────────────┘
│
│ Private Connection
▼
┌────────────────────────────────────────────────────────────────┐
│ Target Database (aws-target-vpc) │
│ ┌────────────────────────────────────────────────────────┐ │
│ │ RDS: production-db (PostgreSQL Multi-AZ) │ │
│ │ - Database: productiondb │ │
│ │ - Tables: (will receive migrated data) │ │
│ │ - Private subnets: 10.1.21.0/24, 10.1.22.0/24 │ │
│ └────────────────────────────────────────────────────────┘ │
└────────────────────────────────────────────────────────────────┘
1. DMS Replication Instance: EC2-based instance that runs the migration software 2. Source Endpoint: Configuration that tells DMS how to connect to the source database 3. Target Endpoint: Configuration that tells DMS how to connect to the target database 4. Migration Task: The actual job that reads from source and writes to target 5. Full Load: One-time copy of all existing data (what we’ll use) 6. CDC (Change Data Capture): Continuous replication of ongoing changes (for zero-downtime migrations) 7. Table Mappings: Rules that define which tables/schemas to migrate
Before starting DMS, let’s ensure both databases are accessible.
ssh -i on-prem-web-key.pem ec2-user@YOUR_SOURCE_PUBLIC_IP
psql --host=YOUR_SOURCE_RDS_ENDPOINT --port=5432 --username=postgres --dbname=legacydb -c "\dt"
Expected output: List of tables (users, products, orders)
psql --host=YOUR_TARGET_RDS_ENDPOINT --port=5432 --username=postgres --dbname=productiondb -c "\dt"
Expected output: Did not find any relations. (empty database)
Document the baseline data to verify after migration:
# Count users
psql --host=YOUR_SOURCE_RDS_ENDPOINT --port=5432 --username=postgres --dbname=legacydb -c "SELECT COUNT(*) FROM users;"
# Count products
psql --host=YOUR_SOURCE_RDS_ENDPOINT --port=5432 --username=postgres --dbname=legacydb -c "SELECT COUNT(*) FROM products;"
# Count orders
psql --host=YOUR_SOURCE_RDS_ENDPOINT --port=5432 --username=postgres --dbname=legacydb -c "SELECT COUNT(*) FROM orders;"
📝 Note these counts:
The replication instance is the compute resource that performs the actual data migration.
Instance configuration:
legacy-to-production-replicationReplication instance for database migration from legacy-db to production-dbHigh Availability:
Engine version:
Allocated storage:
Network and connectivity:
aws-target-vpcdms-target-subnet-group (created in Phase 2)Advanced security and network configuration:
target-dms-sgMaintenance:
Tags (optional):
Project, Value: Database-MigrationEnvironment, Value: MigrationOnce available, click on the instance name and note:
The source endpoint tells DMS how to connect to the legacy database.
legacy-db-source-endpointServer name: Enter your source RDS endpoint
legacy-db.abc123xyz.ap-south-1.rds.amazonaws.comPort: 5432
SSL mode: require
Database name: legacydb
User name: postgres
Password: Enter the password you created for legacy-db in Phase 1
This verifies DMS can reach the source database through VPC peering.
Test endpoint connection:
aws-target-vpclegacy-to-production-replication (Automatically selected)If the test fails, see troubleshooting section below
The target endpoint tells DMS how to connect to the production database.
production-db-target-endpointServer name: Enter your target RDS endpoint
production-db.xyz789abc.ap-south-1.rds.amazonaws.comPort: 5432
SSL mode: require
Database name: productiondb
User name: postgres
Password: Enter the password you created for production-db in Phase 2
Click Endpoint settings to expand.
Add this important setting:
captureDDLs=Y - to capture DDL changesmaxFileSize=512000 - for large objectsTest endpoint connection:
aws-target-vpclegacy-to-production-replicationThe migration task is the actual job that moves data from source to target.
Task identifier: migrate-legacy-to-production
Replication instance: Select legacy-to-production-replication
Source database endpoint: Select legacy-db-source-endpoint
Target database endpoint: Select production-db-target-endpoint
Migration type:
Task settings:
Editing mode: Wizard (easier) or JSON (for advanced users)
Target table preparation mode:
Include LOB columns in replication:
Enable validation: ☑ Check this (highly recommended)
Enable CloudWatch logs: ☑ Check this
This defines which tables/schemas to migrate.
Editing mode: Wizard (we’ll use guided approach)
Click “Add new selection rule”:
Selection rule 1:
public% (wildcard - all tables)This will migrate all tables in the public schema (users, products, orders).
Preview of what will be migrated: The preview should show it will migrate:
Migration task startup configuration:
migrate-legacy-to-productionTask Status Progression:
Scroll down to the Table statistics tab:
You should see a table showing:
| Schema | Table Name | Full Load Rows | Inserts | Deletes | Updates | DDLs | Validation State |
|---|---|---|---|---|---|---|---|
| public | users | 4 | 0 | 0 | 0 | 0 | Validated |
| public | products | 5 | 0 | 0 | 0 | 0 | Validated |
| public | orders | 5 | 0 | 0 | 0 | 0 | Validated |
What these columns mean:
The progress updates every few seconds. For our small dataset:
Starting migration taskLoading table public.usersFinished loading table public.users, loaded 4 rowsAll tables loaded successfullyStatus will change to “Load complete” when finished.
✅ Success indicators:
Now let’s verify the data was migrated correctly to the production database.
SSH into your source web server (or bastion host):
ssh -i on-prem-web-key.pem ec2-user@YOUR_SOURCE_OR_BASTION_IP
Connect to the production database:
psql --host=YOUR_TARGET_RDS_ENDPOINT --port=5432 --username=postgres --dbname=productiondb
-- List all tables
\dt
-- Expected output: users, products, orders
-- Count users
SELECT COUNT(*) FROM users;
-- Expected: 4
-- Count products
SELECT COUNT(*) FROM products;
-- Expected: 5
-- Count orders
SELECT COUNT(*) FROM orders;
-- Expected: 5
-- Check users table
SELECT * FROM users ORDER BY id;
-- Should see: amodh, john_doe, jane_smith, admin
-- Check products table
SELECT product_name, price FROM products ORDER BY id;
-- Should see: Laptop, Mouse, Keyboard, Monitor, Headphones
-- Check orders with JOIN
SELECT
o.id,
u.username,
p.product_name,
o.quantity
FROM orders o
JOIN users u ON o.user_id = u.id
JOIN products p ON o.product_id = p.id
ORDER BY o.id;
-- Should see all 5 orders with correct joins
-- Check primary keys
SELECT
tc.table_name,
kcu.column_name
FROM information_schema.table_constraints tc
JOIN information_schema.key_column_usage kcu
ON tc.constraint_name = kcu.constraint_name
WHERE tc.constraint_type = 'PRIMARY KEY'
ORDER BY tc.table_name;
-- Check foreign keys
SELECT
tc.table_name,
kcu.column_name,
ccu.table_name AS foreign_table_name,
ccu.column_name AS foreign_column_name
FROM information_schema.table_constraints tc
JOIN information_schema.key_column_usage kcu
ON tc.constraint_name = kcu.constraint_name
JOIN information_schema.constraint_column_usage ccu
ON ccu.constraint_name = tc.constraint_name
WHERE tc.constraint_type = 'FOREIGN KEY'
ORDER BY tc.table_name;
-- Exit psql
\q
✅ If all data matches, the migration was successful!
Now that the database is migrated, update the application to use the new production database.
You need to connect to the migrated server (from Phase 3), not the source server.
Use one of these methods:
ssh bastion, then ssh 10.1.11.x# Backup the original configuration
sudo cp /var/www/cgi-bin/db_app.py /var/www/cgi-bin/db_app.py.backup
# Update the database endpoint
sudo sed -i 's|DB_HOST = ".*"|DB_HOST = "YOUR_PRODUCTION_DB_ENDPOINT"|g' /var/www/cgi-bin/db_app.py
# Update the database name
sudo sed -i 's|DB_NAME = "legacydb"|DB_NAME = "productiondb"|g' /var/www/cgi-bin/db_app.py
# Update the password if it changed
sudo sed -i 's|DB_PASS = ".*"|DB_PASS = "YOUR_PRODUCTION_DB_PASSWORD"|g' /var/www/cgi-bin/db_app.py
Or edit manually:
sudo vi /var/www/cgi-bin/db_app.py
Change these lines:
DB_HOST = "production-db.xyz789abc.ap-south-1.rds.amazonaws.com"
DB_NAME = "productiondb"
DB_USER = "postgres"
DB_PASS = "ProductionDB2024!" # Your production password
sudo systemctl restart httpd
# Test locally on the server
curl http://localhost/cgi-bin/db_app.py
Expected: HTML page showing all users, products, and orders from the production database
If you created an ALB in Phase 3:
# From your local machine
curl http://YOUR_ALB_DNS_NAME/cgi-bin/db_app.py
# Or open in browser
http://YOUR_ALB_DNS_NAME/cgi-bin/db_app.py
✅ Application is now fully migrated and using the production database!
After a large data load, update database statistics for query optimization:
psql --host=YOUR_PRODUCTION_DB_ENDPOINT --port=5432 --username=postgres --dbname=productiondb
-- Analyze all tables to update statistics
ANALYZE users;
ANALYZE products;
ANALYZE orders;
-- Or analyze all tables at once
ANALYZE;
-- Vacuum to reclaim space (optional)
VACUUM ANALYZE;
\q
-- Connect to production database
psql --host=YOUR_PRODUCTION_DB_ENDPOINT --port=5432 --username=postgres --dbname=productiondb
-- Create indexes for common queries
CREATE INDEX idx_users_email ON users(email);
CREATE INDEX idx_products_name ON products(product_name);
CREATE INDEX idx_orders_user_id ON orders(user_id);
CREATE INDEX idx_orders_date ON orders(order_date);
-- Verify indexes
\di
\q
The production database should already have automated backups (configured in Phase 2), but verify:
production-dbproduction-db-post-migration-2025-10-17Compare query performance between source and target:
On Source Database:
psql --host=SOURCE_ENDPOINT --port=5432 --username=postgres --dbname=legacydb -c "\timing" -c "SELECT COUNT(*) FROM orders o JOIN users u ON o.user_id = u.id JOIN products p ON o.product_id = p.id;"
On Target Database:
psql --host=TARGET_ENDPOINT --port=5432 --username=postgres --dbname=productiondb -c "\timing" -c "SELECT COUNT(*) FROM orders o JOIN users u ON o.user_id = u.id JOIN products p ON o.product_id = p.id;"
Target should be equal or faster (especially with Multi-AZ).
# Install Apache Bench on your local machine or bastion
# For Amazon Linux:
sudo yum install -y httpd-tools
# Run load test (100 requests, 10 concurrent)
ab -n 100 -c 10 http://YOUR_ALB_DNS_NAME/cgi-bin/db_app.py
Monitor:
Run a complete data comparison (for critical migrations):
-- On both source and target, run:
SELECT
'users' as table_name,
COUNT(*) as row_count,
SUM(id) as id_sum,
MD5(string_agg(username::text, ',' ORDER BY id)) as username_hash
FROM users
UNION ALL
SELECT
'products',
COUNT(*),
SUM(id),
MD5(string_agg(product_name::text, ',' ORDER BY id))
FROM products
UNION ALL
SELECT
'orders',
COUNT(*),
SUM(id),
MD5(string_agg(quantity::text, ',' ORDER BY id))
FROM orders;
Compare the results between source and target - they should be identical.
After successful migration, clean up DMS resources to stop charges.
migrate-legacy-to-productionlegacy-db-source-endpointproduction-db-target-endpoint⚠️ This is the most important cost-saving step!
legacy-to-production-replication💰 After deletion, you stop paying for the DMS replication instance!
If you plan to run continuous replication (CDC):
📝 Important values for cleanup:
| Resource | Value | Status |
|---|---|---|
| DMS Replication Instance | legacy-to-production-replication | DELETE after migration |
| Source Endpoint | legacy-db-source-endpoint | DELETE after migration |
| Target Endpoint | production-db-target-endpoint | DELETE after migration |
| Migration Task | migrate-legacy-to-production | DELETE after migration |
| Production Database | production-db.xxx.rds.amazonaws.com | KEEP - this is your production DB |
Solution:
Solution:
Solution:
\dt in psqlSolution:
Solution:
Solution:
ParallelLoadThreads to 4-8shared_buffers, work_mem✅ Phase 4 Complete!
Your database has been successfully migrated from the legacy single-AZ database to a production-grade Multi-AZ RDS instance!
What you’ve accomplished:
Current state:
Proceed to: CLEANUP.md to safely remove all source resources and temporary infrastructure.
# Connect to production database
psql --host=PRODUCTION_ENDPOINT --port=5432 --username=postgres --dbname=productiondb
# Quick data verification
psql --host=PRODUCTION_ENDPOINT --port=5432 --username=postgres --dbname=productiondb -c "SELECT 'users' as tbl, COUNT(*) FROM users UNION ALL SELECT 'products', COUNT(*) FROM products UNION ALL SELECT 'orders', COUNT(*) FROM orders;"
# Test application
curl http://YOUR_ALB_DNS_NAME/cgi-bin/db_app.py
# Monitor RDS performance
# Go to RDS Console → production-db → Monitoring tab
# View DMS logs
# Go to CloudWatch → Log groups → /aws/dms/tasks/...
In this comprehensive guide, we’ve successfully migrated a production PostgreSQL database using AWS Database Migration Service (DMS), completing the final phase of our cloud migration journey. This hands-on approach demonstrates enterprise-grade database migration strategies and best practices.
What we’ve accomplished:
Key Learning Outcomes:
This completes our comprehensive AWS migration series! You now have a fully migrated, production-grade application running in AWS with both server and database successfully migrated using modern AWS migration services.
This is Part 4 of a comprehensive AWS migration series. The migration journey is now complete! Here is the Part 5, where we’ll clean up the source environment and temporary infrastructure.