Alerts: Email/SMS for downtime, high CPU, disk space < 20%
Logs: Centralized syslog for all servers
Sample SOP: Morning Health Check (SOP-101)
Purpose: Verify all customer databases are healthy before the business day starts
Time Required: 10-15 minutes
Frequency: Daily (M-F, 8am ET)
Checklist
Check Betterstack dashboard for any downtime alerts
SSH into each VPS and run health check script
Verify last night's backups completed successfully
Check disk space on all servers (alert if < 30%)
Review PostgreSQL logs for errors
Check support inbox for any overnight tickets
Document any issues in operations log
Expected Output
All 3 VPS servers responding
15/15 customer databases online
Backups completed: 15/15 successful
Disk space: 45-80% used (healthy range)
No errors in PostgreSQL logs
0 open support tickets
If any check fails, escalate to appropriate incident SOP.
Sample SOP: Database Down (SOP-201)
Priority: P0 (Critical - Drop everything)
Response Time: Immediate
Triage Steps
Confirm the database is actually down (not network issue)
Check if it's just one database or entire VPS
Email customer immediately: "Investigating database outage, will update in 15 minutes"
Check PostgreSQL logs for crash reason
Attempt to restart PostgreSQL service
If restart fails, check disk space / memory
If VPS is unresponsive, contact provider support
If database is corrupted, restore from latest backup
Communication Template
Subject: [INCIDENT] Database outage - investigating
Hi [Customer],
We detected that your database is currently down.
Status: Investigating
Started: [Time]
ETA: Restoring within 30 minutes
We're working on it right now and will update you
every 15 minutes until resolved.
- Jeremy
Resolution Steps
Once database is back online, verify connections work
Check data integrity (run sample queries)
Email customer with root cause analysis
Document incident in operations log
If downtime > 1 hour, apply SLA credit (pro-rated refund)
Verify VPS capacity exists (or provision new VPS if needed)
Add customer to operations spreadsheet
Database Provisioning (SOP-103)
SSH into appropriate VPS
Run provisioning script: SUDO_PASS='pass' ./provision-customer-database.sh customer_name
Script automatically:
Creates PostgreSQL database and user
Generates strong random password
Adds user to pgBouncer (if installed)
Configures SSL-only access
Saves credentials securely
Add to backup schedule (if not auto-included)
Add to monitoring alerts
Test connection via pgBouncer: psql -h server_ip -p 6432 -U customer_user -d customer_db
Send welcome email with connection details
Customer Welcome Email
Subject: Welcome to CostPlusDB - Your database is ready!
Hi [Customer],
Your PostgreSQL database is provisioned and ready to use.
Connection Details:
Host: [hostname].costplusdb.dev
Port: 5432
Database: [dbname]
User: [username]
Password: [secure_password]
SSL: Required (enforce in connection string)
Example connection string:
postgresql://[user]:[pass]@[host]:5432/[db]?sslmode=require
Support: jeremy@intentsolutions.io (4-hour SLA, M-F 9-6 ET)
Monitoring: You'll be added to uptime notifications
Questions? Reply to this email.
Thanks for choosing transparent pricing!
- Jeremy
Transparency Commitment
These SOPs are the actual procedures I follow to run CostPlusDB. They're not marketing fluff - they're living documents I update as I learn.
Why publish these?
You deserve to know how your database is managed
Transparency builds trust
Accountability - if I publish my SOPs, I have to follow them