Troubleshooting guides

Hands-on Infrastructure Troubleshooting Guides

Explore hands-on infrastructure troubleshooting guides from an independent engineering team focused on hardened, maintainable and dependable production platforms.

Problem-aware guides

Start with the issue you are seeing

Each guide links to the relevant solution if you need help fixing the problem safely.

Docker Compose Port Already Allocated

Troubleshoot Docker Compose port conflicts, bind errors, duplicate solutions, reverse proxy issues and host processes already using a port.

Read guide →

SSL Certificate Not Renewing

Troubleshoot SSL renewal failures involving Certbot, DNS challenges, HTTP challenges, NGINX, Cloudflare, firewalls and expired certificates.

Read guide →

AWS S3 Access Denied

Troubleshoot S3 AccessDenied errors involving IAM policies, bucket policies, ACLs, object ownership, KMS, Block Public Access and wrong accounts.

Read guide →

Cloudflare SSL Redirect Loop

Troubleshoot Cloudflare SSL redirect loops caused by Flexible SSL, origin HTTPS redirects, WordPress settings, NGINX config and proxy modes.

Read guide →

Ubuntu Server High Load

Troubleshoot high load on Ubuntu server estates by checking CPU, disk I/O, memory, swap, processes, databases, PHP-FPM and Docker workloads.

Read guide →

Backup Restore Failed

Troubleshoot failed restores involving missing database dumps, corrupt archives, wrong permissions, incomplete file backup processes and untested disaster recovery.

Read guide →

NGINX 502 Bad Gateway with PHP-FPM

Find common causes of NGINX 502 Bad Gateway errors with PHP-FPM, including socket paths, solution failures, permissions, timeouts and high load.

Read guide →

MySQL Using Too Much Memory

A hands-on guide to MySQL memory pressure, configuration, connections, buffers, slow queries and production database stability.

Read guide →

Gunicorn Worker Timeout and NGINX 502 Errors

Troubleshoot Gunicorn worker timeouts, NGINX 502 errors, failed systemd solutions, socket issues, worker counts and Python app deployment planning problems.

Read guide →

Website Migration Planning DNS Cutover

Plan safe website migration planning DNS cutovers with TTLs, backup processes, staging, database freeze windows, SSL, email records and rollback checks.

Read guide →

Docker Container Keeps Restarting

A hands-on guide to Docker containers that keep restarting, including logs, restart policies, environment variables, health checks, volumes and dependency failures.

Read guide →

PHP-FPM High Memory Usage

Troubleshoot PHP-FPM high memory usage, worker settings, WordPress plugins, slow scripts, memory limits and server capacity.

Read guide →

Redis Connection Refused

Troubleshoot Redis connection refused errors caused by stopped solutions, socket/TCP mismatch, bind settings, firewall rules and app configuration.

Read guide →

DNS Changes Not Propagating

Understand DNS propagation, TTLs, nameservers, resolver cache, wrong zones and common causes of delayed or inconsistent DNS results.

Read guide →

AWS Bill Too High

A hands-on AWS cost troubleshooting guide covering EC2, EBS, snapshots, S3, NAT Gateway, CloudWatch, data transfer and idle resources.

Read guide →

WordPress High CPU Usage

Diagnose WordPress high CPU usage from PHP-FPM, plugins, cron, bots, database queries, WordPress infrastructure and caching problems.

Read guide →

MariaDB High CPU Usage

Troubleshoot MariaDB high CPU caused by slow queries, missing indexes, table scans, connection spikes, cron jobs and application workload.

Read guide →

Linux Server Disk Full

Find why a Linux server disk is full, including logs, Docker volumes, databases, backup processes, deleted open files and inode exhaustion.

Read guide →