Files
dotfiles_arch/ansible/TROUBLESHOOTING.md
2026-02-16 23:40:30 +01:00

9.2 KiB

Troubleshooting Guide

Common issues and solutions for Nextcloud Stack deployment.

Table of Contents


DNS Issues

Problem: DNS records not resolving

Symptoms:

  • Let's Encrypt fails to issue certificates
  • Caddy shows certificate errors
  • Services inaccessible via domain

Diagnosis:

dig +short cloud.yourdomain.com @8.8.8.8

Solution:

  1. Ensure all required A records point to your server IP
  2. Wait for DNS propagation (up to 48 hours, usually minutes)
  3. Use DNSChecker.org to verify global propagation

Required DNS Records:

cloud.yourdomain.com    → YOUR_SERVER_IP
office.yourdomain.com   → YOUR_SERVER_IP
draw.yourdomain.com     → YOUR_SERVER_IP
notes.yourdomain.com    → YOUR_SERVER_IP
home.yourdomain.com     → YOUR_SERVER_IP
manage.yourdomain.com   → YOUR_SERVER_IP
uptime.yourdomain.com   → YOUR_SERVER_IP

Temporary Workaround: Edit /etc/hosts on your local machine:

YOUR_SERVER_IP cloud.yourdomain.com

SSL Certificate Problems

Problem: Let's Encrypt rate limit exceeded

Symptoms:

  • Error: "too many certificates already issued"

Solution:

  1. Use Let's Encrypt staging server for testing
  2. Edit Caddyfile (add to global options):
    {
        email {{ user_email }}
        acme_ca https://acme-staging-v02.api.letsencrypt.org/directory
    }
    
  3. Reload Caddy: docker exec caddy caddy reload
  4. After testing, remove staging server line

Rate Limits:

  • 50 certificates per domain per week
  • 5 duplicate certificates per week

Problem: Certificate validation failed

Symptoms:

  • "Failed to verify" errors in Caddy logs

Diagnosis:

docker logs caddy

Common Causes:

  1. DNS not pointing to server
  2. Firewall blocking port 80/443
  3. Another service using port 80/443

Solution:

# Check firewall
sudo ufw status

# Check port usage
sudo ss -tlnp | grep ':80\|:443'

# Check DNS
dig +short yourdomain.com

Docker Issues

Problem: Docker daemon won't start

Symptoms:

  • docker ps fails
  • Error: "Cannot connect to Docker daemon"

Diagnosis:

sudo systemctl status docker
sudo journalctl -xu docker

Solution:

sudo systemctl restart docker

Problem: Containers keep restarting

Diagnosis:

cd /opt/nextcloud-stack
docker compose logs [service-name]

Common Causes:

  1. Configuration errors
  2. Port conflicts
  3. Missing dependencies

Solution:

# Check specific container
docker logs next-db
docker logs next
docker logs caddy

# Restart specific service
docker compose restart next

LXC Container Issues

Problem: Docker fails to start in LXC

Symptoms:

  • Error: "cgroups: cgroup mountpoint does not exist"
  • Docker daemon fails to start

Diagnosis:

systemd-detect-virt  # Should show "lxc"

Solution on LXC Host:

# Set security nesting
lxc config set CONTAINER_NAME security.nesting true

# May also need privileged mode
lxc config set CONTAINER_NAME security.privileged true

# Restart container
lxc restart CONTAINER_NAME

Inside LXC Container:

# Verify cgroups
mount | grep cgroup

# Check Docker status
sudo systemctl status docker

Problem: AppArmor denials in LXC

Solution on LXC Host:

lxc config set CONTAINER_NAME raw.lxc "lxc.apparmor.profile=unconfined"
lxc restart CONTAINER_NAME

Nextcloud Issues

Problem: Nextcloud stuck in maintenance mode

Symptoms:

  • Web interface shows "System in maintenance mode"

Solution:

docker exec -u www-data next php occ maintenance:mode --off

Problem: Trusted domain error

Symptoms:

  • "Access through untrusted domain" error

Solution:

docker exec -u www-data next php occ config:system:set trusted_domains 1 --value=cloud.yourdomain.com

Problem: Redis connection failed

Diagnosis:

docker logs next-redis
docker exec next-redis redis-cli ping

Solution:

# Reconfigure Redis in Nextcloud
docker exec -u www-data next php occ config:system:set redis host --value=next-redis
docker exec -u www-data next php occ config:system:set redis port --value=6379

Problem: File uploads fail

Symptoms:

  • Large files won't upload
  • Error 413 (Payload Too Large)

Solution: Already configured in Caddyfile for 10GB uploads. Check:

docker exec -u www-data next php occ config:system:get max_upload

Problem: OnlyOffice integration not working

Solution:

# Install OnlyOffice app
docker exec -u www-data next php occ app:install onlyoffice

# Configure document server URL
docker exec -u www-data next php occ config:app:set onlyoffice DocumentServerUrl --value="https://office.yourdomain.com/"

# Disable JWT (or configure if needed)
docker exec -u www-data next php occ config:app:set onlyoffice jwt_secret --value=""

Database Connection Issues

Problem: Nextcloud can't connect to database

Symptoms:

  • Error: "SQLSTATE[08006]"
  • Nextcloud shows database error

Diagnosis:

# Check if PostgreSQL is running
docker ps | grep next-db

# Check PostgreSQL logs
docker logs next-db

# Test connection
docker exec next-db pg_isready -U nextcloud

Solution:

# Restart database
docker compose restart next-db

# Wait for it to be healthy
docker exec next-db pg_isready -U nextcloud

# Restart Nextcloud
docker compose restart next

Problem: Database initialization failed

Symptoms:

  • PostgreSQL container keeps restarting
  • Empty database

Solution:

# Remove volumes and recreate
cd /opt/nextcloud-stack
docker compose down -v
docker compose up -d

⚠️ WARNING: This deletes all data! Only use for fresh installations.


Tailscale Issues

Problem: Can't access Tailscale-only services

Symptoms:

  • Homarr, Dockhand, Uptime Kuma return 403 Forbidden

Diagnosis:

# Check if Tailscale is running
sudo tailscale status

# Get Tailscale IP
tailscale ip -4

Solution:

# Activate Tailscale (if not done)
sudo tailscale up

# Verify connection
tailscale status

Access via:

  • Tailscale IP: https://100.64.x.x:PORT
  • MagicDNS: https://hostname.tailnet-name.ts.net

Problem: Tailscale not installed

Solution:

# Re-run Tailscale playbook
ansible-playbook playbooks/04-tailscale-setup.yml --ask-vault-pass

Port Conflicts

Problem: Port 80 or 443 already in use

Symptoms:

  • Error: "bind: address already in use"
  • Caddy won't start

Diagnosis:

sudo ss -tlnp | grep ':80\|:443'

Common Culprits:

  • Apache2
  • Nginx
  • Another Caddy instance

Solution:

# Stop conflicting service
sudo systemctl stop apache2
sudo systemctl disable apache2

# OR
sudo systemctl stop nginx
sudo systemctl disable nginx

# Restart Caddy
docker compose restart caddy

Permission Issues

Problem: Permission denied errors in Nextcloud

Symptoms:

  • Can't upload files
  • Can't install apps

Diagnosis:

# Check file permissions
docker exec next ls -la /var/www/html

Solution:

# Fix permissions (run inside container)
docker exec next chown -R www-data:www-data /var/www/html

Problem: Docker socket permission denied

Symptoms:

  • Homarr or Dockhand can't see containers

Solution: Docker socket is mounted read-only by design for security. This is normal and expected.


Emergency Commands

Completely restart the stack

cd /opt/nextcloud-stack
docker compose down
docker compose up -d

View all logs in real-time

cd /opt/nextcloud-stack
docker compose logs -f

Check container health

docker compose ps
docker inspect --format='{{.State.Health.Status}}' next

Rebuild a specific container

docker compose up -d --force-recreate --no-deps next

Emergency backup

/opt/nextcloud-stack/backup.sh

Reset Nextcloud admin password

docker exec -u www-data next php occ user:resetpassword admin

Getting Help

If none of these solutions work:

  1. Check logs:

    docker compose logs [service-name]
    
  2. Check system logs:

    sudo journalctl -xe
    
  3. Verify configuration:

    cat /opt/nextcloud-stack/docker-compose.yml
    cat /opt/nextcloud-stack/.env
    
  4. Test connectivity:

    curl -I https://cloud.yourdomain.com
    docker exec caddy caddy validate
    
  5. Deployment report:

    cat /opt/nextcloud-stack/DEPLOYMENT.txt
    

Recovery Procedures

Restore from backup

See BACKUP_RESTORE.md

Complete reinstallation

# 1. Backup first!
/opt/nextcloud-stack/backup.sh

# 2. Remove deployment
ansible-playbook playbooks/99-rollback.yml --ask-vault-pass

# 3. Redeploy
ansible-playbook playbooks/site.yml --ask-vault-pass

Last Updated: 2026-02-16