added ansible script
This commit is contained in:
509
ansible/TROUBLESHOOTING.md
Normal file
509
ansible/TROUBLESHOOTING.md
Normal file
@@ -0,0 +1,509 @@
|
||||
# Troubleshooting Guide
|
||||
|
||||
Common issues and solutions for Nextcloud Stack deployment.
|
||||
|
||||
## Table of Contents
|
||||
|
||||
- [DNS Issues](#dns-issues)
|
||||
- [SSL Certificate Problems](#ssl-certificate-problems)
|
||||
- [Docker Issues](#docker-issues)
|
||||
- [LXC Container Issues](#lxc-container-issues)
|
||||
- [Nextcloud Issues](#nextcloud-issues)
|
||||
- [Database Connection Issues](#database-connection-issues)
|
||||
- [Tailscale Issues](#tailscale-issues)
|
||||
- [Port Conflicts](#port-conflicts)
|
||||
- [Permission Issues](#permission-issues)
|
||||
|
||||
---
|
||||
|
||||
## DNS Issues
|
||||
|
||||
### Problem: DNS records not resolving
|
||||
|
||||
**Symptoms:**
|
||||
- Let's Encrypt fails to issue certificates
|
||||
- Caddy shows certificate errors
|
||||
- Services inaccessible via domain
|
||||
|
||||
**Diagnosis:**
|
||||
```bash
|
||||
dig +short cloud.yourdomain.com @8.8.8.8
|
||||
```
|
||||
|
||||
**Solution:**
|
||||
1. Ensure all required A records point to your server IP
|
||||
2. Wait for DNS propagation (up to 48 hours, usually minutes)
|
||||
3. Use [DNSChecker.org](https://dnschecker.org) to verify global propagation
|
||||
|
||||
**Required DNS Records:**
|
||||
```
|
||||
cloud.yourdomain.com → YOUR_SERVER_IP
|
||||
office.yourdomain.com → YOUR_SERVER_IP
|
||||
draw.yourdomain.com → YOUR_SERVER_IP
|
||||
notes.yourdomain.com → YOUR_SERVER_IP
|
||||
home.yourdomain.com → YOUR_SERVER_IP
|
||||
manage.yourdomain.com → YOUR_SERVER_IP
|
||||
uptime.yourdomain.com → YOUR_SERVER_IP
|
||||
```
|
||||
|
||||
**Temporary Workaround:**
|
||||
Edit `/etc/hosts` on your local machine:
|
||||
```
|
||||
YOUR_SERVER_IP cloud.yourdomain.com
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## SSL Certificate Problems
|
||||
|
||||
### Problem: Let's Encrypt rate limit exceeded
|
||||
|
||||
**Symptoms:**
|
||||
- Error: "too many certificates already issued"
|
||||
|
||||
**Solution:**
|
||||
1. Use Let's Encrypt staging server for testing
|
||||
2. Edit Caddyfile (add to global options):
|
||||
```caddy
|
||||
{
|
||||
email {{ user_email }}
|
||||
acme_ca https://acme-staging-v02.api.letsencrypt.org/directory
|
||||
}
|
||||
```
|
||||
3. Reload Caddy: `docker exec caddy caddy reload`
|
||||
4. After testing, remove staging server line
|
||||
|
||||
**Rate Limits:**
|
||||
- 50 certificates per domain per week
|
||||
- 5 duplicate certificates per week
|
||||
|
||||
### Problem: Certificate validation failed
|
||||
|
||||
**Symptoms:**
|
||||
- "Failed to verify" errors in Caddy logs
|
||||
|
||||
**Diagnosis:**
|
||||
```bash
|
||||
docker logs caddy
|
||||
```
|
||||
|
||||
**Common Causes:**
|
||||
1. DNS not pointing to server
|
||||
2. Firewall blocking port 80/443
|
||||
3. Another service using port 80/443
|
||||
|
||||
**Solution:**
|
||||
```bash
|
||||
# Check firewall
|
||||
sudo ufw status
|
||||
|
||||
# Check port usage
|
||||
sudo ss -tlnp | grep ':80\|:443'
|
||||
|
||||
# Check DNS
|
||||
dig +short yourdomain.com
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Docker Issues
|
||||
|
||||
### Problem: Docker daemon won't start
|
||||
|
||||
**Symptoms:**
|
||||
- `docker ps` fails
|
||||
- Error: "Cannot connect to Docker daemon"
|
||||
|
||||
**Diagnosis:**
|
||||
```bash
|
||||
sudo systemctl status docker
|
||||
sudo journalctl -xu docker
|
||||
```
|
||||
|
||||
**Solution:**
|
||||
```bash
|
||||
sudo systemctl restart docker
|
||||
```
|
||||
|
||||
### Problem: Containers keep restarting
|
||||
|
||||
**Diagnosis:**
|
||||
```bash
|
||||
cd /opt/nextcloud-stack
|
||||
docker compose logs [service-name]
|
||||
```
|
||||
|
||||
**Common Causes:**
|
||||
1. Configuration errors
|
||||
2. Port conflicts
|
||||
3. Missing dependencies
|
||||
|
||||
**Solution:**
|
||||
```bash
|
||||
# Check specific container
|
||||
docker logs next-db
|
||||
docker logs next
|
||||
docker logs caddy
|
||||
|
||||
# Restart specific service
|
||||
docker compose restart next
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## LXC Container Issues
|
||||
|
||||
### Problem: Docker fails to start in LXC
|
||||
|
||||
**Symptoms:**
|
||||
- Error: "cgroups: cgroup mountpoint does not exist"
|
||||
- Docker daemon fails to start
|
||||
|
||||
**Diagnosis:**
|
||||
```bash
|
||||
systemd-detect-virt # Should show "lxc"
|
||||
```
|
||||
|
||||
**Solution on LXC Host:**
|
||||
```bash
|
||||
# Set security nesting
|
||||
lxc config set CONTAINER_NAME security.nesting true
|
||||
|
||||
# May also need privileged mode
|
||||
lxc config set CONTAINER_NAME security.privileged true
|
||||
|
||||
# Restart container
|
||||
lxc restart CONTAINER_NAME
|
||||
```
|
||||
|
||||
**Inside LXC Container:**
|
||||
```bash
|
||||
# Verify cgroups
|
||||
mount | grep cgroup
|
||||
|
||||
# Check Docker status
|
||||
sudo systemctl status docker
|
||||
```
|
||||
|
||||
### Problem: AppArmor denials in LXC
|
||||
|
||||
**Solution on LXC Host:**
|
||||
```bash
|
||||
lxc config set CONTAINER_NAME raw.lxc "lxc.apparmor.profile=unconfined"
|
||||
lxc restart CONTAINER_NAME
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Nextcloud Issues
|
||||
|
||||
### Problem: Nextcloud stuck in maintenance mode
|
||||
|
||||
**Symptoms:**
|
||||
- Web interface shows "System in maintenance mode"
|
||||
|
||||
**Solution:**
|
||||
```bash
|
||||
docker exec -u www-data next php occ maintenance:mode --off
|
||||
```
|
||||
|
||||
### Problem: Trusted domain error
|
||||
|
||||
**Symptoms:**
|
||||
- "Access through untrusted domain" error
|
||||
|
||||
**Solution:**
|
||||
```bash
|
||||
docker exec -u www-data next php occ config:system:set trusted_domains 1 --value=cloud.yourdomain.com
|
||||
```
|
||||
|
||||
### Problem: Redis connection failed
|
||||
|
||||
**Diagnosis:**
|
||||
```bash
|
||||
docker logs next-redis
|
||||
docker exec next-redis redis-cli ping
|
||||
```
|
||||
|
||||
**Solution:**
|
||||
```bash
|
||||
# Reconfigure Redis in Nextcloud
|
||||
docker exec -u www-data next php occ config:system:set redis host --value=next-redis
|
||||
docker exec -u www-data next php occ config:system:set redis port --value=6379
|
||||
```
|
||||
|
||||
### Problem: File uploads fail
|
||||
|
||||
**Symptoms:**
|
||||
- Large files won't upload
|
||||
- Error 413 (Payload Too Large)
|
||||
|
||||
**Solution:**
|
||||
Already configured in Caddyfile for 10GB uploads. Check:
|
||||
```bash
|
||||
docker exec -u www-data next php occ config:system:get max_upload
|
||||
```
|
||||
|
||||
### Problem: OnlyOffice integration not working
|
||||
|
||||
**Solution:**
|
||||
```bash
|
||||
# Install OnlyOffice app
|
||||
docker exec -u www-data next php occ app:install onlyoffice
|
||||
|
||||
# Configure document server URL
|
||||
docker exec -u www-data next php occ config:app:set onlyoffice DocumentServerUrl --value="https://office.yourdomain.com/"
|
||||
|
||||
# Disable JWT (or configure if needed)
|
||||
docker exec -u www-data next php occ config:app:set onlyoffice jwt_secret --value=""
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Database Connection Issues
|
||||
|
||||
### Problem: Nextcloud can't connect to database
|
||||
|
||||
**Symptoms:**
|
||||
- Error: "SQLSTATE[08006]"
|
||||
- Nextcloud shows database error
|
||||
|
||||
**Diagnosis:**
|
||||
```bash
|
||||
# Check if PostgreSQL is running
|
||||
docker ps | grep next-db
|
||||
|
||||
# Check PostgreSQL logs
|
||||
docker logs next-db
|
||||
|
||||
# Test connection
|
||||
docker exec next-db pg_isready -U nextcloud
|
||||
```
|
||||
|
||||
**Solution:**
|
||||
```bash
|
||||
# Restart database
|
||||
docker compose restart next-db
|
||||
|
||||
# Wait for it to be healthy
|
||||
docker exec next-db pg_isready -U nextcloud
|
||||
|
||||
# Restart Nextcloud
|
||||
docker compose restart next
|
||||
```
|
||||
|
||||
### Problem: Database initialization failed
|
||||
|
||||
**Symptoms:**
|
||||
- PostgreSQL container keeps restarting
|
||||
- Empty database
|
||||
|
||||
**Solution:**
|
||||
```bash
|
||||
# Remove volumes and recreate
|
||||
cd /opt/nextcloud-stack
|
||||
docker compose down -v
|
||||
docker compose up -d
|
||||
```
|
||||
|
||||
**⚠️ WARNING:** This deletes all data! Only use for fresh installations.
|
||||
|
||||
---
|
||||
|
||||
## Tailscale Issues
|
||||
|
||||
### Problem: Can't access Tailscale-only services
|
||||
|
||||
**Symptoms:**
|
||||
- Homarr, Dockhand, Uptime Kuma return 403 Forbidden
|
||||
|
||||
**Diagnosis:**
|
||||
```bash
|
||||
# Check if Tailscale is running
|
||||
sudo tailscale status
|
||||
|
||||
# Get Tailscale IP
|
||||
tailscale ip -4
|
||||
```
|
||||
|
||||
**Solution:**
|
||||
```bash
|
||||
# Activate Tailscale (if not done)
|
||||
sudo tailscale up
|
||||
|
||||
# Verify connection
|
||||
tailscale status
|
||||
```
|
||||
|
||||
**Access via:**
|
||||
- Tailscale IP: `https://100.64.x.x:PORT`
|
||||
- MagicDNS: `https://hostname.tailnet-name.ts.net`
|
||||
|
||||
### Problem: Tailscale not installed
|
||||
|
||||
**Solution:**
|
||||
```bash
|
||||
# Re-run Tailscale playbook
|
||||
ansible-playbook playbooks/04-tailscale-setup.yml --ask-vault-pass
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Port Conflicts
|
||||
|
||||
### Problem: Port 80 or 443 already in use
|
||||
|
||||
**Symptoms:**
|
||||
- Error: "bind: address already in use"
|
||||
- Caddy won't start
|
||||
|
||||
**Diagnosis:**
|
||||
```bash
|
||||
sudo ss -tlnp | grep ':80\|:443'
|
||||
```
|
||||
|
||||
**Common Culprits:**
|
||||
- Apache2
|
||||
- Nginx
|
||||
- Another Caddy instance
|
||||
|
||||
**Solution:**
|
||||
```bash
|
||||
# Stop conflicting service
|
||||
sudo systemctl stop apache2
|
||||
sudo systemctl disable apache2
|
||||
|
||||
# OR
|
||||
sudo systemctl stop nginx
|
||||
sudo systemctl disable nginx
|
||||
|
||||
# Restart Caddy
|
||||
docker compose restart caddy
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Permission Issues
|
||||
|
||||
### Problem: Permission denied errors in Nextcloud
|
||||
|
||||
**Symptoms:**
|
||||
- Can't upload files
|
||||
- Can't install apps
|
||||
|
||||
**Diagnosis:**
|
||||
```bash
|
||||
# Check file permissions
|
||||
docker exec next ls -la /var/www/html
|
||||
```
|
||||
|
||||
**Solution:**
|
||||
```bash
|
||||
# Fix permissions (run inside container)
|
||||
docker exec next chown -R www-data:www-data /var/www/html
|
||||
```
|
||||
|
||||
### Problem: Docker socket permission denied
|
||||
|
||||
**Symptoms:**
|
||||
- Homarr or Dockhand can't see containers
|
||||
|
||||
**Solution:**
|
||||
Docker socket is mounted read-only by design for security.
|
||||
This is normal and expected.
|
||||
|
||||
---
|
||||
|
||||
## Emergency Commands
|
||||
|
||||
### Completely restart the stack
|
||||
```bash
|
||||
cd /opt/nextcloud-stack
|
||||
docker compose down
|
||||
docker compose up -d
|
||||
```
|
||||
|
||||
### View all logs in real-time
|
||||
```bash
|
||||
cd /opt/nextcloud-stack
|
||||
docker compose logs -f
|
||||
```
|
||||
|
||||
### Check container health
|
||||
```bash
|
||||
docker compose ps
|
||||
docker inspect --format='{{.State.Health.Status}}' next
|
||||
```
|
||||
|
||||
### Rebuild a specific container
|
||||
```bash
|
||||
docker compose up -d --force-recreate --no-deps next
|
||||
```
|
||||
|
||||
### Emergency backup
|
||||
```bash
|
||||
/opt/nextcloud-stack/backup.sh
|
||||
```
|
||||
|
||||
### Reset Nextcloud admin password
|
||||
```bash
|
||||
docker exec -u www-data next php occ user:resetpassword admin
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Getting Help
|
||||
|
||||
If none of these solutions work:
|
||||
|
||||
1. **Check logs:**
|
||||
```bash
|
||||
docker compose logs [service-name]
|
||||
```
|
||||
|
||||
2. **Check system logs:**
|
||||
```bash
|
||||
sudo journalctl -xe
|
||||
```
|
||||
|
||||
3. **Verify configuration:**
|
||||
```bash
|
||||
cat /opt/nextcloud-stack/docker-compose.yml
|
||||
cat /opt/nextcloud-stack/.env
|
||||
```
|
||||
|
||||
4. **Test connectivity:**
|
||||
```bash
|
||||
curl -I https://cloud.yourdomain.com
|
||||
docker exec caddy caddy validate
|
||||
```
|
||||
|
||||
5. **Deployment report:**
|
||||
```bash
|
||||
cat /opt/nextcloud-stack/DEPLOYMENT.txt
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Recovery Procedures
|
||||
|
||||
### Restore from backup
|
||||
|
||||
See [BACKUP_RESTORE.md](BACKUP_RESTORE.md)
|
||||
|
||||
### Complete reinstallation
|
||||
|
||||
```bash
|
||||
# 1. Backup first!
|
||||
/opt/nextcloud-stack/backup.sh
|
||||
|
||||
# 2. Remove deployment
|
||||
ansible-playbook playbooks/99-rollback.yml --ask-vault-pass
|
||||
|
||||
# 3. Redeploy
|
||||
ansible-playbook playbooks/site.yml --ask-vault-pass
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
**Last Updated:** 2026-02-16
|
||||
Reference in New Issue
Block a user