This section is only relevant to Enterprise customers who acquired an on-prem license.
Troubleshooting Guide
Solutions for common issues you might encounter with Permit Platform.
Installation Issues
Frontend Domain Configuration Error
❌ Problem: "Frontend domain not configured in values.yaml"
Error message:
[ERROR] ❌ Frontend domain not configured in values.yaml
Please edit charts/permit-platform/values.yaml and replace:
frontendDomain: "CHANGEME_FRONTEND_DOMAIN"
Solution:
- Edit
charts/permit-platform/values.yaml - Replace
CHANGEME_FRONTEND_DOMAINwith your actual domain:global:
frontendDomain: "permit.yourcompany.com" # Your domain here - Re-run the installation script
Docker Image Loading Failures
❌ Problem: Images fail to load during installation
Error symptoms:
[ERROR] Failed to load permit-backend-v2.tar after 3 attempts
Error loading image: docker: Error response from daemon
Diagnostic steps:
# Check Docker service status
systemctl status docker
# Check available disk space
df -h
# Check Docker daemon logs
journalctl -u docker --tail=20
# Test Docker manually
docker pull hello-world
Solutions:
-
Insufficient disk space:
# Free up space
docker system prune -f
sudo apt-get clean -
Docker service issues:
# Restart Docker
sudo systemctl restart docker
# Check Docker is running
sudo systemctl status docker -
Permission issues:
# Add user to docker group
sudo usermod -aG docker $USER
# Log out and back in, then retry
Kubernetes Access Issues
❌ Problem: "Unable to connect to Kubernetes cluster"
Error symptoms:
[ERROR] Kubernetes cluster not accessible
error: couldn't get current server API group list
Diagnostic steps:
# Check kubectl configuration
kubectl cluster-info
# Check kubeconfig
echo $KUBECONFIG
ls -la ~/.kube/config
# Test basic connectivity
kubectl get nodes
Solutions:
-
Configure kubectl:
# Set kubeconfig if not configured
export KUBECONFIG=/path/to/your/kubeconfig
# Or copy config to default location
mkdir -p ~/.kube
cp /path/to/kubeconfig ~/.kube/config -
Check cluster status:
# For managed clusters (EKS, GKE, AKS)
# Ensure cluster is running and accessible
# For on-premise clusters
systemctl status kubelet
systemctl status kube-apiserver
Helm Installation Issues
❌ Problem: Helm deployment fails
Error symptoms:
Error: failed to install chart: context deadline exceeded
Error: UPGRADE FAILED: another operation is in progress
Diagnostic steps:
# Check Helm status
helm list -n permit-platform
# Check pending releases
helm list --pending -n permit-platform
# Check namespace
kubectl get all -n permit-platform
Solutions:
-
Rollback failed release:
# If upgrade failed
helm rollback permit-platform -n permit-platform
# Or uninstall and retry
helm uninstall permit-platform -n permit-platform
./scripts/install-permit-platform.sh -
Check resource constraints:
# Check cluster resources
kubectl top nodes
kubectl describe nodes
# Check for failed pods
kubectl get pods -n permit-platform --field-selector=status.phase=Failed
Post-Installation Issues
Cannot Access Web Interface
❌ Problem: Cannot reach the web interface
Diagnostic steps:
-
Check pod status:
kubectl get pods -n permit-platform
kubectl get ingress -n permit-platform -
Test internal connectivity:
# Check if services are responding
kubectl port-forward -n permit-platform svc/permit-frontend 3000:3000 &
curl http://localhost:3000 -
Check ingress controller:
kubectl get pods -n ingress-nginx
kubectl logs -n ingress-nginx deployment/ingress-nginx-controller -
Verify DNS resolution:
# Test domain resolution
nslookup [your-frontend-domain]
# Check hosts file (for .local domains)
cat /etc/hosts
Solutions:
-
DNS issues:
# For development (.local domains), add to hosts file
echo "127.0.0.1 permit-frontend.local" | sudo tee -a /etc/hosts
# For production, ensure DNS points to server IP -
Ingress issues:
# Check ingress status
kubectl describe ingress permit-frontend -n permit-platform
# Restart ingress controller if needed
kubectl rollout restart deployment/ingress-nginx-controller -n ingress-nginx -
Service issues:
# Check if frontend pod is running
kubectl get pods -n permit-platform -l app=permit-frontend
# Check frontend logs
kubectl logs -n permit-platform deployment/permit-frontend
Service Startup Issues
❌ Problem: Services failing to start or crashing
Check service health:
# Check all pods status
kubectl get pods -n permit-platform
# Check specific service logs
kubectl logs -n permit-platform deployment/permit-backend-v2
kubectl logs -n permit-platform deployment/celery-general
kubectl logs -n permit-platform deployment/opal-server
# Check events for issues
kubectl get events -n permit-platform --sort-by='.lastTimestamp'
Common solutions:
-
Restart failing services:
# Restart specific deployment
kubectl rollout restart deployment/permit-backend-v2 -n permit-platform
# Restart all deployments
kubectl rollout restart deployment -n permit-platform -
Check resource constraints:
# Check node resources
kubectl top nodes
kubectl describe nodes
# Check for OOMKilled pods
kubectl get pods -n permit-platform -o jsonpath='{range .items[*]}{.metadata.name}{"\t"}{.status.containerStatuses[0].lastState.terminated.reason}{"\n"}{end}' | grep OOMKilled -
Scale resources if needed:
# Scale backend replicas
kubectl scale deployment permit-backend-v2 -n permit-platform --replicas=2
# Scale celery workers
kubectl scale deployment celery-general -n permit-platform --replicas=1
Authentication and Login Issues
❌ Problem: Cannot login with admin credentials
Error symptoms:
- "Invalid username or password"
- Login form keeps reloading
- Authentication redirect loops
Diagnostic steps:
# Check Keycloak status
kubectl get pods -n permit-platform -l app=keycloak
kubectl logs -n permit-platform deployment/keycloak
# Check backend authentication logs
kubectl logs -n permit-platform deployment/permit-backend-v2 | grep -i auth
# Verify admin password
kubectl get secret keycloak-admin-secret -n permit-platform -o jsonpath='{.data.password}' | base64 -d
Solutions:
-
Reset admin password:
# Get current admin password from secret
kubectl get secret keycloak-admin-secret -n permit-platform -o jsonpath='{.data.password}' | base64 -d
# If password doesn't work, check if Keycloak initialized properly
kubectl logs -n permit-platform deployment/keycloak | grep -i "admin user" -
Check Keycloak configuration:
# Verify Keycloak is accessible
kubectl port-forward -n permit-platform svc/keycloak 8080:8080 &
curl http://localhost:8080/auth/realms/permit-platform -
Authentication configuration issues:
# Check backend authentication environment variables
kubectl describe deployment permit-backend-v2 -n permit-platform | grep -A 20 Environment
# Verify cookie configuration
kubectl logs -n permit-platform deployment/permit-backend-v2 | grep -i cookie
Database Connection Issues
❌ Problem: Services cannot connect to database
Check database status:
# Check PostgreSQL pod
kubectl get pods -n permit-platform -l app=postgres
kubectl logs -n permit-platform deployment/postgres
# Test database connectivity from backend
kubectl exec -n permit-platform deployment/permit-backend-v2 -- psql -h postgres -U permit -d permit -c "SELECT 1;"
Solutions:
-
Restart database:
kubectl rollout restart deployment/postgres -n permit-platform
# Wait for database to be ready
kubectl wait --for=condition=ready pod -l app=postgres -n permit-platform --timeout=300s -
Check database initialization:
# Check if database initialized properly
kubectl logs -n permit-platform deployment/postgres | grep -i "database system is ready"
# Check database size and connections
kubectl exec -n permit-platform deployment/postgres -- psql -U permit -c "\l" -
Verify database secrets:
# Check database password in secret
kubectl get secret postgres-secret -n permit-platform -o jsonpath='{.data.password}' | base64 -d
Policy Sync Issues
⚠️ Problem: Policy sync failing from Git repository
Policy Sync is required for platform operation.
Test Git connectivity:
# Check Policy Sync pod status
kubectl get pods -n permit-platform -l app=permit-policy-sync-v2
# Check Policy Sync logs
kubectl logs -n permit-platform deployment/permit-policy-sync-v2
# Verify SSH key secret
kubectl get secret policy-sync-ssh-key -n permit-platform -o jsonpath='{.data.private-key}' | base64 -d | head -1
Solutions:
-
Verify Git credentials in secret:
# Check if SSH key is properly configured
kubectl describe secret policy-sync-ssh-key -n permit-platform
# Test SSH connection manually (if possible)
ssh -T git@github.com -i /path/to/permit-policy-key -
Restart Policy Sync service:
kubectl rollout restart deployment/permit-policy-sync-v2 -n permit-platform
Log Collection and Monitoring
Collecting Diagnostic Information
For troubleshooting or support requests:
# Get all pod logs
kubectl logs -n permit-platform --all-containers=true --selector=app!=postgres > permit-platform-logs.txt
# Get pod status and descriptions
kubectl get pods -n permit-platform -o wide > pod-status.txt
kubectl describe pods -n permit-platform > pod-descriptions.txt
# Get events
kubectl get events -n permit-platform --sort-by='.lastTimestamp' > events.txt
# Get service and ingress info
kubectl get svc,ingress -n permit-platform -o yaml > networking.yaml
# Check resource usage
kubectl top pods -n permit-platform > resource-usage.txt
Performance Monitoring
Check resource consumption:
# Monitor pod resources
kubectl top pods -n permit-platform
# Monitor node resources
kubectl top nodes
# Check for resource limits being hit
kubectl describe pods -n permit-platform | grep -A 5 -B 5 "resource\|limit\|request"
Scale services if needed:
# Scale backend for more capacity
kubectl scale deployment permit-backend-v2 -n permit-platform --replicas=3
# Scale celery workers
kubectl scale deployment celery-general -n permit-platform --replicas=2
# Check horizontal pod autoscaler (if configured)
kubectl get hpa -n permit-platform
Advanced Troubleshooting
Clean Installation Reset
If you need to completely reset the installation:
# 1. Uninstall all 3 Helm releases (in reverse order)
helm uninstall permit-platform -n permit-platform 2>/dev/null || true # Platform services (35 services)
helm uninstall migrations -n permit-platform 2>/dev/null || true # Database migrations
helm uninstall third-party-services -n permit-platform 2>/dev/null || true # Infrastructure (PostgreSQL, Redis, etc.)
# 2. Delete namespace (removes all resources)
kubectl delete namespace permit-platform
# 3. Clean up any persistent volumes (careful!)
kubectl get pv | grep permit-platform # Check before deleting
kubectl delete pv $(kubectl get pv -o jsonpath='{.items[?(@.spec.claimRef.namespace=="permit-platform")].metadata.name}')
# 4. Re-run installation
./scripts/install-permit-platform.sh
Manual Service Recovery
If specific services are failing:
# Force delete stuck pods
kubectl delete pod <pod-name> -n permit-platform --force --grace-period=0
# Patch deployment to fix issues
kubectl patch deployment permit-backend-v2 -n permit-platform -p '{"spec":{"template":{"spec":{"containers":[{"name":"permit-backend-v2","image":"new-image-tag"}]}}}}'
# Check and fix persistent volume claims
kubectl get pvc -n permit-platform
kubectl describe pvc <pvc-name> -n permit-platform
Getting Support
When contacting support, please provide:
Required Information:
- Installation method: Kubernetes cluster type (EKS, GKE, AKS, on-premise)
- Platform version: From installer package filename
- Error description: What you were trying to do and what happened
- Timeline: When did the issue start?
Helpful Diagnostics:
# Create comprehensive diagnostic bundle
{
echo "=== CLUSTER INFO ==="
kubectl cluster-info
echo "=== NODES ==="
kubectl get nodes -o wide
echo "=== PERMIT PLATFORM PODS ==="
kubectl get pods -n permit-platform -o wide
echo "=== RECENT EVENTS ==="
kubectl get events -n permit-platform --sort-by='.lastTimestamp' | tail -20
echo "=== INGRESS STATUS ==="
kubectl get ingress -n permit-platform
echo "=== STORAGE ==="
kubectl get pv,pvc -n permit-platform
} > permit-support-info.txt
Contact Information:
- 📧 Email: support@permit.io
- 💬 Slack: Join our community
Need more help? Our support team has experience with all major Kubernetes platforms and can assist with advanced troubleshooting scenarios.