Enterprise Only

This section is only relevant to Enterprise customers who acquired an on-prem license.

Troubleshooting Guide

Solutions for common issues you might encounter with Permit Platform.

Installation Issues

Frontend Domain Configuration Error

❌ Problem: "Frontend domain not configured in values.yaml"

Error message:

[ERROR] ❌ Frontend domain not configured in values.yaml
Please edit charts/permit-platform/values.yaml and replace:
  frontendDomain: "CHANGEME_FRONTEND_DOMAIN"

Solution:

Edit charts/permit-platform/values.yaml

Replace CHANGEME_FRONTEND_DOMAIN with your actual domain:

global:
  frontendDomain: "permit.yourcompany.com"  # Your domain here

Re-run the installation script

Docker Image Loading Failures

❌ Problem: Images fail to load during installation

Error symptoms:

[ERROR] Failed to load permit-backend-v2.tar after 3 attempts
Error loading image: docker: Error response from daemon

Diagnostic steps:

# Check Docker service status
systemctl status docker

# Check available disk space
df -h

# Check Docker daemon logs
journalctl -u docker --tail=20

# Test Docker manually
docker pull hello-world

Solutions:

Insufficient disk space:

# Free up space
docker system prune -f
sudo apt-get clean

Docker service issues:

# Restart Docker
sudo systemctl restart docker

# Check Docker is running
sudo systemctl status docker

Permission issues:

# Add user to docker group
sudo usermod -aG docker $USER
# Log out and back in, then retry

Kubernetes Access Issues

❌ Problem: "Unable to connect to Kubernetes cluster"

Error symptoms:

[ERROR] Kubernetes cluster not accessible
error: couldn't get current server API group list

Diagnostic steps:

# Check kubectl configuration
kubectl cluster-info

# Check kubeconfig
echo $KUBECONFIG
ls -la ~/.kube/config

# Test basic connectivity
kubectl get nodes

Solutions:

Configure kubectl:

# Set kubeconfig if not configured
export KUBECONFIG=/path/to/your/kubeconfig

# Or copy config to default location
mkdir -p ~/.kube
cp /path/to/kubeconfig ~/.kube/config

Check cluster status:

# For managed clusters (EKS, GKE, AKS)
# Ensure cluster is running and accessible

# For on-premise clusters
systemctl status kubelet
systemctl status kube-apiserver

Helm Installation Issues

❌ Problem: Helm deployment fails

Error symptoms:

Error: failed to install chart: context deadline exceeded
Error: UPGRADE FAILED: another operation is in progress

Diagnostic steps:

# Check Helm status
helm list -n permit-platform

# Check pending releases
helm list --pending -n permit-platform

# Check namespace
kubectl get all -n permit-platform

Solutions:

Rollback failed release:

# If upgrade failed
helm rollback permit-platform -n permit-platform

# Or uninstall and retry
helm uninstall permit-platform -n permit-platform
./scripts/install-permit-platform.sh

Check resource constraints:

# Check cluster resources
kubectl top nodes
kubectl describe nodes

# Check for failed pods
kubectl get pods -n permit-platform --field-selector=status.phase=Failed

Post-Installation Issues

Cannot Access Web Interface

❌ Problem: Cannot reach the web interface

Diagnostic steps:

Check pod status:

kubectl get pods -n permit-platform
kubectl get ingress -n permit-platform

Test internal connectivity:

# Check if services are responding
kubectl port-forward -n permit-platform svc/permit-frontend 3000:3000 &
curl http://localhost:3000

Check ingress controller:

kubectl get pods -n ingress-nginx
kubectl logs -n ingress-nginx deployment/ingress-nginx-controller

Verify DNS resolution:

# Test domain resolution
nslookup [your-frontend-domain]

# Check hosts file (for .local domains)
cat /etc/hosts

Solutions:

DNS issues:

# For development (.local domains), add to hosts file
echo "127.0.0.1 permit-frontend.local" | sudo tee -a /etc/hosts

# For production, ensure DNS points to server IP

Ingress issues:

# Check ingress status
kubectl describe ingress permit-frontend -n permit-platform

# Restart ingress controller if needed
kubectl rollout restart deployment/ingress-nginx-controller -n ingress-nginx

Service issues:

# Check if frontend pod is running
kubectl get pods -n permit-platform -l app=permit-frontend

# Check frontend logs
kubectl logs -n permit-platform deployment/permit-frontend

Service Startup Issues

❌ Problem: Services failing to start or crashing

Check service health:

# Check all pods status
kubectl get pods -n permit-platform

# Check specific service logs
kubectl logs -n permit-platform deployment/permit-backend-v2
kubectl logs -n permit-platform deployment/celery-general
kubectl logs -n permit-platform deployment/opal-server

# Check events for issues
kubectl get events -n permit-platform --sort-by='.lastTimestamp'

Common solutions:

Restart failing services:

# Restart specific deployment
kubectl rollout restart deployment/permit-backend-v2 -n permit-platform

# Restart all deployments
kubectl rollout restart deployment -n permit-platform

Check resource constraints:

# Check node resources
kubectl top nodes
kubectl describe nodes

# Check for OOMKilled pods
kubectl get pods -n permit-platform -o jsonpath='{range .items[*]}{.metadata.name}{"\t"}{.status.containerStatuses[0].lastState.terminated.reason}{"\n"}{end}' | grep OOMKilled

Scale resources if needed:

# Scale backend replicas
kubectl scale deployment permit-backend-v2 -n permit-platform --replicas=2

# Scale celery workers
kubectl scale deployment celery-general -n permit-platform --replicas=1

❌ Problem: Cannot login with admin credentials

Error symptoms:

"Invalid username or password"
Login form keeps reloading
Authentication redirect loops

Diagnostic steps:

# Check Keycloak status
kubectl get pods -n permit-platform -l app=keycloak
kubectl logs -n permit-platform deployment/keycloak

# Check backend authentication logs
kubectl logs -n permit-platform deployment/permit-backend-v2 | grep -i auth

# Verify admin password
kubectl get secret keycloak-admin-secret -n permit-platform -o jsonpath='{.data.password}' | base64 -d

Solutions:

Reset admin password:

# Get current admin password from secret
kubectl get secret keycloak-admin-secret -n permit-platform -o jsonpath='{.data.password}' | base64 -d

# If password doesn't work, check if Keycloak initialized properly
kubectl logs -n permit-platform deployment/keycloak | grep -i "admin user"

Check Keycloak configuration:

# Verify Keycloak is accessible
kubectl port-forward -n permit-platform svc/keycloak 8080:8080 &
curl http://localhost:8080/auth/realms/permit-platform

Authentication configuration issues:

# Check backend authentication environment variables
kubectl describe deployment permit-backend-v2 -n permit-platform | grep -A 20 Environment

# Verify cookie configuration
kubectl logs -n permit-platform deployment/permit-backend-v2 | grep -i cookie

Database Connection Issues

❌ Problem: Services cannot connect to database

Check database status:

# Check PostgreSQL pod
kubectl get pods -n permit-platform -l app=postgres
kubectl logs -n permit-platform deployment/postgres

# Test database connectivity from backend
kubectl exec -n permit-platform deployment/permit-backend-v2 -- psql -h postgres -U permit -d permit -c "SELECT 1;"

Solutions:

Restart database:

kubectl rollout restart deployment/postgres -n permit-platform

# Wait for database to be ready
kubectl wait --for=condition=ready pod -l app=postgres -n permit-platform --timeout=300s

Check database initialization:

# Check if database initialized properly
kubectl logs -n permit-platform deployment/postgres | grep -i "database system is ready"

# Check database size and connections
kubectl exec -n permit-platform deployment/postgres -- psql -U permit -c "\l"

Verify database secrets:

# Check database password in secret
kubectl get secret postgres-secret -n permit-platform -o jsonpath='{.data.password}' | base64 -d

Policy Sync Issues

⚠️ Problem: Policy sync failing from Git repository

Policy Sync is required for platform operation.

Test Git connectivity:

# Check Policy Sync pod status
kubectl get pods -n permit-platform -l app=permit-policy-sync-v2

# Check Policy Sync logs
kubectl logs -n permit-platform deployment/permit-policy-sync-v2

# Verify SSH key secret
kubectl get secret policy-sync-ssh-key -n permit-platform -o jsonpath='{.data.private-key}' | base64 -d | head -1

Solutions:

Verify Git credentials in secret:

# Check if SSH key is properly configured
kubectl describe secret policy-sync-ssh-key -n permit-platform

# Test SSH connection manually (if possible)
ssh -T git@github.com -i /path/to/permit-policy-key

Restart Policy Sync service:

kubectl rollout restart deployment/permit-policy-sync-v2 -n permit-platform

Log Collection and Monitoring

Collecting Diagnostic Information

For troubleshooting or support requests:

# Get all pod logs
kubectl logs -n permit-platform --all-containers=true --selector=app!=postgres > permit-platform-logs.txt

# Get pod status and descriptions
kubectl get pods -n permit-platform -o wide > pod-status.txt
kubectl describe pods -n permit-platform > pod-descriptions.txt

# Get events
kubectl get events -n permit-platform --sort-by='.lastTimestamp' > events.txt

# Get service and ingress info
kubectl get svc,ingress -n permit-platform -o yaml > networking.yaml

# Check resource usage
kubectl top pods -n permit-platform > resource-usage.txt

Performance Monitoring

Check resource consumption:

# Monitor pod resources
kubectl top pods -n permit-platform

# Monitor node resources
kubectl top nodes

# Check for resource limits being hit
kubectl describe pods -n permit-platform | grep -A 5 -B 5 "resource\|limit\|request"

Scale services if needed:

# Scale backend for more capacity
kubectl scale deployment permit-backend-v2 -n permit-platform --replicas=3

# Scale celery workers
kubectl scale deployment celery-general -n permit-platform --replicas=2

# Check horizontal pod autoscaler (if configured)
kubectl get hpa -n permit-platform

Advanced Troubleshooting

Clean Installation Reset

If you need to completely reset the installation:

# 1. Uninstall all 3 Helm releases (in reverse order)
helm uninstall permit-platform -n permit-platform 2>/dev/null || true        # Platform services (35 services)
helm uninstall migrations -n permit-platform 2>/dev/null || true             # Database migrations
helm uninstall third-party-services -n permit-platform 2>/dev/null || true   # Infrastructure (PostgreSQL, Redis, etc.)

# 2. Delete namespace (removes all resources)
kubectl delete namespace permit-platform

# 3. Clean up any persistent volumes (careful!)
kubectl get pv | grep permit-platform  # Check before deleting
kubectl delete pv $(kubectl get pv -o jsonpath='{.items[?(@.spec.claimRef.namespace=="permit-platform")].metadata.name}')

# 4. Re-run installation
./scripts/install-permit-platform.sh

Manual Service Recovery

If specific services are failing:

# Force delete stuck pods
kubectl delete pod <pod-name> -n permit-platform --force --grace-period=0

# Patch deployment to fix issues
kubectl patch deployment permit-backend-v2 -n permit-platform -p '{"spec":{"template":{"spec":{"containers":[{"name":"permit-backend-v2","image":"new-image-tag"}]}}}}'

# Check and fix persistent volume claims
kubectl get pvc -n permit-platform
kubectl describe pvc <pvc-name> -n permit-platform

Getting Support

When contacting support, please provide:

Required Information:

Installation method: Kubernetes cluster type (EKS, GKE, AKS, on-premise)
Platform version: From installer package filename
Error description: What you were trying to do and what happened
Timeline: When did the issue start?

Helpful Diagnostics:

# Create comprehensive diagnostic bundle
{
  echo "=== CLUSTER INFO ==="
  kubectl cluster-info
  echo "=== NODES ==="
  kubectl get nodes -o wide
  echo "=== PERMIT PLATFORM PODS ==="
  kubectl get pods -n permit-platform -o wide
  echo "=== RECENT EVENTS ==="
  kubectl get events -n permit-platform --sort-by='.lastTimestamp' | tail -20
  echo "=== INGRESS STATUS ==="
  kubectl get ingress -n permit-platform
  echo "=== STORAGE ==="
  kubectl get pv,pvc -n permit-platform
} > permit-support-info.txt

Contact Information:

📧 Email: support@permit.io
💬 Slack: Join our community

Need more help? Our support team has experience with all major Kubernetes platforms and can assist with advanced troubleshooting scenarios.

Installation Issues​

Frontend Domain Configuration Error​

Docker Image Loading Failures​

Kubernetes Access Issues​

Helm Installation Issues​

Post-Installation Issues​

Cannot Access Web Interface​

Service Startup Issues​

Authentication and Login Issues​

Database Connection Issues​

Policy Sync Issues​

Log Collection and Monitoring​

Collecting Diagnostic Information​

Performance Monitoring​

Advanced Troubleshooting​

Clean Installation Reset​

Manual Service Recovery​

Getting Support​

Contents

Installation Issues

Frontend Domain Configuration Error

Docker Image Loading Failures

Kubernetes Access Issues

Helm Installation Issues

Post-Installation Issues

Cannot Access Web Interface

Service Startup Issues

Authentication and Login Issues

Database Connection Issues

Policy Sync Issues

Log Collection and Monitoring

Collecting Diagnostic Information

Performance Monitoring

Advanced Troubleshooting

Clean Installation Reset

Manual Service Recovery

Getting Support