Skip to main content
Enterprise Only

This section is only relevant to Enterprise customers who acquired an on-prem license.

Troubleshooting Guide

Solutions for common issues you might encounter with Permit Platform.

Installation Issues

Frontend Domain Configuration Error

❌ Problem: "Frontend domain not configured in values.yaml"

Error message:

[ERROR] ❌ Frontend domain not configured in values.yaml
Please edit charts/permit-platform/values.yaml and replace:
frontendDomain: "CHANGEME_FRONTEND_DOMAIN"

Solution:

  1. Edit charts/permit-platform/values.yaml
  2. Replace CHANGEME_FRONTEND_DOMAIN with your actual domain:
    global:
    frontendDomain: "permit.yourcompany.com" # Your domain here
  3. Re-run the installation script

Docker Image Loading Failures

❌ Problem: Images fail to load during installation

Error symptoms:

[ERROR] Failed to load permit-backend-v2.tar after 3 attempts
Error loading image: docker: Error response from daemon

Diagnostic steps:

# Check Docker service status
systemctl status docker

# Check available disk space
df -h

# Check Docker daemon logs
journalctl -u docker --tail=20

# Test Docker manually
docker pull hello-world

Solutions:

  1. Insufficient disk space:

    # Free up space
    docker system prune -f
    sudo apt-get clean
  2. Docker service issues:

    # Restart Docker
    sudo systemctl restart docker

    # Check Docker is running
    sudo systemctl status docker
  3. Permission issues:

    # Add user to docker group
    sudo usermod -aG docker $USER
    # Log out and back in, then retry

Kubernetes Access Issues

❌ Problem: "Unable to connect to Kubernetes cluster"

Error symptoms:

[ERROR] Kubernetes cluster not accessible
error: couldn't get current server API group list

Diagnostic steps:

# Check kubectl configuration
kubectl cluster-info

# Check kubeconfig
echo $KUBECONFIG
ls -la ~/.kube/config

# Test basic connectivity
kubectl get nodes

Solutions:

  1. Configure kubectl:

    # Set kubeconfig if not configured
    export KUBECONFIG=/path/to/your/kubeconfig

    # Or copy config to default location
    mkdir -p ~/.kube
    cp /path/to/kubeconfig ~/.kube/config
  2. Check cluster status:

    # For managed clusters (EKS, GKE, AKS)
    # Ensure cluster is running and accessible

    # For on-premise clusters
    systemctl status kubelet
    systemctl status kube-apiserver

Helm Installation Issues

❌ Problem: Helm deployment fails

Error symptoms:

Error: failed to install chart: context deadline exceeded
Error: UPGRADE FAILED: another operation is in progress

Diagnostic steps:

# Check Helm status
helm list -n permit-platform

# Check pending releases
helm list --pending -n permit-platform

# Check namespace
kubectl get all -n permit-platform

Solutions:

  1. Rollback failed release:

    # If upgrade failed
    helm rollback permit-platform -n permit-platform

    # Or uninstall and retry
    helm uninstall permit-platform -n permit-platform
    ./scripts/install-permit-platform.sh
  2. Check resource constraints:

    # Check cluster resources
    kubectl top nodes
    kubectl describe nodes

    # Check for failed pods
    kubectl get pods -n permit-platform --field-selector=status.phase=Failed

Post-Installation Issues

Cannot Access Web Interface

❌ Problem: Cannot reach the web interface

Diagnostic steps:

  1. Check pod status:

    kubectl get pods -n permit-platform
    kubectl get ingress -n permit-platform
  2. Test internal connectivity:

    # Check if services are responding
    kubectl port-forward -n permit-platform svc/permit-frontend 3000:3000 &
    curl http://localhost:3000
  3. Check ingress controller:

    kubectl get pods -n ingress-nginx
    kubectl logs -n ingress-nginx deployment/ingress-nginx-controller
  4. Verify DNS resolution:

    # Test domain resolution
    nslookup [your-frontend-domain]

    # Check hosts file (for .local domains)
    cat /etc/hosts

Solutions:

  1. DNS issues:

    # For development (.local domains), add to hosts file
    echo "127.0.0.1 permit-frontend.local" | sudo tee -a /etc/hosts

    # For production, ensure DNS points to server IP
  2. Ingress issues:

    # Check ingress status
    kubectl describe ingress permit-frontend -n permit-platform

    # Restart ingress controller if needed
    kubectl rollout restart deployment/ingress-nginx-controller -n ingress-nginx
  3. Service issues:

    # Check if frontend pod is running
    kubectl get pods -n permit-platform -l app=permit-frontend

    # Check frontend logs
    kubectl logs -n permit-platform deployment/permit-frontend

Service Startup Issues

❌ Problem: Services failing to start or crashing

Check service health:

# Check all pods status
kubectl get pods -n permit-platform

# Check specific service logs
kubectl logs -n permit-platform deployment/permit-backend-v2
kubectl logs -n permit-platform deployment/celery-general
kubectl logs -n permit-platform deployment/opal-server

# Check events for issues
kubectl get events -n permit-platform --sort-by='.lastTimestamp'

Common solutions:

  1. Restart failing services:

    # Restart specific deployment
    kubectl rollout restart deployment/permit-backend-v2 -n permit-platform

    # Restart all deployments
    kubectl rollout restart deployment -n permit-platform
  2. Check resource constraints:

    # Check node resources
    kubectl top nodes
    kubectl describe nodes

    # Check for OOMKilled pods
    kubectl get pods -n permit-platform -o jsonpath='{range .items[*]}{.metadata.name}{"\t"}{.status.containerStatuses[0].lastState.terminated.reason}{"\n"}{end}' | grep OOMKilled
  3. Scale resources if needed:

    # Scale backend replicas
    kubectl scale deployment permit-backend-v2 -n permit-platform --replicas=2

    # Scale celery workers
    kubectl scale deployment celery-general -n permit-platform --replicas=1

Authentication and Login Issues

❌ Problem: Cannot login with admin credentials

Error symptoms:

  • "Invalid username or password"
  • Login form keeps reloading
  • Authentication redirect loops

Diagnostic steps:

# Check Keycloak status
kubectl get pods -n permit-platform -l app=keycloak
kubectl logs -n permit-platform deployment/keycloak

# Check backend authentication logs
kubectl logs -n permit-platform deployment/permit-backend-v2 | grep -i auth

# Verify admin password
kubectl get secret keycloak-admin-secret -n permit-platform -o jsonpath='{.data.password}' | base64 -d

Solutions:

  1. Reset admin password:

    # Get current admin password from secret
    kubectl get secret keycloak-admin-secret -n permit-platform -o jsonpath='{.data.password}' | base64 -d

    # If password doesn't work, check if Keycloak initialized properly
    kubectl logs -n permit-platform deployment/keycloak | grep -i "admin user"
  2. Check Keycloak configuration:

    # Verify Keycloak is accessible
    kubectl port-forward -n permit-platform svc/keycloak 8080:8080 &
    curl http://localhost:8080/auth/realms/permit-platform
  3. Authentication configuration issues:

    # Check backend authentication environment variables
    kubectl describe deployment permit-backend-v2 -n permit-platform | grep -A 20 Environment

    # Verify cookie configuration
    kubectl logs -n permit-platform deployment/permit-backend-v2 | grep -i cookie

Database Connection Issues

❌ Problem: Services cannot connect to database

Check database status:

# Check PostgreSQL pod
kubectl get pods -n permit-platform -l app=postgres
kubectl logs -n permit-platform deployment/postgres

# Test database connectivity from backend
kubectl exec -n permit-platform deployment/permit-backend-v2 -- psql -h postgres -U permit -d permit -c "SELECT 1;"

Solutions:

  1. Restart database:

    kubectl rollout restart deployment/postgres -n permit-platform

    # Wait for database to be ready
    kubectl wait --for=condition=ready pod -l app=postgres -n permit-platform --timeout=300s
  2. Check database initialization:

    # Check if database initialized properly
    kubectl logs -n permit-platform deployment/postgres | grep -i "database system is ready"

    # Check database size and connections
    kubectl exec -n permit-platform deployment/postgres -- psql -U permit -c "\l"
  3. Verify database secrets:

    # Check database password in secret
    kubectl get secret postgres-secret -n permit-platform -o jsonpath='{.data.password}' | base64 -d

Policy Sync Issues

⚠️ Problem: Policy sync failing from Git repository

Policy Sync is required for platform operation.

Test Git connectivity:

# Check Policy Sync pod status
kubectl get pods -n permit-platform -l app=permit-policy-sync-v2

# Check Policy Sync logs
kubectl logs -n permit-platform deployment/permit-policy-sync-v2

# Verify SSH key secret
kubectl get secret policy-sync-ssh-key -n permit-platform -o jsonpath='{.data.private-key}' | base64 -d | head -1

Solutions:

  1. Verify Git credentials in secret:

    # Check if SSH key is properly configured
    kubectl describe secret policy-sync-ssh-key -n permit-platform

    # Test SSH connection manually (if possible)
    ssh -T git@github.com -i /path/to/permit-policy-key
  2. Restart Policy Sync service:

    kubectl rollout restart deployment/permit-policy-sync-v2 -n permit-platform

Log Collection and Monitoring

Collecting Diagnostic Information

For troubleshooting or support requests:

# Get all pod logs
kubectl logs -n permit-platform --all-containers=true --selector=app!=postgres > permit-platform-logs.txt

# Get pod status and descriptions
kubectl get pods -n permit-platform -o wide > pod-status.txt
kubectl describe pods -n permit-platform > pod-descriptions.txt

# Get events
kubectl get events -n permit-platform --sort-by='.lastTimestamp' > events.txt

# Get service and ingress info
kubectl get svc,ingress -n permit-platform -o yaml > networking.yaml

# Check resource usage
kubectl top pods -n permit-platform > resource-usage.txt

Performance Monitoring

Check resource consumption:

# Monitor pod resources
kubectl top pods -n permit-platform

# Monitor node resources
kubectl top nodes

# Check for resource limits being hit
kubectl describe pods -n permit-platform | grep -A 5 -B 5 "resource\|limit\|request"

Scale services if needed:

# Scale backend for more capacity
kubectl scale deployment permit-backend-v2 -n permit-platform --replicas=3

# Scale celery workers
kubectl scale deployment celery-general -n permit-platform --replicas=2

# Check horizontal pod autoscaler (if configured)
kubectl get hpa -n permit-platform

Advanced Troubleshooting

Clean Installation Reset

If you need to completely reset the installation:

# 1. Uninstall all 3 Helm releases (in reverse order)
helm uninstall permit-platform -n permit-platform 2>/dev/null || true # Platform services (35 services)
helm uninstall migrations -n permit-platform 2>/dev/null || true # Database migrations
helm uninstall third-party-services -n permit-platform 2>/dev/null || true # Infrastructure (PostgreSQL, Redis, etc.)

# 2. Delete namespace (removes all resources)
kubectl delete namespace permit-platform

# 3. Clean up any persistent volumes (careful!)
kubectl get pv | grep permit-platform # Check before deleting
kubectl delete pv $(kubectl get pv -o jsonpath='{.items[?(@.spec.claimRef.namespace=="permit-platform")].metadata.name}')

# 4. Re-run installation
./scripts/install-permit-platform.sh

Manual Service Recovery

If specific services are failing:

# Force delete stuck pods
kubectl delete pod <pod-name> -n permit-platform --force --grace-period=0

# Patch deployment to fix issues
kubectl patch deployment permit-backend-v2 -n permit-platform -p '{"spec":{"template":{"spec":{"containers":[{"name":"permit-backend-v2","image":"new-image-tag"}]}}}}'

# Check and fix persistent volume claims
kubectl get pvc -n permit-platform
kubectl describe pvc <pvc-name> -n permit-platform

Getting Support

When contacting support, please provide:

Required Information:

  • Installation method: Kubernetes cluster type (EKS, GKE, AKS, on-premise)
  • Platform version: From installer package filename
  • Error description: What you were trying to do and what happened
  • Timeline: When did the issue start?

Helpful Diagnostics:

# Create comprehensive diagnostic bundle
{
echo "=== CLUSTER INFO ==="
kubectl cluster-info
echo "=== NODES ==="
kubectl get nodes -o wide
echo "=== PERMIT PLATFORM PODS ==="
kubectl get pods -n permit-platform -o wide
echo "=== RECENT EVENTS ==="
kubectl get events -n permit-platform --sort-by='.lastTimestamp' | tail -20
echo "=== INGRESS STATUS ==="
kubectl get ingress -n permit-platform
echo "=== STORAGE ==="
kubectl get pv,pvc -n permit-platform
} > permit-support-info.txt

Contact Information:


Need more help? Our support team has experience with all major Kubernetes platforms and can assist with advanced troubleshooting scenarios.