This section is only relevant to Enterprise customers who acquired an on-prem license.
Troubleshooting Guide
Solutions for common issues you might encounter with Permit Platform.
Installation Issues
Frontend Domain Configuration Error
❌ Problem: "Frontend domain not configured in values.yaml"
Error message:
[ERROR] ❌ Frontend domain not configured in values.yaml
Please edit charts/permit-platform/values.yaml and replace:
frontendDomain: "CHANGEME_FRONTEND_DOMAIN"
Solution:
- Edit
charts/permit-platform/values.yaml - Replace
CHANGEME_FRONTEND_DOMAINwith your actual domain:global:
frontendDomain: "permit.yourcompany.com" # Your domain here - Re-run the installation script
Docker Image Loading Failures
❌ Problem: Images fail to load during installation
Error symptoms:
[ERROR] Failed to load permit-backend-v2.tar after 3 attempts
Error loading image: docker: Error response from daemon
Diagnostic steps:
# Check Docker service status
systemctl status docker
# Check available disk space
df -h
# Check Docker daemon logs
journalctl -u docker --tail=20
# Test Docker manually
docker pull hello-world
Solutions:
-
Insufficient disk space:
# Free up space
docker system prune -f
sudo apt-get clean -
Docker service issues:
# Restart Docker
sudo systemctl restart docker
# Check Docker is running
sudo systemctl status docker -
Permission issues:
# Add user to docker group
sudo usermod -aG docker $USER
# Log out and back in, then retry
Kubernetes Access Issues
❌ Problem: "Unable to connect to Kubernetes cluster"
Error symptoms:
[ERROR] Kubernetes cluster not accessible
error: couldn't get current server API group list
Diagnostic steps:
# Check kubectl configuration
kubectl cluster-info
# Check kubeconfig
echo $KUBECONFIG
ls -la ~/.kube/config
# Test basic connectivity
kubectl get nodes
Solutions:
-
Configure kubectl:
# Set kubeconfig if not configured
export KUBECONFIG=/path/to/your/kubeconfig
# Or copy config to default location
mkdir -p ~/.kube
cp /path/to/kubeconfig ~/.kube/config -
Check cluster status:
# For managed clusters (EKS, GKE, AKS)
# Ensure cluster is running and accessible
# For on-premise clusters
systemctl status kubelet
systemctl status kube-apiserver
Helm Installation Issues
❌ Problem: Helm deployment fails
Error symptoms:
Error: failed to install chart: context deadline exceeded
Error: UPGRADE FAILED: another operation is in progress
Diagnostic steps:
# Check Helm status
helm list -n permit-platform
# Check pending releases
helm list --pending -n permit-platform
# Check namespace
kubectl get all -n permit-platform
Solutions:
-
Rollback failed release:
# If upgrade failed
helm rollback permit-platform -n permit-platform
# Or uninstall and retry
helm uninstall permit-platform -n permit-platform
./scripts/install-permit-platform.sh -
Check resource constraints:
# Check cluster resources
kubectl top nodes
kubectl describe nodes
# Check for failed pods
kubectl get pods -n permit-platform --field-selector=status.phase=Failed
Post-Installation Issues
Cannot Access Web Interface
❌ Problem: Cannot reach the web interface
Diagnostic steps:
-
Check pod status:
kubectl get pods -n permit-platform
kubectl get ingress -n permit-platform -
Test internal connectivity:
# Check if services are responding
kubectl port-forward -n permit-platform svc/permit-frontend 3000:3000 &
curl http://localhost:3000 -
Check ingress controller:
kubectl get pods -n ingress-nginx
kubectl logs -n ingress-nginx deployment/ingress-nginx-controller -
Verify DNS resolution:
# Test domain resolution
nslookup [your-frontend-domain]
# Check hosts file (for .local domains)
cat /etc/hosts
Solutions:
-
DNS issues:
# For development (.local domains), add to hosts file
echo "127.0.0.1 permit-frontend.local" | sudo tee -a /etc/hosts
# For production, ensure DNS points to server IP -
Ingress issues:
# Check ingress status
kubectl describe ingress permit-frontend -n permit-platform
# Restart ingress controller if needed
kubectl rollout restart deployment/ingress-nginx-controller -n ingress-nginx -
Service issues:
# Check if frontend pod is running
kubectl get pods -n permit-platform -l app=permit-frontend
# Check frontend logs
kubectl logs -n permit-platform deployment/permit-frontend
Service Startup Issues
❌ Problem: Services failing to start or crashing
Check service health:
# Check all pods status
kubectl get pods -n permit-platform
# Check specific service logs
kubectl logs -n permit-platform deployment/permit-backend-v2
kubectl logs -n permit-platform deployment/celery-general
kubectl logs -n permit-platform deployment/opal-server
# Check events for issues
kubectl get events -n permit-platform --sort-by='.lastTimestamp'
Common solutions:
-
Restart failing services:
# Restart specific deployment
kubectl rollout restart deployment/permit-backend-v2 -n permit-platform
# Restart all deployments
kubectl rollout restart deployment -n permit-platform -
Check resource constraints:
# Check node resources
kubectl top nodes
kubectl describe nodes
# Check for OOMKilled pods
kubectl get pods -n permit-platform -o jsonpath='{range .items[*]}{.metadata.name}{"\t"}{.status.containerStatuses[0].lastState.terminated.reason}{"\n"}{end}' | grep OOMKilled -
Scale resources if needed:
# Scale backend replicas
kubectl scale deployment permit-backend-v2 -n permit-platform --replicas=2
# Scale celery workers
kubectl scale deployment celery-general -n permit-platform --replicas=1
ImagePullBackOff Errors
❌ Problem: Pods stuck in ImagePullBackOff state - cannot pull container images
Error symptoms:
kubectl get pods -n permit-platform
NAME READY STATUS RESTARTS AGE
permit-backend-v2-xxx 0/1 ImagePullBackOff 0 2m
permit-frontend-xxx 0/1 ImagePullBackOff 0 2m
Diagnostic steps:
# Check pod events for detailed error message
kubectl describe pod permit-backend-v2-xxx -n permit-platform
# Common error messages you'll see:
# - "Failed to pull image... unauthorized: authentication required"
# → Missing or incorrect imagePullSecrets
#
# - "Failed to pull image... not found" or "manifest unknown"
# → Images not pushed to registry, or wrong imageRegistry in values.yaml
#
# - "Failed to pull image... denied: Permission denied"
# → GKE/EKS/AKS node doesn't have permission to pull from registry
Solutions by Root Cause:
1. Missing imagePullSecrets (Artifactory/Harbor/Private Registries)
If using Artifactory, Harbor, or private Docker registry:
# Step 1: Create the Kubernetes secret
kubectl create secret docker-registry registry-credentials \
--docker-server=artifactory.company.com \
--docker-username=YOUR_USERNAME \
--docker-password=YOUR_TOKEN \
--namespace=permit-platform
# Step 2: Verify secret was created
kubectl get secret registry-credentials -n permit-platform
# Step 3: Update values.yaml
vi charts/permit-platform/values.yaml
# Add this to global section:
# global:
# imageRegistry: "artifactory.company.com/permit-platform"
# imagePullSecrets:
# - registry-credentials
# Step 4: Upgrade the Helm release
helm upgrade permit-platform charts/permit-platform -n permit-platform
2. Forgot to Use --skip-images Flag
If you pushed images to a private registry but ran the installer WITHOUT --skip-images:
# Problem: Installer loaded wrong images from local tar files!
# Solution: Re-run installer with --skip-images flag
# For GKE:
./scripts/install-permit-platform.sh --gke --skip-images
# For other Kubernetes with private registry:
./scripts/install-permit-platform.sh --skip-images
# For OpenShift with private registry:
./scripts/install-permit-platform.sh --openshift --skip-images
3. Wrong imageRegistry Configuration
Check your values.yaml has the correct registry URL:
# View current configuration
kubectl get cm -n permit-platform
# Verify imageRegistry matches where you pushed images
cat charts/permit-platform/values.yaml | grep imageRegistry
# Should match your push command:
# If you ran: ./scripts/push-images-to-registry.sh us-central1-docker.pkg.dev/project/repo
# Then values.yaml must have: imageRegistry: "us-central1-docker.pkg.dev/project/repo"
4. Images Not Pushed to Registry
Verify images actually exist in your registry:
# For GKE/Google Artifact Registry:
gcloud artifacts docker images list us-central1-docker.pkg.dev/PROJECT/REPO
# For Artifactory:
curl -u username:password https://artifactory.company.com/v2/_catalog
# For Harbor:
curl -u username:password https://harbor.company.com/v2/_catalog
# For AWS ECR:
aws ecr describe-repositories --region us-east-1
aws ecr list-images --repository-name permit-platform --region us-east-1
If images are missing, push them:
cd permit-platform-installer
./scripts/push-images-to-registry.sh YOUR_REGISTRY_URL
5. GKE Workload Identity / IAM Permissions
For GKE with Google Artifact Registry, ensure nodes have pull permissions:
# Grant Artifact Registry Reader role to GKE service account
gcloud projects add-iam-policy-binding PROJECT_ID \
--member="serviceAccount:PROJECT_NUMBER-compute@developer.gserviceaccount.com" \
--role="roles/artifactregistry.reader"
# Or for specific node pool service account:
gcloud projects add-iam-policy-binding PROJECT_ID \
--member="serviceAccount:SA_NAME@PROJECT_ID.iam.gserviceaccount.com" \
--role="roles/artifactregistry.reader"
# Verify permissions
gcloud projects get-iam-policy PROJECT_ID \
--flatten="bindings[].members" \
--filter="bindings.role:roles/artifactregistry.reader"
6. EKS with ECR - Missing IAM Role
For EKS with AWS ECR:
# Verify node IAM role has ECR pull permissions
aws iam get-role --role-name YOUR_NODE_ROLE_NAME
# Add ECR read policy if missing
aws iam attach-role-policy \
--role-name YOUR_NODE_ROLE_NAME \
--policy-arn arn:aws:iam::aws:policy/AmazonEC2ContainerRegistryReadOnly
7. AKS with ACR - Missing Role Assignment
For AKS with Azure Container Registry:
# Get AKS cluster identity
az aks show -g RESOURCE_GROUP -n CLUSTER_NAME --query identityProfile
# Grant AcrPull role to AKS
az aks update -g RESOURCE_GROUP -n CLUSTER_NAME --attach-acr ACR_NAME
# Verify access
az acr check-access --name ACR_NAME
Quick Verification Commands:
# Check which images are failing
kubectl get pods -n permit-platform -o jsonpath='{range .items[*]}{.metadata.name}{"\t"}{.status.containerStatuses[*].image}{"\t"}{.status.containerStatuses[*].state.waiting.reason}{"\n"}{end}' | grep ImagePull
# Check all imagePullSecrets are configured
kubectl get deployment -n permit-platform -o jsonpath='{range .items[*]}{.metadata.name}{"\t"}{.spec.template.spec.imagePullSecrets[*].name}{"\n"}{end}'
# Test pulling an image manually (on a cluster node)
kubectl run test-pull --image=YOUR_REGISTRY/permit-backend-v2:TAG --namespace=permit-platform --dry-run=client
💡 Prevention Tip: When using private registries, always follow this order:
- 1. Push images to registry (
push-images-to-registry.sh) - 2. Create imagePullSecrets if needed (Artifactory/Harbor only)
- 3. Configure values.yaml with imageRegistry and imagePullSecrets
- 4. Run installer with
--skip-imagesflag
See Installation Guide Step 3.5 for complete workflow.
Authentication and Login Issues
❌ Problem: Cannot login with admin credentials
Error symptoms:
- "Invalid username or password"
- Login form keeps reloading
- Authentication redirect loops
Diagnostic steps:
# Check Keycloak status
kubectl get pods -n permit-platform -l app=keycloak
kubectl logs -n permit-platform deployment/keycloak
# Check backend authentication logs
kubectl logs -n permit-platform deployment/permit-backend-v2 | grep -i auth
# Verify admin password
kubectl get secret keycloak-admin-secret -n permit-platform -o jsonpath='{.data.password}' | base64 -d
Solutions:
-
Reset admin password:
# Get current admin password from secret
kubectl get secret keycloak-admin-secret -n permit-platform -o jsonpath='{.data.password}' | base64 -d
# If password doesn't work, check if Keycloak initialized properly
kubectl logs -n permit-platform deployment/keycloak | grep -i "admin user" -
Check Keycloak configuration:
# Verify Keycloak is accessible
kubectl port-forward -n permit-platform svc/keycloak 8080:8080 &
curl http://localhost:8080/auth/realms/permit-platform -
Authentication configuration issues:
# Check backend authentication environment variables
kubectl describe deployment permit-backend-v2 -n permit-platform | grep -A 20 Environment
# Verify cookie configuration
kubectl logs -n permit-platform deployment/permit-backend-v2 | grep -i cookie
Database Connection Issues
❌ Problem: Services cannot connect to database
Check database status:
# Check PostgreSQL pod
kubectl get pods -n permit-platform -l app=postgres
kubectl logs -n permit-platform deployment/postgres
# Test database connectivity from backend
kubectl exec -n permit-platform deployment/permit-backend-v2 -- psql -h postgres -U permit -d permit -c "SELECT 1;"
Solutions:
-
Restart database:
kubectl rollout restart deployment/postgres -n permit-platform
# Wait for database to be ready
kubectl wait --for=condition=ready pod -l app=postgres -n permit-platform --timeout=300s -
Check database initialization:
# Check if database initialized properly
kubectl logs -n permit-platform deployment/postgres | grep -i "database system is ready"
# Check database size and connections
kubectl exec -n permit-platform deployment/postgres -- psql -U permit -c "\l" -
Verify database secrets:
# Check database password in secret
kubectl get secret postgres-secret -n permit-platform -o jsonpath='{.data.password}' | base64 -d
Policy Sync Issues
⚠️ Problem: Policy sync failing from Git repository
Policy Sync is required for platform operation.
Test Git connectivity:
# Check Policy Sync pod status
kubectl get pods -n permit-platform -l app=permit-policy-sync-v2
# Check Policy Sync logs
kubectl logs -n permit-platform deployment/permit-policy-sync-v2
# Verify SSH key secret
kubectl get secret policy-sync-ssh-key -n permit-platform -o jsonpath='{.data.private-key}' | base64 -d | head -1
Solutions:
-
Verify Git credentials in secret:
# Check if SSH key is properly configured
kubectl describe secret policy-sync-ssh-key -n permit-platform
# Test SSH connection manually (if possible)
ssh -T git@github.com -i /path/to/permit-policy-key -
Restart Policy Sync service:
kubectl rollout restart deployment/permit-policy-sync-v2 -n permit-platform
Log Collection and Monitoring
Collecting Diagnostic Information
For troubleshooting or support requests:
# Get all pod logs
kubectl logs -n permit-platform --all-containers=true --selector=app!=postgres > permit-platform-logs.txt
# Get pod status and descriptions
kubectl get pods -n permit-platform -o wide > pod-status.txt
kubectl describe pods -n permit-platform > pod-descriptions.txt
# Get events
kubectl get events -n permit-platform --sort-by='.lastTimestamp' > events.txt
# Get service and ingress info
kubectl get svc,ingress -n permit-platform -o yaml > networking.yaml
# Check resource usage
kubectl top pods -n permit-platform > resource-usage.txt
Performance Monitoring
Check resource consumption:
# Monitor pod resources
kubectl top pods -n permit-platform
# Monitor node resources
kubectl top nodes
# Check for resource limits being hit
kubectl describe pods -n permit-platform | grep -A 5 -B 5 "resource\|limit\|request"
Scale services if needed:
# Scale backend for more capacity
kubectl scale deployment permit-backend-v2 -n permit-platform --replicas=3
# Scale celery workers
kubectl scale deployment celery-general -n permit-platform --replicas=2
# Check horizontal pod autoscaler (if configured)
kubectl get hpa -n permit-platform
Advanced Troubleshooting
Clean Installation Reset
If you need to completely reset the installation:
# 1. Uninstall all 3 Helm releases (in reverse order)
helm uninstall permit-platform -n permit-platform 2>/dev/null || true # Platform services (35 services)
helm uninstall migrations -n permit-platform 2>/dev/null || true # Database migrations
helm uninstall third-party-services -n permit-platform 2>/dev/null || true # Infrastructure (PostgreSQL, Redis, etc.)
# 2. Delete namespace (removes all resources)
kubectl delete namespace permit-platform
# 3. Clean up any persistent volumes (careful!)
kubectl get pv | grep permit-platform # Check before deleting
kubectl delete pv $(kubectl get pv -o jsonpath='{.items[?(@.spec.claimRef.namespace=="permit-platform")].metadata.name}')
# 4. Re-run installation
./scripts/install-permit-platform.sh
Manual Service Recovery
If specific services are failing:
# Force delete stuck pods
kubectl delete pod <pod-name> -n permit-platform --force --grace-period=0
# Patch deployment to fix issues
kubectl patch deployment permit-backend-v2 -n permit-platform -p '{"spec":{"template":{"spec":{"containers":[{"name":"permit-backend-v2","image":"new-image-tag"}]}}}}'
# Check and fix persistent volume claims
kubectl get pvc -n permit-platform
kubectl describe pvc <pvc-name> -n permit-platform
Getting Support
When contacting support, please provide:
Required Information:
- Installation method: Kubernetes cluster type (EKS, GKE, AKS, on-premise)
- Platform version: From installer package filename
- Error description: What you were trying to do and what happened
- Timeline: When did the issue start?
Helpful Diagnostics:
# Create comprehensive diagnostic bundle
{
echo "=== CLUSTER INFO ==="
kubectl cluster-info
echo "=== NODES ==="
kubectl get nodes -o wide
echo "=== PERMIT PLATFORM PODS ==="
kubectl get pods -n permit-platform -o wide
echo "=== RECENT EVENTS ==="
kubectl get events -n permit-platform --sort-by='.lastTimestamp' | tail -20
echo "=== INGRESS STATUS ==="
kubectl get ingress -n permit-platform
echo "=== STORAGE ==="
kubectl get pv,pvc -n permit-platform
} > permit-support-info.txt
Contact Information:
- 📧 Email: support@permit.io
- 💬 Slack: Join our community
Need more help? Our support team has experience with all major Kubernetes platforms and can assist with advanced troubleshooting scenarios.