Application Debugging¶

Guide for debugging application-specific issues in your k3s cluster.

Quick Diagnostics¶

# Check pod status
kubectl get pods -n <namespace>

# View pod logs
kubectl logs <pod-name> -n <namespace>
kubectl logs <pod-name> -n <namespace> --previous  # Previous crash

# Describe pod for events
kubectl describe pod <pod-name> -n <namespace>

# Execute shell in pod
kubectl exec -it <pod-name> -n <namespace> -- /bin/sh

# Check resource usage
kubectl top pod <pod-name> -n <namespace>

Pod Startup Issues¶

Image Pull Errors¶

Symptoms: ImagePullBackOff or ErrImagePull

Diagnosis:

kubectl describe pod <pod-name> -n <namespace>
# Look for: Failed to pull image

Common Causes:

Wrong image name or tag: Verify image exists

Private registry auth missing:

kubectl create secret docker-registry regcred \
  --docker-server=<registry> \
  --docker-username=<username> \
  --docker-password=<password> \
  -n <namespace>

Rate limit (Docker Hub): Authenticate to increase rate limit

Application Crashes on Startup¶

See Common Issues - CrashLoopBackOff

Check exit code:

kubectl describe pod <pod-name> -n <namespace>
# Look for: Exit Code

# Common exit codes:
# 0   - Success (shouldn't crash)
# 1   - General error
# 137 - SIGKILL (OOMKilled)
# 143 - SIGTERM (terminated)

Configuration Issues¶

Check ConfigMaps and Secrets:

# Verify ConfigMap exists and has correct data
kubectl get configmap <name> -n <namespace> -o yaml

# Verify Secret exists
kubectl get secret <name> -n <namespace> -o yaml

# Decode secret
kubectl get secret <name> -n <namespace> -o jsonpath='{.data.<key>}' | base64 -d

# Check environment variables in pod
kubectl exec <pod-name> -n <namespace> -- env

Runtime Issues¶

Out of Memory (OOMKilled)¶

Symptoms: Pod restarts frequently, exit code 137

Diagnosis:

# Check events
kubectl describe pod <pod-name> -n <namespace>
# Look for: OOMKilled

# Monitor memory usage
kubectl top pod <pod-name> -n <namespace>

Fix:

# Increase memory limit
resources:
  limits:
    memory: "512Mi"  # Increase this
  requests:
    memory: "256Mi"

CPU Throttling¶

Symptoms: Application slow, high CPU usage

Diagnosis:

# Check CPU usage
kubectl top pod <pod-name> -n <namespace>

Fix:

resources:
  limits:
    cpu: "1000m"  # Increase (1 core = 1000m)
  requests:
    cpu: "500m"

Database Connection Issues¶

Symptoms: Application can't connect to database

Check:

Database pod is running:

kubectl get pods -n <namespace> -l app=postgres

Service exists:

kubectl get svc <db-service> -n <namespace>

Test connectivity from app pod:

kubectl exec -it <app-pod> -n <namespace> -- nc -zv <db-service> 5432

Check credentials: Verify secret exists and app uses it

Liveness/Readiness Probe Failures¶

Symptoms: Pod keeps restarting, shows not ready

Diagnosis:

kubectl describe pod <pod-name> -n <namespace>
# Look for: Liveness probe failed / Readiness probe failed

Fix: Adjust probe timing:

livenessProbe:
  httpGet:
    path: /health
    port: 8080
  initialDelaySeconds: 60  # Give app time to start
  periodSeconds: 10
  timeoutSeconds: 5
  failureThreshold: 3

Debugging Techniques¶

Interactive Debugging¶

Enter shell in running pod:

kubectl exec -it <pod-name> -n <namespace> -- /bin/sh
# or /bin/bash if available

Run debug container in same pod:

kubectl debug <pod-name> -n <namespace> -it --image=nicolaka/netshoot

Log Analysis¶

Follow logs:

kubectl logs -f <pod-name> -n <namespace>

Multiple containers in pod:

# List containers
kubectl get pod <pod-name> -n <namespace> -o jsonpath='{.spec.containers[*].name}'

# View specific container
kubectl logs <pod-name> -c <container-name> -n <namespace>

Logs from all pods in deployment:

kubectl logs -n <namespace> -l app=<app-label> --tail=100

Events¶

Check recent events:

# Namespace-specific
kubectl get events -n <namespace> --sort-by='.lastTimestamp' | tail -20

# Watch events
kubectl get events -n <namespace> --watch

Port Forwarding for Testing¶

Forward pod port to localhost:

kubectl port-forward <pod-name> -n <namespace> 8080:80
# Access at http://localhost:8080

Application-Specific Debugging¶

Web Applications¶

Test endpoints:

# HTTP test
kubectl exec -it <pod-name> -n <namespace> -- curl localhost:8080/health

# With port-forward
kubectl port-forward <pod-name> -n <namespace> 8080:8080
curl http://localhost:8080

Database Applications¶

PostgreSQL:

# Enter psql
kubectl exec -it <postgres-pod> -n <namespace> -- psql -U <username> -d <database>

# Check connections
kubectl exec -it <postgres-pod> -n <namespace> -- psql -U postgres -c "SELECT * FROM pg_stat_activity;"

Background Jobs/Workers¶

Check job status:

kubectl get jobs -n <namespace>
kubectl describe job <job-name> -n <namespace>

# Check job pod logs
kubectl logs job/<job-name> -n <namespace>

CronJob debugging:

kubectl get cronjobs -n <namespace>
kubectl get jobs -n <namespace> --sort-by=.status.startTime

# Manually trigger CronJob
kubectl create job --from=cronjob/<cronjob-name> test-run -n <namespace>

Performance Debugging¶

Slow Application Response¶

Check resource usage:

kubectl top pod <pod-name> -n <namespace>
kubectl describe pod <pod-name> -n <namespace>

Check node resources:

kubectl top nodes
kubectl describe node <node-name>

Common Patterns¶

Init Container Failures¶

Symptoms: Pod stuck in Init:0/1

Check init container logs:

kubectl logs <pod-name> -c <init-container> -n <namespace>
kubectl describe pod <pod-name> -n <namespace>

Quick Reference¶

# Pod status
kubectl get pods -n <namespace>

# Pod logs
kubectl logs <pod> -n <namespace>
kubectl logs <pod> -n <namespace> --previous
kubectl logs -f <pod> -n <namespace>

# Pod description
kubectl describe pod <pod> -n <namespace>

# Execute commands
kubectl exec <pod> -n <namespace> -- <command>
kubectl exec -it <pod> -n <namespace> -- /bin/sh

# Port forwarding
kubectl port-forward <pod> -n <namespace> 8080:80

# Resource usage
kubectl top pod <pod> -n <namespace>

# Events
kubectl get events -n <namespace> --sort-by='.lastTimestamp'

# Debug container
kubectl debug <pod> -n <namespace> -it --image=nicolaka/netshoot

[CSI]: Container Storage Interface
[IOMMU]: Input-Output Memory Management Unit. Used to virualize memory access for devices. See Wikipedia