Kubernetes Pod Debugging Cheat Sheet

Troubleshoot Pending, CrashLoopBackOff, image pull, readiness, liveness, and init container problems.

View
StandardDetailedCompact
Export
Copy the compact sheet, download it, or print it.
Download
`D` dense toggle · `C` copy all
## Pending and Unschedulable Pods
List Pending pods
kubectl get pods -A --field-selector=status.phase=Pending

# Find pods stuck before startup.

Describe a Pending pod
kubectl describe pod <pod> -n <namespace>

# Look for FailedScheduling, taints, or PVC issues.

List PersistentVolumeClaims
kubectl get pvc -A

# Check if pod scheduling is blocked on volume binding.

Describe a PVC
kubectl describe pvc <claim> -n <namespace>

# Inspect storage class, events, and binding failures.

Inspect node taints
kubectl describe node <node> | sed -n '/Taints:/,/Unschedulable:/p'

# Check whether tolerations are missing.

View pod node selectors and affinity
kubectl get pod <pod> -n <namespace> -o jsonpath='{.spec.nodeSelector}{"
"}{.spec.affinity}{"
"}'

# Inspect scheduling constraints from raw YAML.

## CrashLoopBackOff and Image Pull Errors
Find CrashLoopBackOff pods
kubectl get pods -A | grep CrashLoopBackOff

# List pods that are restarting and failing to stay up.

Read logs from the previous failed container
kubectl logs <pod> -n <namespace> -c <container> --previous

# Very useful when a restart wipes current state.

Describe image pull failures
kubectl describe pod <pod> -n <namespace> | sed -n '/Events:/,$p'

# Inspect ErrImagePull and ImagePullBackOff reasons.

Inspect image pull secrets referenced by a pod
kubectl get pod <pod> -n <namespace> -o jsonpath='{.spec.imagePullSecrets[*].name}{"
"}'

# Verify that imagePullSecrets are wired correctly.

Describe docker registry secret
kubectl describe secret <secret> -n <namespace>

# Check secret type and metadata for image pulls.

Show waiting reason for first container
kubectl get pod <pod> -n <namespace> -o jsonpath='{.status.containerStatuses[0].state.waiting.reason}{"
"}'

# Compact way to inspect wait state reasons.

## Readiness, Liveness, and Startup Probes
Inspect configured probes
kubectl get pod <pod> -n <namespace> -o jsonpath='{.spec.containers[*].readinessProbe}{"
"}{.spec.containers[*].livenessProbe}{"
"}{.spec.containers[*].startupProbe}{"
"}'

# View readiness, liveness, and startup probe definitions.

See probe failure events
kubectl describe pod <pod> -n <namespace> | grep -A5 -i probe

# Describe a pod and inspect event messages for failing probes.

Run the probe command manually
kubectl exec -it <pod> -n <namespace> -c <container> -- <probe-command>

# Validate probe behavior inside the container context.

Port-forward and test HTTP probe locally
kubectl port-forward pod/<pod> -n <namespace> 8080:<container-port>

# Verify whether the probe endpoint behaves as expected.

Call a probe endpoint after port-forward
curl -i http://127.0.0.1:8080/healthz

# Use curl locally against the forwarded health endpoint.

## Init Container Debugging
Inspect init container statuses
kubectl get pod <pod> -n <namespace> -o jsonpath='{.status.initContainerStatuses}{"
"}'

# See waiting, terminated, and exit details for init containers.

Read logs from an init container
kubectl logs <pod> -n <namespace> -c <init-container>

# Inspect the failing setup step before app containers start.

Describe init container failure
kubectl describe pod <pod> -n <namespace>

# Read events and termination details from the pod description.

Copy a pod and override its command
kubectl debug <pod> -n <namespace> --copy-to=<pod>-copy --set-image='*=busybox:1.36' -it

# Use kubectl debug to clone a pod with a safer command for inspection.

## Resources and OOMKilled
Check resource usage for a pod
kubectl top pod <pod> -n <namespace>

# Compare live usage against requests and limits.

Show pod resource requests and limits
kubectl get pod <pod> -n <namespace> -o jsonpath='{.spec.containers[*].resources}{"
"}'

# Inspect CPU and memory resource settings quickly.

Show termination reason
kubectl get pod <pod> -n <namespace> -o jsonpath='{.status.containerStatuses[0].lastState.terminated.reason}{"
"}'

# Check whether the last exit reason was OOMKilled.

Show last exit code
kubectl get pod <pod> -n <namespace> -o jsonpath='{.status.containerStatuses[0].lastState.terminated.exitCode}{"
"}'

# Inspect the numeric exit code for the container.

Recommended next

No recommendations yet.