Kubernetes Node Debugging Cheat Sheet

Diagnose node readiness, kubelet issues, pressure conditions, cordon/drain, and node-level debugging with kubectl debug.

View
StandardDetailedCompact
Export
Copy the compact sheet, download it, or print it.
Download
`D` dense toggle · `C` copy all
## Node Health and Conditions
List nodes
kubectl get nodes

# Start by verifying Ready status across the cluster.

Describe a node
kubectl describe node <node>

# Inspect conditions, capacity, allocatable, taints, and events.

Get raw node YAML
kubectl get node <node> -o yaml

# Inspect status addresses, conditions, and daemon info.

Show live node usage
kubectl top node <node>

# Compare CPU and memory utilization across nodes.

Show all node metrics
kubectl top nodes

# Find hot nodes during an incident.

List key node conditions
kubectl get nodes -o custom-columns=NAME:.metadata.name,READY:.status.conditions[?(@.type=="Ready")].status,MEM:.status.conditions[?(@.type=="MemoryPressure")].status,DISK:.status.conditions[?(@.type=="DiskPressure")].status,PID:.status.conditions[?(@.type=="PIDPressure")].status

# Compact view for Ready and pressure-related conditions.

## Node Debug Shell
Open a debug shell on a node
kubectl debug node/<node> -it --image=busybox:1.36

# Create a privileged debug pod on the node for investigation.

Open a richer debug shell on a node
kubectl debug node/<node> -it --image=ubuntu:24.04

# Use a fuller image when you need common networking tools.

Chroot into the host filesystem
chroot /host

# After opening a node debug shell, enter the host rootfs.

Read kubelet logs from the node
journalctl -u kubelet --no-pager | tail -n 200

# Inside a node shell on systemd-based hosts, inspect kubelet logs.

List containers with crictl
crictl ps -a

# Inspect CRI-level state from the node.

List pod sandboxes with crictl
crictl pods

# View pod sandboxes at the CRI layer.

## Scheduling and Maintenance
Cordon a node
kubectl cordon <node>

# Prevent new pods from being scheduled onto a node.

Uncordon a node
kubectl uncordon <node>

# Return a node to normal scheduling.

Drain a node
kubectl drain <node> --ignore-daemonsets --delete-emptydir-data

# Evict workloads before maintenance or replacement.

List pods on a specific node
kubectl get pods -A --field-selector spec.nodeName=<node> -o wide

# See which workloads are affected by a node problem.

List DaemonSets
kubectl get daemonsets -A

# Verify whether node-level agents are missing or unhealthy.

## Control Plane and System Pods
List kube-system pods
kubectl get pods -n kube-system -o wide

# Inspect control plane and addon health quickly.

Describe kube-proxy pod
kubectl describe pod -n kube-system <kube-proxy-pod>

# Inspect node networking agent events and status.

Read kube-proxy logs
kubectl logs -n kube-system <kube-proxy-pod>

# Check proxy sync failures and iptables/ipvs issues.

Find node problem detector pods
kubectl get pods -n kube-system | grep -i node-problem-detector

# If used, inspect node problem detector coverage.

Recommended next

No recommendations yet.