Expected Behavior and the CrashLoopBackOff Error
When deploying an application to Kubernetes (K8s), the expected behavior is that the Pod will transition from Pending directly into Running, and the containers within will remain up. However, one of the most common and notorious errors encountered is CrashLoopBackOff.
This status means that a container inside the Pod is starting, instantly crashing, and Kubernetes is continuously trying to restart it—with increasing delays (backoffs) between each attempt (10s, 20s, 40s, up to 5 minutes).
Unlike a Pending state (which implies an infrastructure or scheduling issue), a CrashLoopBackOff explicitly tells you that the node has scheduled the Pod, pulled the image, and executed the container, but the application inside it voluntarily exited or was forcefully killed.
Prerequisites
Before diving into the troubleshooting steps, ensure you have:
- A running Kubernetes Cluster (Minikube, EKS, GKE, AKS, or bare metal).
- The
kubectlcommand-line tool installed and configured to point to your cluster. - The exact name of the Pod exhibiting the error and the namespace it resides in.
Root Causes of CrashLoopBackOff
Since CrashLoopBackOff is a symptom rather than the exact cause, we have to look for the underlying reason. The most common culprits include:
- Application Panic/Fatal Error: The code encounters an unhandled exception or missing dependency immediately upon startup (e.g., failing to connect to a database) and explicitly exits.
- Missing Configuration/Secrets: The application expects an environment variable mapped from a
ConfigMaporSecretthat does not exist or has a typo. - Liveness Probe Failures: Your
livenessProbeis failing consecutively, causing Kubernetes to interpret the container as unhealthy and killing it repeatedly to restart it. - OOMKilled (Out of Memory): The container tries to allocate more memory than its assigned
limits.memoryallows, prompting the Linux kernel to terminate the process. - Invalid Entrypoint/Command: The command specified in the YAML or the Dockerfile’s
CMD/ENTRYPOINTis syntactically incorrect, lacks permissions, or exits immediately because it’s not a foreground process.
Step-by-Step Solution
1. View the Pod Description
The first command you should run is describe. This will give you the exact Exit Code of the crash.
kubectl describe pod <pod-name> -n <namespace>
Scroll down to the Containers section and look at the Last State block:
Last State: Terminated
Reason: Error
Exit Code: 1
Started: Fri, 27 Feb 2026 10:00:00 GMT
Finished: Fri, 27 Feb 2026 10:00:02 GMT
Common Exit Codes:
- Exit Code 1: General application error (look at your code logs).
- Exit Code 2: Misuse of shell built-ins (check your Dockerfile command).
- Exit Code 126: Command invoked cannot execute (permissions error).
- Exit Code 128: Invalid exit argument.
- Exit Code 137: OOMKilled (Out of Memory - the container hit its limit).
- Exit Code 255: Exit status out of range (usually a fatal initialization failure).
2. Check the Previous Logs
Because the container is actively crashing and restarting, normal kubectl logs might return empty if you hit the container just as it spins up. Instead, use the --previous flag to fetch the logs from the dead container just before it was killed:
kubectl logs <pod-name> -n <namespace> --previous
This is highly effective for catching stack traces, missing database bindings, or “File Not Found” errors that caused the panic.
3. Check for OOMKilled
If the Exit Code is 137 or the describe output literally says OOMKilled, the solution is to increase your container’s memory limits.
Edit your deployment YAML to bump the memory limit:
resources:
requests:
memory: "256Mi"
cpu: "250m"
limits:
memory: "512Mi" # <-- Increase this value
cpu: "500m"
Apply the changes using kubectl apply -f deployment.yaml.
4. Overriding the Entrypoint to Debug (Alternative Solution)
If the log output is cryptic or non-existent (e.g., immediate script exit due to permissions), a great way to debug safely is to override the command with a sleeping shell. This forces the container to stay alive long enough for you to enter it.
Temporarily modify your Deployment manifest:
containers:
- name: my-crashing-app
image: my-registry/my-app:v1
command: ["sleep", "3600"] # Forces the pod to stay alive
Apply the manifest. The Pod will now transition to Running and stay there. Then, safely exec into it:
kubectl exec -it <pod-name> -- /bin/sh
Once inside, you can manually run your application script (npm start, python app.py, etc.) and observe the error in real-time, inspect files, or verify environment variables.
Prevention
To reduce the occurrence of CrashLoopBackOff in your production clusters, implement the following best practices:
- Implement Readiness and Liveness Probes Properly: Make sure your
livenessProbegives the application enoughinitialDelaySecondsto boot before it starts pinging it. - Use Init Containers: If your app depends on a database or external service being available, use an
initContainerto wait for that service before starting the main application, effectively preventing the panic. - Validate Configuration Dependencies: Ensure your CI/CD pipeline validates that
ConfigMapsandSecretsreferenced in your Deployments actually exist before applying the manifests.
Summary
CrashLoopBackOffindicates a container started but exited/died repeatedly.- Use
kubectl describe podto locate theExit Code. - Use
kubectl logs --previousto read the stack trace of the last crashed iteration. - Exit Code
137means memory limits were hit (OOMKilled). - Override the deployment
commandtosleepif you need to manually SSH into the pod to debug filesystem or permission issues.