Failed Deployment

Deploying an inference service can fail for various reasons. This guide helps you troubleshoot common issues that may cause a deployment to fail.

Before You Start

Use kubectl to describe the revision and the deployment to get more information about the failure.

kubectl describe revision revision.serving.knative/<SERVICE_NAME>
kubectl describe deployment <DEPLOYMENT_NAME>
warning icon Ephemeral Storage Considerations

Default disk sizes on cloud providers may not be sufficient for large models, which often require significant ephemeral storage. This can result in the inference service failing to start serving—in some instances, without providing an error message about being out of disk space.

Ephemeral storage is normally provided by the boot disk of the compute nodes. You can inspect the amount of ephemeral storage on your nodes using the kubectl describe node <node-name> command.


Errors

Insufficient nvidia.com/gpu

Internal Error

Model Load Failed

RevisionMissing, RevisionFailed