Manage Packaged Models

A packaged model describes the model and user code that you want to deploy as an inference service.

Model Packaging Options

The following model types and registry options are supported:

Model Type Registries
Bento Archive S3
Custom OpenLLM (HuggingFace), PVC, S3, None
NIM NGC, PVC
OpenLLM OpenLLM (HuggingFace), S3

Service Deployment Journey

Adding a packaged model is the third step in the service deployment journey. It requires that you have already created a compatible image available.

graph LR;
    A(Set Up Registry) --> B(Add Registry);
    B --> C(Add Model);
    C --> D(Create Deployment);
    style A fill:#fff,stroke:#333,stroke-width:2px;
    style B fill:#fff,stroke:#333,stroke-width:2px;
    style C fill:#7FF9E2,stroke:#333,stroke-width:4px;
    style D fill:#fff,stroke:#333,stroke-width:2px;