Manage Packaged Models

A packaged model describes the model and user code that you want to deploy as an inference service.

Model Packaging Options

The following model types and registry options are supported:

Model TypeRegistries
Bento ArchiveS3, PFS
CustomOpenLLM (HuggingFace), PVC, S3, PFS, None
NIMNGC, PVC
OpenLLMOpenLLM (HuggingFace), S3, PFS

Service Deployment Journey

Adding a packaged model is the third step in the service deployment journey. It requires that you have already created a compatible image available.

graph LR;
    A(Set Up Registry) --> B(Add Registry);
    B --> C(Add Model);
    C --> D(Create Deployment);
    style A fill:#fff,stroke:#333,stroke-width:2px;
    style B fill:#fff,stroke:#333,stroke-width:2px;
    style C fill:#7FF9E2,stroke:#333,stroke-width:4px;
    style D fill:#fff,stroke:#333,stroke-width:2px;