Manage Packaged Models #
A packaged model describes the model and user code that you want to deploy as an inference service.
Model Packaging Options #
The following model types and registry options are supported:
Model Type | Registries |
---|---|
Bento Archive | S3, PFS |
Custom | OpenLLM (HuggingFace), PVC, S3, PFS, None |
NIM | NGC, PVC |
OpenLLM | OpenLLM (HuggingFace), S3, PFS |
Service Deployment Journey #
Adding a packaged model is the third step in the service deployment journey. It requires that you have already created a compatible image available.
graph LR; A(Set Up Registry) --> B(Add Registry); B --> C(Add Model); C --> D(Create Deployment); style A fill:#fff,stroke:#333,stroke-width:2px; style B fill:#fff,stroke:#333,stroke-width:2px; style C fill:#7FF9E2,stroke:#333,stroke-width:4px; style D fill:#fff,stroke:#333,stroke-width:2px;