1.0.0
Welcome to the first Generally Available (GA) release of HPE Machine Learning Inferencing Software (MLIS)! We recommend that you follow the steps outlined in our Get Started section to install and configure this platform.
You can also refer to the object model, environment variable, and helm chart references to learn more about the platform’s architecture and configuration options.
http://<your-mlis-url>/docs/rest-api/
.Highlights #
This release includes the following features:
Registries #
Create registries to reference your models from various sources. MLIS supports s3
, OpenLLM
, and NGC
registries.
- Feature: Perform registry operations via the UI, API, or CLI.
- Feature:
s3
registries include any s3-compatible storage service (e.g., AWS S3, MinIO, etc.).
Packaged Models #
Register packaged models to be used in deployments of inference services. These can be both models you’ve trained and uploaded to your registry or pre-existing models provided by your registry. Supported model types include Bento Archive
, Custom
(openllm or bentoml), bentoml
, NIM
, and OpenLLM
.
- Feature: Perform packaged model operations via the UI, API, or CLI.
- Feature: MLIS provides default images to execute
bentoml
andopenllm
models fromopenllm://
ands3://
URLs. - Feature: MLIS enables you to pull and execute
NIM
models directly from the NGC catalog. - Feature: You can provide entirely custom bentoml or openllm container images or build a new image off of our default base container images.
- Feature: Specify environment variables and arguments to the model container during packaged model creation.
- Feature: Specify resource templates for the model container during packaged model creation.
- Feature: Select GPU types for the model container to use during packaged model creation; this must be enabled by an admin.
Deployments #
Create deployments to launch inference services. Deployments are created from packaged models and can be scaled horizontally. You can provide users access to the deployment via a generated URL.
- Feature: Perform deployment operations via the UI, API, or CLI.
- Feature: Choose a default autoscaling target template or define custom autoscaling targets for your deployment.
- Feature: Provide environment variables and arguments to the deployment instance during deployment creation.
- Feature: Require authentication for using a deployed inference service.
- Feature: Initiate canary rollouts to test new model version performance before full deployment.
- Feature: Monitor deployment performance and resource usage using pre-built Grafana dashboards. These dashboards include logs (via Loki) and metrics (via Prometheus).
Admin #
- Feature: Set up external authentication for MLIS using your favorite identity provider.
- See the GitHub Identity Provider guide for an example.
- Feature: Manage users and user roles (RBAC) via the UI, API, or CLI.
Known Issues #
- Custom Grafana Base URLs: Deployments come with pre-built Grafana dashboard quicklinks — however, if you choose to manually configure the
grafana.deployment_dashboard_baseurl
invalues.yaml
, the link provided on a deployment from the UI will not work. - Unsaved Changes: Changes made in a modal (e.g., editing a packaged model) are lost if you click outside the modal without saving.
- SSO Sign-in Button: You must click on the text of the SSO sign-in button to sign in with SSO; clicking anywhere else on the button itself results in a “user not found” error.
- NIM Model Dropdown: If you are adding a NIM packaged model for NGC, the dropdown for selecting a NIM model does not display any models due to a NIM
v1.0.0
change. You can still manually enter the NIM model name from the NGC catalog in theimage
field.