Release Notes Highlights for MLIS

1.1.0

August 6, 2024

Welcome to the 1.1.0 release of HPE Machine Learning Inferencing Software (MLIS).

REST API Docs
Reminder: You can access the Rest API Documentation from your MLIS instance by navigating to http://<your-mlis-url>/docs/rest-api/.

Highlights

This release includes the following features:

Deployments

Deployment Tokens

Create deployment tokens to control access to your inference service endpoint.

  • UI Feature: Create and manage deployment tokens via the UI.
  • CLI Feature: In addition to the timestamp format, you can manage the expiration date of an access token by applying a simple date-time or simple date format.

Model Management

Admin

  • Feature: Automatically collect and report anonymous customer telemetry data to improve product quality and support.

Known Issues

  • DB Connections: TLS/SSL connections are not supported for both built-in and external databases. This issue will be addressed in a future release.
  • LLM Streaming Response Payload Received All at Once: We have identified an issue where streaming responses are being received all at once, instead of as a continuous stream. See the No Streamed Responses Troubleshooting article for more information and a workaround.
  • UI Mislabels Deployment on Errors Tab: While a deployment is in progress, the UI may display error messages for a “Deployment B” when viewing the Errors tab; this is a mislabeling of the deployment. In this case, you can ignore the initial column and instead focus on the “Type” and “Message” columns during troubleshooting.
  • Custom Grafana Base URLs: Deployments come with pre-built Grafana dashboard quicklinks — however, if you choose to manually configure the grafana.deployment_dashboard_baseurl in values.yaml, the link cannot be set to point to a completely different Grafana instance with a full host path; it must point to a different path on the same Grafana instance.

1.0.0

May 5, 2024

Welcome to the first Generally Available (GA) release of HPE Machine Learning Inferencing Software (MLIS)! We recommend that you follow the steps outlined in our Get Started section to install and configure this platform.

You can also refer to the object model, environment variable, and helm chart references to learn more about the platform’s architecture and configuration options.

REST API Docs
You can access the Rest API Documentation from your MLIS instance by navigating to http://<your-mlis-url>/docs/rest-api/.

Highlights

This release includes the following features:

Registries

Create registries to reference your models from various sources. MLIS supports s3, OpenLLM, and NGC registries.

  • Feature: Perform registry operations via the UI, API, or CLI.
  • Feature: s3 registries include any s3-compatible storage service (e.g., AWS S3, MinIO, etc.).

Packaged Models

Register packaged models to be used in deployments of inference services. These can be both models you’ve trained and uploaded to your registry or pre-existing models provided by your registry. Supported model types include Bento Archive, Custom (openllm or bentoml), bentoml, NIM, and OpenLLM.

  • Feature: Perform packaged model operations via the UI, API, or CLI.
  • Feature: MLIS provides default images to execute bentoml and openllm models from openllm:// and s3:// URLs.
  • Feature: MLIS enables you to pull and execute NIM models directly from the NGC catalog.
  • Feature: You can provide entirely custom bentoml or openllm container images or build a new image off of our default base container images.
  • Feature: Specify environment variables and arguments to the model container during packaged model creation.
  • Feature: Specify resource templates for the model container during packaged model creation.
  • Feature: Select GPU types for the model container to use during packaged model creation; this must be enabled by an admin.

Deployments

Create deployments to launch inference services. Deployments are created from packaged models and can be scaled horizontally. You can provide users access to the deployment via a generated URL.

  • Feature: Perform deployment operations via the UI, API, or CLI.
  • Feature: Choose a default autoscaling target template or define custom autoscaling targets for your deployment.
  • Feature: Provide environment variables and arguments to the deployment instance during deployment creation.
  • Feature: Require authentication for using a deployed inference service.
  • Feature: Initiate canary rollouts to test new model version performance before full deployment.
  • Feature: Monitor deployment performance and resource usage using pre-built Grafana dashboards. These dashboards include logs (via Loki) and metrics (via Prometheus).

Admin


Known Issues

  • Custom Grafana Base URLs: Deployments come with pre-built Grafana dashboard quicklinks — however, if you choose to manually configure the grafana.deployment_dashboard_baseurl in values.yaml, the link provided on a deployment from the UI will not work.
  • Unsaved Changes: Changes made in a modal (e.g., editing a packaged model) are lost if you click outside the modal without saving.
  • SSO Sign-in Button: You must click on the text of the SSO sign-in button to sign in with SSO; clicking anywhere else on the button itself results in a “user not found” error.
  • NIM Model Dropdown: If you are adding a NIM packaged model for NGC, the dropdown for selecting a NIM model does not display any models due to a NIM v1.0.0 change. You can still manually enter the NIM model name from the NGC catalog in the image field.