Platform Setup Quickstart
This guide will walk you through the steps to deploy the HPE Machine Learning Inferencing Software platform on a Kubernetes cluster using the Helm chart. If your administrator has already deployed the HPE Machine Learning Inferencing Software platform, you can skip this guide and proceed to the Service Deployment Quickstart.
Before You Start #
- Complete the HPE MSC Docker Registry setup
- Create secrets for:
- Private Docker Registry (to pull private images)
- TLS (for HTTPS access to API)
- Ensure the availability of prerequisites for HPE Machine Learning Inferencing Software installation:
Component Minimum Version Latest Version Validated Dependency Kubernetes 1.20 1.30 Core Docker 2.6.0 2.6.0 Core Helm 3.0 3.13.2 Core KServe 0.11 0.14 Core Istio 1.18 1.20.4 KServe Istio Client 1.20.1 KServe Istio Control Plane 1.20.4 KServe Istio Data Plane 1.20.4 KServe Knative 1.10 1.14.5 KServe Knative Operator 1.14.5 KServe Knative Serving 1.13.1 KServe Cert Manager 1.9.0 1.15.1 KServe
MLIS has been installed successfully on:
- On-premise:
- Kubernetes
- SLES 15 Rancher (RKE2)
- MicroK8s
- GKE
- Kind
- Docker Desktop Kubernetes
EKS is known to not work out-of-the-box due to a need for custom service accounts. You may need to regenerate the service accounts with the required annotations to deploy successfully on EKS.
How to Deploy the Platform #
1. Obtain the Helm Chart #
token
required for the helm registry login
options can be retrieved by clicking the Access Your Products link in the HPE Software Delivery Receipt Email. It is displayed as the key
value under the Product Info column.2. Install KServe #
Refer to the official KServe serverless installation guide for instructions on how to install KServe on your Kubernetes cluster.
3. Configure DNS #
Access to inference service deployments is best achieved by configuring your Kubernetes cluster with a DNS domain name so that your inference services are accessible via a hostname. Otherwise, you must set up port forwarding in Part 6 to access the services (see the Pending IP tab).
If you do not have a DNS domain name for your Kubernetes cluster and do not wish to use port forwarding, you could instead use a magic DNS service such as nip.io to map an external IP to a hostname.
To configure the magic DNS service, create a ConfigMap in the knative-serving
namespace with the following command:
kubectl create configmap config-domain \
--namespace knative-serving \
--from-literal <CLUSTER-EXTERNAL-IP>.nip.io="" \
--dry-run=client -o yaml | kubectl replace -f -
This will enable you to access the cluster via the hostname <CLUSTER-EXTERNAL-IP>.nip.io
and an inference service via http://<INFERENCE-SERVICE-NAME>/<NAMESPACE>/<CLUSTER-EXTERNAL-IP>.nip.io
. See the Configure Magic DNS Service guide for more detailed steps.
Alternative magic DNS solutions include: localtls, sslip.io, and logal.gd.
4. Evaluate Deployment Needs #
- Evaluate your deployment needs for the following optional configurations that involve the Helm chart:
- TLS/HTTPS: Enable HTTPS access to the REST API.
- External Auth: Enable external authentication services using OIDC and other identity providers.
- Observability: Disable or enable observability features.
- Node Selectors: Control which nodes HPE Machine Learning Inferencing Software is deployed on.
- Rancher Engine: Configure Loki Service to work with
CoreDNS
if applicable. - Namespace In/Exclusions: Filter out namespaces not to be used for deployment.
- GPU Selection: Enable selecting GPU types when defining a packaged model.
- Optionally update your
values.yaml
file with the necessary configurations for any of the previous options. - Add it to the
helm install mlis
command in the next step using--values values.yaml
.
Configure Garbage Collection #
Unless otherwise configured, inactive inference service replicas shut down after 15 hours. Knative provides Garbage collection configuration options to customize the cleanup behavior of inactive replicas. GPUs allocated to a replica are not released when the replica is shut down.
We recommend that you configure Knative for immediate cleanup of inactive replicas (Knative Revisions). The following command will configure Knative to immediately clean up inactive replicas:
kubectl patch cm -n knative-serving config-gc --type=strategic \
-p '{"data":{"min-non-active-revisions":"0", "max-non-active-revisions": "0", "retain-since-create-time": "disabled","retain-since-last-active-time": "disabled"}}'
5. Install the Controller via Helm #
- Download the Helm chart from your HPE MSC account.
- Run the
helm install
command, passing in:- Your required secrets
- An optional values.yaml file for configurations (TLS, external auth)
- A default password for the
admin
user
If you skip setting the admin password during installation (using --set defaultPassword
), the system generates one automatically. You can retrieve this generated password by using the following command and replacing <RELEASE_NAME>
with the name of the Helm release (e.g., mlis
):
kubectl get secrets aioli-master-config-<RELEASE_NAME> \
--template='{{ index .data "aioli-master.yaml" | base64decode }}' | grep defaultPassword
If you use this command after updating the admin password, it will still only return the original default password.
6. Obtain Controller Address #
Once installed, identify the host/ip to be used to communicate with the Controller. Typically this is via the EXTERNAL-IP
of the aioli-proxy
service.
-
Enter the following command to obtain the external IP address of the
aioli-proxy
service.kubectl get svc/aioli-proxy
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE aioli-proxy LoadBalancer 10.80.7.201 35.224.128.245 80:30819/TCP 17m
-
Once you’ve identified how to access the HPE Machine Learning Inferencing Software controller, setup the
AIOLI_CONTROLLER
export variable to reference the proper address by default or specify it via the AIOLI command option-c/--controller
.
7. Sign in #
If authentication has been enabled, the AIOLI_CONTROLLER
address should be defined to use https://...
.
The CLI can be configured with environment variables for AIOLI_CONTROLLER_CERT_FILE
and AIOLI_CONTROLLER_CERT_NAME
to point to full certificate chain. If the certificate is not signed by a well-known Certificate Authority (CA), the user can adopt a trust-on-first-use approach by trusting the SHA256 fingerprint of the certificate chain.
How to Configure Users #
1. Change the Default Password #
If you did not set the defaultPassword
during installation, you should change the default password for the admin
user at this point.
aioli user login admin
aioli user change-password admin
2. Add User Accounts & Roles #
Provision additional user accounts as needed using the command aioli user create --<ROLE_NAME>
.
Roles | Permissions |
---|---|
admin | Can update permissions for other users |
viewer | Can view AIOLI resources and their own deployment tokens |
maintainer | Can perform CRUD operations on AIOLI resources (registries, models, deployments, tokens, and resource/autoscaling templates) |
robot | Can perform read operations on models and view their own deployment tokens; requires one-time authentication/authorization |
The viewer
role is automatically assigned if no role is specified during user creation.
You can also manage roles using the aioli rbac
command:
- Add Role:
aioli rbac assign-role <ROLE_NAME> -u <USER_NAME>
- Remove Role:
aioli rbac unassign-role <ROLE_NAME> -u <USER_NAME>
Next Steps #
You are now ready to continue to the Service Deployment Quickstart guide.