Air-Gapped Deployment Considerations

Deploying HPE Machine Learning Inferencing Software (MLIS) in air-gapped environments presents unique challenges and limitations. This is because models are often pulled from external repositories, such as HuggingFace or NGC. Additionally, models may also require the installation of extra Python libraries when launching a deployment. In an air-gapped environment, these models and libraries cannot be directly accessed.

This document aims to provide a high-level guide on the deployment requirements, available functionality, and limitations of deploying HPE Machine Learning Inferencing Software in such environments.

Deployment Requirements

Local Object Store

An S3-compliant object store is required to host models and dependencies. Options include solutions like: MinIO, Ceph, or OpenIO.

Local Python Package Registry

A local Python package registry is required to host all necessary Python packages, including those required by MLIS and its dependencies. Read Hosting your own simple repository and review solutions like devpi for more information on how to set up a local Python package registry.

Local Docker Registry

A local Docker registry to host mirrors of all necessary images is required. The images needed are dependent upon the exact models and model libraries your team uses.

In addition, all of the following containers must also be available locally: Kserve, Knative, Cert Manager, and Istio.


Available Functionality

All of HPE Machine Learning Inferencing Software’s core functionality is available in an air-gapped environment. This includes:

Limitations are primarily related to the availability of models and their dependencies. See the Limitations section for more information.


Considerations

Local API Replication

S3 and HuggingFace support endpoints are configurable via the registry object using the endpointURL attribute. If local copies of these APIs are provided, they will fully function in an air-gapped environment, meaning that you can replicate these model registries locally.

Containers & Models

  • HuggingFace Models: Build a container containing the necessary model first so that it can run standalone, then host the container in a local Docker registry. Alternatively, if your model does not require downloading additional Python libraries during deployment (openllm start), you can use the local s3-compatible object store to host the model.

Limitations

  • NGC/NIM: Not supported in air-gapped environments due to their required validation process via the NGC registry.