Environment Variables

CLI

The following environment variables can be set to configure the CLI:

Variable Description
AIOLI_USER The username used to authenticate the user.
AIOLI_PASS The password used to authenticate the user.
AIOLI_USER_TOKEN The token used to authenticate the user.
AIOLI_CONTROLLER The protocol, IP, and port of the controller (e.g., http://mycontrollerhostname:80).
AIOLI_CONTROLLER_CERT_FILE The path to the certificate file used by the controller, which is required for setting up external authentication. See our Obtaining a CA Signed Certificate (GKE) and GitHub Authentication guides for example usage.
AIOLI_CONTROLLER_CERT_NAME The name of the controller’s certificate file.
AIOLI_DEBUG_CONFIG_PATH The path to the debug configuration file.

Deployment & Packaged Model

The following environment variables can be set while adding a packaged model or creating a deployment to modify the default settings:

Variable Description
AIOLI_LOGGER_PORT The port that the logger service listens on; default is 49160.
AIOLI_PROGRESS_DEADLINE The deadline for downloading the model; default is 1500s.
AIOLI_READINESS_FAILURE_THRESHOLD The number of readiness probe failures before the deployment is considered unhealthy; default is 100.
AIOLI_COMMAND_OVERRIDE The customized deployment command that enables you to override the default deployment command within a predefined runtime (e.g., for NIM containers). Useful if you want to switch to a nim_llm runtime for running vLLM models.

Environment variables set on a deployment will override the values set on its packaged model.

Command Override Argument Options

MLIS executes a default command for your container runtime based on the type of packaged model you have selected. However, you can modify this command using the AIOLI_COMMAND_OVERRIDE environment variable. Any arguments from the packaged model are then appended to the end of this command, followed by any arguments from the deployment (AIOLI_COMMAND_OVERRIDE = [CLI_COMMAND] [MODEL_ARGS] [DEPLOYMENT_ARGS]).

The following table shows the default command for each packaged model’s framework type:

FRAMEWORK COMMAND DESCRIPTION
OpenLLM openllm start --port {{.containerPort}} {{.modelDir}} You can add any options from OpenLLM version 0.4.44 to your command (see openllm start -h).
Bento Archive bentoml serve ... You can add any options from BentoML version 1.1.11 to your command (see bentoml serve -h).
Custom none For custom models, the default entrypoint for the container is executed.
NVIDIA NIM none For NIM models, the default entrypoint for the container is executed. You must use environment variables; NIM contaiers do not honor CLI arguments.

You can also use the following variables to modify the command’s arguments:

Named Argument Description
{{.numGpus}} The number of GPUs the model is requesting.
{{.modelName}} The MLIS model name being deployed.
{{.modelDir}} The directory into which the model will be downloaded. This is typically /mnt/models. This applies to NIM, OpenLLM, and S3 models.
{{.containerPort}} The http port that the container must listen on for inference requests and readiness checks.

Examples

AIOLI_COMMAND_OVERRIDE="nim_llm --model_name {{.modelName}} --model_path {{.modelDir}} --port {{.containerPort}} --health_port {{.healthPort}} --num_gpus {{.numGpus}}"
AIOLI_COMMAND_OVERRIDE="openllm start {{.modelName}} --port {{.containerPort}} --gpu-memory-utilization 0.9 --max-total-tokens 4096"
AIOLI_COMMAND_OVERRIDE="bentoml serve {{.modelDir}}/bentofile.yaml --production --port {{.containerPort}} --host 0.0.0.0"