Environment Variables

CLI #

The following environment variables can be set to configure the CLI:

Variable	Description
`AIOLI_USER`	The username used to authenticate the user.
`AIOLI_PASS`	The password used to authenticate the user.
`AIOLI_USER_TOKEN`	The token used to authenticate the user.
`AIOLI_CONTROLLER`	The protocol, IP, and port of the controller (e.g., `http://mycontrollerhostname:80`).
`AIOLI_CONTROLLER_CERT_FILE`	The path to the certificate file used by the controller, which is required for setting up external authentication. See our Obtaining a CA Signed Certificate (GKE) and GitHub Authentication guides for example usage.
`AIOLI_CONTROLLER_CERT_NAME`	The name of the controller’s certificate file.
`AIOLI_DEBUG_CONFIG_PATH`	The path to the debug configuration file.

Deployment & Packaged Model #

The following environment variables can be set while adding a packaged model or creating a deployment to modify the default settings:

Variable	Description
`AIOLI_LOGGER_PORT`	The port that the logger service listens on; default is `49160`.
`AIOLI_PROGRESS_DEADLINE`	The deadline for downloading the model; default is `1500s`.
`AIOLI_READINESS_FAILURE_THRESHOLD`	The number of readiness probe failures before the deployment is considered unhealthy; default is `100`.
`AIOLI_COMMAND_OVERRIDE`	The customized deployment command that enables you to override the default deployment command within a predefined runtime.
`AIOLI_SERVICE_PORT`	The inference service container port used for communication; default is `8080` except for NIMs, which is `8000`.
`AIOLI_DISABLE_MODEL_CACHE`	Disables automatic model caching for a deployment, even if it is enabled for the model; default is `false`.
`AIOLI_DISABLE_LOGGER`	A workaround for the Kserve defect concerning streamed responses.

Environment variables set on a deployment will override the values set on its packaged model.

Command Override Argument Options #

MLIS executes a default command for your container runtime based on the type of packaged model you have selected. However, you can modify this command using the AIOLI_COMMAND_OVERRIDE environment variable. Any arguments from the packaged model are then appended to the end of this command, followed by any arguments from the deployment (AIOLI_COMMAND_OVERRIDE = [CLI_COMMAND] [MODEL_ARGS] [DEPLOYMENT_ARGS]).

The following table shows the default command for each packaged model’s framework type:

FRAMEWORK	COMMAND	DESCRIPTION
OpenLLM	`openllm start --port {{.containerPort}} {{.modelDir}}`	You can add any options from OpenLLM version `0.4.44` to your command (see `openllm start -h`).
Bento Archive	`bentoml serve ...`	You can add any options from BentoML version `1.1.11` to your command (see `bentoml serve -h`).
Custom	`none`	For custom models, the default entrypoint for the container is executed.
NVIDIA NIM	`none`	For NIM models, the default entrypoint for the container is executed. You must use environment variables; NIM contaiers do not honor CLI arguments.
vLLM	`--model {{.modelName}} --port {{.containerPort}} --download-dir {{.modelDir}} --tensor-parallel-size {{.numGpus}}`	Arguments vary for S3/PVC/PFS URLs.

You can also use the following variables to modify the command’s arguments:

Named Argument	Description
`{{.numGpus}}`	The number of GPUs the model is requesting.
`{{.modelName}}`	The model name being deployed. For Openllm/vLLM the model name path from the url.
`{{.modelDir}}`	The directory into which the model will be downloaded. This is typically `/mnt/models`. This applies to NIM, OpenLLM, and S3 models.
`{{.containerPort}}`	The http port that the container must listen on for inference requests and readiness checks.

Examples #

AIOLI_COMMAND_OVERRIDE="nim_llm --model_name {{.modelName}} --model_path {{.modelDir}} --port {{.containerPort}} --health_port {{.healthPort}} --num_gpus {{.numGpus}}"

AIOLI_COMMAND_OVERRIDE="openllm start {{.modelName}} --port {{.containerPort}} --gpu-memory-utilization 0.9 --max-total-tokens 4096"

AIOLI_COMMAND_OVERRIDE="bentoml serve {{.modelDir}}/bentofile.yaml --production --port {{.containerPort}} --host 0.0.0.0"