No Streamed Responses

Scenario

Responses that are configured to be streamed to the caller are not streamed. Instead, they are sent after the entire response is generated.

Triage

There is a known KServe defect that prevents streamed responses when logging is enabled. The agent sidecar added for logging disrupts the streaming process, causing the caller to receive the entire response only after it has been fully created, rather than receiving a streamed response.

A bug ticket has been filed with the KServe team to address this issue.

Resolution

There is currently no resolution for this issue.

As a workaround, you can define AIOLI_DISABLE_LOGGING=1 in your packaged model or deployment environment variables. This workaround disables the KServe request/response logging but allows streaming to work as expected.