跳到主要内容

Increase LLM Request Timeout

When sending requests to an LLM, the request may take a long time to finish, especially when using reasoning models. This means we may need to increase the timeout when sending requests to an LLM.

For Spring AI, most of those model clients using RestClient for non-streaming requests and WebClient for streaming request. To increase request timeout, we need to configure both RestClient and WebClient.

Depending on the available libraries, The actual HTTP client libraries used RestClient and WebClient can be different. The request timeout is configured to the underlying HTTP client library.

RestClient

To configure RestClient, we can provide a bean of type RestClient.Builder with a custom ClientHttpRequestFactory. These ClientHttpRequestFactory implementations have their own way to configure request timeout.

ImplementationMethod
JdkClientHttpRequestFactorysetReadTimeout
JettyClientHttpRequestFactorysetReadTimeout
HttpComponentsClientHttpRequestFactorysetReadTimeout

WebClient

To configure WebClient, we can provide a bean of type WebClient.Builder with a custom ClientHttpConnector. These ClientHttpConnector implementations have their own way to configure request timeout.

Only JdkClientHttpConnector provides a setReadTimeout method to set the read timeout. For other ClientHttpConnector implementations, we need to configure the underlying HTTP client object directly.

JDK HTTP Client

It's recommended to use JDK HTTP client to avoid introducing extra third-party libraries.

Of course, if your application already uses other HTTP client libraries, it's better to use those libraries.

The code below shows an example of configuring JDK HTTP client. The timeout is set to three minutes. This configuration provides beans of HttpClient, RestClient.Builder and WebClient.Builder.

Spring configuration
package com.javaaidev.agent;

import java.net.http.HttpClient;
import java.time.Duration;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.core.task.SimpleAsyncTaskExecutor;
import org.springframework.http.client.JdkClientHttpRequestFactory;
import org.springframework.http.client.reactive.JdkClientHttpConnector;
import org.springframework.web.client.RestClient;
import org.springframework.web.reactive.function.client.WebClient;

@Configuration
public class AppConfiguration {

private static final Duration API_TIMEOUT = Duration.ofMinutes(3);

@Bean
public RestClient.Builder restClientBuilder(HttpClient httpClient) {
JdkClientHttpRequestFactory requestFactory = new JdkClientHttpRequestFactory(httpClient);
requestFactory.setReadTimeout(API_TIMEOUT);
return RestClient.builder().requestFactory(requestFactory);
}

@Bean
public WebClient.Builder webClientBuilder(HttpClient httpClient) {
var connector = new JdkClientHttpConnector(httpClient);
connector.setReadTimeout(API_TIMEOUT);
return WebClient.builder().clientConnector(connector);
}

@Bean
public HttpClient httpClient() {
var executor = new SimpleAsyncTaskExecutor();
executor.setVirtualThreads(true);
return HttpClient.newBuilder()
.executor(executor)
.connectTimeout(API_TIMEOUT)
.build();
}
}