Spring AI — Getting Started & Fundamentals — Avaneesh Yadav

Spring AI brings LLMs into Spring Boot with the same ergonomics you already know: starters, auto-configuration, and beans you inject. This page gets you from zero to your first working calls.

The mental model

Spring AI is built on provider-agnostic abstractions:

ChatModel / ChatClient — chat completions (the fluent ChatClient is what you'll use day to day).
EmbeddingModel — turn text into vectors (for RAG).
VectorStore — store/query embeddings.
ImageModel, AudioModel — multimodal where supported.

You depend on the abstraction; a starter wires in the concrete provider. Switching from OpenAI to Anthropic is mostly a dependency + config change, not a code rewrite.

1. Add a model starter

Pick a provider starter (you can have more than one). For Anthropic Claude:

<dependency>
  <groupId>org.springframework.ai</groupId>
  <artifactId>spring-ai-starter-model-anthropic</artifactId>
</dependency>

spring.ai.anthropic.api-key=${ANTHROPIC_API_KEY}
spring.ai.anthropic.chat.options.model=claude-sonnet-4-6
spring.ai.anthropic.chat.options.temperature=0.4

Keep the API key in an environment variable — never commit it.

2. Build a ChatClient

ChatClient is the fluent entry point. Configure shared defaults once via the builder:

@Configuration
class AiConfig {
    @Bean
    ChatClient chatClient(ChatClient.Builder builder) {
        return builder
            .defaultSystem("You are a concise, accurate enterprise assistant.")
            .build();
    }
}

3. Your first call

@RestController
class AskController {
    private final ChatClient chat;
    AskController(ChatClient chat) { this.chat = chat; }

    @GetMapping("/ask")
    String ask(@RequestParam String q) {
        return chat.prompt()
            .user(q)
            .call()
            .content();
    }
}

.call().content() returns the text. Want the full response (tokens, metadata)? Use .call().chatResponse().

4. Streaming

For chat UIs, stream tokens as they arrive:

Flux<String> stream(String q) {
    return chat.prompt().user(q).stream().content();
}

Return the Flux from a WebFlux endpoint (or bridge to SSE) so users see output immediately.

5. Per-call options

Override model/temperature per request without touching config:

chat.prompt()
    .user(q)
    .options(AnthropicChatOptions.builder().temperature(0.0).build())
    .call()
    .content();

Best practices from day one

Set a clear default system prompt — role, tone, and the boundaries of what the assistant should do.
Externalize keys and model names to config/env; don't hardcode.
Pick temperature by task: ~0 for extraction/classification, higher only when you want variety.
Don't trust output blindly — you'll add structure and validation next.

Next: make outputs reliable and let the model call your code → Prompting, Structured Output & Tool Calling →

Spring AI — Getting Started & Fundamentals

The mental model

1. Add a model starter

2. Build a ChatClient

3. Your first call

4. Streaming

5. Per-call options

Best practices from day one

Ask about this article

Enjoyed this? Get more like it
every Monday.

Spring AI — Getting Started & Fundamentals

The mental model

1. Add a model starter

2. Build a ChatClient

3. Your first call

4. Streaming

5. Per-call options

Best practices from day one

Ask about this article

Enjoyed this? Get more like it every Monday.

Enjoyed this? Get more like it
every Monday.