Spring AI
    June 10, 2026

    Spring AI — Prompting, Structured Output & Tool Calling

    Get reliable, typed output from Spring AI and let the model call your Java services with @Tool — the request loop and production guardrails.

    Share

    A chat string is a demo. Production needs typed output and the ability for the model to call your code. Spring AI makes both first-class.

    Prompt templating

    Parameterise prompts instead of string-concatenating:

    chat.prompt()
        .user(u -> u.text("Summarise this ticket for a {role}: {body}")
                   .param("role", "support lead")
                   .param("body", ticket))
        .call()
        .content();

    Put stable instructions in the system message and the variable task in the user message — it's clearer and more cache-friendly.

    Structured output — stop parsing prose

    Map the reply straight onto a record. Spring AI instructs the model to conform and deserializes for you:

    record Triage(String summary, String severity, boolean escalate) {}
    
    Triage t = chat.prompt()
        .user("Triage: " + message)
        .call()
        .entity(Triage.class);

    This replaces brittle JSON.parse() + regex cleanup. Use enums for fixed-value fields, keep temperature low, and validate business rules after deserialization (schema-valid ≠ semantically correct).

    Tool calling — let the model use your services

    A tool is a Spring bean method annotated with @Tool. The model decides when to call it; Spring runs the whole request/response loop.

    @Component
    class OrderTools {
        private final OrderService orders;
        OrderTools(OrderService orders) { this.orders = orders; }
    
        @Tool(description = "Look up the current status and ETA of an order by its ID")
        OrderStatus orderStatus(
            @ToolParam(description = "Order ID, e.g. ORD-1234") String orderId) {
            return orders.status(orderId);
        }
    }

    Register the tools on the call (or as defaults):

    chat.prompt()
        .user("Where is order ORD-1234?")
        .tools(orderTools)
        .call()
        .content();

    The loop, under the hood

    flowchart TD A[User question + tool list] --> B{Model decides} B -->|needs data| C[Spring runs orderStatus] C --> D[Result returned to model] D --> B B -->|has enough| E[Final answer]

    The model never runs your code — it requests a call, Spring executes it, feeds the result back, and the model continues until it can answer.

    Best practices

    • Descriptions are the routing signal. Be specific about when to use a tool and what it returns. Vague descriptions → wrong-tool selection.
    • Keep the toolset small per assistant — a handful, not dozens.
    • Validate tool arguments — the model chooses them; treat as untrusted. Guard money-moving actions and make writes idempotent.
    • Return structured errors ("not_found" vs "unavailable") so the model recovers gracefully.

    Anti-patterns

    • Asking for JSON in prose and parsing by hand → use entity().
    • One mega-tool with 15 parameters → split into focused tools.
    • Letting the model "decide" security → enforce authorization in code, not the prompt.

    Next: ground answers in your own data → RAG & Vector Stores →

    Ask about this article

    Get answers grounded in this post. AI-generated — based on this article, and may be imperfect.

    Scaled AI Weekly

    Enjoyed this? Get more like it every Monday.

    Real architecture decisions, LLMOps patterns that survive production, and engineering leadership advice — from 12+ years of building at enterprise scale. Free. No spam. Unsubscribe anytime.

    Join engineers building production AI systems