- Spring Framework: The core Spring Framework provides comprehensive infrastructure support for developing Java applications. It focuses on providing a wide range of functionalities, such as dependency injection, aspect-oriented programming, transaction management, and more. It is modular, meaning you can use only the parts you need for your application.
- Spring Boot: Spring Boot is built on top of the Spring Framework and is designed to simplify the process of creating stand-alone, production-grade Spring applications. It aims to minimize configuration and setup time by offering default configurations and embedded servers.
- Spring Framework: Requires extensive configuration, usually involving XML or Java-based configuration. Developers need to manually define beans and configure application settings.
- Spring Boot: Reduces the need for manual configuration through auto-configuration and convention over configuration. It uses sensible defaults and annotations to automatically configure the application based on the dependencies present in the classpath.
- Spring Framework: Setting up a Spring application involves creating and configuring a lot of boilerplate code and configuration files. You need to manually set up the application context and configure dependencies.
- Spring Boot: Simplifies the setup process by providing starter dependencies (starter POMs) and a simplified project structure. It also includes embedded servers, so you can run your application as a stand-alone Java application.
- Spring Framework: Typically requires an external application server (like Tomcat, Jetty, or JBoss) to run the application. Developers need to package and deploy their application to the server.
- Spring Boot: Comes with embedded servers (Tomcat, Jetty, or Undertow), allowing you to run your application directly from the command line without needing to deploy it to an external server. This makes development, testing, and deployment easier and faster.
- Spring Framework: Does not include built-in production-ready features. Developers need to add and configure additional tools and libraries for monitoring, health checks, and metrics.
- Spring Boot: Provides built-in production-ready features, including health checks, metrics, application monitoring, and logging. These features are available out-of-the-box and require minimal configuration.
In summary, while the core Spring Framework provides the foundational tools and infrastructure for building applications, Spring Boot streamlines the process, offering default configurations and embedded servers to create stand-alone, production-ready applications quickly and easily.
Spring AI Tutorial: Add AI to Your Spring Boot App (2026)
2026 · 6 min read
If you've been putting off "learning AI" because it feels like a different world — Python, notebooks, a dozen new frameworks — here's the good news: you don't have to leave Spring Boot to do it.
Spring AI gives you a client that behaves exactly like the ones you already use every day. If you know RestClient, WebClient, or JdbcClient, you already know how ChatClient works. Same fluent builder, same mental model. The only new thing is what sits on the other end of the call — an LLM instead of a REST API or a database.
We'll go from an empty project to a working AI endpoint, then to typed Java objects coming back from the model, and finish with the one production gotcha that trips up most people. All code runs on Java 21+ against Spring AI 1.1.x (the current GA line).
What is Spring AI, really?
Spring AI is a portability layer. It sits above the individual model providers (OpenAI, Anthropic, Ollama, and others) and gives you one consistent API on top of all of them. The same idea Spring has always followed: you code against an abstraction, and the provider becomes a configuration detail.
In practice that means three things you'll care about:
- ChatClient — the fluent API for talking to a model. This is the
RestClientof LLMs. - Advisors — interceptors that wrap your calls to add behavior like memory or retrieval (we'll cover these in the RAG post).
- Auto-configuration — add a starter, set a key, inject the client. Standard Spring Boot.
You don't need to understand transformers or embeddings to ship your first feature. You need to know how to inject a bean.
Step 1: Project setup
Head to start.spring.io and create a normal Spring Boot project. Then add the Spring AI BOM and the OpenAI starter. We'll use OpenAI here because it's the model most readers already have a key for — but remember that single line is the only provider-specific thing in this entire tutorial.
Maven:
<dependencyManagement>
<dependencies>
<dependency>
<groupId>org.springframework.ai</groupId>
<artifactId>spring-ai-bom</artifactId>
<version>1.1.5</version>
<type>pom</type>
<scope>import</scope>
</dependency>
</dependencies>
</dependencyManagement>
<dependencies>
<dependency>
<groupId>org.springframework.ai</groupId>
<artifactId>spring-ai-starter-model-openai</artifactId>
</dependency>
</dependencies>
Gradle:
dependencies {
implementation platform("org.springframework.ai:spring-ai-bom:1.1.5")
implementation "org.springframework.ai:spring-ai-starter-model-openai"
}
One thing worth knowing so you don't waste an afternoon: the stable GA releases live in Maven Central, so you don't need to add any Spring snapshot repository. You only need that if you go chasing 2.0 milestones — which you shouldn't for a real project. Pin to 1.1.x and move on.
Step 2: Configure your key
Add this to application.yml:
spring:
ai:
openai:
api-key: ${OPENAI_API_KEY}
chat:
options:
model: gpt-4o
temperature: 0.7
Never hardcode the key. The ${OPENAI_API_KEY} placeholder reads it from an environment variable, which is exactly how you'd handle any other secret in Spring Boot. That's the point — none of your existing instincts go out the window here.
Step 3: Your first call (three lines that matter)
Spring Boot auto-configures a ChatClient.Builder for you. Inject it, build a client, and make a call. Here's a complete REST controller — if you've written a @RestController before (and if you haven't, start here with the core annotations), nothing on this screen is new except the chatClient line:
@RestController
public class ChatController {
private final ChatClient chatClient;
public ChatController(ChatClient.Builder builder) {
this.chatClient = builder.build();
}
@GetMapping("/ask")
public String ask(@RequestParam String question) {
return chatClient.prompt()
.user(question)
.call()
.content();
}
}
Run it, hit /ask?question=Explain a database index in one line, and you have a working AI endpoint. Read that fluent chain out loud: build a prompt, set the user message, call the model, get the content. It's the same shape as restClient.get().uri(...).retrieve().body(...). You already knew this.
Step 4: The part that should make you smile — provider portability
Here's the line that sells Spring AI to anyone who's been burned by vendor lock-in. Say you want to move off OpenAI — maybe to Anthropic's Claude, maybe to a local Ollama model so nothing leaves your network. What changes in the controller above?
Nothing.
You swap the starter dependency and one config block. For a local Ollama model:
implementation "org.springframework.ai:spring-ai-starter-model-ollama"
spring:
ai:
ollama:
chat:
options:
model: llama3.2
The ChatController doesn't get touched. Your service code is written against the Spring AI abstraction, not against OpenAI. This is the same payoff you get from coding against JpaRepository instead of raw JDBC — the implementation underneath becomes a decision you can change later, cheaply. For anyone shipping AI features into production, that optionality is worth a lot.
Step 5: Get typed Java objects back, not strings
A raw string from an LLM is hard to work with in a real backend. You want a Java object you can validate, persist, and pass around. Spring AI does this with .entity() — you hand it a class, it handles the prompting and parsing for you.
Since we're Hungry Coders, let's make the model generate interview questions as structured data:
record InterviewQuestion(String question, String topic, String difficulty) {}
@GetMapping("/questions")
public List<InterviewQuestion> questions(@RequestParam String topic) {
return chatClient.prompt()
.user("Generate 3 backend interview questions about " + topic)
.call()
.entity(new ParameterizedTypeReference<List<InterviewQuestion>>() {});
}
Call /questions?topic=concurrency and you get back a clean List<InterviewQuestion> — typed, ready to serialize, ready to drop into a database. No manual JSON parsing, no fragile string splitting. This is the moment most backend engineers realize AI actually fits their world: it speaks your type system.
The one production gotcha: conversation memory needs an ID
Everything above is stateless — each call is independent, the model remembers nothing. The moment you want a real conversation (a chatbot that remembers the last message), you reach for chat memory advisors.
Here's the trap people hit after copying an old tutorial: as of the current Spring AI line, chat memory advisors require an explicit conversation ID. The older auto-magical behavior was deprecated. If you wire up memory without supplying a conversation ID per user, you'll either get an error or — worse — everyone's chat history bleeding into one shared context.
The fix is simple once you know it: pass a conversation ID (a user ID, a session ID — whatever scopes the conversation) on each call. The mechanics of advisors and memory are a post of their own, but file this away now so it doesn't bite you later. Most "my chatbot is mixing up users" bugs trace back to exactly this.
Where this fits in your roadmap
If you're mapping out where AI sits in your backend skill set, it's not a separate track — it's a layer on top of the Spring Boot you already know. We laid out the full picture in the 2026 Backend Engineer Roadmap, and AI integration is now a standard stage in it, not an optional bonus.
This post is your foundation. From here, the natural next steps — each one building on this same ChatClient — are:
- RAG (Retrieval-Augmented Generation): make the model answer using your documents and data. This is where caching strategies you already know start mattering again, for embeddings. Check it out here: RAG Blog,
- Tools & MCP: let the model call your code — query a database, hit an API — using the Model Context Protocol that landed in Spring AI 1.1.
- Guardrails & LLM security: what to lock down before any of this goes to production.
We'll cover each of these in upcoming posts. Bookmark this one — it's the piece they'll all link back to.
Go Deeper
Want the complete, guided path?
RAG pipelines, tool calling, multi-agent systems, and LLM security — built specifically for Java & Spring Boot engineers, not Python tutorials rewritten in Java. Production-experience-driven. No hype, just depth.
Explore the AI Course →Who we are
Courses
-
Java Spring Boot Course
-
System Design Course
-
Blogs
-
Join Our Team
-
Newsletter
