Trying Out Spring AI

Introduction
Hello. I am Yamada, and I develop and operate in-house tools in the Platform Engineering Team of KINTO Technologies' (KTC) Platform Group. As an application engineer, I primarily develop a CMDB (Configuration Management Database), working on both the backend using Java and Spring Boot, and the frontend with JavaScript and React. Over the past year or so, I’ve also been riding the wave of generative AI, working on a chatbot for our CMDB that uses technologies like RAG and Text-to-SQL.
This is my previous article about Text-to-SQL. If you’re interested, feel free to check it out!
This time, I’d like to share my experience exploring Spring AI, a topic related to generative AI. Spring AI reached General Availability (GA) on May 20, 2025, and I had the opportunity to try it out firsthand.
Overview of Spring AI
Spring AI is a generative AI framework that reached GA on May 20, 2025, as part of the Spring ecosystem. In the generative AI space, Python-based frameworks like LlamaIndex and LangChain are widely known. But if you’re looking to add generative AI capabilities to an existing Spring application, or want to experiment with generative AI in Java, Spring AI could be a great option.
What I Tried
This time, using Spring AI, I implemented the following two functions:
- Chat Function (LLM interaction) A conversational function powered by AWS Bedrock using the Claude 3.7 Sonnet model.
- Embedding Function A document embedding and similarity search function using Cohere's embedding model in combination with Chroma, a vector database.
Technology Stack
- Spring Boot 3.5.0
- Java 21
- Spring AI 1.0.0
- Gradle
- Chroma 1.0.0
- AWS Bedrock
- LLM Model: Anthropic Claude 3.7 Sonnet
- Embedding model: Cohere Embed Multilingual v3
Environment Setup
Setting Up Dependencies
First, add the necessary dependencies for using Spring AI in your build.gradle
file.
plugins {
id 'java'
id 'org.springframework.boot' version '3.5.0'
id 'io.spring.dependency-management' version '1.1.7'
}
java {
toolchain {
languageVersion = JavaLanguageVersion.of(21)
}
}
ext {
set('springAiVersion', "1.0.0")
}
dependencies {
implementation 'org.springframework.boot:spring-boot-starter-web'
implementation 'org.springframework.boot:spring-boot-starter-validation'
implementation 'org.springframework.ai:spring-ai-starter-model-bedrock-converse'
implementation 'org.springframework.ai:spring-ai-starter-model-bedrock'
implementation 'org.springframework.ai:spring-ai-bedrock'
implementation 'org.springframework.ai:spring-ai-starter-vector-store-chroma'
testImplementation 'org.springframework.boot:spring-boot-starter-test'
}
dependencyManagement {
imports {
mavenBom "org.springframework.ai:spring-ai-bom:${springAiVersion}"
}
}
Application Configuration
In application.yml
, configure your AWS Bedrock credentials, model selection, and vector store connection settings.
spring:
application:
name: spring-ai-sample
ai:
bedrock:
aws:
region: ap-northeast-1
access-key: ${AWS_ACCESS_KEY_ID}
secret-key: ${AWS_SECRET_ACCESS_KEY}
session-token: ${AWS_SESSION_TOKEN}
converse:
chat:
options:
model: arn:aws:bedrock:ap-northeast-1:{account_id}:inference-profile/apac.anthropic.claude-3-7-sonnet-20250219-v1:0
cohere:
embedding:
model: cohere.embed-multilingual-v3
model:
embedding: bedrock-cohere
vectorstore:
chroma:
client:
host: http://localhost
port: 8000
initialize-schema: true
That's all for the configuration. From here, you can easily inject the necessary classes into your application using dependency injection (DI).
Implementation Example
1. Conversational Function with a Generative AI (LLM)
With Spring AI, you can easily implement conversations with an LLM using the ChatClient
interface.
Service Layer Implementation
Implement the conversational function with an LLM in a class called ChatService.
@Service
public class ChatService {
private final ChatClient chatClient;
public ChatService(ChatClient.Builder builder) {
this.chatClient = builder.build();
}
public String generate(String message) {
return this.chatClient.prompt(message).call().content();
}
}
By injecting the ChatClient.Builder
, you can automatically build a client based on the configuration file.
Controller Implementation
Create a REST controller to provide the API endpoint.
@RestController
public class ChatController {
private final ChatService chatService;
private final EmbeddingService embeddingService;
public ChatController(ChatService chatService, EmbeddingService embeddingService) {
this.chatService = chatService;
this.embeddingService = embeddingService;
}
@PostMapping("/generate")
public ResponseEntity<String> generate(@RequestBody ChatRequest request) {
return ResponseEntity.ok(chatService.generate(request.getMessage()));
}
}
Once this is set up, when you send a POST request to the /generate
endpoint, you’ll receive a response generated by Claude 3.7 Sonnet.
Here’s an example:
curl -X POST http://localhost:8080/generate -H "Content-Type: application/json" -d '{"message": "こんにちわわ"}'
こんにちは! 何かお手伝いできることはありますか?
2. Implementing the Embedding Search Function
Next, implement a function that vectorizes documents and enables semantic search.
Embedding Service Implementation
In this service, vectorize a sample text as a Document
object and store it in Chroma. Then, use "Spring" as the query to search for similar documents. Behind the scenes, the Cohere Embed Multilingual v3 model converts the text into vectors.
@Service
public class EmbeddingService {
private final VectorStore vectorStore;
public EmbeddingService(VectorStore vectorStore) {
this.vectorStore = vectorStore;
}
public List<Document> embed() {
List<Document> documents = List.of(
new Document("Spring AI is a framework for building AI applications with the familiar Spring ecosystem and programming model."),
new Document("LlamaIndex is a data framework for LLM applications to ingest, structure, and access private or domain-specific data."),
new Document("LangChain is a framework for developing applications powered by language models through composability.")
);
vectorStore.add(documents);
return vectorStore.similaritySearch(SearchRequest.builder().query("Spring").topK(5).build());
}
}
Implementing the API Endpoint
@GetMapping("/embedding")
public ResponseEntity<List<Document>> embedding() {
return ResponseEntity.ok(embeddingService.embed());
}
This implementation defines the /embedding
endpoint, which vectorizes a sample document, performs a similarity search, and returns the results.
Here’s an example: As expected, the document containing the word "Spring" has the highest similarity score.
curl http://localhost:8080/embedding
[
{
"id": "af885f07-20c9-4db4-913b-95406f1cb0cb",
"text": "Spring AI is a framework for building AI applications with the familiar Spring ecosystem and programming model.",
"media": null,
"metadata": {
"distance": 0.5593532
},
"score": 0.44064682722091675
},
{
"id": "5a8b8071-b8d6-491e-b542-611d33e16159",
"text": "LlamaIndex is a data framework for LLM applications to ingest, structure, and access private or domain-specific data.",
"media": null,
"metadata": {
"distance": 0.6968217
},
"score": 0.3031783103942871
},
{
"id": "336b3e07-1a70-4546-920d-c4869e77e4bb",
"text": "LangChain is a framework for developing applications powered by language models through composability.",
"media": null,
"metadata": {
"distance": 0.71094555
},
"score": 0.2890544533729553
}
]
Conclusion
This was a brief introduction to implementing generative AI functions using Spring AI. After trying it out, Spring AI seems like a good option when integrating generative AI capabilities into Java-based systems. Also, I feel that combining the chat and embedding functions introduced here makes it relatively easy to build RAG functionality.
At this point, Python-based frameworks like LlamaIndex and LangChain still provide more advanced features and a richer ecosystem for generative AI. However, since Spring AI has only recently been released, there's a lot of potential for growth, and I’m excited to see how it develops.
関連記事 | Related Posts
We are hiring!
生成AIエンジニア/AIファーストG/東京・名古屋・大阪・福岡
AIファーストGについて生成AIの活用を通じて、KINTO及びKINTOテクノロジーズへ事業貢献することをミッションに2024年1月に新設されたプロジェクトチームです。生成AI技術は生まれて日が浅く、その技術を業務活用する仕事には定説がありません。
【クラウドプラットフォームエンジニア】プラットフォームG/東京・大阪・福岡
プラットフォームグループについてAWS を中心とするインフラ上で稼働するアプリケーション運用改善のサポートを担当しています。