Trying Out Spring AI

 IntroductionHello. I am Yamada, and I develop and operate in-house tools in the Platform Engineering Team of KINTO Technologies' (KTC) Platform Group. As an application engineer, I primarily develop a CMDB (Configuration Management Database), working on both the backend using Java and Spring Boot, and the frontend with JavaScript and React. Over the past year or so, I’ve also been riding the wave of generative AI, working on a chatbot for our CMDB that uses technologies like RAG and Text-to-SQL.
This is my previous article about Text-to-SQL. If you’re interested, feel free to check it out!

https://blog.kinto-technologies.com/posts/2025-01-16-generativeAI_and_Text-to-SQL/
This time, I’d like to share my experience exploring Spring AI, a topic related to generative AI. Spring AI reached General Availability (GA) on May 20, 2025, and I had the opportunity to try it out firsthand.
 Overview of Spring AISpring AI is a generative AI framework that reached GA on May 20, 2025, as part of the Spring ecosystem. In the generative AI space, Python-based frameworks like LlamaIndex and LangChain are widely known. But if you’re looking to add generative AI capabilities to an existing Spring application, or want to experiment with generative AI in Java, Spring AI could be a great option.
https://spring.pleiades.io/spring-ai/reference/
 What I TriedThis time, using Spring AI, I implemented the following two functions:
Chat Function (LLM interaction) A conversational function powered by AWS Bedrock using the Claude 3.7 Sonnet model.
Embedding Function A document embedding and similarity search function using Cohere's embedding model in combination with Chroma, a vector database.
 Technology StackSpring Boot 3.5.0
Java 21
Spring AI 1.0.0
Gradle
Chroma 1.0.0
AWS Bedrock
LLM Model: Anthropic Claude 3.7 Sonnet
Embedding model: Cohere Embed Multilingual v3

 Environment Setup Setting Up DependenciesFirst, add the necessary dependencies for using Spring AI in your build.gradle file.
plugins {
    id 'java'
    id 'org.springframework.boot' version '3.5.0'
    id 'io.spring.dependency-management' version '1.1.7'
}

java {
    toolchain {
        languageVersion = JavaLanguageVersion.of(21)
    }
}

ext {
    set('springAiVersion', "1.0.0")
}

dependencies {
    implementation 'org.springframework.boot:spring-boot-starter-web'
    implementation 'org.springframework.boot:spring-boot-starter-validation'
    implementation 'org.springframework.ai:spring-ai-starter-model-bedrock-converse'
    implementation 'org.springframework.ai:spring-ai-starter-model-bedrock'
    implementation 'org.springframework.ai:spring-ai-bedrock'
    implementation 'org.springframework.ai:spring-ai-starter-vector-store-chroma'
    testImplementation 'org.springframework.boot:spring-boot-starter-test'
}

dependencyManagement {
    imports {
        mavenBom "org.springframework.ai:spring-ai-bom:${springAiVersion}"
    }
}
 Application ConfigurationIn application.yml, configure your AWS Bedrock credentials, model selection, and vector store connection settings.
spring:
  application:
    name: spring-ai-sample
  
  ai:
    bedrock:
      aws:
        region: ap-northeast-1
        access-key: ${AWS_ACCESS_KEY_ID}
        secret-key: ${AWS_SECRET_ACCESS_KEY}
        session-token: ${AWS_SESSION_TOKEN}
      converse:
        chat:
          options:
            model: arn:aws:bedrock:ap-northeast-1:{account_id}:inference-profile/apac.anthropic.claude-3-7-sonnet-20250219-v1:0
      cohere:
        embedding:
          model: cohere.embed-multilingual-v3
    model:
      embedding: bedrock-cohere
    vectorstore:
      chroma:
        client:
          host: http://localhost
          port: 8000
        initialize-schema: true
That's all for the configuration. From here, you can easily inject the necessary classes into your application using dependency injection (DI).
 Implementation Example 1. Conversational Function with a Generative AI (LLM)With Spring AI, you can easily implement conversations with an LLM using the ChatClient interface.
 Service Layer ImplementationImplement the conversational function with an LLM in a class called ChatService.
@Service
public class ChatService {
    private final ChatClient chatClient;

    public ChatService(ChatClient.Builder builder) {
        this.chatClient = builder.build();
    }

    public String generate(String message) {
        return this.chatClient.prompt(message).call().content();
    }
}
By injecting the ChatClient.Builder, you can automatically build a client based on the configuration file.
 Controller ImplementationCreate a REST controller to provide the API endpoint.
@RestController
public class ChatController {
    private final ChatService chatService;
    private final EmbeddingService embeddingService;

    public ChatController(ChatService chatService, EmbeddingService embeddingService) {
        this.chatService = chatService;
        this.embeddingService = embeddingService;
    }

    @PostMapping("/generate")
    public ResponseEntity<String> generate(@RequestBody ChatRequest request) {
        return ResponseEntity.ok(chatService.generate(request.getMessage()));
    }

}
Once this is set up, when you send a POST request to the /generate endpoint, you’ll receive a response generated by Claude 3.7 Sonnet.
Here’s an example:
curl -X POST http://localhost:8080/generate -H "Content-Type: application/json" -d '{"message": "こんにちわわ"}'

こんにちは！ 何かお手伝いできることはありますか？
 2. Implementing the Embedding Search FunctionNext, implement a function that vectorizes documents and enables semantic search.
 Embedding Service ImplementationIn this service, vectorize a sample text as a Document object and store it in Chroma. Then, use "Spring" as the query to search for similar documents. Behind the scenes, the Cohere Embed Multilingual v3 model converts the text into vectors.
@Service
public class EmbeddingService {
    private final VectorStore vectorStore;

    public EmbeddingService(VectorStore vectorStore) {
        this.vectorStore = vectorStore;
    }

    public List<Document> embed() {
        List<Document> documents = List.of(
            new Document("Spring AI is a framework for building AI applications with the familiar Spring ecosystem and programming model."),
            new Document("LlamaIndex is a data framework for LLM applications to ingest, structure, and access private or domain-specific data."),
            new Document("LangChain is a framework for developing applications powered by language models through composability.")
        );

        vectorStore.add(documents);
        return vectorStore.similaritySearch(SearchRequest.builder().query("Spring").topK(5).build());
    }
}
 Implementing the API Endpoint@GetMapping("/embedding")
public ResponseEntity<List<Document>> embedding() {
    return ResponseEntity.ok(embeddingService.embed());
}
This implementation defines the /embedding endpoint, which vectorizes a sample document, performs a similarity search, and returns the results.
Here’s an example: As expected, the document containing the word "Spring" has the highest similarity score.
curl http://localhost:8080/embedding

[
  {
    "id": "af885f07-20c9-4db4-913b-95406f1cb0cb",
    "text": "Spring AI is a framework for building AI applications with the familiar Spring ecosystem and programming model.",
    "media": null,
    "metadata": {
      "distance": 0.5593532
    },
    "score": 0.44064682722091675
  },
  {
    "id": "5a8b8071-b8d6-491e-b542-611d33e16159",
    "text": "LlamaIndex is a data framework for LLM applications to ingest, structure, and access private or domain-specific data.",
    "media": null,
    "metadata": {
      "distance": 0.6968217
    },
    "score": 0.3031783103942871
  },
  {
    "id": "336b3e07-1a70-4546-920d-c4869e77e4bb",
    "text": "LangChain is a framework for developing applications powered by language models through composability.",
    "media": null,
    "metadata": {
      "distance": 0.71094555
    },
    "score": 0.2890544533729553
  }
]
 ConclusionThis was a brief introduction to implementing generative AI functions using Spring AI. After trying it out, Spring AI seems like a good option when integrating generative AI capabilities into Java-based systems. Also, I feel that combining the chat and embedding functions introduced here makes it relatively easy to build RAG functionality.
At this point, Python-based frameworks like LlamaIndex and LangChain still provide more advanced features and a richer ecosystem for generative AI. However, since Spring AI has only recently been released, there's a lot of potential for growth, and I’m excited to see how it develops.

We are hiring!

生成AIエンジニア／AIファーストG／東京・名古屋・大阪・福岡

AIファーストGについて生成AIの活用を通じて、KINTO及びKINTOテクノロジーズへ事業貢献することをミッションに2024年1月に新設されたプロジェクトチームです。生成AI技術は生まれて日が浅く、その技術を業務活用する仕事には定説がありません。

【ソフトウェアエンジニア（バックエンド領域）】FACTORY 業務支援G／東京

FACTORY 業務支援GについてFACTORY 業務支援グループは、TOYOTAのアップグレード事業を支える基幹システムおよびバックオフィスシステムの開発・運用・保守を担っています。基幹システムでは、TOYOTA・販売店・KINTOをつなぐ重要な役割を果たしており、今後の成長が期待されるアップグレード事業をシステム面から強力に支援しています。

Trying Out Spring AI

Masaki Yamada

Introduction

Overview of Spring AI

What I Tried

Technology Stack

Environment Setup

Setting Up Dependencies

Application Configuration

Implementation Example

1. Conversational Function with a Generative AI (LLM)

Service Layer Implementation

Controller Implementation

2. Implementing the Embedding Search Function

Embedding Service Implementation

Implementing the API Endpoint

Conclusion

関連記事 | Related Posts

Masaki Yamada

Torii⛩

Masaki Yamada

p2sk,hoshino

Soju Kameyama

Masaki Yamada

We are hiring!

生成AIエンジニア／AIファーストG／東京・名古屋・大阪・福岡

【ソフトウェアエンジニア（バックエンド領域）】FACTORY 業務支援G／東京

イベント情報