KINTO Tech Blog
Generative AI

Trying Out Spring AI

Cover Image for Trying Out Spring AI

Introduction

Hello. I am Yamada, and I develop and operate in-house tools in the Platform Engineering Team of KINTO Technologies' (KTC) Platform Group. As an application engineer, I primarily develop a CMDB (Configuration Management Database), working on both the backend using Java and Spring Boot, and the frontend with JavaScript and React. Over the past year or so, I’ve also been riding the wave of generative AI, working on a chatbot for our CMDB that uses technologies like RAG and Text-to-SQL.

This is my previous article about Text-to-SQL. If you’re interested, feel free to check it out!
https://blog.kinto-technologies.com/posts/2025-01-16-generativeAI_and_Text-to-SQL/

This time, I’d like to share my experience exploring Spring AI, a topic related to generative AI. Spring AI reached General Availability (GA) on May 20, 2025, and I had the opportunity to try it out firsthand.

Overview of Spring AI

Spring AI is a generative AI framework that reached GA on May 20, 2025, as part of the Spring ecosystem. In the generative AI space, Python-based frameworks like LlamaIndex and LangChain are widely known. But if you’re looking to add generative AI capabilities to an existing Spring application, or want to experiment with generative AI in Java, Spring AI could be a great option.

https://spring.pleiades.io/spring-ai/reference/

What I Tried

This time, using Spring AI, I implemented the following two functions:

  1. Chat Function (LLM interaction) A conversational function powered by AWS Bedrock using the Claude 3.7 Sonnet model.
  2. Embedding Function A document embedding and similarity search function using Cohere's embedding model in combination with Chroma, a vector database.

Technology Stack

  • Spring Boot 3.5.0
  • Java 21
  • Spring AI 1.0.0
  • Gradle
  • Chroma 1.0.0
  • AWS Bedrock
    • LLM Model: Anthropic Claude 3.7 Sonnet
    • Embedding model: Cohere Embed Multilingual v3

Environment Setup

Setting Up Dependencies

First, add the necessary dependencies for using Spring AI in your build.gradle file.

plugins {
    id 'java'
    id 'org.springframework.boot' version '3.5.0'
    id 'io.spring.dependency-management' version '1.1.7'
}

java {
    toolchain {
        languageVersion = JavaLanguageVersion.of(21)
    }
}

ext {
    set('springAiVersion', "1.0.0")
}

dependencies {
    implementation 'org.springframework.boot:spring-boot-starter-web'
    implementation 'org.springframework.boot:spring-boot-starter-validation'
    implementation 'org.springframework.ai:spring-ai-starter-model-bedrock-converse'
    implementation 'org.springframework.ai:spring-ai-starter-model-bedrock'
    implementation 'org.springframework.ai:spring-ai-bedrock'
    implementation 'org.springframework.ai:spring-ai-starter-vector-store-chroma'
    testImplementation 'org.springframework.boot:spring-boot-starter-test'
}

dependencyManagement {
    imports {
        mavenBom "org.springframework.ai:spring-ai-bom:${springAiVersion}"
    }
}

Application Configuration

In application.yml, configure your AWS Bedrock credentials, model selection, and vector store connection settings.

spring:
  application:
    name: spring-ai-sample
  
  ai:
    bedrock:
      aws:
        region: ap-northeast-1
        access-key: ${AWS_ACCESS_KEY_ID}
        secret-key: ${AWS_SECRET_ACCESS_KEY}
        session-token: ${AWS_SESSION_TOKEN}
      converse:
        chat:
          options:
            model: arn:aws:bedrock:ap-northeast-1:{account_id}:inference-profile/apac.anthropic.claude-3-7-sonnet-20250219-v1:0
      cohere:
        embedding:
          model: cohere.embed-multilingual-v3
    model:
      embedding: bedrock-cohere
    vectorstore:
      chroma:
        client:
          host: http://localhost
          port: 8000
        initialize-schema: true

That's all for the configuration. From here, you can easily inject the necessary classes into your application using dependency injection (DI).

Implementation Example

1. Conversational Function with a Generative AI (LLM)

With Spring AI, you can easily implement conversations with an LLM using the ChatClient interface.

Service Layer Implementation

Implement the conversational function with an LLM in a class called ChatService.

@Service
public class ChatService {
    private final ChatClient chatClient;

    public ChatService(ChatClient.Builder builder) {
        this.chatClient = builder.build();
    }

    public String generate(String message) {
        return this.chatClient.prompt(message).call().content();
    }
}

By injecting the ChatClient.Builder, you can automatically build a client based on the configuration file.

Controller Implementation

Create a REST controller to provide the API endpoint.

@RestController
public class ChatController {
    private final ChatService chatService;
    private final EmbeddingService embeddingService;

    public ChatController(ChatService chatService, EmbeddingService embeddingService) {
        this.chatService = chatService;
        this.embeddingService = embeddingService;
    }

    @PostMapping("/generate")
    public ResponseEntity<String> generate(@RequestBody ChatRequest request) {
        return ResponseEntity.ok(chatService.generate(request.getMessage()));
    }

}

Once this is set up, when you send a POST request to the /generate endpoint, you’ll receive a response generated by Claude 3.7 Sonnet.

Here’s an example:

curl -X POST http://localhost:8080/generate -H "Content-Type: application/json" -d '{"message": "こんにちわわ"}'

こんにちは! 何かお手伝いできることはありますか?

2. Implementing the Embedding Search Function

Next, implement a function that vectorizes documents and enables semantic search.

Embedding Service Implementation

In this service, vectorize a sample text as a Document object and store it in Chroma. Then, use "Spring" as the query to search for similar documents. Behind the scenes, the Cohere Embed Multilingual v3 model converts the text into vectors.

@Service
public class EmbeddingService {
    private final VectorStore vectorStore;

    public EmbeddingService(VectorStore vectorStore) {
        this.vectorStore = vectorStore;
    }

    public List<Document> embed() {
        List<Document> documents = List.of(
            new Document("Spring AI is a framework for building AI applications with the familiar Spring ecosystem and programming model."),
            new Document("LlamaIndex is a data framework for LLM applications to ingest, structure, and access private or domain-specific data."),
            new Document("LangChain is a framework for developing applications powered by language models through composability.")
        );

        vectorStore.add(documents);
        return vectorStore.similaritySearch(SearchRequest.builder().query("Spring").topK(5).build());
    }
}

Implementing the API Endpoint

@GetMapping("/embedding")
public ResponseEntity<List<Document>> embedding() {
    return ResponseEntity.ok(embeddingService.embed());
}

This implementation defines the /embedding endpoint, which vectorizes a sample document, performs a similarity search, and returns the results.

Here’s an example: As expected, the document containing the word "Spring" has the highest similarity score.

curl http://localhost:8080/embedding

[
  {
    "id": "af885f07-20c9-4db4-913b-95406f1cb0cb",
    "text": "Spring AI is a framework for building AI applications with the familiar Spring ecosystem and programming model.",
    "media": null,
    "metadata": {
      "distance": 0.5593532
    },
    "score": 0.44064682722091675
  },
  {
    "id": "5a8b8071-b8d6-491e-b542-611d33e16159",
    "text": "LlamaIndex is a data framework for LLM applications to ingest, structure, and access private or domain-specific data.",
    "media": null,
    "metadata": {
      "distance": 0.6968217
    },
    "score": 0.3031783103942871
  },
  {
    "id": "336b3e07-1a70-4546-920d-c4869e77e4bb",
    "text": "LangChain is a framework for developing applications powered by language models through composability.",
    "media": null,
    "metadata": {
      "distance": 0.71094555
    },
    "score": 0.2890544533729553
  }
]

Conclusion

This was a brief introduction to implementing generative AI functions using Spring AI. After trying it out, Spring AI seems like a good option when integrating generative AI capabilities into Java-based systems. Also, I feel that combining the chat and embedding functions introduced here makes it relatively easy to build RAG functionality.

At this point, Python-based frameworks like LlamaIndex and LangChain still provide more advanced features and a richer ecosystem for generative AI. However, since Spring AI has only recently been released, there's a lot of potential for growth, and I’m excited to see how it develops.

Facebook

関連記事 | Related Posts

We are hiring!

生成AIエンジニア/AIファーストG/東京・名古屋・大阪・福岡

AIファーストGについて生成AIの活用を通じて、KINTO及びKINTOテクノロジーズへ事業貢献することをミッションに2024年1月に新設されたプロジェクトチームです。生成AI技術は生まれて日が浅く、その技術を業務活用する仕事には定説がありません。

【クラウドプラットフォームエンジニア】プラットフォームG/東京・大阪・福岡

プラットフォームグループについてAWS を中心とするインフラ上で稼働するアプリケーション運用改善のサポートを担当しています。

イベント情報