Introducing the Internal Slack Chatbot Leveraging LLM
<h2 id="1.-introduction" data-line="1" class="code-line"><a class="header-anchor-link" href="#1.-introduction" aria-hidden="true"></a> 1. Introduction</h2>
<p data-line="3" class="code-line">Hello, this is Torii (<a href="https://x.com/yu_torii" target="_blank" rel="nofollow noopener noreferrer">@yu_torii</a>) from the Common Service Development Group. I'm a fullstack software engineer, primarily working on both backend and frontend development. As part of the KINTO Member Platform Development Team, I focus on frontend engineering while also contributing to internal initiatives involving generative AI.</p>
<p data-line="5" class="code-line">This article introduces our internal generative AI tool, our internal chatbot powered by LLM and integrated into Slack. We’ll explore its RAG capabilities and its translation function, which works through Slack reactions.</p>
<hr data-line="7" class="code-line" />
<p data-line="9" class="code-line">Internal generative AI tool is designed to facilitate generative AI adoption within the company by enabling employees to naturally interact with AI in their daily Slack conversations. The goal is to eliminate the need for frontend development and deploy it quickly within the company, enhancing efficiency and collaboration. Internal generative AI tool aims to be a dependable assistant for improving work efficiency and streamlining information sharing within the company.</p>
<p data-line="11" class="code-line">By the way, the character was generated by team members of the Creative Group as a surprise reveal for the KINTO Technologies General Meeting (KTC CHO All-Hands Meeting). A big thank you to them!</p>
<hr data-line="13" class="code-line" />
<p data-line="15" class="code-line">This initiative was carried out in collaboration with Wada-san (<a href="https://x.com/cognac_n" target="_blank" rel="nofollow noopener noreferrer">@cognac_n</a>), who is leading the Generative AI Development Project. We have been driving the adoption of generative AI within the company through various improvements, such as introducing a RAG pipeline, setting up a local development environment, and implementing a translation and summarization feature using Slack emoji reactions.</p>
<details><summary>Click here for Wada-san's article</summary><div class="details-content"><span class="embed-block zenn-embedded zenn-embedded-card"><iframe id="zenn-embedded__91613a4c1005c" src="https://embed.zenn.studio/card#zenn-embedded__91613a4c1005c" data-content="https%3A%2F%2Fblog.kinto-technologies.com%2Fposts%2F2024-01-26-GenerativeAIDevelopProject-en%2F" frameborder="0" scrolling="no" loading="lazy"></iframe></span>
<span class="embed-block zenn-embedded zenn-embedded-card"><iframe id="zenn-embedded__4178200940d6f" src="https://embed.zenn.studio/card#zenn-embedded__4178200940d6f" data-content="https%3A%2F%2Fblog.kinto-technologies.com%2Fposts%2F2024-08-20-prompty-review-en%2F" frameborder="0" scrolling="no" loading="lazy"></iframe></span>
<span class="embed-block zenn-embedded zenn-embedded-card"><iframe id="zenn-embedded__0541c9aab177e" src="https://embed.zenn.studio/card#zenn-embedded__0541c9aab177e" data-content="https%3A%2F%2Fwww.wantedly.com%2Fcompanies%2Fcompany_7864825%2Fpost_articles%2F878277" frameborder="0" scrolling="no" loading="lazy"></iframe></span>
</div></details>
<p data-line="25" class="code-line">Note that this article does not go into detail about RAG or generative AI technology itself. Instead, it focuses on the implementation process and feature enhancements. Additionally, Internal generative AI tool's LLM runs on Azure OpenAI.</p>
<h2 id="what-you-will-gain-from-this-article" data-line="27" class="code-line"><a class="header-anchor-link" href="#what-you-will-gain-from-this-article" aria-hidden="true"></a> What You Will Gain from This Article</h2>
<ul data-line="29" class="code-line">
<li data-line="29" class="code-line">
<p data-line="29" class="code-line"><strong>How to use chat functions powered by generative AI in a Slack Bot</strong>
This article introduces an implementation example of a chatbot that combines Slack and LLM, allowing users to trigger translations and summaries using emoji reactions or natural message posts.</p>
</li>
<li data-line="32" class="code-line">
<p data-line="32" class="code-line"><strong>Techniques for retrieving Confluence data, including HTML sanitization</strong>
Learn how to fetch Confluence documents using Go and sanitize HTML to prepare text for summarization and embedding.</p>
</li>
<li data-line="35" class="code-line">
<p data-line="35" class="code-line"><strong>Implementing a simple RAG pipeline using FAISS and S3</strong>
This section explains the steps and considerations for setting up a simple RAG pipeline using the FAISS indexing and S3. While the response speed may not be very fast, this approach provides a cost-effective way to integrate a basic RAG system.</p>
</li>
</ul>
<aside class="msg message"><span class="msg-symbol">!</span><div class="msg-content"><ul data-line="40" class="code-line">
<li data-line="40" class="code-line">This article is based on the implementation status at the time of writing. Please note that there are still many areas for improvement.</li>
<li data-line="41" class="code-line">Since this is an internal tool, it is not designed for production-level performance.</li>
<li data-line="42" class="code-line">Additionally, this article does not cover details on setting up the development environment, such as deployment with AWS SAM, building a pipeline with Step Functions, or configuring CI/CD with GitHub Actions.</li>
</ul>
</div></aside>
<details><summary>Table of Contents (Expanded)</summary><div class="details-content"><ul data-line="48" class="code-line">
<li data-line="48" class="code-line"><a href="#1-%E3%81%AF%E3%81%98%E3%82%81%E3%81%AB">1. Introduction</a></li>
<li data-line="49" class="code-line"><a href="#%E3%81%93%E3%81%AE%E8%A8%98%E4%BA%8B%E3%81%A7%E5%BE%97%E3%82%89%E3%82%8C%E3%82%8B%E3%81%93%E3%81%A8">What You Will Gain from This Article</a></li>
<li data-line="50" class="code-line"><a href="#2-%E5%85%A8%E4%BD%93%E3%82%A2%E3%83%BC%E3%82%AD%E3%83%86%E3%82%AF%E3%83%81%E3%83%A3%E3%81%AE%E6%A6%82%E8%A6%81">2. Overview of the Overall Architecture</a></li>
<li data-line="51" class="code-line"><a href="#3-slack-bot%E3%81%AB%E3%82%88%E3%82%8B%E7%94%9F%E6%88%90ai%E3%82%92%E6%B4%BB%E7%94%A8%E3%81%97%E3%81%9F%E3%83%81%E3%83%A3%E3%83%83%E3%83%88%E6%A9%9F%E8%83%BD">3. AI-Powered Chat Function Using Slack Bot</a>
<ul data-line="52" class="code-line">
<li data-line="52" class="code-line"><a href="#%E3%83%81%E3%83%A3%E3%83%83%E3%83%88%E6%A9%9F%E8%83%BD%E3%81%A8%E3%83%AA%E3%82%A2%E3%82%AF%E3%82%B7%E3%83%A7%E3%83%B3%E6%A9%9F%E8%83%BD">Chat Function and Reaction Function</a></li>
<li data-line="53" class="code-line"><a href="#%E3%83%81%E3%83%A3%E3%83%83%E3%83%88%E6%A9%9F%E8%83%BD%E5%88%A9%E7%94%A8%E3%82%B7%E3%83%8A%E3%83%AA%E3%82%AA">Chat Function: Usage Scenario</a></li>
<li data-line="54" class="code-line"><a href="#%E3%83%AA%E3%82%A2%E3%82%AF%E3%82%B7%E3%83%A7%E3%83%B3%E6%A9%9F%E8%83%BD%E5%88%A9%E7%94%A8%E3%82%B7%E3%83%8A%E3%83%AA%E3%82%AA">Reaction Feature: Usage Scenario</a></li>
<li data-line="55" class="code-line"><a href="#%E3%81%93%E3%81%AE%E6%A7%8B%E6%88%90%E3%81%AE%E3%83%A1%E3%83%AA%E3%83%83%E3%83%88">Benefits of This Configuration</a></li>
<li data-line="56" class="code-line"><a href="#35-%E5%AE%9F%E9%9A%9B%E3%81%AE%E5%88%A9%E7%94%A8%E4%BE%8B">3.5 Actual Use Cases</a></li>
<li data-line="57" class="code-line"><a href="#3%E7%AB%A0%E3%81%BE%E3%81%A8%E3%82%81">Summary of Chapter 3</a></li>
</ul>
</li>
<li data-line="58" class="code-line"><a href="#4-%E5%AE%9F%E8%A3%85%E6%96%B9%E9%87%9D%E3%81%A8%E5%86%85%E9%83%A8%E8%A8%AD%E8%A8%88">4. Implementation Policy and Internal Design</a>
<ul data-line="59" class="code-line">
<li data-line="59" class="code-line"><a href="#%E5%85%A8%E4%BD%93%E7%9A%84%E3%81%AA%E5%87%A6%E7%90%86%E3%83%95%E3%83%AD%E3%83%BC">Overall Process Flow</a></li>
<li data-line="60" class="code-line"><a href="#%E6%9D%A1%E4%BB%B6%E5%88%86%E5%B2%90%E3%81%AB%E3%82%88%E3%82%8B%E6%A9%9F%E8%83%BD%E5%88%A4%E5%AE%9A">Function Determination by Conditional Branching</a></li>
<li data-line="61" class="code-line"><a href="#%E3%82%B5%E3%83%8B%E3%82%BF%E3%82%A4%E3%82%BA%E3%81%AE%E5%BD%B9%E5%89%B2">The Role of Sanitization</a></li>
<li data-line="62" class="code-line"><a href="#rag%E5%88%A9%E7%94%A8%E3%82%92confluence%E5%8F%82%E7%85%A7%E3%81%AB%E9%99%90%E5%AE%9A">Limiting RAG Usage to Confluence References</a></li>
<li data-line="63" class="code-line"><a href="#%E6%8B%A1%E5%BC%B5%E6%80%A7%E4%BF%9D%E5%AE%88%E6%80%A7%E3%81%B8%E3%81%AE%E8%80%83%E6%85%AE">Considerations for Scalability and Maintainability</a></li>
</ul>
</li>
<li data-line="64" class="code-line"><a href="#5-%E3%82%B3%E3%83%BC%E3%83%89%E4%BE%8B%E3%81%A8%E8%A8%AD%E5%AE%9A%E3%83%95%E3%82%A1%E3%82%A4%E3%83%AB%E3%81%AE%E7%B4%B9%E4%BB%8B">5. Introduction to Code examples and Configuration Files</a>
<ul data-line="65" class="code-line">
<li data-line="65" class="code-line"><a href="#51-go-slack%E3%82%A4%E3%83%99%E3%83%B3%E3%83%88%E5%8F%97%E4%BF%A1%E3%81%A8%E8%A7%A3%E6%9E%90">5.1 [Go] Receiving and Parsing Slack Events</a></li>
<li data-line="66" class="code-line"><a href="#52-go-html%E3%83%86%E3%82%AD%E3%82%B9%E3%83%88%E3%81%AE%E3%82%B5%E3%83%8B%E3%82%BF%E3%82%A4%E3%82%BA">5.2 [Go] HTML Text Sanitization</a></li>
<li data-line="67" class="code-line"><a href="#53-python-llm%E5%95%8F%E3%81%84%E5%90%88%E3%82%8F%E3%81%9B%E3%81%AE%E5%85%B7%E4%BD%93%E4%BE%8B">5.3 [Python] Example of an LLM Query</a></li>
<li data-line="68" class="code-line"><a href="#54-python-rag%E6%A4%9C%E7%B4%A2%E5%91%BC%E3%81%B3%E5%87%BA%E3%81%97%E4%BE%8B">5.4 [Python] Example of a RAG Search Call</a></li>
<li data-line="69" class="code-line"><a href="#55-python-embedding%E3%81%A8faiss%E3%82%A4%E3%83%B3%E3%83%87%E3%83%83%E3%82%AF%E3%82%B9%E5%8C%96">5.5 [Python] Embedding and FAISS Indexing</a></li>
</ul>
</li>
</ul>
</div></details>
<hr data-line="73" class="code-line" />
<h2 id="2.-overview-of-the-overall-architecture" data-line="75" class="code-line"><a class="header-anchor-link" href="#2.-overview-of-the-overall-architecture" aria-hidden="true"></a> 2. Overview of the Overall Architecture</h2>
<p data-line="77" class="code-line">Here, we will explain the process by which internal generative AI tool returns answers using generative AI from two perspectives: - The generative AI chat function with users - The process of indexing Confluence documents.</p>
<h3 id="processing-generative-ai-chat-functions-with-users" data-line="79" class="code-line"><a class="header-anchor-link" href="#processing-generative-ai-chat-functions-with-users" aria-hidden="true"></a> Processing Generative AI Chat Functions with Users</h3>
<ol data-line="81" class="code-line">
<li data-line="81" class="code-line">
<p data-line="81" class="code-line"><strong>Calling internal generative AI tool on Slack</strong> There are two ways to call internal generative AI tool: via chat or with a reaction emoji. Users can call internal generative AI tool by mentioning it in a channel or sending a direct message to request a generative AI response. A reaction call allows you to request a translation or summary of a message by adding a translation/summary emoji reaction.</p>
</li>
<li data-line="83" class="code-line">
<p data-line="83" class="code-line"><strong>Go Slack Bot Lambda</strong>
The bot that handles Slack events is built with Go and runs on AWS Lambda. It receives questions and reactions, then determines processing based on the request type. When using Azure OpenAI for LLM queries or retrieving embedded data, this component generates requests and forwards them to Python Lambda.</p>
</li>
<li data-line="86" class="code-line">
<p data-line="86" class="code-line"><strong>Request to LLM and RAG reference in Python Lambda</strong>
Python Lambda is divided into functions responsible for LLM queries, RAG references, and related processes. It receives requests from Go Lambda, queries the LLM, and generates an answer using RAG.</p>
</li>
<li data-line="89" class="code-line">
<p data-line="89" class="code-line"><strong>Returning an answer to Slack</strong> The generated answer is sent back to Slack via Go Slack Bot Lambda. The translation and summary functions that can be invoked via emoji reactions are also integrated into this flow.</p>
</li>
</ol>
<details><summary>Architecture diagram (processing user requests)</summary><div class="details-content"><span class="embed-block zenn-embedded zenn-embedded-mermaid"><iframe id="zenn-embedded__8ca50d91398c5" src="https://embed.zenn.studio/mermaid#zenn-embedded__8ca50d91398c5" data-content="flowchart%20LR%0A%20%20%20%20subgraph%20%22Processing%20user%20requests%22%0A%20%20%20%20A%5B%22User(Slack)%22%5D%20--%3E%7C%22Question%E3%83%BBEmoji%22%7C%20B%5B%22Go%20Slack%20Bot(Lambda)%22%5D%0A%20%20%20%20B%20--%3E%7C%22Request%22%7C%20C%5B%22Python%20Lambda%20RAG%2FLLM%22%5D%0A%20%20%20%20C%20--%3E%7C%22Answer%22%7C%20B%0A%20%20%20%20B%20--%3E%7C%22Answer%22%7C%20A%0A%20%20%20%20end" frameborder="0" scrolling="no" loading="lazy"></iframe></span></div></details>
<h3 id="the-process-of-indexing-confluence-documents" data-line="105" class="code-line"><a class="header-anchor-link" href="#the-process-of-indexing-confluence-documents" aria-hidden="true"></a> The Process of Indexing Confluence Documents</h3>
<p data-line="107" class="code-line">To use the RAG pipeline, you need to prepare your Confluence documents in a way that makes them easy to summarize and embed. These preprocessing steps are structured into workflows using StepFunctions and are executed automatically on a scheduled basis.</p>
<ul data-line="109" class="code-line">
<li data-line="109" class="code-line">
<p data-line="109" class="code-line"><strong>Document retrieval and HTML sanitization (Go implementation)</strong>
A Lambda function implemented in Go retrieves documents from the Confluence API, cleans up HTML tags, and makes the text more manageable.
The sanitized text is output as JSON.</p>
</li>
<li data-line="113" class="code-line">
<p data-line="113" class="code-line"><strong>Summary processing (Go + Python Lambda invocation)</strong>
Summarization aims to refine the text, making it easier to process with Embedding and RAG.
The Go implementation of Lambda invokes a Python Lambda that processes requests to the Azure OpenAI Chat API, shortens the text, and converts it back to JSON.</p>
</li>
<li data-line="117" class="code-line">
<p data-line="117" class="code-line"><strong>FAISS indexing and S3 storage (Indexer Lambda)</strong>
The Indexer Lambda embeds the summarized text and generates a FAISS index, then stores the index and meta information in S3.
This enables instant retrieval of indexed data upon a query, ensuring the RAG pipeline runs smoothly.</p>
</li>
</ul>
<details><summary>Architecture diagram (Confluence document indexing flow)</summary><div class="details-content"><span class="embed-block zenn-embedded zenn-embedded-mermaid"><iframe id="zenn-embedded__9783e63ccecda" src="https://embed.zenn.studio/mermaid#zenn-embedded__9783e63ccecda" data-content="flowchart%20LR%0A%20%20%20%20subgraph%20%22Indexing%20Preparation%20StepFunctions%22%0A%20%20%20%20D%5B%22Go%20Lambda%20(Document%20Retrieval%2FHTML%20Sanitization)%22%5D%0A%20%20%20%20D%20--%3E%20E%5B%22Go%20Lambda%20(summary%2FPython%20call)%22%5D%0A%20%20%20%20E%20--%3E%20F%5B%22Indexer%20Lambda(Embedding%2FFAISS%2FS3)%22%5D%0A%20%20%20%20end" frameborder="0" scrolling="no" loading="lazy"></iframe></span></div></details>
<p data-line="135" class="code-line">By combining this pre-processing with request-time processing, internal generative AI tool enables generative AI responses that incorporate company-specific knowledge with simple operations in Slack. In the following chapters, we'll dive deeper into these components.</p>
<hr data-line="137" class="code-line" />
<h2 id="3.-ai-powered-chat-function-using-slack-bot" data-line="139" class="code-line"><a class="header-anchor-link" href="#3.-ai-powered-chat-function-using-slack-bot" aria-hidden="true"></a> 3. AI-Powered Chat Function Using Slack Bot</h2>
<p data-line="141" class="code-line">The previous chapter provided an overview of the architecture. In this chapter, we’ll focus on how users can make use of internal generative AI tool’s features with simple, natural interactions in Slack, and how these features can benefit them.
Here, we will illustrate what internal generative AI tool can do and which scenarios it can be useful in, while the next chapters will systematically explain the implementation details.</p>
<h3 id="chat-function-and-reaction-function" data-line="144" class="code-line"><a class="header-anchor-link" href="#chat-function-and-reaction-function" aria-hidden="true"></a> Chat Function and Reaction Function</h3>
<p data-line="146" class="code-line">internal generative AI tool offers a variety of generative AI capabilities, powered by natural interactions within Slack.</p>
<ul data-line="148" class="code-line">
<li data-line="148" class="code-line">
<p data-line="148" class="code-line"><strong>Chat function</strong>:
By posting a question, attaching files or images, including external links, or sending a message in a specific format, users can trigger LLM queries or, in certain cases, RAG searches to obtain relevant answers.
By integrating AI into Slack, an everyday tool, users can seamlessly adopt generative AI without the need to learn new environments or commands.</p>
</li>
<li data-line="152" class="code-line">
<p data-line="152" class="code-line"><strong>Reaction function</strong>:
By simply adding specific emoji reactions to a message, users can trigger translations, delete messages, and perform other actions—enabling additional operations without requiring commands.</p>
</li>
</ul>
<h3 id="chat-function%3A-usage-scenario" data-line="155" class="code-line"><a class="header-anchor-link" href="#chat-function%3A-usage-scenario" aria-hidden="true"></a> Chat Function: Usage Scenario</h3>
<ol data-line="157" class="code-line">
<li data-line="157" class="code-line">
<p data-line="157" class="code-line"><strong>Basic Questions and Answers</strong>
Simply posting a question allows users to receive AI-generated responses through LLM.</p>
<ul data-line="159" class="code-line">
<li data-line="159" class="code-line">Example scenario:
Asking "What are the steps for this project?" provides an instant answer, taking into account thread history and speaker context for a more accurate response.</li>
</ul>
</li>
<li data-line="162" class="code-line">
<p data-line="162" class="code-line"><strong>File, image, and external link processing</strong>
Attaching a file and asking "Summarize this" will generate a summary of the document.
Uploading an image allows internal generative AI tool to extract text and provide relevant answers.
Sharing an external link enables the bot to analyze and summarize webpage content, incorporating it into the LLM-generated response.</p>
<ul data-line="166" class="code-line">
<li data-line="166" class="code-line">Example scenario:
-Get a short summary of meeting notes from a text file. -Extract text information from an image. -Generate a concise summary of an external article.</li>
</ul>
</li>
<li data-line="169" class="code-line">
<p data-line="169" class="code-line"><strong>Confluence page lookup (RAG integration)</strong>
<code>:confluence: By using index:Index name</code>, users can perform a RAG-based search on Confluence pages containing company documentation, such as internal rules and application procedures.</p>
<ul data-line="171" class="code-line">
<li data-line="171" class="code-line">Example scenario:
If company policies and procedures are documented in Confluence, users can instantly access relevant information tailored to internal workflows.</li>
<li data-line="173" class="code-line">Example scenario:
Easily retrieve project-specific settings and instructions. that would otherwise be difficult to search for.</li>
</ul>
</li>
</ol>
<h3 id="reaction-feature%3A-usage-scenario" data-line="176" class="code-line"><a class="header-anchor-link" href="#reaction-feature%3A-usage-scenario" aria-hidden="true"></a> Reaction Feature: Usage Scenario</h3>
<p data-line="178" class="code-line">Adding specific emoji reactions makes calling up functions even more intuitive.</p>
<ul data-line="180" class="code-line">
<li data-line="180" class="code-line">
<p data-line="180" class="code-line"><strong>Translation</strong>:
Adding a translation emoji automatically translates the message into the specified language, helping break down language barriers and improve communication.</p>
</li>
<li data-line="183" class="code-line">
<p data-line="183" class="code-line"><strong>Message deletion</strong>
Unnecessary internal generative AI tool responses can be deleted instantly by adding a single emoji, keeping Slack channels organized.</p>
</li>
</ul>
<h3 id="benefits-of-this-configuration" data-line="186" class="code-line"><a class="header-anchor-link" href="#benefits-of-this-configuration" aria-hidden="true"></a> Benefits of This Configuration</h3>
<ul data-line="188" class="code-line">
<li data-line="188" class="code-line">
<p data-line="188" class="code-line"><strong>Seamless and intuitive AI usage</strong>:
Users can use AI without needing new commands—they simply interact with Slack as they normally do (e.g., posting messages, adding emojis reactions), reducing learning costs.</p>
</li>
<li data-line="191" class="code-line">
<p data-line="191" class="code-line"><strong>Embedding AI into everyday tool (Slack)</strong>:
By integrating generative AI directly into Slack, AI can be naturally incorporated into daily workflows without friction. Additionally, reaction-based interactions enable users to perform actions like translation and deletion without typing commands, making AI even more accessible.</p>
</li>
<li data-line="194" class="code-line">
<p data-line="194" class="code-line"><strong>Scalable and easily extendable</strong>:
If new models or additional features need to be introduced, the existing flow (chat-based queries and emoji interactions) can be easily expanded by adding conditions.</p>
</li>
</ul>
<h3 id="3.5-actual-use-cases" data-line="197" class="code-line"><a class="header-anchor-link" href="#3.5-actual-use-cases" aria-hidden="true"></a> 3.5 Actual Use Cases</h3>
<p data-line="199" class="code-line">Below are some real-world examples of how internal generative AI tool is used within Slack.</p>
<h4 id="example-1%3A-image-recognition" data-line="201" class="code-line"><a class="header-anchor-link" href="#example-1%3A-image-recognition" aria-hidden="true"></a> Example 1: Image Recognition</h4>
<p data-line="203" class="code-line">When you attach an image to a message, internal generative AI tool recognizes the text within the image and responds with its content.</p>
<p data-line="205" class="code-line"><img src="/assets/blog/authors/torii/2024-12-23_ai_tool_slack/images/image-context.png" alt="Image Recognition" width="500" /> <em>Image Recognition</em></p>
<h4 id="example-2%3A-answers-based-on-confluence-documentation" data-line="207" class="code-line"><a class="header-anchor-link" href="#example-2%3A-answers-based-on-confluence-documentation" aria-hidden="true"></a> Example 2: Answers Based on Confluence Documentation</h4>
<p data-line="209" class="code-line">By using <code>:confluence:</code> emoji, users can retrieve answers based on the Confluence documentation.</p>
<p data-line="211" class="code-line"><img src="/assets/blog/authors/torii/2024-12-23_ai_tool_slack/images/image-confluence-rag.png" alt="Answer based on Confluence documentation" width="500" /> <em>Answer based on Confluence documentation</em></p>
<p data-line="213" class="code-line">As shown above, internal generative AI tool can be triggered from workflows, allowing it to function similarly to a prompt store.</p>
<h4 id="example-3%3A-requesting-english-translations-via-the-translation-reaction" data-line="215" class="code-line"><a class="header-anchor-link" href="#example-3%3A-requesting-english-translations-via-the-translation-reaction" aria-hidden="true"></a> Example 3: Requesting English Translations via the Translation Reaction</h4>
<p data-line="217" class="code-line"><strong>Add translation reactions to messages that users want to translate into English</strong></p>
<p data-line="219" class="code-line">To share a Japanese message with an English speaker, simply add the <code>:ai_tool_translate_to_english:</code> reaction emoji. This will automatically translate the message into English. <img src="/assets/blog/authors/torii/2024-12-23_ai_tool_slack/images/image-translation.png" alt="English translation reaction" width="500" /> <em>English translation reaction</em></p>
<p data-line="221" class="code-line">To prevent misunderstandings and ensure users do not overly rely on automatic translations, a notification in multiple languages clarifies that the translation was AI-generated. Additionally, we provide instructions on how to use the feature to encourage adoption. Besides English, internal generative AI tool supports translation into multiple languages.</p>
<h3 id="summary-of-chapter-3" data-line="223" class="code-line"><a class="header-anchor-link" href="#summary-of-chapter-3" aria-hidden="true"></a> Summary of Chapter 3</h3>
<p data-line="225" class="code-line">In this chapter, we focused on the user perspective—"what can be done with what operations." Building on the usage scenarios discussed here, the following chapters will systematically explain the implementation details. We will cover: - How to integrate Go-based bot with Python Lambdas. - How Slack events are processed and how RAG search works. - The details of the file extraction and sanitization processes.</p>
<hr data-line="227" class="code-line" />
<h2 id="4.-implementation-policy-and-internal-design" data-line="229" class="code-line"><a class="header-anchor-link" href="#4.-implementation-policy-and-internal-design" aria-hidden="true"></a> 4. Implementation Policy and Internal Design</h2>
<p data-line="231" class="code-line">In this chapter, we will explain the implementation policy and internal design behind how internal generative AI tool provides a variety of generative AI functions through Slack messages and reactions.
This section focuses on the overall concept, role distribution, and scalability considerations. Specific code snippets and configuration files will be provided in the next chapter.</p>
<h3 id="overall-process-flow" data-line="234" class="code-line"><a class="header-anchor-link" href="#overall-process-flow" aria-hidden="true"></a> Overall Process Flow</h3>
<ol data-line="236" class="code-line">
<li data-line="236" class="code-line">
<p data-line="236" class="code-line"><strong>Slack event reception (Go Lambda)</strong>
Events occurring in Slack—such as message posts, file attachments, image uploads, external link insertions, and emoji reactions—are sent to an AWS Lambda function written in Go via the Slack API.
The Go function then analyzes these events and determines the appropriate action based on user intent (e.g., normal chat, translation, Confluence reference).</p>
</li>
<li data-line="240" class="code-line">
<p data-line="240" class="code-line"><strong>Text processing and sanitization on the Go side</strong>
Text is extracted from external links or files and added to the prompt as context. When referencing external links, meaningful tags such as <code>table</code>, <code>ol</code>, <code>ul</code> are preserved while unnecessary tags are removed to optimize token usage.</p>
</li>
<li data-line="243" class="code-line">
<p data-line="243" class="code-line"><strong>LLM queries and RAG searches on the Python side</strong>
If necessary, the Go function invokes a Python Lambda for LLM queries or a separate Python Lambda for RAG searches (e.g., for Confluence references).
For example, if <code>:confluence:</code> is included in the request, the Go function calls the RAG search Lambda. If no index is specified, it defaults to the primary index.
Otherwise, the text is passed to the LLM Lambda for standard query processing.</p>
</li>
<li data-line="248" class="code-line">
<p data-line="248" class="code-line"><strong>Reply and display in Slack</strong>
The Python Lambda returns the generated response to the Go function, which then posts it back to Slack.
This enables users to access advanced features through familiar Slack interactions—such as emojis, keywords, and file attachments—without needing to memorize commands.</p>
</li>
</ol>
<h3 id="function-determination-by-conditional-branching" data-line="252" class="code-line"><a class="header-anchor-link" href="#function-determination-by-conditional-branching" aria-hidden="true"></a> Function Determination by Conditional Branching</h3>
<p data-line="254" class="code-line">Processing is routed based on specific emojis (e.g., <code>:confluence:</code>), keywords, the presence of files or images, and whether an external link is included.
To add new features, simply introduce new conditions on the Go side and, if necessary, extend the logic for invoking the corresponding Python Lambdas (e.g., for LLM or RAG tasks).</p>
<h3 id="the-role-of-sanitization" data-line="257" class="code-line"><a class="header-anchor-link" href="#the-role-of-sanitization" aria-hidden="true"></a> The Role of Sanitization</h3>
<p data-line="259" class="code-line">Sanitizing on the Go side removes unnecessary HTML tags to improve token efficiency and ensure clean input for the model.
Key structural elements such as table, ol, and ul are retained to preserve the information structure and maintain useful context for the model.</p>
<h3 id="limiting-rag-usage-to-confluence-references" data-line="262" class="code-line"><a class="header-anchor-link" href="#limiting-rag-usage-to-confluence-references" aria-hidden="true"></a> Limiting RAG Usage to Confluence References</h3>
<p data-line="264" class="code-line">RAG search is only performed when explicitly specified with <code>:confluence:</code>.
By default, summarization, translation, and Q&A tasks are handled via direct LLM queries, ensuring RAG logic is triggered only for Confluence references.
Embedding generation for Confluence documents and FAISS index updates are handled periodically by StepFunctions, ensuring that the latest index is always available for queries.</p>
<h3 id="considerations-for-scalability-and-maintainability" data-line="268" class="code-line"><a class="header-anchor-link" href="#considerations-for-scalability-and-maintainability" aria-hidden="true"></a> Considerations for Scalability and Maintainability</h3>
<p data-line="270" class="code-line">Conditional branching based on emojis, keywords, or the presence of files/images minimizes the number of code modifications required when introducing new features, enhancing maintainability.
The separation of concerns—where text formatting and sanitization are handled on the Go side, while LLM queries and RAG searches are managed on the Python side—improves code clarity and facilitates future model replacements or additional processing logic.</p>
<p data-line="273" class="code-line">In the next chapter, we will introduce specific code snippets and configuration examples based on these design principles.</p>
<hr data-line="275" class="code-line" />
<h2 id="5.-introduction-to-code-examples-and-configuration-files" data-line="277" class="code-line"><a class="header-anchor-link" href="#5.-introduction-to-code-examples-and-configuration-files" aria-hidden="true"></a> 5. Introduction to Code examples and Configuration Files</h2>
<p data-line="279" class="code-line">This chapter introduces a brief implementation example based on the implementation policy and design concepts explained in Chapter 4.</p>
<p data-line="281" class="code-line">This chapter contains the following sections:</p>
<ul data-line="283" class="code-line">
<li data-line="283" class="code-line">
<p data-line="283" class="code-line">5.1 [Go] Receiving and parsing Slack events
Explains how to use Slack's Events API to receive and process events such as messages and emoji reactions.</p>
</li>
<li data-line="286" class="code-line">
<p data-line="286" class="code-line">5.2 [Go] HTML text sanitization
Provides an example of sanitizing HTML when referencing external links.</p>
</li>
<li data-line="289" class="code-line">
<p data-line="289" class="code-line">5.3 [Python] Example of an LLM query
Shows how to query an LLM using a Python-based Lambda function.</p>
</li>
<li data-line="292" class="code-line">
<p data-line="292" class="code-line">5.4 [Python] Example of a RAG search call
Demonstrates how to perform a RAG search call, such as for Confluence lookups.</p>
</li>
<li data-line="295" class="code-line">
<p data-line="295" class="code-line">5.5 [Python] Embedding and FAISS indexing.
Provides an example of Lambda code that periodically embeds Confluence documents and updates the FAISS index.</p>
</li>
</ul>
<h3 id="5.1-%5Bgo%5D-receiving-and-parsing-slack-events" data-line="298" class="code-line"><a class="header-anchor-link" href="#5.1-%5Bgo%5D-receiving-and-parsing-slack-events" aria-hidden="true"></a> 5.1 [Go] Receiving and Parsing Slack Events</h3>
<p data-line="300" class="code-line">This section explains the basic steps for using the Slack Events API to receive and analyze events with Go code on AWS Lambda.
We will also cover the settings on the Slack side (OAuth & Permissions, event subscription) and how to check the scopes required when using the <code>chat.postMessage</code> method (such as <code>chat:write</code>), to clarify the necessary preparations before implementation.</p>
<h4 id="configuration-steps-on-slack-side" data-line="303" class="code-line"><a class="header-anchor-link" href="#configuration-steps-on-slack-side" aria-hidden="true"></a> Configuration Steps on Slack Side</h4>
<ul data-line="305" class="code-line">
<li data-line="305" class="code-line">
<p data-line="305" class="code-line"><strong>Create an app and check the App ID</strong>:
Create a new app at <a href="https://api.slack.com/apps" target="_blank" rel="nofollow noopener noreferrer">https://api.slack.com/apps</a>.
Once created, find your App ID (a string starting with <code>A</code>) on the Basic Information page (<code>https://api.slack.com/apps/APP_ID/general</code>, where <code>APP_ID</code> is a unique ID for your app).
This App ID identifies your Slack App and can be used to access the URLs for <code>OAuth & Permissions</code> and <code>Event Subscriptions</code> pages described below.</p>
</li>
<li data-line="310" class="code-line">
<p data-line="310" class="code-line"><strong>Granting scopes via OAuth & Permissions</strong>:
Visit the OAuth & Permissions page (<code>https://api.slack.com/apps/APP_ID/oauth</code>) and add the necessary scopes to Bot Token Scopes.
For example, if the <code>chat.postMessage</code> method is needed to post messages to a channel, checking this page (<a href="https://api.slack.com/methods/chat.postMessage" target="_blank" rel="nofollow noopener noreferrer">https://api.slack.com/methods/chat.postMessage</a>) under "Required scopes" will indicate that <code>chat:write</code> is required.
After granting the scope, click "reinstall your app" to apply the changes in your workspace. Then, the changes will be reflected.</p>
<p data-line="315" class="code-line"><img src="/assets/blog/authors/torii/2024-12-23_ai_tool_slack/images/image.png" alt="Checking required scopes" width="500" /> <em>Checking required scopes</em> <img src="/assets/blog/authors/torii/2024-12-23_ai_tool_slack/images/image-1.png" alt="Setting scope" width="500" /> <em>Setting scope</em></p>
</li>
<li data-line="317" class="code-line">
<p data-line="317" class="code-line"><strong>Enabling the Events API and subscribing to events</strong>:
Enable the Events API on the "Event Subscriptions" page (<code>https://api.slack.com/apps/APP_ID/event-subscriptions</code>) and set the AWS Lambda endpoint described below in "Request URL".
Add the events you want to subscribe to, such as <code>message.channels</code> or <code>reaction_added</code>. This allows Slack to send a notification to the specified URL whenever a subscribed event occurs.</p>
<p data-line="321" class="code-line"><img src="/assets/blog/authors/torii/2024-12-23_ai_tool_slack/images/image-3.png" alt="Event Subscriptions Explanation" width="500" /></p>
</li>
</ul>
<h4 id="event-reception-and-analysis-on-aws-lambda" data-line="323" class="code-line"><a class="header-anchor-link" href="#event-reception-and-analysis-on-aws-lambda" aria-hidden="true"></a> Event Reception and Analysis on AWS Lambda</h4>
<p data-line="325" class="code-line">Once the configuration is complete on the Slack side, Slack will send a POST request to AWS Lambda via API Gateway whenever a subscribed event occurs.</p>
<h5 data-line="327" class="code-line">Step 1: Parsing Slack Events</h5>
<p data-line="329" class="code-line">Use the <code>slack-go/slackevents</code> package to parse the received JSON into an <code>EventsAPIEvent</code> structure.
This makes it easier to identify event types, such as URL validation and CallbackEvent.</p>
<div class="code-block-container"><pre class="language-go"><code class="language-go code-line" data-line="332"><span class="token keyword">func</span> <span class="token function">parseSlackEvent</span><span class="token punctuation">(</span>body <span class="token builtin">string</span><span class="token punctuation">)</span> <span class="token punctuation">(</span><span class="token operator">*</span>slackevents<span class="token punctuation">.</span>EventsAPIEvent<span class="token punctuation">,</span> <span class="token builtin">error</span><span class="token punctuation">)</span> <span class="token punctuation">{</span>
event<span class="token punctuation">,</span> err <span class="token operator">:=</span> slackevents<span class="token punctuation">.</span><span class="token function">ParseEvent</span><span class="token punctuation">(</span>json<span class="token punctuation">.</span><span class="token function">RawMessage</span><span class="token punctuation">(</span>body<span class="token punctuation">)</span><span class="token punctuation">,</span> slackevents<span class="token punctuation">.</span><span class="token function">OptionNoVerifyToken</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span>
<span class="token keyword">if</span> err <span class="token operator">!=</span> <span class="token boolean">nil</span> <span class="token punctuation">{</span>
<span class="token keyword">return</span> <span class="token boolean">nil</span><span class="token punctuation">,</span> fmt<span class="token punctuation">.</span><span class="token function">Errorf</span><span class="token punctuation">(</span><span class="token string">"Failed to parse Slack event: %w"</span><span class="token punctuation">,</span> err<span class="token punctuation">)</span>
<span class="token punctuation">}</span>
<span class="token keyword">return</span> <span class="token operator">&</span>event<span class="token punctuation">,</span> <span class="token boolean">nil</span>
<span class="token punctuation">}</span>
</code></pre></div><span class="embed-block zenn-embedded zenn-embedded-card"><iframe id="zenn-embedded__6567bcfc82b5c" src="https://embed.zenn.studio/card#zenn-embedded__6567bcfc82b5c" data-content="https%3A%2F%2Fgithub.com%2Fslack-go%2Fslack" frameborder="0" scrolling="no" loading="lazy"></iframe></span>
<h5 data-line="344" class="code-line">Step 2: Handling URL Verification Requests</h5>
<p data-line="346" class="code-line">When setting up the integration, Slack will initially send an event with <code>type=url_verification</code>. To verify the URL, simply return the <code>challenge</code> value as received. Once verified, Slack will continue sending event notifications.</p>
<div class="code-block-container"><pre class="language-go"><code class="language-go code-line" data-line="348"><span class="token keyword">func</span> <span class="token function">handleURLVerification</span><span class="token punctuation">(</span>body <span class="token builtin">string</span><span class="token punctuation">)</span> <span class="token punctuation">(</span>events<span class="token punctuation">.</span>APIGatewayProxyResponse<span class="token punctuation">,</span> <span class="token builtin">error</span><span class="token punctuation">)</span> <span class="token punctuation">{</span>
<span class="token keyword">var</span> r <span class="token keyword">struct</span> <span class="token punctuation">{</span>
Challenge <span class="token builtin">string</span> <span class="token string">json:"challenge"</span>
<span class="token punctuation">}</span>
<span class="token keyword">if</span> err <span class="token operator">:=</span> json<span class="token punctuation">.</span><span class="token function">Unmarshal</span><span class="token punctuation">(</span><span class="token punctuation">[</span><span class="token punctuation">]</span><span class="token function">byte</span><span class="token punctuation">(</span>body<span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token operator">&</span>r<span class="token punctuation">)</span><span class="token punctuation">;</span> err <span class="token operator">!=</span> <span class="token boolean">nil</span> <span class="token punctuation">{</span>
<span class="token keyword">return</span> <span class="token function">createErrorResponse</span><span class="token punctuation">(</span><span class="token number">400</span><span class="token punctuation">,</span> err<span class="token punctuation">)</span>
<span class="token punctuation">}</span>
<span class="token keyword">return</span> events<span class="token punctuation">.</span>APIGatewayProxyResponse<span class="token punctuation">{</span>
StatusCode<span class="token punctuation">:</span> <span class="token number">200</span><span class="token punctuation">,</span>
Body<span class="token punctuation">:</span> r<span class="token punctuation">.</span>Challenge<span class="token punctuation">,</span>
<span class="token punctuation">}</span><span class="token punctuation">,</span> <span class="token boolean">nil</span>
<span class="token punctuation">}</span>
</code></pre></div><span class="embed-block zenn-embedded zenn-embedded-card"><iframe id="zenn-embedded__90c2c2c44dc2d" src="https://embed.zenn.studio/card#zenn-embedded__90c2c2c44dc2d" data-content="https%3A%2F%2Fapi.slack.com%2Fauthentication%2Fverifying-requests-from-slack" frameborder="0" scrolling="no" loading="lazy"></iframe></span>
<h5 data-line="365" class="code-line">Step 3: Verifying Signatures and Ignoring Retry Requests</h5>
<p data-line="367" class="code-line">Slack includes a request signature that allows verification of authenticity (implementation omitted).
Additionally, in case of a failure or outage, Slack may resend the request as a retry. The <code>X-Slack-Retry-Num</code> header can be used to identify retry attempts and prevent processing the same event multiple times.</p>
<div class="code-block-container"><pre class="language-go"><code class="language-go code-line" data-line="370"><span class="token keyword">func</span> <span class="token function">verifySlackRequest</span><span class="token punctuation">(</span>body <span class="token builtin">string</span><span class="token punctuation">,</span> headers http<span class="token punctuation">.</span>Header<span class="token punctuation">)</span> <span class="token builtin">error</span> <span class="token punctuation">{</span>
<span class="token comment">// Signature verification process (omitted)</span>
<span class="token keyword">return</span> <span class="token boolean">nil</span>
<span class="token punctuation">}</span>
<span class="token keyword">func</span> <span class="token function">isSlackRetry</span><span class="token punctuation">(</span>headers http<span class="token punctuation">.</span>Header<span class="token punctuation">)</span> <span class="token builtin">bool</span> <span class="token punctuation">{</span>
<span class="token keyword">return</span> headers<span class="token punctuation">.</span><span class="token function">Get</span><span class="token punctuation">(</span><span class="token string">"X-Slack-Retry-Num"</span><span class="token punctuation">)</span> <span class="token operator">!=</span> <span class="token string">""</span>
<span class="token punctuation">}</span>
<span class="token keyword">func</span> <span class="token function">createIgnoredRetryResponse</span><span class="token punctuation">(</span><span class="token punctuation">)</span> <span class="token punctuation">(</span>events<span class="token punctuation">.</span>APIGatewayProxyResponse<span class="token punctuation">,</span> <span class="token builtin">error</span><span class="token punctuation">)</span> <span class="token punctuation">{</span>
responseBody<span class="token punctuation">,</span> <span class="token boolean">_</span> <span class="token operator">:=</span> json<span class="token punctuation">.</span><span class="token function">Marshal</span><span class="token punctuation">(</span><span class="token keyword">map</span><span class="token punctuation">[</span><span class="token builtin">string</span><span class="token punctuation">]</span><span class="token builtin">string</span><span class="token punctuation">{</span><span class="token string">"message"</span><span class="token punctuation">:</span> <span class="token string">"Ignoring Slack retry request"</span><span class="token punctuation">}</span><span class="token punctuation">)</span>
<span class="token keyword">return</span> events<span class="token punctuation">.</span>APIGatewayProxyResponse<span class="token punctuation">{</span>
StatusCode<span class="token punctuation">:</span> <span class="token number">200</span><span class="token punctuation">,</span>
Headers<span class="token punctuation">:</span> <span class="token keyword">map</span><span class="token punctuation">[</span><span class="token builtin">string</span><span class="token punctuation">]</span><span class="token builtin">string</span><span class="token punctuation">{</span><span class="token string">"Content-Type"</span><span class="token punctuation">:</span> <span class="token string">"application/json"</span><span class="token punctuation">}</span><span class="token punctuation">,</span>
Body<span class="token punctuation">:</span> <span class="token function">string</span><span class="token punctuation">(</span>responseBody<span class="token punctuation">)</span><span class="token punctuation">,</span>
<span class="token punctuation">}</span><span class="token punctuation">,</span> <span class="token boolean">nil</span>
<span class="token punctuation">}</span>
</code></pre></div><h5 data-line="390" class="code-line">Step 4: Handling CallbackEvent</h5>
<p data-line="392" class="code-line">The <code>CallbackEvent</code> includes actions such as message postings and adding reactions. At this stage, the system checks whether <code>:confluence:</code> is included in the message, if a file is attached, or if a translation-related emoji is present. Based on this assessment, it proceeds to text processing and Python Lambda invocation, as described in section 5.2 and beyond.</p>
<div class="code-block-container"><pre class="language-go"><code class="language-go code-line" data-line="394"><span class="token comment">// handleCallbackEvent processes callback events (covered in Section 5.1).</span>
<span class="token keyword">func</span> <span class="token function">handleCallbackEvent</span><span class="token punctuation">(</span>ctx context<span class="token punctuation">.</span>Context<span class="token punctuation">,</span> isOrchestrator <span class="token builtin">bool</span><span class="token punctuation">,</span> event <span class="token operator">*</span>slackevents<span class="token punctuation">.</span>EventsAPIEvent<span class="token punctuation">)</span> <span class="token punctuation">(</span>events<span class="token punctuation">.</span>APIGatewayProxyResponse<span class="token punctuation">,</span> <span class="token builtin">error</span><span class="token punctuation">)</span> <span class="token punctuation">{</span>
innerEvent <span class="token operator">:=</span> event<span class="token punctuation">.</span>InnerEvent
<span class="token keyword">switch</span> innerEvent<span class="token punctuation">.</span>Data<span class="token punctuation">.</span><span class="token punctuation">(</span><span class="token keyword">type</span><span class="token punctuation">)</span> <span class="token punctuation">{</span>
<span class="token keyword">case</span> <span class="token operator">*</span>slackevents<span class="token punctuation">.</span>AppMentionEvent<span class="token punctuation">:</span>
<span class="token comment">// Processing for AppMentionEvent (details explained in 5.2)</span>
<span class="token keyword">case</span> <span class="token operator">*</span>slackevents<span class="token punctuation">.</span>MessageEvent<span class="token punctuation">:</span>
<span class="token comment">// Processing for MessageEvent (details explained in 5.2)</span>
<span class="token keyword">case</span> <span class="token operator">*</span>slackevents<span class="token punctuation">.</span>ReactionAddedEvent<span class="token punctuation">:</span>
<span class="token comment">// Processing for ReactionAddedEvent (details explained in 5.2)</span>
<span class="token punctuation">}</span>
<span class="token keyword">return</span> events<span class="token punctuation">.</span>APIGatewayProxyResponse<span class="token punctuation">{</span>Body<span class="token punctuation">:</span> <span class="token string">"OK"</span><span class="token punctuation">,</span> StatusCode<span class="token punctuation">:</span> http<span class="token punctuation">.</span>StatusOK<span class="token punctuation">}</span><span class="token punctuation">,</span> <span class="token boolean">nil</span>
<span class="token punctuation">}</span>
</code></pre></div><h5 data-line="412" class="code-line">Complete Handler Code Example</h5>
<p data-line="414" class="code-line">These steps combine to define an AWS Lambda handler.</p>
<details><summary>Complete code example of the handler</summary><div class="details-content"><div class="code-block-container"><pre class="language-go"><code class="language-go code-line" data-line="418"><span class="token keyword">func</span> <span class="token function">handler</span><span class="token punctuation">(</span>ctx context<span class="token punctuation">.</span>Context<span class="token punctuation">,</span> request events<span class="token punctuation">.</span>APIGatewayProxyRequest<span class="token punctuation">)</span> <span class="token punctuation">(</span>events<span class="token punctuation">.</span>APIGatewayProxyResponse<span class="token punctuation">,</span> <span class="token builtin">error</span><span class="token punctuation">)</span> <span class="token punctuation">{</span>
event<span class="token punctuation">,</span> err <span class="token operator">:=</span> <span class="token function">parseSlackEvent</span><span class="token punctuation">(</span>request<span class="token punctuation">.</span>Body<span class="token punctuation">)</span>
<span class="token keyword">if</span> err <span class="token operator">!=</span> <span class="token boolean">nil</span> <span class="token punctuation">{</span>
<span class="token keyword">return</span> <span class="token function">createErrorResponse</span><span class="token punctuation">(</span><span class="token number">400</span><span class="token punctuation">,</span> err<span class="token punctuation">)</span>
<span class="token punctuation">}</span>
<span class="token keyword">if</span> event<span class="token punctuation">.</span>Type <span class="token operator">==</span> slackevents<span class="token punctuation">.</span>URLVerification <span class="token punctuation">{</span>
<span class="token keyword">return</span> <span class="token function">handleURLVerification</span><span class="token punctuation">(</span>request<span class="token punctuation">.</span>Body<span class="token punctuation">)</span>
<span class="token punctuation">}</span>
headers <span class="token operator">:=</span> <span class="token function">convertToHTTPHeader</span><span class="token punctuation">(</span>request<span class="token punctuation">.</span>Headers<span class="token punctuation">)</span>
err <span class="token operator">=</span> <span class="token function">verifySlackRequest</span><span class="token punctuation">(</span>request<span class="token punctuation">.</span>Body<span class="token punctuation">,</span> headers<span class="token punctuation">)</span>
<span class="token keyword">if</span> err <span class="token operator">!=</span> <span class="token boolean">nil</span> <span class="token punctuation">{</span>
<span class="token keyword">return</span> <span class="token function">createErrorResponse</span><span class="token punctuation">(</span>http<span class="token punctuation">.</span>StatusUnauthorized<span class="token punctuation">,</span> fmt<span class="token punctuation">.</span><span class="token function">Errorf</span><span class="token punctuation">(</span><span class="token string">"Failed to validate request: %w"</span><span class="token punctuation">,</span> err<span class="token punctuation">)</span><span class="token punctuation">)</span>
<span class="token punctuation">}</span>
<span class="token keyword">if</span> <span class="token function">isSlackRetry</span><span class="token punctuation">(</span>headers<span class="token punctuation">)</span> <span class="token punctuation">{</span>
<span class="token keyword">return</span> <span class="token function">createIgnoredRetryResponse</span><span class="token punctuation">(</span><span class="token punctuation">)</span>
<span class="token punctuation">}</span>
<span class="token keyword">if</span> event<span class="token punctuation">.</span>Type <span class="token operator">==</span> slackevents<span class="token punctuation">.</span>CallbackEvent <span class="token punctuation">{</span>
<span class="token keyword">return</span> <span class="token function">handleCallbackEvent</span><span class="token punctuation">(</span>ctx<span class="token punctuation">,</span> event<span class="token punctuation">)</span>
<span class="token punctuation">}</span>
<span class="token keyword">return</span> events<span class="token punctuation">.</span>APIGatewayProxyResponse<span class="token punctuation">{</span>Body<span class="token punctuation">:</span> <span class="token string">"OK"</span><span class="token punctuation">,</span> StatusCode<span class="token punctuation">:</span> <span class="token number">200</span><span class="token punctuation">}</span><span class="token punctuation">,</span> <span class="token boolean">nil</span>
<span class="token punctuation">}</span>
<span class="token keyword">func</span> <span class="token function">convertToHTTPHeader</span><span class="token punctuation">(</span>headers <span class="token keyword">map</span><span class="token punctuation">[</span><span class="token builtin">string</span><span class="token punctuation">]</span><span class="token builtin">string</span><span class="token punctuation">)</span> http<span class="token punctuation">.</span>Header <span class="token punctuation">{</span>
httpHeaders <span class="token operator">:=</span> http<span class="token punctuation">.</span>Header<span class="token punctuation">{</span><span class="token punctuation">}</span>
<span class="token keyword">for</span> key<span class="token punctuation">,</span> value <span class="token operator">:=</span> <span class="token keyword">range</span> headers <span class="token punctuation">{</span>
httpHeaders<span class="token punctuation">.</span><span class="token function">Set</span><span class="token punctuation">(</span>key<span class="token punctuation">,</span> value<span class="token punctuation">)</span>
<span class="token punctuation">}</span>
<span class="token keyword">return</span> httpHeaders
<span class="token punctuation">}</span>
<span class="token keyword">func</span> <span class="token function">createErrorResponse</span><span class="token punctuation">(</span>statusCode <span class="token builtin">int</span><span class="token punctuation">,</span> err <span class="token builtin">error</span><span class="token punctuation">)</span> <span class="token punctuation">(</span>events<span class="token punctuation">.</span>APIGatewayProxyResponse<span class="token punctuation">,</span> <span class="token builtin">error</span><span class="token punctuation">)</span> <span class="token punctuation">{</span>
responseBody<span class="token punctuation">,</span> <span class="token boolean">_</span> <span class="token operator">:=</span> json<span class="token punctuation">.</span><span class="token function">Marshal</span><span class="token punctuation">(</span><span class="token keyword">map</span><span class="token punctuation">[</span><span class="token builtin">string</span><span class="token punctuation">]</span><span class="token builtin">string</span><span class="token punctuation">{</span><span class="token string">"error"</span><span class="token punctuation">:</span> err<span class="token punctuation">.</span><span class="token function">Error</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">}</span><span class="token punctuation">)</span>
<span class="token keyword">return</span> events<span class="token punctuation">.</span>APIGatewayProxyResponse<span class="token punctuation">{</span>
StatusCode<span class="token punctuation">:</span> statusCode<span class="token punctuation">,</span>
Headers<span class="token punctuation">:</span> <span class="token keyword">map</span><span class="token punctuation">[</span><span class="token builtin">string</span><span class="token punctuation">]</span><span class="token builtin">string</span><span class="token punctuation">{</span><span class="token string">"Content-Type"</span><span class="token punctuation">:</span> <span class="token string">"application/json"</span><span class="token punctuation">}</span><span class="token punctuation">,</span>
Body<span class="token punctuation">:</span> <span class="token function">string</span><span class="token punctuation">(</span>responseBody<span class="token punctuation">)</span><span class="token punctuation">,</span>
<span class="token punctuation">}</span><span class="token punctuation">,</span> err
<span class="token punctuation">}</span>
</code></pre></div></div></details>
<h4 id="summary-of-5.1" data-line="466" class="code-line"><a class="header-anchor-link" href="#summary-of-5.1" aria-hidden="true"></a> Summary of 5.1</h4>
<p data-line="468" class="code-line">In this section, we explained how to obtain the Slack App's App ID, grant scopes using OAuth & Permissions, and configure event subscriptions in Event Subscriptions. We also covered the process of receiving and parsing Slack events, including handling URL verification, signature validation, ignoring retry requests, and processing CallbackEvents.
From section 5.2 onwards, we will introduce specific examples of CallbackEvent processing, text handling in Go, and sending queries to Python Lambda.</p>
<h3 id="5.2-%5Bgo%5D-html-text-sanitization" data-line="471" class="code-line"><a class="header-anchor-link" href="#5.2-%5Bgo%5D-html-text-sanitization" aria-hidden="true"></a> 5.2 [Go] HTML Text Sanitization</h3>
<h4 id="sanitizing-external-link-references" data-line="473" class="code-line"><a class="header-anchor-link" href="#sanitizing-external-link-references" aria-hidden="true"></a> Sanitizing External Link References</h4>
<p data-line="475" class="code-line">HTML text retrieved from external links may contain unnecessary tags such as <code>script</code> and <code>style</code>, which are not needed for generating responses. Passing this directly to the LLM increases the token count, leading to higher model costs and potentially reducing response accuracy. The following code uses the <code>bluemonday</code> package for basic sanitization. It removes unnecessary tags while preserving important ones like <code>table</code>, <code>ol</code>, and <code>ul</code>, ensuring the text remains well-structured and readable. Additionally, the <code>addNewlinesForTags</code> function inserts line breaks after specific tags, improving text formatting. This helps optimize queries to the model by ensuring that only the necessary information is passed in a structured and efficient format.</p>
<span class="embed-block zenn-embedded zenn-embedded-card"><iframe id="zenn-embedded__b53cd48133a09" src="https://embed.zenn.studio/card#zenn-embedded__b53cd48133a09" data-content="https%3A%2F%2Fgithub.com%2Fmicrocosm-cc%2Fbluemonday" frameborder="0" scrolling="no" loading="lazy"></iframe></span>
<div class="code-block-container"><pre class="language-go"><code class="language-go code-line" data-line="479"><span class="token keyword">func</span> <span class="token function">sanitizeContent</span><span class="token punctuation">(</span>htmlContent <span class="token builtin">string</span><span class="token punctuation">)</span> <span class="token builtin">string</span> <span class="token punctuation">{</span>
<span class="token comment">// Basic sanitization</span>
ugcPolicy <span class="token operator">:=</span> bluemonday<span class="token punctuation">.</span><span class="token function">UGCPolicy</span><span class="token punctuation">(</span><span class="token punctuation">)</span>
sanitized <span class="token operator">:=</span> ugcPolicy<span class="token punctuation">.</span><span class="token function">Sanitize</span><span class="token punctuation">(</span>htmlContent<span class="token punctuation">)</span>
<span class="token comment">// Allow specific tags in a custom policy</span>
customPolicy <span class="token operator">:=</span> bluemonday<span class="token punctuation">.</span><span class="token function">NewPolicy</span><span class="token punctuation">(</span><span class="token punctuation">)</span>
customPolicy<span class="token punctuation">.</span><span class="token function">AllowLists</span><span class="token punctuation">(</span><span class="token punctuation">)</span>
customPolicy<span class="token punctuation">.</span><span class="token function">AllowTables</span><span class="token punctuation">(</span><span class="token punctuation">)</span>
customPolicy<span class="token punctuation">.</span><span class="token function">AllowAttrs</span><span class="token punctuation">(</span><span class="token string">"href"</span><span class="token punctuation">)</span><span class="token punctuation">.</span><span class="token function">OnElements</span><span class="token punctuation">(</span><span class="token string">"a"</span><span class="token punctuation">)</span>
<span class="token comment">// Add line breaks after specific tags to improve readability</span>
formattedContent <span class="token operator">:=</span> <span class="token function">addNewlinesForTags</span><span class="token punctuation">(</span>sanitized<span class="token punctuation">,</span> <span class="token string">"p"</span><span class="token punctuation">)</span>
<span class="token comment">// Apply final sanitization after enforcing the custom policy</span>
finalContent <span class="token operator">:=</span> customPolicy<span class="token punctuation">.</span><span class="token function">Sanitize</span><span class="token punctuation">(</span>formattedContent<span class="token punctuation">)</span>
<span class="token keyword">return</span> finalContent
<span class="token punctuation">}</span>
<span class="token keyword">func</span> <span class="token function">addNewlinesForTags</span><span class="token punctuation">(</span>htmlStr <span class="token builtin">string</span><span class="token punctuation">,</span> tags <span class="token operator">...</span><span class="token builtin">string</span><span class="token punctuation">)</span> <span class="token builtin">string</span> <span class="token punctuation">{</span>
<span class="token keyword">for</span> <span class="token boolean">_</span><span class="token punctuation">,</span> tag <span class="token operator">:=</span> <span class="token keyword">range</span> tags <span class="token punctuation">{</span>
closeTag <span class="token operator">:=</span> fmt<span class="token punctuation">.</span><span class="token function">Sprintf</span><span class="token punctuation">(</span><span class="token string">"</%s>"</span><span class="token punctuation">,</span> tag<span class="token punctuation">)</span>
htmlStr <span class="token operator">=</span> strings<span class="token punctuation">.</span><span class="token function">ReplaceAll</span><span class="token punctuation">(</span>htmlStr<span class="token punctuation">,</span> closeTag<span class="token punctuation">,</span> closeTag<span class="token operator">+</span><span class="token string">"\n"</span><span class="token punctuation">)</span>
<span class="token punctuation">}</span>
<span class="token keyword">return</span> htmlStr
<span class="token punctuation">}</span>
</code></pre></div><p data-line="509" class="code-line">This process ensures that the model receives only text with unnecessary tags removed, improving response accuracy and cost efficiency. By preserving essential structures such as tables and bullet points while inserting line breaks after specific tags, the model can better interpret the provided context.</p>
<h3 id="5.3-%5Bpython%5D-example-of-an-llm-query" data-line="511" class="code-line"><a class="header-anchor-link" href="#5.3-%5Bpython%5D-example-of-an-llm-query" aria-hidden="true"></a> 5.3 [Python] Example of an LLM Query</h3>
<p data-line="513" class="code-line">Below is an example of how to query an LLM (e.g. Azure OpenAI) in Python. With <code>OpenAIClientFactory</code>, you can dynamically switch models and endpoints, enabling the reuse of a common client creation process across multiple Lambda handlers.</p>
<h4 id="client-creation-process" data-line="515" class="code-line"><a class="header-anchor-link" href="#client-creation-process" aria-hidden="true"></a> Client Creation Process</h4>
<p data-line="517" class="code-line"><code>OpenAIClientFactory</code> dynamically generates a client for either Azure OpenAI or OpenAI, depending on <code>api_type</code> and <code>model</code>.
Since API keys and endpoints are retrieved from environment variables and secret management services, code modifications are minimized even when updating models or configurations.</p>
<div class="code-block-container"><pre class="language-python"><code class="language-python code-line" data-line="520"><span class="token keyword">import</span> openai
<span class="token keyword">from</span> shared<span class="token punctuation">.</span>secrets <span class="token keyword">import</span> get_secret
<span class="token keyword">class</span> <span class="token class-name">OpenAIClientFactory</span><span class="token punctuation">:</span>
<span class="token decorator annotation punctuation">@staticmethod</span>
<span class="token keyword">def</span> <span class="token function">create_client</span><span class="token punctuation">(</span>region<span class="token operator">=</span><span class="token string">"eastus2"</span><span class="token punctuation">,</span> model<span class="token operator">=</span><span class="token string">"gpt-4o"</span><span class="token punctuation">)</span> <span class="token operator">-</span><span class="token operator">></span> openai<span class="token punctuation">.</span>OpenAI<span class="token punctuation">:</span>
secret <span class="token operator">=</span> get_secret<span class="token punctuation">(</span><span class="token punctuation">)</span>
api_type <span class="token operator">=</span> secret<span class="token punctuation">.</span>get<span class="token punctuation">(</span><span class="token string">"openai_api_type"</span><span class="token punctuation">,</span> <span class="token string">"azure"</span><span class="token punctuation">)</span>
<span class="token keyword">if</span> api_type <span class="token operator">==</span> <span class="token string">"azure"</span><span class="token punctuation">:</span>
<span class="token keyword">return</span> openai<span class="token punctuation">.</span>AzureOpenAI<span class="token punctuation">(</span>
api_key<span class="token operator">=</span>secret<span class="token punctuation">.</span>get<span class="token punctuation">(</span><span class="token string-interpolation"><span class="token string">f"azure_openai_api_key_</span><span class="token interpolation"><span class="token punctuation">{</span>region<span class="token punctuation">}</span></span><span class="token string">"</span></span><span class="token punctuation">)</span><span class="token punctuation">,</span>
azure_endpoint<span class="token operator">=</span>secret<span class="token punctuation">.</span>get<span class="token punctuation">(</span><span class="token string-interpolation"><span class="token string">f"azure_openai_endpoint_</span><span class="token interpolation"><span class="token punctuation">{</span>region<span class="token punctuation">}</span></span><span class="token string">"</span></span><span class="token punctuation">)</span><span class="token punctuation">,</span>
api_version<span class="token operator">=</span>secret<span class="token punctuation">.</span>get<span class="token punctuation">(</span>
<span class="token string-interpolation"><span class="token string">f"azure_openai_api_version_</span><span class="token interpolation"><span class="token punctuation">{</span>region<span class="token punctuation">}</span></span><span class="token string">"</span></span><span class="token punctuation">,</span> <span class="token string">"2024-07-01-preview"</span>
<span class="token punctuation">)</span><span class="token punctuation">,</span>
<span class="token punctuation">)</span>
<span class="token keyword">elif</span> api_type <span class="token operator">==</span> <span class="token string">"openai"</span><span class="token punctuation">:</span>
<span class="token keyword">return</span> openai<span class="token punctuation">.</span>OpenAI<span class="token punctuation">(</span>api_key<span class="token operator">=</span>secret<span class="token punctuation">.</span>get<span class="token punctuation">(</span><span class="token string">"openai_api_key"</span><span class="token punctuation">)</span><span class="token punctuation">)</span>
<span class="token keyword">raise</span> ValueError<span class="token punctuation">(</span><span class="token string-interpolation"><span class="token string">f"Invalid api_type: </span><span class="token interpolation"><span class="token punctuation">{</span>api_type<span class="token punctuation">}</span></span><span class="token string">"</span></span><span class="token punctuation">)</span>
</code></pre></div><h4 id="llm-query-processing" data-line="544" class="code-line"><a class="header-anchor-link" href="#llm-query-processing" aria-hidden="true"></a> LLM Query Processing</h4>
<p data-line="546" class="code-line">The <code>chatCompletionHandler</code> function extracts <code>messages</code>, <code>model</code>, <code>temperature</code>, and other parameters from the JSON received in the HTTP request. It then queries the LLM using the client generated by <code>OpenAIClientFactory</code>.
Responses are returned in JSON format. If an error occurs, a properly formatted error response is generated using a common error handling function.</p>
<div class="code-block-container"><pre class="language-python"><code class="language-python code-line" data-line="549"><span class="token keyword">import</span> json
<span class="token keyword">from</span> typing <span class="token keyword">import</span> Any<span class="token punctuation">,</span> Dict<span class="token punctuation">,</span> List
<span class="token keyword">import</span> openai
<span class="token keyword">from</span> openai<span class="token punctuation">.</span>types<span class="token punctuation">.</span>chat <span class="token keyword">import</span> ChatCompletionMessageParam
<span class="token keyword">from</span> shared<span class="token punctuation">.</span>openai_client <span class="token keyword">import</span> OpenAIClientFactory
<span class="token keyword">def</span> <span class="token function">chatCompletionHandler</span><span class="token punctuation">(</span>event<span class="token punctuation">:</span> Dict<span class="token punctuation">[</span><span class="token builtin">str</span><span class="token punctuation">,</span> Any<span class="token punctuation">]</span><span class="token punctuation">,</span> context<span class="token punctuation">:</span> Any<span class="token punctuation">)</span> <span class="token operator">-</span><span class="token operator">></span> Dict<span class="token punctuation">[</span><span class="token builtin">str</span><span class="token punctuation">,</span> Any<span class="token punctuation">]</span><span class="token punctuation">:</span>
request_body <span class="token operator">=</span> json<span class="token punctuation">.</span>loads<span class="token punctuation">(</span>event<span class="token punctuation">[</span><span class="token string">"body"</span><span class="token punctuation">]</span><span class="token punctuation">)</span>
messages<span class="token punctuation">:</span> List<span class="token punctuation">[</span>ChatCompletionMessageParam<span class="token punctuation">]</span> <span class="token operator">=</span> request_body<span class="token punctuation">.</span>get<span class="token punctuation">(</span><span class="token string">"messages"</span><span class="token punctuation">,</span> <span class="token punctuation">[</span><span class="token punctuation">]</span><span class="token punctuation">)</span>
model <span class="token operator">=</span> request_body<span class="token punctuation">.</span>get<span class="token punctuation">(</span><span class="token string">"model"</span><span class="token punctuation">,</span> <span class="token string">"gpt-4o"</span><span class="token punctuation">)</span>
client <span class="token operator">=</span> OpenAIClientFactory<span class="token punctuation">.</span>create_client<span class="token punctuation">(</span>model<span class="token operator">=</span>model<span class="token punctuation">)</span>
temperature <span class="token operator">=</span> request_body<span class="token punctuation">.</span>get<span class="token punctuation">(</span><span class="token string">"temperature"</span><span class="token punctuation">,</span> <span class="token number">0.7</span><span class="token punctuation">)</span>
max_tokens <span class="token operator">=</span> request_body<span class="token punctuation">.</span>get<span class="token punctuation">(</span><span class="token string">"max_tokens"</span><span class="token punctuation">,</span> <span class="token number">4000</span><span class="token punctuation">)</span>
response_format <span class="token operator">=</span> request_body<span class="token punctuation">.</span>get<span class="token punctuation">(</span><span class="token string">"response_format"</span><span class="token punctuation">,</span> <span class="token boolean">None</span><span class="token punctuation">)</span>
completion <span class="token operator">=</span> client<span class="token punctuation">.</span>chat<span class="token punctuation">.</span>completions<span class="token punctuation">.</span>create<span class="token punctuation">(</span>
model<span class="token operator">=</span>model<span class="token punctuation">,</span>
stream<span class="token operator">=</span><span class="token boolean">False</span><span class="token punctuation">,</span>
messages<span class="token operator">=</span>messages<span class="token punctuation">,</span>
max_tokens<span class="token operator">=</span>max_tokens<span class="token punctuation">,</span>
frequency_penalty<span class="token operator">=</span><span class="token number">0</span><span class="token punctuation">,</span>
presence_penalty<span class="token operator">=</span><span class="token number">0</span><span class="token punctuation">,</span>
temperature<span class="token operator">=</span>temperature<span class="token punctuation">,</span>
response_format<span class="token operator">=</span>response_format<span class="token punctuation">,</span>
<span class="token punctuation">)</span>
<span class="token keyword">return</span> <span class="token punctuation">{</span>
<span class="token string">"statusCode"</span><span class="token punctuation">:</span> <span class="token number">200</span><span class="token punctuation">,</span>
<span class="token string">"body"</span><span class="token punctuation">:</span> json<span class="token punctuation">.</span>dumps<span class="token punctuation">(</span>completion<span class="token punctuation">.</span>to_dict<span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">,</span>
<span class="token string">"headers"</span><span class="token punctuation">:</span> <span class="token punctuation">{</span>
<span class="token string">"Content-Type"</span><span class="token punctuation">:</span> <span class="token string">"application/json"</span><span class="token punctuation">,</span>
<span class="token string">"Access-Control-Allow-Origin"</span><span class="token punctuation">:</span> <span class="token string">"*"</span><span class="token punctuation">,</span>
<span class="token string">"Access-Control-Allow-Methods"</span><span class="token punctuation">:</span> <span class="token string">"OPTIONS,POST"</span><span class="token punctuation">,</span>
<span class="token string">"Access-Control-Allow-Headers"</span><span class="token punctuation">:</span> <span class="token string">"Content-Type,X-Amz-Date,Authorization,X-Api-Key,X-Amz-Security-Token"</span><span class="token punctuation">,</span>
<span class="token punctuation">}</span><span class="token punctuation">,</span>
<span class="token punctuation">}</span>
</code></pre></div><p data-line="590" class="code-line">This mechanism allows different Lambda handlers to make LLM queries using the same procedure, ensuring flexibility in adapting to models and endpoint changes.</p>
<h3 id="5.4-%5Bpython%5D-example-of-a-rag-search-call" data-line="592" class="code-line"><a class="header-anchor-link" href="#5.4-%5Bpython%5D-example-of-a-rag-search-call" aria-hidden="true"></a> 5.4 [Python] Example of a RAG Search Call</h3>
<p data-line="594" class="code-line">This section provides instructions on performing a Retrieval Augmented Generation (RAG) search in Python.
By vectorizing internal knowledge, such as Confluence documents, and performing similarity searches using the FAISS index, it is possible to integrate highly relevant information into LLM responses.</p>
<p data-line="597" class="code-line">A key consideration is the handling of the <code>faiss</code> library. <code>faiss</code> is a large package and may exceed the capacity limits of the Lambda Layers. To work around this, it is common to use EFS or containerize the Lambda function.
To simplify deployment, the <code>setup_faiss</code> function dynamically downloads and extracts <code>faiss</code> from S3, then adds it to <code>sys.path</code>, making <code>faiss</code> available at runtime.</p>
<h4 id="what-is-faiss%3F" data-line="600" class="code-line"><a class="header-anchor-link" href="#what-is-faiss%3F" aria-hidden="true"></a> What is FAISS?</h4>
<p data-line="602" class="code-line">FAISS (Facebook AI Similarity Search) is an approximate nearest neighbor search library developed by Meta (Facebook). It provides tools for creating indexes to efficiently search for similar images and text.</p>
<span class="embed-block zenn-embedded zenn-embedded-card"><iframe id="zenn-embedded__5014caebd59cf" src="https://embed.zenn.studio/card#zenn-embedded__5014caebd59cf" data-content="https%3A%2F%2Fnote.com%2Fnpaka%2Fn%2Fnb766e344a4fc" frameborder="0" scrolling="no" loading="lazy"></iframe></span>
<h4 id="faiss-setup-using-the-setup_faiss-function" data-line="606" class="code-line"><a class="header-anchor-link" href="#faiss-setup-using-the-setup_faiss-function" aria-hidden="true"></a> FAISS Setup Using the <code>setup_faiss</code> Function</h4>
<p data-line="608" class="code-line">To use FAISS in the Lambda environment, the <code>setup_faiss</code> function performs the following steps:</p>
<ol data-line="610" class="code-line">
<li data-line="610" class="code-line">
<p data-line="610" class="code-line"><strong>Build and archive the faiss package in a local/CI environment</strong>
Developers install the <code>faiss-cpu</code> package in a CI environment such as GitHub Actions and package the necessary binaries into a <code>tar.gz</code> archive.</p>
</li>
<li data-line="613" class="code-line">
<p data-line="613" class="code-line"><strong>Upload to S3</strong>
The archived <code>faiss_package.tar.gz</code> is uploaded to an S3 bucket.
By storing the package in an appropriate bucket and path (e.g., for staging or production), the Lambda function can dynamically retrieve it during execution.</p>
</li>
<li data-line="617" class="code-line">
<p data-line="617" class="code-line"><strong>Dynamic loading with <code>setup_faiss</code> when running Lambda</strong>
In the Lambda execution environment, the <code>setup_faiss</code> function downloads and extracts <code>faiss_package.tar.gz</code> from S3 at startup and adds it to <code>sys.path</code>.
This enables the Lambda function to run <code>import faiss</code>, allowing for efficient vector searches using embeddings.</p>
</li>
</ol>
<h5 data-line="621" class="code-line">Example: Uploading the FAISS Package to S3 Using GitHub Actions</h5>
<p data-line="623" class="code-line">The following GitHub Actions workflow demonstrates how to install <code>faiss-cpu</code>, package it for Lambda use, and upload it to S3.</p>
<p data-line="625" class="code-line">This setup uses GitHub Actions Secrets and Environment Variables to manage AWS credentials and S3 bucket names securely, avoiding hardcoded values.</p>
<span class="embed-block zenn-embedded zenn-embedded-card"><iframe id="zenn-embedded__7d84913159fbb" src="https://embed.zenn.studio/card#zenn-embedded__7d84913159fbb" data-content="https%3A%2F%2Fdocs.github.com%2Fja%2Factions%2Fsecurity-for-github-actions%2Fsecurity-guides%2Fusing-secrets-in-github-actions%23creating-secrets-for-an-environment" frameborder="0" scrolling="no" loading="lazy"></iframe></span>
<div class="code-block-container"><pre class="language-yaml"><code class="language-yaml code-line" data-line="629"><span class="token key atrule">name</span><span class="token punctuation">:</span> Build and Upload FAISS
<span class="token key atrule">on</span><span class="token punctuation">:</span>
<span class="token key atrule">workflow_dispatch</span><span class="token punctuation">:</span>
<span class="token key atrule">inputs</span><span class="token punctuation">:</span>
<span class="token key atrule">environment</span><span class="token punctuation">:</span>
<span class="token key atrule">description</span><span class="token punctuation">:</span> Deployment Environment
<span class="token key atrule">type</span><span class="token punctuation">:</span> environment
<span class="token key atrule">default</span><span class="token punctuation">:</span> dev
<span class="token key atrule">jobs</span><span class="token punctuation">:</span>
<span class="token key atrule">build-and-upload-faiss</span><span class="token punctuation">:</span>
<span class="token key atrule">environment</span><span class="token punctuation">:</span> $<span class="token punctuation">{</span><span class="token punctuation">{</span> inputs.environment <span class="token punctuation">}</span><span class="token punctuation">}</span>
<span class="token key atrule">runs-on</span><span class="token punctuation">:</span> ubuntu<span class="token punctuation">-</span>latest
<span class="token key atrule">steps</span><span class="token punctuation">:</span>
<span class="token punctuation">-</span> <span class="token key atrule">uses</span><span class="token punctuation">:</span> actions/checkout@v4
<span class="token punctuation">-</span> <span class="token key atrule">uses</span><span class="token punctuation">:</span> actions/setup<span class="token punctuation">-</span>python@v5
<span class="token key atrule">with</span><span class="token punctuation">:</span>
<span class="token key atrule">python-version</span><span class="token punctuation">:</span> <span class="token string">"3.11"</span>
<span class="token comment"># Install required packages (faiss-cpu)</span>
<span class="token punctuation">-</span> <span class="token key atrule">name</span><span class="token punctuation">:</span> Install faiss<span class="token punctuation">-</span>cpu
<span class="token key atrule">run</span><span class="token punctuation">:</span> <span class="token punctuation">|</span><span class="token scalar string">
set -e
echo "Installing faiss-cpu..."
pip install faiss-cpu --no-deps</span>
<span class="token comment"># Archive the faiss binary</span>
<span class="token punctuation">-</span> <span class="token key atrule">name</span><span class="token punctuation">:</span> Archive faiss binaries
<span class="token key atrule">run</span><span class="token punctuation">:</span> <span class="token punctuation">|</span><span class="token scalar string">
mkdir -p faiss_package
pip install --target=faiss_package faiss-cpu
tar -czvf faiss_package.tar.gz faiss_package</span>
<span class="token comment"># Set AWS credentials (configure Secrets or Roles based on your environment)</span>
<span class="token punctuation">-</span> <span class="token key atrule">name</span><span class="token punctuation">:</span> Configure AWS credentials
<span class="token key atrule">uses</span><span class="token punctuation">:</span> aws<span class="token punctuation">-</span>actions/configure<span class="token punctuation">-</span>aws<span class="token punctuation">-</span>credentials@v3
<span class="token key atrule">with</span><span class="token punctuation">:</span>
<span class="token key atrule">aws-access-key-id</span><span class="token punctuation">:</span> $<span class="token punctuation">{</span><span class="token punctuation">{</span> secrets.CICD_AWS_ACCESS_KEY_ID <span class="token punctuation">}</span><span class="token punctuation">}</span>
<span class="token key atrule">aws-secret-access-key</span><span class="token punctuation">:</span> $<span class="token punctuation">{</span><span class="token punctuation">{</span> secrets.CICD_AWS_SECRET_ACCESS_KEY <span class="token punctuation">}</span><span class="token punctuation">}</span>
<span class="token key atrule">aws-region</span><span class="token punctuation">:</span> ap<span class="token punctuation">-</span>northeast<span class="token punctuation">-</span><span class="token number">1</span>
<span class="token comment"># Upload to S3</span>
<span class="token punctuation">-</span> <span class="token key atrule">name</span><span class="token punctuation">:</span> Upload faiss binaries to S3
<span class="token key atrule">run</span><span class="token punctuation">:</span> <span class="token punctuation">|</span><span class="token scalar string">
echo "Uploading faiss_package.tar.gz to S3..."
aws s3 cp faiss_package.tar.gz s3://${{ secrets.AWS_S3_BUCKET }}/lambda/faiss_package.tar.gz
echo "Upload complete."</span>
</code></pre></div><p data-line="680" class="code-line">In the above example, <code>faiss_package.tar.gz</code> is uploaded to S3 with the key <code>lambda/faiss_package.tar.gz</code>.</p>
<h4 id="dynamic-loading-process-on-lambda-side-(setup_faiss-function)" data-line="682" class="code-line"><a class="header-anchor-link" href="#dynamic-loading-process-on-lambda-side-(setup_faiss-function)" aria-hidden="true"></a> Dynamic Loading Process on Lambda side (<code>setup_faiss</code> function)</h4>
<p data-line="684" class="code-line">The <code>setup_faiss</code> function handles the dynamic loading of FAISS at runtime. It downloads <code>faiss_package.tar.gz</code> from S3, extracts it to the <code>/tmp</code> directory, and appends the package path to <code>sys.path</code>. This enables <code>import faiss</code> to be executed within Lambda, allowing FAISS index lookups to be performed.</p>
<div class="code-block-container"><pre class="language-python"><code class="language-python code-line" data-line="686"><span class="token comment"># setup_faiss example: Download the FAISS package from S3 and add it to sys.path</span>
<span class="token keyword">import</span> os
<span class="token keyword">import</span> sys
<span class="token keyword">import</span> tarfile
<span class="token keyword">from</span> shared<span class="token punctuation">.</span>logger <span class="token keyword">import</span> getLogger
<span class="token keyword">from</span> shared<span class="token punctuation">.</span>s3_client <span class="token keyword">import</span> S3Client
logger <span class="token operator">=</span> getLogger<span class="token punctuation">(</span>name<span class="token punctuation">)</span>
<span class="token keyword">def</span> <span class="token function">setup_faiss</span><span class="token punctuation">(</span>s3_client<span class="token punctuation">:</span> S3Client<span class="token punctuation">,</span> s3_bucket<span class="token punctuation">:</span> <span class="token builtin">str</span><span class="token punctuation">)</span> <span class="token operator">-</span><span class="token operator">></span> <span class="token boolean">None</span><span class="token punctuation">:</span>
<span class="token keyword">try</span><span class="token punctuation">:</span>
<span class="token keyword">import</span> faiss
logger<span class="token punctuation">.</span>info<span class="token punctuation">(</span><span class="token string">"faiss has already been imported."</span><span class="token punctuation">)</span>
<span class="token keyword">except</span> ImportError<span class="token punctuation">:</span>
logger<span class="token punctuation">.</span>info<span class="token punctuation">(</span><span class="token string">"faiss not found. Downloading from S3."</span><span class="token punctuation">)</span>
faiss_package_key <span class="token operator">=</span> <span class="token string">"lambda/faiss_package.tar.gz"</span>
faiss_package_path <span class="token operator">=</span> <span class="token string">"/tmp/faiss_package.tar.gz"</span>
faiss_extract_path <span class="token operator">=</span> <span class="token string">"/tmp/faiss_package"</span>
<span class="token comment"># Download the package from S3 and extract it</span>
s3_client<span class="token punctuation">.</span>download_file<span class="token punctuation">(</span>bucket_name<span class="token operator">=</span>s3_bucket<span class="token punctuation">,</span> key<span class="token operator">=</span>faiss_package_key<span class="token punctuation">,</span> file_path<span class="token operator">=</span>faiss_package_path<span class="token punctuation">)</span>
<span class="token keyword">with</span> tarfile<span class="token punctuation">.</span><span class="token builtin">open</span><span class="token punctuation">(</span>faiss_package_path<span class="token punctuation">,</span> <span class="token string">"r:gz"</span><span class="token punctuation">)</span> <span class="token keyword">as</span> tar<span class="token punctuation">:</span>
<span class="token keyword">for</span> member <span class="token keyword">in</span> tar<span class="token punctuation">.</span>getmembers<span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">:</span>
member<span class="token punctuation">.</span>name <span class="token operator">=</span> os<span class="token punctuation">.</span>path<span class="token punctuation">.</span>relpath<span class="token punctuation">(</span>member<span class="token punctuation">.</span>name<span class="token punctuation">,</span> start<span class="token operator">=</span>member<span class="token punctuation">.</span>name<span class="token punctuation">.</span>split<span class="token punctuation">(</span><span class="token string">"/"</span><span class="token punctuation">)</span><span class="token punctuation">[</span><span class="token number">0</span><span class="token punctuation">]</span><span class="token punctuation">)</span>
tar<span class="token punctuation">.</span>extract<span class="token punctuation">(</span>member<span class="token punctuation">,</span> faiss_extract_path<span class="token punctuation">)</span>
sys<span class="token punctuation">.</span>path<span class="token punctuation">.</span>insert<span class="token punctuation">(</span><span class="token number">0</span><span class="token punctuation">,</span> faiss_extract_path<span class="token punctuation">)</span>
<span class="token keyword">import</span> faiss
logger<span class="token punctuation">.</span>info<span class="token punctuation">(</span><span class="token string">"faiss was imported successfully."</span><span class="token punctuation">)</span>
</code></pre></div><h4 id="rag-search-using-embeddings-and-faiss-indexes" data-line="718" class="code-line"><a class="header-anchor-link" href="#rag-search-using-embeddings-and-faiss-indexes" aria-hidden="true"></a> RAG Search Using Embeddings and FAISS Indexes</h4>
<p data-line="720" class="code-line">The <code>search_data</code> function loads the FAISS index retrieved from S3 locally and searches for documents that best match the query. Documents are vectorized using the Embeddings client (Azure OpenAI or OpenAI) generated by the <code>get_embeddings</code> function, enabling fast searches using <code>faiss</code>.</p>
<div class="code-block-container"><pre class="language-python"><code class="language-python code-line" data-line="722"><span class="token keyword">from</span> typing <span class="token keyword">import</span> Any<span class="token punctuation">,</span> Dict<span class="token punctuation">,</span> List<span class="token punctuation">,</span> Optional
<span class="token keyword">from</span> langchain_community<span class="token punctuation">.</span>vectorstores <span class="token keyword">import</span> FAISS
<span class="token keyword">from</span> langchain_core<span class="token punctuation">.</span>documents<span class="token punctuation">.</span>base <span class="token keyword">import</span> Document
<span class="token keyword">from</span> langchain_core<span class="token punctuation">.</span>vectorstores<span class="token punctuation">.</span>base <span class="token keyword">import</span> VectorStoreRetriever
<span class="token keyword">from</span> shared<span class="token punctuation">.</span>secrets <span class="token keyword">import</span> get_secret
<span class="token keyword">from</span> shared<span class="token punctuation">.</span>logger <span class="token keyword">import</span> getLogger
<span class="token keyword">from</span> langchain_openai <span class="token keyword">import</span> AzureOpenAIEmbeddings<span class="token punctuation">,</span> OpenAIEmbeddings
logger <span class="token operator">=</span> getLogger<span class="token punctuation">(</span>name<span class="token punctuation">)</span>
<span class="token keyword">def</span> <span class="token function">get_embeddings</span><span class="token punctuation">(</span>secrets<span class="token punctuation">:</span> Dict<span class="token punctuation">[</span><span class="token builtin">str</span><span class="token punctuation">,</span> <span class="token builtin">str</span><span class="token punctuation">]</span><span class="token punctuation">)</span><span class="token punctuation">:</span>
api_type<span class="token punctuation">:</span> <span class="token builtin">str</span> <span class="token operator">=</span> secrets<span class="token punctuation">.</span>get<span class="token punctuation">(</span><span class="token string">"openai_api_type"</span><span class="token punctuation">,</span> <span class="token string">"azure"</span><span class="token punctuation">)</span>
<span class="token keyword">if</span> api_type <span class="token operator">==</span> <span class="token string">"azure"</span><span class="token punctuation">:</span>
<span class="token keyword">return</span> AzureOpenAIEmbeddings<span class="token punctuation">(</span>
openai_api_key<span class="token operator">=</span>secrets<span class="token punctuation">.</span>get<span class="token punctuation">(</span><span class="token string">"azure_openai_api_key_eastus2"</span><span class="token punctuation">)</span><span class="token punctuation">,</span>
azure_endpoint<span class="token operator">=</span>secrets<span class="token punctuation">.</span>get<span class="token punctuation">(</span><span class="token string">"azure_openai_endpoint_eastus2"</span><span class="token punctuation">)</span><span class="token punctuation">,</span>
model<span class="token operator">=</span><span class="token string">"text-embedding-3-large"</span><span class="token punctuation">,</span>
api_version<span class="token operator">=</span>secrets<span class="token punctuation">.</span>get<span class="token punctuation">(</span><span class="token string">"azure_openai_api_version_eastus2"</span><span class="token punctuation">,</span> <span class="token string">"2023-07-01-preview"</span><span class="token punctuation">)</span><span class="token punctuation">,</span>
<span class="token punctuation">)</span>
<span class="token keyword">elif</span> api_type <span class="token operator">==</span> <span class="token string">"openai"</span><span class="token punctuation">:</span>
<span class="token keyword">return</span> OpenAIEmbeddings<span class="token punctuation">(</span>
openai_api_key<span class="token operator">=</span>secrets<span class="token punctuation">.</span>get<span class="token punctuation">(</span><span class="token string">"openai_api_key"</span><span class="token punctuation">)</span><span class="token punctuation">,</span>
model<span class="token operator">=</span><span class="token string">"text-embedding-3-large"</span><span class="token punctuation">,</span>
<span class="token punctuation">)</span>
<span class="token keyword">else</span><span class="token punctuation">:</span>
logger<span class="token punctuation">.</span>error<span class="token punctuation">(</span><span class="token string">"An invalid API type specified."</span><span class="token punctuation">)</span>
<span class="token keyword">raise</span> ValueError<span class="token punctuation">(</span><span class="token string">"Invalid api_type"</span><span class="token punctuation">)</span>
<span class="token keyword">def</span> <span class="token function">search_data</span><span class="token punctuation">(</span>
query<span class="token punctuation">:</span> <span class="token builtin">str</span><span class="token punctuation">,</span>
index_folder_path<span class="token punctuation">:</span> <span class="token builtin">str</span><span class="token punctuation">,</span>
search_type<span class="token punctuation">:</span> <span class="token builtin">str</span> <span class="token operator">=</span> <span class="token string">"similarity"</span><span class="token punctuation">,</span>
score_threshold<span class="token punctuation">:</span> Optional<span class="token punctuation">[</span><span class="token builtin">float</span><span class="token punctuation">]</span> <span class="token operator">=</span> <span class="token boolean">None</span><span class="token punctuation">,</span>
k<span class="token punctuation">:</span> Optional<span class="token punctuation">[</span><span class="token builtin">int</span><span class="token punctuation">]</span> <span class="token operator">=</span> <span class="token boolean">None</span><span class="token punctuation">,</span>
fetch_k<span class="token punctuation">:</span> Optional<span class="token punctuation">[</span><span class="token builtin">int</span><span class="token punctuation">]</span> <span class="token operator">=</span> <span class="token boolean">None</span><span class="token punctuation">,</span>
lambda_mult<span class="token punctuation">:</span> Optional<span class="token punctuation">[</span><span class="token builtin">float</span><span class="token punctuation">]</span> <span class="token operator">=</span> <span class="token boolean">None</span><span class="token punctuation">,</span>
<span class="token punctuation">)</span> <span class="token operator">-</span><span class="token operator">></span> List<span class="token punctuation">[</span>Dict<span class="token punctuation">]</span><span class="token punctuation">:</span>
secrets<span class="token punctuation">:</span> Dict<span class="token punctuation">[</span><span class="token builtin">str</span><span class="token punctuation">,</span> <span class="token builtin">str</span><span class="token punctuation">]</span> <span class="token operator">=</span> get_secret<span class="token punctuation">(</span><span class="token punctuation">)</span>
embeddings <span class="token operator">=</span> get_embeddings<span class="token punctuation">(</span>secrets<span class="token punctuation">)</span>
db<span class="token punctuation">:</span> FAISS <span class="token operator">=</span> FAISS<span class="token punctuation">.</span>load_local<span class="token punctuation">(</span>
folder_path<span class="token operator">=</span>index_folder_path<span class="token punctuation">,</span>
embeddings<span class="token operator">=</span>embeddings<span class="token punctuation">,</span>
allow_dangerous_deserialization<span class="token operator">=</span><span class="token boolean">True</span><span class="token punctuation">,</span>
<span class="token punctuation">)</span>
search_kwargs <span class="token operator">=</span> <span class="token punctuation">{</span><span class="token string">"k"</span><span class="token punctuation">:</span> k<span class="token punctuation">}</span>
<span class="token keyword">if</span> search_type <span class="token operator">==</span> <span class="token string">"similarity_score_threshold"</span> <span class="token keyword">and</span> score_threshold <span class="token keyword">is</span> <span class="token keyword">not</span> <span class="token boolean">None</span><span class="token punctuation">:</span>
search_kwargs<span class="token punctuation">[</span><span class="token string">"score_threshold"</span><span class="token punctuation">]</span> <span class="token operator">=</span> score_threshold
<span class="token keyword">elif</span> search_type <span class="token operator">==</span> <span class="token string">"mmr"</span><span class="token punctuation">:</span>
search_kwargs<span class="token punctuation">[</span><span class="token string">"fetch_k"</span><span class="token punctuation">]</span> <span class="token operator">=</span> fetch_k <span class="token keyword">or</span> k <span class="token operator">*</span> <span class="token number">4</span>
<span class="token keyword">if</span> lambda_mult <span class="token keyword">is</span> <span class="token keyword">not</span> <span class="token boolean">None</span><span class="token punctuation">:</span>
search_kwargs<span class="token punctuation">[</span><span class="token string">"lambda_mult"</span><span class="token punctuation">]</span> <span class="token operator">=</span> lambda_mult
retriever<span class="token punctuation">:</span> VectorStoreRetriever <span class="token operator">=</span> db<span class="token punctuation">.</span>as_retriever<span class="token punctuation">(</span>
search_type<span class="token operator">=</span>search_type<span class="token punctuation">,</span>
search_kwargs<span class="token operator">=</span>search_kwargs<span class="token punctuation">,</span>
<span class="token punctuation">)</span>
results<span class="token punctuation">:</span> List<span class="token punctuation">[</span>Document<span class="token punctuation">]</span> <span class="token operator">=</span> retriever<span class="token punctuation">.</span>invoke<span class="token punctuation">(</span><span class="token builtin">input</span><span class="token operator">=</span>query<span class="token punctuation">)</span>
<span class="token keyword">return</span> <span class="token punctuation">[</span><span class="token punctuation">{</span><span class="token string">"content"</span><span class="token punctuation">:</span> doc<span class="token punctuation">.</span>page_content<span class="token punctuation">,</span> <span class="token string">"metadata"</span><span class="token punctuation">:</span> doc<span class="token punctuation">.</span>metadata<span class="token punctuation">}</span> <span class="token keyword">for</span> doc <span class="token keyword">in</span> results<span class="token punctuation">]</span>
</code></pre></div><h4 id="asynchronous-downloads-and-lambda-handlers" data-line="786" class="code-line"><a class="header-anchor-link" href="#asynchronous-downloads-and-lambda-handlers" aria-hidden="true"></a> Asynchronous Downloads and Lambda Handlers</h4>
<p data-line="788" class="code-line">Within <code>async_handler</code>, <code>setup_faiss</code> is executed, and the FAISS index file is retrieved from S3 using <code>download_files</code>. Afterward, <code>search_data</code> performs a RAG search, and the results are returned in JSON format.</p>
<div class="code-block-container"><pre class="language-python"><code class="language-python code-line" data-line="790"><span class="token keyword">import</span> asyncio
<span class="token keyword">import</span> json
<span class="token keyword">import</span> os
<span class="token keyword">from</span> shared<span class="token punctuation">.</span>s3_client <span class="token keyword">import</span> S3Client
<span class="token keyword">from</span> shared<span class="token punctuation">.</span>logger <span class="token keyword">import</span> getLogger
<span class="token keyword">from</span> shared<span class="token punctuation">.</span>token_verifier <span class="token keyword">import</span> with_token_verification
logger <span class="token operator">=</span> getLogger<span class="token punctuation">(</span>name<span class="token punctuation">)</span>
RESULT_NUM <span class="token operator">=</span> <span class="token number">5</span>
<span class="token decorator annotation punctuation">@with_token_verification</span>
<span class="token keyword">async</span> <span class="token keyword">def</span> <span class="token function">async_handler</span><span class="token punctuation">(</span>event<span class="token punctuation">:</span> Dict<span class="token punctuation">[</span><span class="token builtin">str</span><span class="token punctuation">,</span> Any<span class="token punctuation">]</span><span class="token punctuation">,</span> context<span class="token punctuation">:</span> Any<span class="token punctuation">)</span> <span class="token operator">-</span><span class="token operator">></span> Dict<span class="token punctuation">[</span><span class="token builtin">str</span><span class="token punctuation">,</span> Any<span class="token punctuation">]</span><span class="token punctuation">:</span>
env <span class="token operator">=</span> os<span class="token punctuation">.</span>getenv<span class="token punctuation">(</span><span class="token string">"ENV"</span><span class="token punctuation">)</span>
s3_client <span class="token operator">=</span> S3Client<span class="token punctuation">(</span><span class="token punctuation">)</span>
s3_bucket <span class="token operator">=</span> <span class="token string">"bucket-name"</span>
setup_faiss<span class="token punctuation">(</span>s3_client<span class="token punctuation">,</span> s3_bucket<span class="token punctuation">)</span>
request_body_str <span class="token operator">=</span> event<span class="token punctuation">.</span>get<span class="token punctuation">(</span><span class="token string">"body"</span><span class="token punctuation">,</span> <span class="token string">"{}"</span><span class="token punctuation">)</span>
request_body <span class="token operator">=</span> json<span class="token punctuation">.</span>loads<span class="token punctuation">(</span>request_body_str<span class="token punctuation">)</span>
query <span class="token operator">=</span> request_body<span class="token punctuation">.</span>get<span class="token punctuation">(</span><span class="token string">"query"</span><span class="token punctuation">)</span>
index_path <span class="token operator">=</span> request_body<span class="token punctuation">.</span>get<span class="token punctuation">(</span><span class="token string">"index_path"</span><span class="token punctuation">)</span>
local_index_dir <span class="token operator">=</span> <span class="token string">"/tmp/index_faiss"</span>
<span class="token keyword">await</span> download_files<span class="token punctuation">(</span>s3_client<span class="token punctuation">,</span> s3_bucket<span class="token punctuation">,</span> index_path<span class="token punctuation">,</span> local_index_dir<span class="token punctuation">)</span>
results <span class="token operator">=</span> search_data<span class="token punctuation">(</span>
query<span class="token punctuation">,</span>
local_index_dir<span class="token punctuation">,</span>
search_type<span class="token operator">=</span>request_body<span class="token punctuation">.</span>get<span class="token punctuation">(</span><span class="token string">"search_type"</span><span class="token punctuation">,</span> <span class="token string">"similarity"</span><span class="token punctuation">)</span><span class="token punctuation">,</span>
score_threshold<span class="token operator">=</span>request_body<span class="token punctuation">.</span>get<span class="token punctuation">(</span><span class="token string">"score_threshold"</span><span class="token punctuation">)</span><span class="token punctuation">,</span>
k<span class="token operator">=</span>request_body<span class="token punctuation">.</span>get<span class="token punctuation">(</span><span class="token string">"k"</span><span class="token punctuation">,</span> RESULT_NUM<span class="token punctuation">)</span><span class="token punctuation">,</span>
fetch_k<span class="token operator">=</span>request_body<span class="token punctuation">.</span>get<span class="token punctuation">(</span><span class="token string">"fetch_k"</span><span class="token punctuation">)</span><span class="token punctuation">,</span>
lambda_mult<span class="token operator">=</span>request_body<span class="token punctuation">.</span>get<span class="token punctuation">(</span><span class="token string">"lambda_mult"</span><span class="token punctuation">)</span><span class="token punctuation">,</span>
<span class="token punctuation">)</span>
<span class="token keyword">return</span> create_response<span class="token punctuation">(</span><span class="token number">200</span><span class="token punctuation">,</span> results<span class="token punctuation">)</span>
<span class="token keyword">def</span> <span class="token function">retrieverHandler</span><span class="token punctuation">(</span>event<span class="token punctuation">:</span> Dict<span class="token punctuation">[</span><span class="token builtin">str</span><span class="token punctuation">,</span> Any<span class="token punctuation">]</span><span class="token punctuation">,</span> context<span class="token punctuation">:</span> Any<span class="token punctuation">)</span> <span class="token operator">-</span><span class="token operator">></span> Dict<span class="token punctuation">[</span><span class="token builtin">str</span><span class="token punctuation">,</span> Any<span class="token punctuation">]</span><span class="token punctuation">:</span>
<span class="token keyword">return</span> asyncio<span class="token punctuation">.</span>run<span class="token punctuation">(</span>async_handler<span class="token punctuation">(</span>event<span class="token punctuation">,</span> context<span class="token punctuation">)</span><span class="token punctuation">)</span>
<span class="token keyword">def</span> <span class="token function">create_response</span><span class="token punctuation">(</span>status_code<span class="token punctuation">:</span> <span class="token builtin">int</span><span class="token punctuation">,</span> body<span class="token punctuation">:</span> Any<span class="token punctuation">)</span> <span class="token operator">-</span><span class="token operator">></span> Dict<span class="token punctuation">[</span><span class="token builtin">str</span><span class="token punctuation">,</span> Any<span class="token punctuation">]</span><span class="token punctuation">:</span>
<span class="token keyword">return</span> <span class="token punctuation">{</span>
<span class="token string">"statusCode"</span><span class="token punctuation">:</span> status_code<span class="token punctuation">,</span>
<span class="token string">"body"</span><span class="token punctuation">:</span> json<span class="token punctuation">.</span>dumps<span class="token punctuation">(</span>body<span class="token punctuation">,</span> ensure_ascii<span class="token operator">=</span><span class="token boolean">False</span><span class="token punctuation">)</span><span class="token punctuation">,</span>
<span class="token string">"headers"</span><span class="token punctuation">:</span> <span class="token punctuation">{</span>
<span class="token string">"Content-Type"</span><span class="token punctuation">:</span> <span class="token string">"application/json"</span><span class="token punctuation">,</span>
<span class="token punctuation">}</span><span class="token punctuation">,</span>
<span class="token punctuation">}</span>
<span class="token keyword">async</span> <span class="token keyword">def</span> <span class="token function">download_files</span><span class="token punctuation">(</span>s3_client<span class="token punctuation">:</span> S3Client<span class="token punctuation">,</span> bucket<span class="token punctuation">:</span> <span class="token builtin">str</span><span class="token punctuation">,</span> key<span class="token punctuation">:</span> <span class="token builtin">str</span><span class="token punctuation">,</span> file_path<span class="token punctuation">:</span> <span class="token builtin">str</span><span class="token punctuation">)</span> <span class="token operator">-</span><span class="token operator">></span> <span class="token boolean">None</span><span class="token punctuation">:</span>
loop <span class="token operator">=</span> asyncio<span class="token punctuation">.</span>get_running_loop<span class="token punctuation">(</span><span class="token punctuation">)</span>
<span class="token keyword">await</span> loop<span class="token punctuation">.</span>run_in_executor<span class="token punctuation">(</span><span class="token boolean">None</span><span class="token punctuation">,</span> download_files_from_s3<span class="token punctuation">,</span> s3_client<span class="token punctuation">,</span> bucket<span class="token punctuation">,</span> key<span class="token punctuation">,</span> file_path<span class="token punctuation">)</span>
<span class="token keyword">def</span> <span class="token function">download_files_from_s3</span><span class="token punctuation">(</span>s3_client<span class="token punctuation">:</span> S3Client<span class="token punctuation">,</span> s3_bucket<span class="token punctuation">:</span> <span class="token builtin">str</span><span class="token punctuation">,</span> prefix<span class="token punctuation">:</span> <span class="token builtin">str</span><span class="token punctuation">,</span> local_dir<span class="token punctuation">:</span> <span class="token builtin">str</span><span class="token punctuation">)</span> <span class="token operator">-</span><span class="token operator">></span> <span class="token boolean">None</span><span class="token punctuation">:</span>
keys <span class="token operator">=</span> s3_client<span class="token punctuation">.</span>list_objects<span class="token punctuation">(</span>bucket_name<span class="token operator">=</span>s3_bucket<span class="token punctuation">,</span> prefix<span class="token operator">=</span>prefix<span class="token punctuation">)</span>
<span class="token keyword">if</span> <span class="token keyword">not</span> keys<span class="token punctuation">:</span>
logger<span class="token punctuation">.</span>info<span class="token punctuation">(</span><span class="token string-interpolation"><span class="token string">f"No file found in '</span><span class="token interpolation"><span class="token punctuation">{</span>prefix<span class="token punctuation">}</span></span><span class="token string">'"</span></span><span class="token punctuation">)</span>
<span class="token keyword">return</span>
<span class="token keyword">for</span> key <span class="token keyword">in</span> keys<span class="token punctuation">:</span>
relative_path <span class="token operator">=</span> os<span class="token punctuation">.</span>path<span class="token punctuation">.</span>relpath<span class="token punctuation">(</span>key<span class="token punctuation">,</span> prefix<span class="token punctuation">)</span>
local_file_path <span class="token operator">=</span> os<span class="token punctuation">.</span>path<span class="token punctuation">.</span>join<span class="token punctuation">(</span>local_dir<span class="token punctuation">,</span> relative_path<span class="token punctuation">)</span>
os<span class="token punctuation">.</span>makedirs<span class="token punctuation">(</span>os<span class="token punctuation">.</span>path<span class="token punctuation">.</span>dirname<span class="token punctuation">(</span>local_file_path<span class="token punctuation">)</span><span class="token punctuation">,</span> exist_ok<span class="token operator">=</span><span class="token boolean">True</span><span class="token punctuation">)</span>
s3_client<span class="token punctuation">.</span>download_file<span class="token punctuation">(</span>bucket_name<span class="token operator">=</span>s3_bucket<span class="token punctuation">,</span> key<span class="token operator">=</span>key<span class="token punctuation">,</span> file_path<span class="token operator">=</span>local_file_path<span class="token punctuation">)</span>
</code></pre></div><h4 id="summary-of-5.4" data-line="857" class="code-line"><a class="header-anchor-link" href="#summary-of-5.4" aria-hidden="true"></a> Summary of 5.4</h4>
<ul data-line="859" class="code-line">
<li data-line="859" class="code-line">Avoid Lambda layer capacity issues with <code>setup_faiss</code> <code>faiss</code> dynamic loading.</li>
<li data-line="860" class="code-line">Asynchronous I/O and S3 usage allow FAISS index to be loaded without containerization or EFS connectivity.</li>
<li data-line="861" class="code-line"><code>search_data</code> searches the embedded index, enabling RAG to quickly provide similar documents.</li>
</ul>
<p data-line="863" class="code-line">This enables high-speed knowledge searches using RAG, providing LLM answers enriched with company-specific information.</p>
<h3 id="5.5-%5Bpython%5D-embedding-and-faiss-indexing" data-line="865" class="code-line"><a class="header-anchor-link" href="#5.5-%5Bpython%5D-embedding-and-faiss-indexing" aria-hidden="true"></a> 5.5 [Python] Embedding and FAISS Indexing</h3>
<p data-line="867" class="code-line">This section provides an example of periodic batch processing that embeds internal company documents (such as Confluence pages) and creates or updates the FAISS index.
The index used in the RAG pipeline is essential for generative AI to incorporate company-specific knowledge into its responses. To maintain accuracy, we regularly update embeddings and rebuild the FAISS index, ensuring that the latest information is always accessible.</p>
<h4 id="process-overview" data-line="870" class="code-line"><a class="header-anchor-link" href="#process-overview" aria-hidden="true"></a> Process Overview</h4>
<ol data-line="872" class="code-line">
<li data-line="872" class="code-line">Retrieve JSON-formatted documents from S3.</li>
<li data-line="873" class="code-line">Generate embeddings for the retrieved documents (using the Embeddings API from OpenAI or Azure OpenAI).</li>
<li data-line="874" class="code-line">Index the embedded text using FAISS.</li>
<li data-line="875" class="code-line">Upload the FAISS index to S3.</li>
</ol>
<p data-line="877" class="code-line">By executing these steps periodically via Lambda batch processing or a Step Functions workflow, RAG searches will always use the latest index when queried.</p>
<h4 id="step-1%3A-loading-a-json-document" data-line="879" class="code-line"><a class="header-anchor-link" href="#step-1%3A-loading-a-json-document" aria-hidden="true"></a> Step 1: Loading a JSON document</h4>
<p data-line="881" class="code-line">Download and parse a JSON file from S3 (e.g., summarized Confluence pages) and convert it into a list of <code>Document</code> objects.</p>
<div class="code-block-container"><pre class="language-python"><code class="language-python code-line" data-line="883"><span class="token keyword">import</span> json
<span class="token keyword">from</span> typing <span class="token keyword">import</span> Any<span class="token punctuation">,</span> Dict<span class="token punctuation">,</span> List
<span class="token keyword">from</span> langchain_core<span class="token punctuation">.</span>documents<span class="token punctuation">.</span>base <span class="token keyword">import</span> Document
<span class="token keyword">from</span> shared<span class="token punctuation">.</span>logger <span class="token keyword">import</span> getLogger
logger <span class="token operator">=</span> getLogger<span class="token punctuation">(</span>name<span class="token punctuation">)</span>
<span class="token keyword">def</span> <span class="token function">load_json</span><span class="token punctuation">(</span>file_path<span class="token punctuation">:</span> <span class="token builtin">str</span><span class="token punctuation">)</span> <span class="token operator">-</span><span class="token operator">></span> List<span class="token punctuation">[</span>Document<span class="token punctuation">]</span><span class="token punctuation">:</span>
<span class="token triple-quoted-string string">"""
Reads a JSON file and returns a list of Document objects.
The JSON format is expected to be: [{"title": "...", "content": "...", "id": "...", "url": "..."}]
"""</span>
<span class="token keyword">with</span> <span class="token builtin">open</span><span class="token punctuation">(</span>file_path<span class="token punctuation">,</span> <span class="token string">"r"</span><span class="token punctuation">,</span> encoding<span class="token operator">=</span><span class="token string">"utf-8"</span><span class="token punctuation">)</span> <span class="token keyword">as</span> f<span class="token punctuation">:</span>
data <span class="token operator">=</span> json<span class="token punctuation">.</span>load<span class="token punctuation">(</span>f<span class="token punctuation">)</span>
<span class="token keyword">if</span> <span class="token keyword">not</span> <span class="token builtin">isinstance</span><span class="token punctuation">(</span>data<span class="token punctuation">,</span> <span class="token builtin">list</span><span class="token punctuation">)</span><span class="token punctuation">:</span>
<span class="token keyword">raise</span> ValueError<span class="token punctuation">(</span><span class="token string">"The top-level JSON structure is not a list."</span><span class="token punctuation">)</span>
documents <span class="token operator">=</span> <span class="token punctuation">[</span><span class="token punctuation">]</span>
<span class="token keyword">for</span> record <span class="token keyword">in</span> data<span class="token punctuation">:</span>
<span class="token keyword">if</span> <span class="token keyword">not</span> <span class="token builtin">isinstance</span><span class="token punctuation">(</span>record<span class="token punctuation">,</span> <span class="token builtin">dict</span><span class="token punctuation">)</span><span class="token punctuation">:</span>
logger<span class="token punctuation">.</span>warning<span class="token punctuation">(</span><span class="token string-interpolation"><span class="token string">f"Skipped record (not a dictionary): </span><span class="token interpolation"><span class="token punctuation">{</span>record<span class="token punctuation">}</span></span><span class="token string">"</span></span><span class="token punctuation">)</span>
<span class="token keyword">continue</span>
title <span class="token operator">=</span> record<span class="token punctuation">.</span>get<span class="token punctuation">(</span><span class="token string">"title"</span><span class="token punctuation">,</span> <span class="token string">""</span><span class="token punctuation">)</span>
content <span class="token operator">=</span> record<span class="token punctuation">.</span>get<span class="token punctuation">(</span><span class="token string">"content"</span><span class="token punctuation">,</span> <span class="token string">""</span><span class="token punctuation">)</span>
metadata <span class="token operator">=</span> <span class="token punctuation">{</span>
<span class="token string">"id"</span><span class="token punctuation">:</span> record<span class="token punctuation">.</span>get<span class="token punctuation">(</span><span class="token string">"id"</span><span class="token punctuation">)</span><span class="token punctuation">,</span>
<span class="token string">"title"</span><span class="token punctuation">:</span> title<span class="token punctuation">,</span>
<span class="token string">"url"</span><span class="token punctuation">:</span> record<span class="token punctuation">.</span>get<span class="token punctuation">(</span><span class="token string">"url"</span><span class="token punctuation">)</span><span class="token punctuation">,</span>
<span class="token punctuation">}</span>
<span class="token comment"># Create a Document object combining the title and content</span>
doc <span class="token operator">=</span> Document<span class="token punctuation">(</span>page_content<span class="token operator">=</span><span class="token string-interpolation"><span class="token string">f"Title: </span><span class="token interpolation"><span class="token punctuation">{</span>title<span class="token punctuation">}</span></span><span class="token string">\nContent: </span><span class="token interpolation"><span class="token punctuation">{</span>content<span class="token punctuation">}</span></span><span class="token string">"</span></span><span class="token punctuation">,</span> metadata<span class="token operator">=</span>metadata<span class="token punctuation">)</span>
documents<span class="token punctuation">.</span>append<span class="token punctuation">(</span>doc<span class="token punctuation">)</span>
logger<span class="token punctuation">.</span>info<span class="token punctuation">(</span><span class="token string-interpolation"><span class="token string">f"Loaded </span><span class="token interpolation"><span class="token punctuation">{</span><span class="token builtin">len</span><span class="token punctuation">(</span>documents<span class="token punctuation">)</span><span class="token punctuation">}</span></span><span class="token string"> documents."</span></span><span class="token punctuation">)</span>
<span class="token keyword">return</span> documents
</code></pre></div><h4 id="step-2%3A-embedding-and-faiss-indexing" data-line="922" class="code-line"><a class="header-anchor-link" href="#step-2%3A-embedding-and-faiss-indexing" aria-hidden="true"></a> Step 2: Embedding and FAISS indexing</h4>
<p data-line="924" class="code-line">The <code>vectorize_and_save</code> function embeds the documents using the Embeddings client obtained from <code>get_embeddings</code> and creates a <code>FAISS</code> index. It then saves the index locally.</p>
<div class="code-block-container"><pre class="language-python"><code class="language-python code-line" data-line="926"><span class="token keyword">import</span> os
<span class="token keyword">from</span> langchain_community<span class="token punctuation">.</span>vectorstores <span class="token keyword">import</span> FAISS
<span class="token keyword">from</span> langchain_core<span class="token punctuation">.</span>text_splitter <span class="token keyword">import</span> RecursiveCharacterTextSplitter
<span class="token keyword">from</span> shared<span class="token punctuation">.</span>logger <span class="token keyword">import</span> getLogger
logger <span class="token operator">=</span> getLogger<span class="token punctuation">(</span>name<span class="token punctuation">)</span>
<span class="token keyword">def</span> <span class="token function">vectorize_and_save</span><span class="token punctuation">(</span>documents<span class="token punctuation">:</span> List<span class="token punctuation">[</span>Document<span class="token punctuation">]</span><span class="token punctuation">,</span> output_dir<span class="token punctuation">:</span> <span class="token builtin">str</span><span class="token punctuation">,</span> embeddings<span class="token punctuation">)</span> <span class="token operator">-</span><span class="token operator">></span> <span class="token boolean">None</span><span class="token punctuation">:</span>
<span class="token triple-quoted-string string">"""
Embed the documents, create a FAISS index, and save it locally.
"""</span>
<span class="token comment"># Split the document into smaller chunks using a text splitter</span>
text_splitter <span class="token operator">=</span> RecursiveCharacterTextSplitter<span class="token punctuation">(</span>chunk_size<span class="token operator">=</span><span class="token number">1024</span><span class="token punctuation">,</span> chunk_overlap<span class="token operator">=</span><span class="token number">128</span><span class="token punctuation">)</span>
split_docs <span class="token operator">=</span> text_splitter<span class="token punctuation">.</span>split_documents<span class="token punctuation">(</span>documents<span class="token punctuation">)</span>
logger<span class="token punctuation">.</span>info<span class="token punctuation">(</span><span class="token string-interpolation"><span class="token string">f"</span><span class="token interpolation"><span class="token punctuation">{</span><span class="token builtin">len</span><span class="token punctuation">(</span>split_docs<span class="token punctuation">)</span><span class="token punctuation">}</span></span><span class="token string"> split documents"</span></span><span class="token punctuation">)</span>
<span class="token comment"># Vectorize using embeddings and build FAISS index</span>
db<span class="token punctuation">:</span> FAISS <span class="token operator">=</span> FAISS<span class="token punctuation">.</span>from_documents<span class="token punctuation">(</span>split_docs<span class="token punctuation">,</span> embeddings<span class="token punctuation">)</span>
logger<span class="token punctuation">.</span>info<span class="token punctuation">(</span><span class="token string">"Vector DB construction completed."</span><span class="token punctuation">)</span>
os<span class="token punctuation">.</span>makedirs<span class="token punctuation">(</span>output_dir<span class="token punctuation">,</span> exist_ok<span class="token operator">=</span><span class="token boolean">True</span><span class="token punctuation">)</span>
db<span class="token punctuation">.</span>save_local<span class="token punctuation">(</span>output_dir<span class="token punctuation">)</span>
logger<span class="token punctuation">.</span>info<span class="token punctuation">(</span><span class="token string-interpolation"><span class="token string">f"Vector DB saved to </span><span class="token interpolation"><span class="token punctuation">{</span>output_dir<span class="token punctuation">}</span></span><span class="token string">"</span></span><span class="token punctuation">)</span>
</code></pre></div><h4 id="step-3%3A-uploading-the-index-to-s3" data-line="952" class="code-line"><a class="header-anchor-link" href="#step-3%3A-uploading-the-index-to-s3" aria-hidden="true"></a> Step 3: Uploading the Index to S3</h4>
<p data-line="954" class="code-line">By uploading the locally created FAISS index to S3, it can be easily accessed by the RAG search Lambda.</p>
<div class="code-block-container"><pre class="language-python"><code class="language-python code-line" data-line="956"><span class="token keyword">from</span> shared<span class="token punctuation">.</span>s3_client <span class="token keyword">import</span> S3Client
<span class="token keyword">from</span> shared<span class="token punctuation">.</span>logger <span class="token keyword">import</span> getLogger
logger <span class="token operator">=</span> getLogger<span class="token punctuation">(</span>name<span class="token punctuation">)</span>
<span class="token keyword">def</span> <span class="token function">upload_faiss_to_s3</span><span class="token punctuation">(</span>s3_client<span class="token punctuation">:</span> S3Client<span class="token punctuation">,</span> s3_bucket<span class="token punctuation">:</span> <span class="token builtin">str</span><span class="token punctuation">,</span> local_index_dir<span class="token punctuation">:</span> <span class="token builtin">str</span><span class="token punctuation">,</span> index_s3_path<span class="token punctuation">:</span> <span class="token builtin">str</span><span class="token punctuation">)</span> <span class="token operator">-</span><span class="token operator">></span> <span class="token boolean">None</span><span class="token punctuation">:</span>
<span class="token triple-quoted-string string">"""
Upload the FAISS index to S3.
"""</span>
index_files <span class="token operator">=</span> <span class="token punctuation">[</span><span class="token string">"index.faiss"</span><span class="token punctuation">,</span> <span class="token string">"index.pkl"</span><span class="token punctuation">]</span>
<span class="token keyword">for</span> file_name <span class="token keyword">in</span> index_files<span class="token punctuation">:</span>
local_file_path <span class="token operator">=</span> os<span class="token punctuation">.</span>path<span class="token punctuation">.</span>join<span class="token punctuation">(</span>local_index_dir<span class="token punctuation">,</span> file_name<span class="token punctuation">)</span>
s3_index_key <span class="token operator">=</span> os<span class="token punctuation">.</span>path<span class="token punctuation">.</span>join<span class="token punctuation">(</span>index_s3_path<span class="token punctuation">,</span> file_name<span class="token punctuation">)</span>
s3_client<span class="token punctuation">.</span>upload_file<span class="token punctuation">(</span>local_file_path<span class="token punctuation">,</span> s3_bucket<span class="token punctuation">,</span> s3_index_key<span class="token punctuation">)</span>
logger<span class="token punctuation">.</span>info<span class="token punctuation">(</span><span class="token string-interpolation"><span class="token string">f"FAISS index file uploaded to s3://</span><span class="token interpolation"><span class="token punctuation">{</span>s3_bucket<span class="token punctuation">}</span></span><span class="token string">/</span><span class="token interpolation"><span class="token punctuation">{</span>s3_index_key<span class="token punctuation">}</span></span><span class="token string">"</span></span><span class="token punctuation">)</span>
</code></pre></div><h4 id="step-4%3A-running-the-entire-flow-in-lambda" data-line="974" class="code-line"><a class="header-anchor-link" href="#step-4%3A-running-the-entire-flow-in-lambda" aria-hidden="true"></a> Step 4: Running the entire flow in Lambda</h4>
<p data-line="976" class="code-line">The <code>index_to_s3</code> function encapsulates the entire process. It downloads JSON from S3, generates embeddings, creates a FAISS index, and uploads the index to S3. This process can be executed periodically using a workflow such as Step Functions, ensuring that the index remains up to date.</p>
<div class="code-block-container"><pre class="language-python"><code class="language-python code-line" data-line="978"><span class="token keyword">import</span> os
<span class="token keyword">from</span> shared<span class="token punctuation">.</span>faiss <span class="token keyword">import</span> setup_faiss
<span class="token keyword">from</span> shared<span class="token punctuation">.</span>logger <span class="token keyword">import</span> getLogger
<span class="token keyword">from</span> shared<span class="token punctuation">.</span>s3_client <span class="token keyword">import</span> S3Client
<span class="token keyword">from</span> shared<span class="token punctuation">.</span>secrets <span class="token keyword">import</span> get_secret
logger <span class="token operator">=</span> getLogger<span class="token punctuation">(</span>name<span class="token punctuation">)</span>
<span class="token keyword">def</span> <span class="token function">index_to_s3</span><span class="token punctuation">(</span>json_s3_key<span class="token punctuation">:</span> <span class="token builtin">str</span><span class="token punctuation">,</span> index_s3_path<span class="token punctuation">:</span> <span class="token builtin">str</span><span class="token punctuation">)</span> <span class="token operator">-</span><span class="token operator">></span> Dict<span class="token punctuation">[</span><span class="token builtin">str</span><span class="token punctuation">,</span> Any<span class="token punctuation">]</span><span class="token punctuation">:</span>
<span class="token triple-quoted-string string">"""
Download JSON from S3, generate embeddings, create a FAISS index, and upload the index to S3.
"""</span>
env <span class="token operator">=</span> os<span class="token punctuation">.</span>getenv<span class="token punctuation">(</span><span class="token string">"ENV"</span><span class="token punctuation">)</span>
<span class="token keyword">if</span> env <span class="token keyword">is</span> <span class="token boolean">None</span><span class="token punctuation">:</span>
error_msg <span class="token operator">=</span> <span class="token string">"ENV environment variable not set."</span>
logger<span class="token punctuation">.</span>error<span class="token punctuation">(</span>error_msg<span class="token punctuation">)</span>
<span class="token keyword">return</span> <span class="token punctuation">{</span><span class="token string">"status"</span><span class="token punctuation">:</span> <span class="token string">"error"</span><span class="token punctuation">,</span> <span class="token string">"message"</span><span class="token punctuation">:</span> error_msg<span class="token punctuation">}</span>
<span class="token keyword">try</span><span class="token punctuation">:</span>
s3_client <span class="token operator">=</span> S3Client<span class="token punctuation">(</span><span class="token punctuation">)</span>
s3_bucket <span class="token operator">=</span> <span class="token string">"bucket-name"</span>
local_json_path <span class="token operator">=</span> <span class="token string">"/tmp/json_file.json"</span>
local_index_dir <span class="token operator">=</span> <span class="token string">"/tmp/index"</span>
<span class="token comment"># Set up faiss if necessary (download from S3)</span>
setup_faiss<span class="token punctuation">(</span>s3_client<span class="token punctuation">,</span> s3_bucket<span class="token punctuation">)</span>
<span class="token comment"># Download the JSON file from S3</span>
s3_client<span class="token punctuation">.</span>download_file<span class="token punctuation">(</span>s3_bucket<span class="token punctuation">,</span> json_s3_key<span class="token punctuation">,</span> local_json_path<span class="token punctuation">)</span>
documents <span class="token operator">=</span> load_json<span class="token punctuation">(</span>local_json_path<span class="token punctuation">)</span>
<span class="token comment"># Get Embeddings client</span>
secrets <span class="token operator">=</span> get_secret<span class="token punctuation">(</span><span class="token punctuation">)</span>
embeddings <span class="token operator">=</span> get_embeddings<span class="token punctuation">(</span>secrets<span class="token punctuation">)</span>
<span class="token comment"># Vectorization and FAISS indexing</span>
vectorize_and_save<span class="token punctuation">(</span>documents<span class="token punctuation">,</span> local_index_dir<span class="token punctuation">,</span> embeddings<span class="token punctuation">)</span>
<span class="token comment"># Upload the index file to S3</span>
upload_faiss_to_s3<span class="token punctuation">(</span>s3_client<span class="token punctuation">,</span> s3_bucket<span class="token punctuation">,</span> local_index_dir<span class="token punctuation">,</span> index_s3_path<span class="token punctuation">)</span>
<span class="token keyword">return</span> <span class="token punctuation">{</span>
<span class="token string">"status"</span><span class="token punctuation">:</span> <span class="token string">"success"</span><span class="token punctuation">,</span>
<span class="token string">"message"</span><span class="token punctuation">:</span> "FAISS index created <span class="token keyword">and</span> uploaded to S3<span class="token punctuation">.</span>”<span class="token punctuation">,</span>
<span class="token string">"output"</span><span class="token punctuation">:</span> <span class="token punctuation">{</span>
<span class="token string">"bucket"</span><span class="token punctuation">:</span> s3_bucket<span class="token punctuation">,</span>
<span class="token string">"index_key"</span><span class="token punctuation">:</span> index_s3_path<span class="token punctuation">,</span>
<span class="token punctuation">}</span><span class="token punctuation">,</span>
<span class="token punctuation">}</span>
<span class="token keyword">except</span> Exception <span class="token keyword">as</span> e<span class="token punctuation">:</span>
logger<span class="token punctuation">.</span>error<span class="token punctuation">(</span><span class="token string-interpolation"><span class="token string">f"An error occurred during the indexing process: </span><span class="token interpolation"><span class="token punctuation">{</span>e<span class="token punctuation">}</span></span><span class="token string">"</span></span><span class="token punctuation">)</span>
<span class="token keyword">return</span> <span class="token punctuation">{</span><span class="token string">"status"</span><span class="token punctuation">:</span> <span class="token string">"error"</span><span class="token punctuation">,</span> <span class="token string">"message"</span><span class="token punctuation">:</span> <span class="token builtin">str</span><span class="token punctuation">(</span>e<span class="token punctuation">)</span><span class="token punctuation">}</span>
</code></pre></div><h4 id="summary-of-5.5" data-line="1034" class="code-line"><a class="header-anchor-link" href="#summary-of-5.5" aria-hidden="true"></a> Summary of 5.5</h4>
<ul data-line="1036" class="code-line">
<li data-line="1036" class="code-line"><code>load_json</code> loads a JSON file, and <code>vectorize_and_save</code> generates embeddings and creates a FAISS index.</li>
<li data-line="1037" class="code-line"><code>upload_faiss_to_s3</code> uploads the local index to S3.</li>
<li data-line="1038" class="code-line"><code>index_to_s3</code> consolidates the entire process, ensuring that the latest index is created and updated through regular batch processing.</li>
</ul>
<p data-line="1040" class="code-line">This enables automated batch processing to embed internal documents and maintain FAISS indexes for RAG searches.</p>
<h2 id="6.-summary" data-line="1042" class="code-line"><a class="header-anchor-link" href="#6.-summary" aria-hidden="true"></a> 6. Summary</h2>
<p data-line="1044" class="code-line">In this article, we covered the development background and technical implementation of internal generative AI tool, out internal chatbot powered by LLM and integrated into Slack. We also outlined the steps for implementing the RAG pipeline, sanitizing Confluence documents, building a search infrastructure using Embeddings and FAISS indexes, and extending functionality with features like translation and summarization.
This system enables employees to seamlessly integrate generative AI into their Slack workflow, allowing them to access advanced information capabilities without needing to learn new tools or commands.</p>
<h2 id="7.-future-outlook" data-line="1047" class="code-line"><a class="header-anchor-link" href="#7.-future-outlook" aria-hidden="true"></a> 7. Future Outlook</h2>
<p data-line="1049" class="code-line">We will actively work on the following improvements and expansions to further enhance internal generative AI tool.</p>
<ul data-line="1051" class="code-line">
<li data-line="1051" class="code-line">
<p data-line="1051" class="code-line"><strong>Strengthening Azure-based deployment</strong>
We will fully integrate with Azure services such as Azure Functions and Azure CosmosDB, significantly improving the performance and scalability of the RAG pipeline.</p>
<ul data-line="1053" class="code-line">
<li data-line="1053" class="code-line"><strong>Introducing Azure Cosmos DB Vector Search</strong>
We will implement vector search functionality on Azure Cosmos DB for NoSQL, enabling more advanced search capabilities.<span class="embed-block zenn-embedded zenn-embedded-card"><iframe id="zenn-embedded__d2adaffe7bc5e" src="https://embed.zenn.studio/card#zenn-embedded__d2adaffe7bc5e" data-content="https%3A%2F%2Flearn.microsoft.com%2Fja-jp%2Fazure%2Fcosmos-db%2Fnosql%2Fvector-search" frameborder="0" scrolling="no" loading="lazy"></iframe></span>
</li>
<li data-line="1056" class="code-line"><strong>Utilizing AI Document Intelligence</strong>
By actively incorporating AI Document Intelligence, we aim to expand the knowledge scope of RAG and enhance information utilization across a broader range of use cases.<span class="embed-block zenn-embedded zenn-embedded-card"><iframe id="zenn-embedded__94f21877b6e23" src="https://embed.zenn.studio/card#zenn-embedded__94f21877b6e23" data-content="https%3A%2F%2Flearn.microsoft.com%2Fja-jp%2Fazure%2Fai-services%2Fdocument-intelligence%2Foverview%3Fview%3Ddoc-intel-4.0.0" frameborder="0" scrolling="no" loading="lazy"></iframe></span>
</li>
</ul>
</li>
<li data-line="1060" class="code-line">
<p data-line="1060" class="code-line"><strong>Diversification and sophistication of models</strong>
We will continue integrating cutting-edge models by expanding support beyond GPT-4o to include GPT-o1, Google Gemini, and other state-of-the-art AI models.</p>
</li>
<li data-line="1063" class="code-line">
<p data-line="1063" class="code-line"><strong>Implementing Web UI</strong>
To overcome the expression and interaction limitations imposed by Slack, we will develop a Web UI, allowing for more diverse interactions and the flexible deployment of new features.</p>
</li>
<li data-line="1066" class="code-line">
<p data-line="1066" class="code-line"><strong>Enhancing prompt management</strong>
We will template existing prompts, making them easily reusable across different use cases. Additionally, we will enhance the prompt-sharing functionality to further promote the adoption of generative AI across the company.</p>
</li>
<li data-line="1069" class="code-line">
<p data-line="1069" class="code-line"><strong>Realizing multi-agent capabilities</strong>
By deploying specialized agents dedicated to tasks such as summarization, translation, and RAG search, and allowing flexible combinations through an Agent Builder, we will enable more advanced and adaptable information processing.</p>
</li>
<li data-line="1072" class="code-line">
<p data-line="1072" class="code-line"><strong>Evaluating and improving RAG accuracy</strong>
We will build test sets and conduct automated answer evaluations to quantitatively measure accuracy and continuously improve quality.</p>
</li>
<li data-line="1075" class="code-line">
<p data-line="1075" class="code-line"><strong>Enhancements based on user feedback</strong>
By incorporating real-world usage data and feedback, we will optimize dialogue flows, fine-tune prompts, and strengthen external service integrations, ensuring that internal generative AI tool remains highly convenient and useful.</p>
</li>
</ul>
<p data-line="1078" class="code-line">Through these efforts, we will continue evolving internal generative AI tool, growing it into a powerful internal support tool that meets a wide range of business needs.</p>
関連記事 | Related Posts
We are hiring!
生成AIエンジニア/AIファーストG/東京・名古屋・大阪・福岡
AIファーストGについて生成AIの活用を通じて、KINTO及びKINTOテクノロジーズへ事業貢献することをミッションに2024年1月に新設されたプロジェクトチームです。生成AI技術は生まれて日が浅く、その技術を業務活用する仕事には定説がありません。
リードエンジニア/プロジェクト推進G/東京・名古屋
業務内容バックエンド開発を中心に、サービスの企画から設計・開発・運用までプロダクトに関わっていただきます。サービスや会社の成長を考えて、自分やチームが何をすべきか自律的に動き、スピード感を持って開発に取り組むことを期待しています。



