KINTO Tech Blog
AWS

Deployment Process in CloudFront Functions and Operational Kaizen

Cover Image for Deployment Process in CloudFront Functions and Operational Kaizen

Introduction

Nice to meet you. I am Shirai, working as a Cloud Infrastructure Engineer in the Cloud Infrastructure Team of the Platform Group at KINTO Technologies Corporation. I typically build and design the infrastructure for systems built on AWS. My hobby is playing table tennis and video games. Recently, I bought the remake of Super Mario RPG and played it while immersing myself in nostalgic memories. This time, I will introduce the deployment process in CloudFront Functions, which is being built in KINTO Technologies, and the story of the operational kaizen, including the background!

KINTO Technologies' Cloud Infrastructure Team

Prior to exploring the subject, I'd like to share a bit about our team. At KINTO Technologies, infrastructure construction is managed by IaC using Terraform. For a detailed historical background, etc., please refer to Mr. Shimakawa’s article from the same team. He has published a document on How to Abstract Terraform and Reduce the Man-hours Required to Build an Environment.

Current Challenges

KINTO Technologies currently uses CloudFront Functions (hereafter, CF2) for redirect processing in some systems. For more details, you can read the post from Mr. Iki, another team member, as he introduces CF2 under the title Edge Functions Available with CloudFront.

While using CF2 at KINTO Technologies, the following three challenges have been raised:

  1. High communication costs between the Application Team and the Cloud Infrastructure Team
  2. The Application Team is not authorized to view logs output to CloudWatch Logs
  3. Logs output to CloudWatch Logs remain unexpired

We will solve these three challenges.

Digging Deeper Into Challenges

1. Challenges in high communication cost

The process for CF2 to be applied so far is as follows.
Deployment process to date

Because the Deployment is reliant on the Cloud Infrastructure Team, in the event of an issue with CF2 source code, steps (2) to (4) in the above diagram must be re-executed by the Cloud Infrastructure Team. The problems with this flow include:

  • The update of CF2 depends on the Cloud Infrastructure Team
  • When CF2 is updated, the Cloud Infrastructure Team must also review the scope of impact and coordinate with the Application Team

The above two points have resulted in high communication costs.

2. Challenges in Application Team not being able to view logs

KTC has restricted the authority to hand over to the Application Team. As a result, they do not have permission to view CF2 logs. In this situation, the Application Team cannot investigate when a problem occurs in CF2.

3. Challenges with permanent CF2 logs

At present, CF2 was built without a CloudWatch log group set up. According to the specification of CF2, a log group named /aws/cloudfront/function/${FunctionName} is automatically created in CloudWatchLogs in the us-east-1 region when CF2 are output. In this situation, the log group has no set expiration period, causing it to persist and resulting in high costs.

Solutions

The problems and solutions are summarized below.

Issue Issue Solution
1 High communication cost Grant additional permissions to the Application Team to deploy at any time
2 Application Team not being able to view logs Add log view permission to the Application Team
3 CF2 logs remain permanent Create log group with expiration period first

Now, I would like to dig deeper into each solution.

Issue 1: High communication costs

As mentioned above, establish a policy allowing the Application Team to deploy at any time. So, I decided to revamp the deployment process.

First, I will show you an example of the configuration before CF2 was built and an example of the configuration after the process was revamped.

Example of the configuration before CF2 was built

Example of final configuration

I would like to add more details about the DEVELOPMENT stage and LIVE stage of CF2. LIVE stage is actually CF2 running linked to CloudFront. Apart from that, the DEVELOPMENT stage is mainly used for development purposes and allows you to validate incoming requests in the LIVE stage.

Next, I would like to briefly explain about the maintenance role and CICD user listed in red text.

Maintenance role and CICD user

Each role is as follows.

  • Duty of the maintenance role
    • Monitoring and updating various AWS services on the AWS Management Console.
    • At KINTO Technologies, when logging into the AWS Management Console, SSO logins are made to an account provided for each environment. By switching to the properly authorized maintenance role after SSO login, you can view and update the necessary AWS services manually.
      • Due to the presence of various products within the same account, we enforce restrictions on viewing and updating permissions to prevent misoperation.
  • Duty of the CICD user
    • Updating various AWS services using CICD tools such as Github Actions.
      • Setting permissions to be used in deploying applications. The AWS resources used by each product determine the permissions to be granted. For example, a product that deploys Lambda and ECS has permission to deploy both, while a product that deploys ECS only has permission to deploy ECS only.

Existing maintenance roles and CICD users were not granted CF2 permission, so the following permissions were added.

Additional permission
        {
            "Action": [
                "cloudfront:UpdateFunction",
                "cloudfront:TestFunction",
                "cloudfront:PublishFunction",
                "cloudfront:ListFunctionTestEvent",
                "cloudfront:GetFunction",
                "cloudfront:DescribeFunction",
                "cloudfront:DeleteFunctionTestEvent",
                "cloudfront:CreateFunctionTestEvent"
            ],
            "Effect": "Allow",
            "Resource": "arn:aws:cloudfront::{AccountID}:function/${aws:PrincipalTag/environment}-${aws:PrincipalTag/sid}-*",
            "Sid": ""
        },
        {
            "Action": [
                "cloudfront:ListFunctions"
            ],
            "Effect": "Allow",
            "Resource": "*",
            "Sid": ""
        }

As a side note, CF2 can submit test requests like Lambda at the DEVELOPMENT stage. Among them, *TestEvent permission was needed, but the action was not stated in the official document, so I added the necessary permissions by relying on CloudTrail. I found it to be a good example of realizing that official document isn't everything.

Next, I'll talk about the division of responsibilities between the Cloud Infrastructure Team and the Application Team.

The Roles of the Cloud Infrastructure Team and the Application Team

Task Cloud Infrastructure Team Application Team
CF2 permissions -
Create sample app and link to CloudFront -
Develop CloudFront Functions, publish to LIVE stage -
Operate and monitor CF2 -

Let's take a look at the process of actual deploying (Publishing) to the LIVE stage.

Deployment process

1. The Application Team asks the Cloud Infrastructure Team to build

You need to issue Jira tickets based on the following template.


CF2 naming:hogehoge
e.g.) redirect-cf2
List of environments to build: xxx CloudfFront ARN to associate: arn:aws:cloudfront::{AccoutID}:distribution/{DistributionID}
e.g.) arn:aws:cloudfront::111111111111:distribution/EXXXXXXXXXXXXX

Associate cache behaviors Viewer request Viewer response
hogehoge -

2. Built by the Cloud Infrastructure Team

  • Link CF2 of the sample app (request through) created by the Cloud Infrastructure Team to CloudFront behavior.
  • Grant the Application Team permissions for development and deployment to maintenance role and CICD user.
Sample app's CF2
function handler(event) {
    var request = event.request;
    return request;
}


*The Cloud Infrastructure Team updates and creates the necessary resources. The red frame is the target. *

3. The Application Team publishes the CF2 code to the DEVELOPMENT stage.

How to update the source code to the DEVELOPMENT stage is as follows. 1. Manual execution from the AWS Management Console using maintenance role
2. Apply using CI/CD tools such as Github Actions, using CICD user credentials

You can run tests on the AWS Management Console or with CI/CD tools.


Development and testing

4. The Application Team publishes the CF2 code to the LIVE stage.

Similar to applying to the DEVELOPMENT stage, publishing to the LIVE stage can be performed from the AWS Management Console or CI/CD tools such as Github Actions.


Final Configuration

Issue 2: The Application Team is not being able to view logs.

Grant view permission to the log group.

Permission to view log insights and log groups
        {
            "Action": [
                "logs:StartQuery",
                "logs:GetLogGroupFields",
                "logs:GetLogEvents"
            ],
            "Effect": "Allow",
            "Resource": "arn:aws:logs:us-east-1:{AccountID}:log-group:/aws/cloudfront/function/${aws:PrincipalTag/environment}-${aws:PrincipalTag/sid}-*-cloudfront-function:log-stream:*",
            "Sid": ""
        }

In the maintenance role, the logs are now visible because we have granted the log group view and log insight view permissions as described above. As a result, I believe the Application Team can now take the lead in addressing problems as they occur.

Issue 3: CF2 logs remain unexpired.

A CloudWatchLog group is created when CF2 is built. This was achieved by including it in a module that is referenced in the process of CF2 creation.

resource "aws_cloudwatch_log_group" "this" {
  name              = "/aws/cloudfront/function/${local.function_name}"
  retention_in_days = var.cwlogs_retention_in_days == null ? var.env.log_retention_in_days : var.cwlogs_retention_in_days
}

Summary/Conclusion

Three improvement initiatives were implemented for CF2 at this time. Let me summarize it in bullet points.

  • Issue 1: High communication costs
    • Solution: Organize permissions and processes to enable the Application Team to deploy on their own
    • Effect: The Application Team can now execute tasks at any time, allowing effective communication as required.
  • Issue 2: The Application Team cannot view logs
    • Solution: Grant the Application Team permission to view logs
    • Effect: Even when problems occur, they can check the logs and respond by themselves
  • Issue 3: CF2 logs remain permanent
    • Solution: Create destination log group with expiration period first
    • Effect: The log validity period was determined, which contributed to cost optimization.

Thank you for reading my article all the way to the end!

Facebook

関連記事 | Related Posts

Y.Suzuki
Y.Suzuki
Cover Image for November Welcomes: Introducing the New Members

November Welcomes: Introducing the New Members

Cover Image for CloudFrontで利用できるエッジ関数とは

CloudFrontで利用できるエッジ関数とは

yuki.n
yuki.n
Cover Image for December and January Welcomes: Introducing the New Members

December and January Welcomes: Introducing the New Members

Taka
Taka
Cover Image for プラットフォームグループのご紹介

プラットフォームグループのご紹介

Y.Suzuki
Y.Suzuki
Cover Image for 11月入社メンバー紹介

11月入社メンバー紹介

Maya.S
Maya.S
Cover Image for October Welcomes: Introducing the New Members

October Welcomes: Introducing the New Members

We are hiring!

【プラットフォームエンジニア】プラットフォームG/東京・大阪

プラットフォームグループについてAWS を中心とするインフラ上で稼働するアプリケーション運用改善のサポートを担当しています。

【プロダクト開発バックエンドエンジニア】共通サービス開発G/東京・大阪

共通サービス開発グループについてWebサービスやモバイルアプリの開発において、必要となる共通機能=会員プラットフォームや決済プラットフォームの開発を手がけるグループです。KINTOの名前が付くサービスやTFS関連のサービスをひとつのアカウントで利用できるよう、様々な共通機能を構築することを目的としています。