KINTO Tech Blog
Security

A Day in the Life of a KTC Cloud Security Engineer

Cover Image for A Day in the Life of a KTC Cloud Security Engineer

Introduction

Hello, I'm Kuwahara from the SCoE Group at the Osaka Tech Lab in KINTO Technologies (KTC).

SCoE stands for Security Center of Excellence, a term that might not be widely recognized yet. In April 2024, KTC restructured the CCoE team into the SCoE group. To learn more about the SCoE Group, check out the article SCoE Group: Leading the Evolution of Cloud Security.

For more details about the Osaka Tech Lab, KTC's Kansai base, visit Introduction to Osaka Tech Lab.

The mission of the SCoE Group is to "implement real-time guardrail monitoring and improvement activities" across AWS, Google Cloud, and Azure environments. These activities focus on three key areas:

  • Preventing security risks
  • Continuously monitoring and analyzing security risks
  • Responding promptly when a security risk arises

In this post, I’ll provide a closer look at the work of KTC’s cloud security engineers.

A Day in the Life of a Cloud Security Engineer

To provide a clearer picture, I’d like to walk you through a typical day for a cloud security engineer (please note that due to the sensitive nature of the field, some aspects cannot be shared in detail.)

Checking alerts

The first thing we do in the morning is check whether there are any high-risk alerts. We use CSPM (Cloud Security Posture Management) and threat detection services to understand the security status of the entire cloud environment and check whether there are any alerts that require immediate action. KTC uses services such as AWS Security Hub, Amazon GuardDuty, and Sysdig Secure for CSPM and threat detection services.

In checking alerts, the following are considered:

  • Alert prioritization: Alerts are classified and prioritized based on their severity and scope of impact.
  • Alert Triage: Identify the cause of an alert and take necessary action.
  • Management of false positives (”over-detection”): Security tools can sometimes produce false positives. This may cause activities that are actually not problematic to be reported as alerts. A cloud security engineer also manages these as part of alert handling.
  • Identification of operations required for work: This is related to managing false positives, but some alerts may be triggered by operations required for work. For example, this includes maintenance tasks regularly performed by the person in charge of each product. A cloud security engineer identifies these activities and responds to them appropriately.

This allows you to understand the security status of your entire cloud environment and check for any alerts that require immediate action.

Information Gathering and Catch-up

Next, a cloud security engineer catches up on cybersecurity trends and the latest information on cloud services such as AWS. The following are used as information sources.

  • X (formerly Twitter): Cloud security engineers follow cybersecurity experts and industry leaders on X (formerly Twitter). They share the latest threat information and countermeasures, allowing for real-time information gathering.
  • Official news and blogs from AWS and Google Cloud: Official information from cloud service providers is an important source of information about new feature releases and security updates. This helps cloud security engineers stay informed about new service launches, the latest technological trends, and best practices.
  • Other news sites: By regularly checking news sites and blogs focused on cybersecurity, cloud security engineers can understand trends across the industry and catch up with the latest threats and attack methods.

Threat Detection with SIEM

KTC uses Splunk Cloud Platform as its SIEM (Security Information and Event Management). It aggregates security-related logs in Splunk and provides an environment for cross-sectional analysis and monitoring of logs.

That day, I discovered a suspicious log on the Splunk dashboard. The log stated: "Attempted to create a resource for a service restricted by Google Cloud's organizational policy, but the operation failed."

We were able to determine the general activity from the information in the dashboard for Google Cloud Audit logs that we created using Splunk, but we will investigate in more detail.

First, we identify users who are repeatedly retrying to create resources for services restricted by Google Cloud organizational policies. User information is masked before being logged in Google Cloud's audit log (audicy_denied), so users cannot be identified from this log alone. We identify users by analyzing cross-sectional logs together with terminal logs, etc. We created a query to use for this analysis and identified the users in question.

Next, we create queries to further analyze the behavior of the identified users and analyze the logs.

It appears that the identified user is attempting to use the AI/ML service, Vertex AI. Since no requests for the use of Compute services were made in the relevant project, the use of Compute services is restricted by the organizational policy. When using Notebook with Vertex AI, a Compute Engine (GCE) instance is launched. Therefore, this is a violation of the organizational policy.

Ultimately, we determined that this was a harmless activity, citing a omission of information about the services to be used when applying for a new Google Cloud project.

Cost optimization for Cloud Vendor-Native Security Services

Security services provided by cloud vendors are charged on a pay-as-you-go billing basis, so as cloud resources increase, the security service charge also increases.

Our idea of ​​"security" is "security for business," and "security that hinders business" is unacceptable. Therefore, the "balance between security and cost" is also an important point, and cost optimization of security services is also included in the SCoE Group's mission.

On that day, I investigated the potential for cost optimization of several security services that accounted for a high proportion of the overall cost.

The graph above shows the services that were targeted in the analysis this time. Among them, I paid particular attention to AWS Config. (Specific item names have been masked.)

AWS Config is a service for auditing, evaluating, and recording the configuration of AWS resources. Until November 2023, the only recording method for AWS Config was “a method that records every time a change in resource configuration has occurred." This method is called "recording frequency: continuous recording." In other words, if the frequency of resource changes is high, the number of records in AWS Config increases, and the usage fee increases proportionally.

As an example, let's look at network-related events. The data below is a graph showing the number of VPC and network-related configuration changes in AWS account over a one-week period.

You can see that CreateNetworkInterface and DeleteNetworkInterface, which correspond to the creation and deletion of Elastic Network Interfaces (ENIs), occur approximately 17,000 times per day. KTC is using Fargate, an Amazon Elastic Container Service (ECS). For this reason, the creation/ deletion of ENI occur each time an ECS task (container) starts/stops. Under these circumstances, if you have AWS Config set to “Recording frequency: Continuous recording," the number of AWS Config records associated with these changes will be huge, and the amount you are charged will increase accordingly.

However, starting in November 2023, a new feature that allows you to select "Recording Frequency: Daily Recording" was added to AWS Config. This new feature allows us to adjust the recording frequency for each resource type, providing the flexibility in balancing security and cost. In general, this setting is believed to help optimize the cost of using AWS Config.

However, this is only the case if you are not using AWS Control Tower. AWS Control Tower is a service for centrally managing the governance of multiple AWS accounts.

If you use AWS Control Tower to manage AWS Config in your AWS account, check Guidance for creating and modifying AWS Control Tower resources.

Please pay attention to the following sentence at the beginning of the guidance:

Do not modify or delete any resources created by AWS Control Tower, including resources in the management account, in the shared accounts, and in member accounts. If you modify these resources, you may be required to update your landing zone or re-register an OU, and modification can result in inaccurate compliance reporting.

As this statement indicates, modifying or deleting resources created by AWS Control Tower by any means other than AWS Control Tower is not recommended.

Specifically, as of December 2024, AWS Control Tower does not provide the feature to modify the frequency of AWS Config recording. Therefore, changing the recording frequency of AWS Config under AWS Control Tower management is not recommended, and the official documentation also states that it may cause problems.

Taking into account the content of the official documentation, I also contacted AWS Support just to be sure and received the same opinion.

In this way, when “a setting itself is possible but poses the risk of problems or is not recommended,” it becomes difficult to maintain stable cloud security and governance. The result could be "security that hinders business".

In light of the above, we decided to postpone changing the recording frequency of AWS Config for now and submitted an improvement request to AWS Support. I believe that proposing such an improvement request to enhance the convenience of cloud services is a modest yet very important initiative.

Preparation for a Security Study Session

Finally, I created presentation materials for our regularly held in-house security and privacy study sessions.

The SCoE Group has formulated "Cloud Security Guidelines" that summarize the key points of cloud security for"requirements definition," "design," and "development" phases of product development, and has made them available in-house. This set of guidelines is an important resource for ensuring compliance with the security policies of the group companies to which KTC belongs, minimizing security risks, and supporting efficient development.

I host study sessions to raise awareness and enhance understanding of the Cloud Security Guidelines. In the study sessions, I provide detailed explanations of each item in the guidelines, while also incorporating specific cases and practical advice.

On that day, I created presentation materials on IAM (Identity and Access Management) best practices, ensuring the materials were concise enough to fit within a 20-minute timeframe.

Conclusion

I’ve shared a glimpse into a day in the life of a cloud security engineer at KTC. While this is just a snapshot, I hope it helped you gain a better understanding of what the role entails.

The SCoE Group is currently looking for new team members. Whether you have hands-on experience in cloud security or are simply passionate about the field, we’d love to hear from you. Feel free to reach out to us.

For more information, please check here.

Facebook

関連記事 | Related Posts

We are hiring!

【クラウドセキュリティエンジニア】SCoE G/東京・大阪

Security Center of Excellence ( SCoE ) グループについてSCoE グループは、マルチクラウド ( AWS, Google Cloud, Azure ) 環境のセキュリティガバナンスを担当しています。KINTO テクノロジーズ内だけでなく、グループ内の関連組織とも協力しながら、業務に行います。

【クラウドセキュリティエンジニア】SCoE G/東京・大阪

Security Center of Excellence ( SCoE ) グループについてSCoE グループは、マルチクラウド ( AWS, Google Cloud, Azure ) 環境のセキュリティガバナンスを担当しています。KINTO テクノロジーズ内だけでなく、グループ内の関連組織とも協力しながら、業務に行います。

イベント情報

P3NFEST Bug Bounty 2025 Winter 【KINTOテクノロジーズ協賛】