When NotFound Errors are plenty in AWS CloudTrail! Exploring Solutions and Best Practices
Hey, I found a bunch of NotFound error events in AWS CloudTrail!
Hello. I am Kurihara from the Cloud Center of Excellence (CCoE) team at KINTO Technologies, who couldn't dislike alcohol even after watching the Japanese series "Drinking Habit 50." As Tada from my team previously introduced CCoE Activities and Providing Google Cloud Security Preset Environments, we are working every day to keep our cloud environment secure. While analyzing AWS CloudTrail logs to check the AWS health of our account, I noticed that there were a lot of NotFound-type errors on a regular basis. This may sound boring, but if you are an AWS user, chances are you have encountered the same event. Despite searching extensively on Google, I couldn't find any relevant information, so I decided to document my investigation through a blog post.
Conclusion
Overall, when analyzing AWS CloudTrail, NotFound-type errors via the service link crawl in the AWS Config recorder should be excluded and analyzed.
Error events inevitably occur due to the behavior of AWS Config, so they should be properly filtered to reduce analysis noise.
Details of Investigation
KINTO Technologies has a multi-account configuration where Landing Zones are managed in AWS Control Tower in accordance with best practices for AWS multi-account management. Therefore, AWS Config manages configuration information and AWS CloudTrail manages audit logs.
While analyzing AWS CloudTrail logs to check the AWS health of our account, I found that NotFound-type error events were occurring in large numbers and on a regular basis.
Here are the results of the AWS Athena analysis of CloudTrail logs for about a month from a certain AWS account. This account is issued with minimal security settings and no workload has been built.
-- Analyze the top of errorCode
WITH filterd AS (
SELECT
*
FROM
cloudtrail_logs
WHERE
errorCode IS NOT NULL
)
SELECT
errorCode,
count(errorcode) as eventCount,
count(errorCode) * 100 / (select count(*) from filterd) as errorRate
FROM
filterd
GROUP BY
errorCode | eventCount | errorRate |
---|---|---|
ResourceNotFoundException | 1,515 | 18 |
ReplicationConfigurationNotFoundError | 1,112 | 13 |
ObjectLockConfigurationNotFoundError | 958 | 11 |
NoSuchWebsiteConfiguration | 954 | 11 |
NoSuchCORSConfiguration | 952 | 11 |
InvalidRequestException | 627 | 7 |
Client.RequestLimitExceeded | 609 | 7 |
-- Check the frequency of occurrence of a specific erroCode
SELECT
date(from_iso8601_timestamp(eventtime)) as "date"
count(*) as count
FROM
cloudtrail_logs
WHERE
errorcode = 'ResourceNotFoundException'
GROUP BY
date(from_iso8601_timestamp(eventtime))
ORDER BY
"date" ASC
LIMIT 5
date | count |
---|---|
2023-10-19 | 52 |
2023-10-20 | 80 |
2023-10-21 | 80 |
2023-10-22 | 80 |
2023-10-23 | 80 |
I picked up a few error codes and looked at the AWS CloudTrail records (the actual AWS CloudTrail logs are listed at the end of this article) and found that all of them were recorded in the arn field of the userIdentity that was the access source as arn:aws:sts::${AWS_ACCOUNT_ID}:assumed-role/AWSServiceRoleForConfig/${SESSION_NAME}
. This is the Service-Linked Roles attached to AWS Config. I could not figure out why NotFound would occur even though the target resource exists, but when I checked the eventName
section, I realized that it is not an API to get configuration information of the resource itself, but rather for each of its dependent resources.
Resource | errorCode | API that was called (eventName) |
---|---|---|
Lambda | ResourceNotFoundException | GetPolicy20150331v2 |
S3 | ReplicationConfigurationNotFoundError | GetBucketReplication |
S3 | NoSuchCORSConfiguration | GetBucketCors |
Although it is not an error that affects the workload, we would like to eliminate it as it is noise in general monitoring and troubleshooting. To do so, we need to take non-essential actions such as "configure something in the related resource" (for example, adding a Lambda resource-based policy that allows InvokeFunction actions only from its own account).
We came to the corresponding conclusion that our CCoE team excludes access from the AWS Config service-linked role when analyzing AWS CloudTrail. If you analyze with AWS Athena, it is an image of executing the following query.
SELECT
*
FROM
cloudtrail_logs
WHERE
userIdentity.arn not like '%AWSServiceRoleForConfig%'
A Brief Deep Dive
I will delve a bit further into the process of recording configuration information in AWS Config, based on insights gained during this investigation. There are two points that are not explicitly stated in the official documentation, but were found in this investigation.
- Dependent (supplemental) resources (I named it myself) recording behavior
- Frequency of recording dependent (supplemental) resources
Dependent (supplemental) resource recording behavior
AWS Config not only records configuration information of the resource itself, but also related resources (relationships). They are named direct relationship
and indirect relationship
.
AWS Config derives the relationships for most resource types from the configuration field, which are called "direct" relationships. A direct relationship is a one-way connection (A→B) between a resource (A) and another resource (B), typically obtained from the describe API response of resource (A). In the past, for some resource types that AWS Config initially supported, it also captured relationships from the configurations of other resources, creating "indirect" relationships that are bidirectional (B→A). For example, the relationship between an Amazon EC2 instance and its security group is direct because the security groups are included in the describe API response for the Amazon EC2 instance. On the other hand, the relationship between a security group and an Amazon EC2 instance is indirect because describing a security group does not return any information about the instances it is associated with. As a result, when a resource configuration change is detected, AWS Config not only creates a CI for that resource, but also generates CIs for any related resources, including those with indirect relationships. For example, when AWS Config detects changes in an Amazon EC2 instance, it creates a CI for the instance and a CI for the security group that is associated with the instance.
-- https://docs.aws.amazon.com/config/latest/developerguide/faq.html#faq-1
There are resources, which I name them on my own, dependent (supplemental) resource
, that are separate from related resources and appear to be settings for the resource itself, but they also have separate acquisition APIs. In the case of Lambda, Lambda itself is a resource that can be obtained with GetFunction, whereas resource-based policy is another resource that can be obtained with GetPolicy. Looking at the Configuration Item (CI), the resource-based policy that is a dependent (supplemental) resource, is recorded in the supplementaryConfiguration
field as follows:
{
"version": "1.3",
"accountId": "<$AWS_ACCOUNT_ID>",
"configurationItemCaptureTime": "2023-12-15T09:52:19.238Z",
"configurationItemStatus": "OK",
"configurationStateId": "************",
"configurationItemMD5Hash": "",
"arn": "arn:aws:lambda:ap-northeast-1:<$AWS_ACCOUNT_ID>:function:check-config-behavior",
"resourceType": "AWS::Lambda::Function",
"resourceId": "check-config-behavior",
"resourceName": "check-config-behavior",
"awsRegion": "ap-northeast-1",
"availabilityZone": "Not Applicable",
"tags": {
"Purpose": "investigate"
},
"relatedEvents": [],
# Related resources
"relationships": [
{
"resourceType": "AWS::IAM::Role",
"resourceName": "check-config-behavior-role-nkmqq3sh",
"relationshipName": "Is associated with "
}
],
... Omitted
# Dependent (supplemental) resources
"supplementaryConfiguration": {
"Policy": "{\"Version\":\"2012-10-17\",\"Id\":\"default\",\"Statement\":[{\"Sid\":\"test-poilcy\",\"Effect\":\"Allow\",\"Principal\":{\"AWS\":\"arn:aws:iam::<$AWS_ACCOUNT_ID>:root\"},\"Action\":\"lambda:InvokeFunction\",\"Resource\":\"arn:aws:lambda:ap-northeast-1:<$AWS_ACCOUNT_ID>:function:check-config-behavior\"}]}",
"Tags": {
"Purpose": "investigate"
}
}
}
Frequency of recording dependent (supplemental) resources
The frequency of recording CIs in AWS Config depends on the setting of RecordingMode, but this does not seem to be the case for dependent (supplemental) resources. If it was a NotFound-type error, it may have been due to retry attempts. However, the observed behavior indicated that recording was attempted once every 12 or 24 hours. This also does not seem to be a regularity subject to the type of dependent (supplemental) resources. This is the result of my investigation, although it is quite a black box behavior.
Summary
The above introduced the identity of the mysterious NotFound-type error events output to AWS CloudTrail and countermeasures. The details will be further investigated in the future, but it has been confirmed that similar error events are occurring from the service-linked roles in Macie. Although AWS CloudTrail analysis is a tedious task, it is also an opportunity to gain a deeper understanding of the behavior of AWS services. Therefore, let's perform it proactively! For engineers who want to leverage AWS to its fullest, and who think Keisuke Koide is a talented actor, the Platform Group is currently seeking to hire you!
Finally, I will conclude this article by listing each AWS CloudTrail error event. Thank you for reading.
Lambda: ResourceNotFoundException
{
"eventVersion": "1.08",
"userIdentity": {
"type": "AssumedRole",
"principalId": "************:LambdaDescribeHandlerSession",
"arn": "arn:aws:sts::<$AWS_ACCOUNT_ID>:assumed-role/AWSServiceRoleForConfig/LambdaDescribeHandlerSession",
"accountId": "<$AWS_ACCOUNT_ID>",
"accessKeyId": "*********",
"sessionContext": {
"sessionIssuer": {
"type": "Role",
"principalId": "*********",
"arn": "arn:aws:iam::<$AWS_ACCOUNT_ID>:role/aws-service-role/config.amazonaws.com/AWSServiceRoleForConfig",
"accountId": "<$AWS_ACCOUNT_ID>",
"userName": "AWSServiceRoleForConfig"
},
"webIdFederationData": {},
"attributes": {
"creationDate": "2023-12-03T09:09:17Z",
"mfaAuthenticated": "false"
}
},
"invokedBy": "config.amazonaws.com"
},
"eventTime": "2023-12-03T09:09:19Z",
"eventSource": "lambda.amazonaws.com",
"eventName": "GetPolicy20150331v2",
"awsRegion": "ap-northeast-1",
"sourceIPAddress": "config.amazonaws.com",
"userAgent": "config.amazonaws.com",
"errorCode": "ResourceNotFoundException",
"errorMessage": "The resource you requested does not exist.",
"requestParameters": {
"functionName": "**************"
},
"responseElements": null,
"requestID": "******************",
"eventID": "******************",
"readOnly": true,
"eventType": "AwsApiCall",
"managementEvent": true,
"recipientAccountId": "<$AWS_ACCOUNT_ID>",
"eventCategory": "Management"
}
S3: ReplicationConfigurationNotFoundError
{
"eventVersion": "1.09",
"userIdentity": {
"type": "AssumedRole",
"principalId": "**********:AWSConfig-Describe",
"arn": "arn:aws:sts::<$AWS_ACCOUNT_ID>:assumed-role/AWSServiceRoleForConfig/AWSConfig-Describe",
"accountId": "<$AWS_ACCOUNT_ID>",
"accessKeyId": "*************",
"sessionContext": {
"sessionIssuer": {
"type": "Role",
"principalId": "*************",
"arn": "arn:aws:iam::<$AWS_ACCOUNT_ID>:role/aws-service-role/config.amazonaws.com/AWSServiceRoleForConfig",
"accountId": "<$AWS_ACCOUNT_ID>",
"userName": "AWSServiceRoleForConfig"
},
"attributes": {
"creationDate": "2023-12-03T13:09:16Z",
"mfaAuthenticated": "false"
}
},
"invokedBy": "config.amazonaws.com"
},
"eventTime": "2023-12-03T13:09:55Z",
"eventSource": "s3.amazonaws.com",
"eventName": "GetBucketReplication",
"awsRegion": "ap-northeast-1",
"sourceIPAddress": "config.amazonaws.com",
"userAgent": "config.amazonaws.com",
"errorCode": "ReplicationConfigurationNotFoundError",
"errorMessage": "The replication configuration was not found",
"requestParameters": {
"replication": "",
"bucketName": "*********",
"Host": "*************"
},
"responseElements": null,
"additionalEventData": {
"SignatureVersion": "SigV4",
"CipherSuite": "ECDHE-RSA-AES128-GCM-SHA256",
"bytesTransferredIn": 0,
"AuthenticationMethod": "AuthHeader",
"x-amz-id-2": "**************",
"bytesTransferredOut": 338
},
"requestID": "**********",
"eventID": "*************",
"readOnly": true,
"resources": [
{
"accountId": "<$AWS_ACCOUNT_ID>",
"type": "AWS::S3::Bucket",
"ARN": "arn:aws:s3:::***********"
}
],
"eventType": "AwsApiCall",
"managementEvent": true,
"recipientAccountId": "<$AWS_ACCOUNT_ID>",
"vpcEndpointId": "vpce-***********",
"eventCategory": "Management"
}
S3: NoSuchCORSConfiguration
{
"eventVersion": "1.09",
"userIdentity": {
"type": "AssumedRole",
"principalId": "***********:AWSConfig-Describe",
"arn": "arn:aws:sts::<$AWS_ACCOUNT_ID>:assumed-role/AWSServiceRoleForConfig/AWSConfig-Describe",
"accountId": "<$AWS_ACCOUNT_ID>",
"accessKeyId": "***************",
"sessionContext": {
"sessionIssuer": {
"type": "Role",
"principalId": "*************",
"arn": "arn:aws:iam::<$AWS_ACCOUNT_ID>:role/aws-service-role/config.amazonaws.com/AWSServiceRoleForConfig",
"accountId": "<$AWS_ACCOUNT_ID>",
"userName": "AWSServiceRoleForConfig"
},
"attributes": {
"creationDate": "2023-12-03T13:09:16Z",
"mfaAuthenticated": "false"
}
},
"invokedBy": "config.amazonaws.com"
},
"eventTime": "2023-12-03T13:09:55Z",
"eventSource": "s3.amazonaws.com",
"eventName": "GetBucketCors",
"awsRegion": "ap-northeast-1",
"sourceIPAddress": "config.amazonaws.com",
"userAgent": "config.amazonaws.com",
"errorCode": "NoSuchCORSConfiguration",
"errorMessage": "The CORS configuration does not exist",
"requestParameters": {
"bucketName": "********",
"Host": "*************************8",
"cors": ""
},
"responseElements": null,
"additionalEventData": {
"SignatureVersion": "SigV4",
"CipherSuite": "ECDHE-RSA-AES128-GCM-SHA256",
"bytesTransferredIn": 0,
"AuthenticationMethod": "AuthHeader",
"x-amz-id-2": "*********************",
"bytesTransferredOut": 339
},
"requestID": "***********",
"eventID": "*****************",
"readOnly": true,
"resources": [
{
"accountId": "<$AWS_ACCOUNT_ID>",
"type": "AWS::S3::Bucket",
"ARN": "arn:aws:s3:::*************"
}
],
"eventType": "AwsApiCall",
"managementEvent": true,
"recipientAccountId": "<$AWS_ACCOUNT_ID>",
"vpcEndpointId": "vpce-********",
"eventCategory": "Management"
}
関連記事 | Related Posts
Getting Started with Prometheus, Grafana, and X-Ray for Observability (O11y)
A story about JavaScript error detection in browser
Uncovering and Resolving Memory Leaks in Web Services
Deployment Process in CloudFront Functions and Operational Kaizen
Prometheus + Grafana + X-Rayで始めるO11y
CloudFront FunctionsのDeployのプロセスと運用カイゼン
We are hiring!
【クラウドセキュリティエンジニア】SCoE G/東京・大阪
Security Center of Excellence ( SCoE ) グループについてSCoE グループは、マルチクラウド ( AWS, Google Cloud, Azure ) 環境のセキュリティガバナンスを担当しています。KINTO テクノロジーズ内だけでなく、グループ内の関連組織とも協力しながら、業務に行います。
【クラウドエンジニア】Cloud Infrastructure G/東京・大阪
KINTO Tech BlogWantedlyストーリーCloud InfrastructureグループについてAWSを主としたクラウドインフラの設計、構築、運用を主に担当しています。