AWS S3 via SQS
Description
The AWS S3 Source enables you to use log data stored in Amazon S3 as a Pipeline source via Amazon's Simple Queue Service (SQS). You would typically use Amazon S3 as a Pipeline source if you need to re-hydrate data for further analysis, due to an incident or for compliance. Once S3 data begins to flow through the Pipeline, you can specify data you don’t want to collect through a pipeline Processor to parse specific fields, and route this parsed data to a metric analysis tool like Prometheus, or Mezmo Log Analysis. By re-hydrating, parsing, and routing only the Amazon S3 data you need to your analytical tools, you can minimize your costs to re-ingest your data.
Configure SQS and S3
Because you will need some values from your S3 bucket configuration to complete the Access Policy configuration for SQS, you should set up your S3 bucket before you configure SQS. When you set up SQS, make sure that it is in the same region as your S3 bucket, or you will not be able to connect it to S3. You will only need to set up a basic S3 configuration, you will modify it to send data through SQS as part of these instructions.
Configure SQS
- Log in to the AWS console and navigate to Simple Queue Service.
- On the Amazon SQS landing page, click Create queue.
- For Type, make sure Standard is selected.
- Enter a Name for the service, for example
mezmo-pipeline
. - Leave all the other settings at the default, and at the bottom of the page click Create queue.
- On the Queue details page, select the Access policy tab.
- Click Edit for the Access policy, and enter the JSON definition at the end of these instructions.
- In the JSON definition, enter these values, then click Save.
- For
Resource
, enter the value forARN
in the Details section for SQS. - For
aws:SourceARN
, navigate to your S3 bucket, open the Properties tab, and enter the value forAmazon Resource Name (ARN).
- For
aws:SourceAccount,
enter the value for your AWS user account. You can find your Account ID by selecting your account name in the upper-right corner of the AWS console.
- For
Configure S3
- Navigate to your S3 bucket.
- Select the Properties tab.
- Scroll to the Event notifications section, and click Create event notification.
- Enter an Event name, for example
mezmo-sqs-bucket-notify
. - Under Event types, for Object creation, select Put. This will send an event notification only when a new file or folder is added to the bucket. If you want to trigger notifications for other types of events, select those events.
- Under Destination, choose
SQS queue
and then choose the queue you created above. - Click Save changes.
This completes the configuration of SQS and your S3 bucket. Now you can configure the Mezmo S3 source to receive S3 event notifications via SQS.
SQS Access Policy JSON Definition
{
"Version": "2012-10-17",
"Id": "example-ID",
"Statement": [
{
"Sid": "example-statement-ID",
"Effect": "Allow",
"Principal": {
"Service": "s3.amazonaws.com"
},
"Action": [
"SQS:SendMessage"
],
"Resource": "SQS-queue-ARN",
"Condition": {
"ArnLike": {
"aws:SourceArn": "arn:aws:s3:*:*:awsexamplebucket1"
},
"StringEquals": {
"aws:SourceAccount": "bucket-owner-account-id"
}
}
}
]
}
AWS Access Key Required IAM Permissions
Name | Purpose |
---|---|
s3:GetObject | Operation |
sqs:ReceiveMessage | Operation |
sqs:DeleteMessage | Operation |
Mezmo Source Configuration
- Select the Amazon S3 Source in your Mezmo Telemetry Pipeline and select Edit config.
- Navigate to your SQS queue and copy the URL in the Details section.
- In the Amazon S3 Source configuration, copy the URL into the SQS Queue URL field.
- Enter the information for your AWS Authentication, and the Region for your SQS Queue and S3 bucket.
- Select the Compression method to use for your SQS event notificatinos. The
Compression
setting ofauto
will attempt to use theContent Encoding
,Content-Type
, andkey
suffix to determine which encoding is in use. If the compression setting cannot be determined, it will default tonone
.
Mezmo Configuration Options
Option | Description |
---|---|
SQS Queue URL | The URL of an AWS SQS queue configured to receive S3 bucket notifications for the S3 buckets where your source data is stored. |
AWS Authentication - Access Key ID | The access key ID for your AWS account. |
AWS Authentication - Secret Access Key | The secret access key for your AWS account. |
Compression | The compression format for the S3 objects. |
Region | The name of the region for the S3 bucket. |
Included Metadata
Field | Format | Description |
---|---|---|
file_name | String | This is the name of the file being pulled |
size | Number | Number of bytes to be pulled |
bucket | String | Where the file came from |