RemoteFile Output Cannot Handle SSE-KMS Encrypted S3 Buckets
#16744 opened on Apr 9, 2025
Description
Relevant telegraf.conf
[[outputs.remotefile]]
data_format = "json
files = [
"{{.Name}}-{{.Time.Format \"2006-01-02\"}}"
]
remote = "s3,env_auth=true,server_side_encryption=aws:kms,sse_kms_key_id=arn:aws:kms:ap-southeast-2:123456789:key/123456789,provider=AWS,region=ap-southeast-2:my-bucket"
Logs from Telegraf
2025-04-09T06:36:57Z I! Loading config: /etc/telegraf/telegraf.conf
2025-04-09T06:36:57Z I! Starting Telegraf 1.34.1 brought to you by InfluxData the makers of InfluxDB
2025-04-09T06:36:57Z I! Available plugins: 239 inputs, 9 aggregators, 33 processors, 26 parsers, 63 outputs, 6 secret-stores
2025-04-09T06:36:57Z I! Loaded inputs: netflow
2025-04-09T06:36:57Z I! Loaded aggregators:
2025-04-09T06:36:57Z I! Loaded processors: date
2025-04-09T06:36:57Z I! Loaded secretstores:
2025-04-09T06:36:57Z I! Loaded outputs: file health remotefile
2025-04-09T06:36:57Z I! Tags enabled: host=telegraf-polling-service
2025-04-09T06:36:57Z I! [agent] Config: Interval:10s, Quiet:false, Hostname:"telegraf-polling-service", Flush Interval:10s
2025-04-09T06:36:57Z W! [agent] The default value of 'skip_processors_after_aggregators' will change to 'true' with Telegraf v1.40.0! If you need the current default behavior, please explicitly set the option to 'false'!
2025-04-09T06:36:57Z D! [agent] Initializing plugins
2025-04-09T06:36:57Z I! [inputs.netflow] Loaded 0 PEN mappings...
2025-04-09T06:36:57Z D! [agent] Connecting outputs
2025-04-09T06:36:57Z D! [agent] Attempting connection to [outputs.health]
2025-04-09T06:36:57Z I! [outputs.health] Listening on http://[::]:8888
2025-04-09T06:36:57Z I! Config watcher started for /etc/telegraf/telegraf.conf
2025-04-09T06:36:57Z D! [agent] Successfully connected to outputs.health
2025-04-09T06:36:57Z D! [agent] Attempting connection to [outputs.remotefile]
Unfortunately, there are no specific errors or anything to help me dig into what is going wrong. I am already enabling debug arg and debug agent based on below helm chart values.yaml configuration:
args: ["--watch-config", "notify", "--debug"]
config:
agent:
debug: true
The exit code is 127.
System info
Telegraf v1.34.1, Linux x86, EKS v1.30
Docker
N/A
Steps to reproduce
- Create a S3 Bucket with KMS Encryption (SSE-KMS)
- Configure Telegraf remote config based on rclone to use kms configuration. https://rclone.org/s3/#key-management-system-kms `remote = "s3,env_auth=true,server_side_encryption=aws:kms,sse_kms_key_id=arn:aws:kms:ap-southeast-2:123456789:key/123456789,provider=AWS,region=ap-southeast-2:my-bucket"
- Generate input netflow data that should be outputted to remotefile.
Expected behavior
Telegraf should be able to connect to S3 with the KMS key, and write files
Actual behavior
Telegraf fails to connect to S3 and write files. It keeps crashing with exit code 127.
Additional info
I believe it is due to the way that the telegraf code expects the bucket name to be appended after the colon ":" symbol. When you see the new kms configuration, also has colon for ARN of KMS Key and for the "aws:kms" field. I believe the parsing logic is not smart enough to recognise and handle the multiple colons in the configuration.
When I disable SSE-KMS and move back to SSE-S3 which is less secure, and I remove the extra server_side_encryption=aws:kms,sse_kms_key_id=arn:aws:kms:ap-southeast-2:123456789:key/123456789 then I can see files being uploaded to S3 as expected.
I need to be able to write to a SSE-KMS encrypted bucket for security and organisational reasons.
Below is my AWS IAM Policy for Telegraf:
{
"Version": "2012-10-17",
"Statement": [
{
"Condition": {
"StringEquals": {
"aws:RequestedRegion": "ap-southeast-2"
}
},
"Action": [
"s3:PutObject",
"s3:PutObjectAcl",
"s3:ListBucket",
"s3:GetBucketLocation",
"s3:GetObject"
],
"Resource": [
"arn:aws:s3:::123456789",
"arn:aws:s3:::123456789/*"
],
"Effect": "Allow"
},
{
"Condition": {
"StringEquals": {
"aws:RequestedRegion": "ap-southeast-2"
}
},
"Action": [
"kms:Decrypt",
"kms:GenerateDataKey",
"kms:DescribeKey"
],
"Resource": "arn:aws:kms:ap-southeast-2:123456789:key/123456789",
"Effect": "Allow",
"Sid": "AllowKMSKeyAccess"
}
]
}
Another Test Scenario (No Crashing but Functionality Not Working)
I have also tried to keep SSE-KMS S3 Bucket Encryption and simple just use Telegraf remote config without any explicit KMS settings, which connects successfully but then files never write to S3. The logs feel as if it is all working fine, but the functionality is not working.
S3 Bucket uses SSE-KMS
Config:
remote = "s3,env_auth=true,provider=AWS,region=ap-southeast-2:my-bucket"
Logs:
2025-04-10T01:21:08Z I! Loading config: /etc/telegraf/telegraf.conf
2025-04-10T01:21:08Z I! Starting Telegraf 1.34.1 brought to you by InfluxData the makers of InfluxDB
2025-04-10T01:21:08Z I! Available plugins: 239 inputs, 9 aggregators, 33 processors, 26 parsers, 63 outputs, 6 secret-stores
2025-04-10T01:21:08Z I! Loaded inputs: netflow
2025-04-10T01:21:08Z I! Loaded aggregators:
2025-04-10T01:21:08Z I! Loaded processors: enum
2025-04-10T01:21:08Z I! Loaded secretstores:
2025-04-10T01:21:08Z I! Loaded outputs: file health remotefile
2025-04-10T01:21:08Z I! Tags enabled:
2025-04-10T01:21:08Z I! [agent] Config: Interval:10s, Quiet:false, Hostname:"telegraf-polling-service", Flush Interval:10s
2025-04-10T01:21:08Z W! [agent] The default value of 'skip_processors_after_aggregators' will change to 'true' with Telegraf v1.40.0! If you need the current default behavior, please explicitly set the option to 'false'!
2025-04-10T01:21:08Z D! [agent] Initializing plugins
2025-04-10T01:21:08Z I! [inputs.netflow] Loaded 0 PEN mappings...
2025-04-10T01:21:08Z D! [agent] Connecting outputs
2025-04-10T01:21:08Z D! [agent] Attempting connection to [outputs.remotefile]
2025-04-10T01:21:08Z I! Config watcher started for /etc/telegraf/telegraf.conf
2025-04-10T01:21:09Z D! [outputs.remotefile] Connected to S3 bucket 123456789 with 1.1 PB total, 0 B used and 1.1 PB free!
2025-04-10T01:21:09Z D! [agent] Successfully connected to outputs.remotefile
2025-04-10T01:21:09Z D! [agent] Attempting connection to [outputs.file]
2025-04-10T01:21:09Z D! [agent] Successfully connected to outputs.file
2025-04-10T01:21:09Z D! [agent] Attempting connection to [outputs.health]
2025-04-10T01:21:09Z I! [outputs.health] Listening on http://[::]:8888
2025-04-10T01:21:09Z D! [agent] Successfully connected to outputs.health
2025-04-10T01:21:09Z D! [agent] Starting service inputs
2025-04-10T01:21:09Z I! [inputs.netflow] Listening on udp://[::]:2055
...
{"fields":{"observation_domain_id":67108864,"sampling_packet_interval":1,"sampling_packet_space":0,"selector_algo":"systematic count-based sampling"},"name":"netflow_options","tags":{"source":"10.199.65.39","version":"IPFIX"},"timestamp":1744248238}
{"fields":{"dst":"::","flow_end_ms":1744248238133,"flow_label":"0x00000000","flow_start_ms":1744248238133,"icmp_code":1,"icmp_type":0,"in_bytes":3,"in_dst_mac":"f3:ad:60:f5:ab:da","in_packets":2,"in_phy_interface":2165659240,"in_src_mac":"e5:58:fb:49:63:31","ip_version":"IPv4","ipv6_extensions":"0x00000000","ipv6_next_header":0,"out_phy_interface":579231806,"out_vlan_customer_id":0,"out_vlan_id":0,"protocol":"udp","src":"::","src_port":2601,"src_tos":"0x00","tcp_flags":"........","vlan_customer_id":2210,"vlan_id":3592},"name":"netflow","tags":{"source":"10.199.65.39","version":"IPFIX"},"timestamp":1744248238}
2025-04-10T01:23:59Z D! [outputs.file] Wrote batch of 20 metrics in 608.192µs
2025-04-10T01:23:59Z D! [outputs.file] Buffer fullness: 0 / 10000 metrics
2025-04-10T01:23:59Z D! [outputs.remotefile] Wrote batch of 20 metrics in 2.826698ms
2025-04-10T01:23:59Z D! [outputs.remotefile] Buffer fullness: 0 / 10000 metrics
As you can see it outputs to file, and also remotefile, but in S3 there is nothing present. No errors.. Once I disable SSE-KMS and switch to SSE-S3, then the remotefile starts writing to S3 straight away which confirms to me that when I am using the rclone config without KMS settings, the remotefile can still connect but cannot actually write files to the S3 bucket. But if I try to add the KMS settings, Telegraf never connects and constantly crashes.