vectordotdev/vector

Timestamp being overwritten (or removed) by cloudwatch logs sink

Open

#15,346 创建于 2022年11月25日

在 GitHub 查看
 (11 评论) (2 反应) (0 负责人)Rust (21,837 star) (2,126 fork)batch import
good first issuesink: aws_cloudwatch_logs

描述

A note for the community

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

Problem

When using vector to forward logs that contain a timestamp field, the timestamp is lifted off the object, and overwritten with the time of submission to cloudwatch logs.

Configuration

[sources.audit_trail]
type = "http_client"
endpoint = "$URL"
scrape_interval_secs = 60

[transforms.remap_data]
inputs = [ "audit_trail"]
type = "remap"
source = '''
  parsed = parse_json!(.message)
  # input is a batch of logs in a .data field (jsonapi format)
  . = parsed.data
'''

[sinks.aws_cloudwatch_logs]
type = "aws_cloudwatch_logs"
inputs = [ "remap_data" ]
create_missing_group = false
create_missing_stream = false
group_name = "$AWS_LOG_GROUP"
compression = "none"
region = "$AWS_REGION"
stream_name = "$AWS_LOG_STREAM"
[sinks.aws_cloudwatch_logs.encoding]
codec = "json"

Version

0.25.1

Debug Output

No response

Example Data

{
    "data": [
        {
            "id": "2bc31f88-99e4-4030-b92f-f710fb48ed28",
            "version": "0",
            "type": "Resource",
            "timestamp": "2022-11-24T21:06:29.000Z",
            "auth": {
                "type": "Impersonated",
                "accessor_id": "user-34r07v09n35c4isj",
                "description": "clang",
                "impersonator_id": "user-cy6ylswoknfwj9eg",
                "organization_id": "org-7le9vo0it2cazal5"
            },
            "request": {
                "id": "bccb9c14-c311-469c-a255-814ef3ad72d1"
            },
            "resource": {
                "id": "nc-7pitueipxj7qbtw4",
                "type": "notification_configuration",
                "action": "queue",
                "meta": {
                    "related": "user-3adufxrzoy3sp8nh"
                }
            }
        },
      ...
    ],
    "pagination": {
        "current_page": 1,
        "prev_page": null,
        "next_page": 2,
        "total_pages": 5,
        "total_count": 5000
    }
}

Additional Context

I don't have a snapshot of cloudwatch logs, but the event received has keys sorted ascending as expected, except the timestamp field is removed. The timestamp displayed on the cloudwatch log is not the timestamp from the object, it's either being set by Vector? Or it's a timestamp generated on AWS' end, in which case the bug here is just that timestamp is disappearing from my objects.

I think in theory I could get around the issue if I add a new field to each entry (something like original_timestamp), but I am still new to Vector and would need to get fancy with for_each?

References

No response

贡献者指南