vectordotdev/vector

Safeguard `reduce` transform memory use

Open

#3,027 建立於 2020年7月10日

在 GitHub 查看
 (1 留言) (2 反應) (0 負責人)Rust (21,837 star) (2,126 fork)batch import
domain: performancedomain: reliabilitygood first issuehave: shouldtransform: reducetype: enhancement

描述

There are two closely related potential issues with the merge transform as implemented in #2870.

First, we have no upper limit on the potential memory use of the merge states. If the identifier fields are misconfigured, we could end up storing an unbounded amount of data in memory until it expires, and the expiration window is large enough that the chance of OOM is real. Ideally, we should have a configurable solution here similar to the tag cardinality limit transform.

The second and much smaller issue is that the hashmap we use to store those merge states is never resized down. This means it will remain the largest size it has ever grown to, even if that was an outlier. I suspect that this is not much of an issue, because the vast majority of state stored in that hashmap is heap-allocated and will get reclaimed as it expires. That being said, it's still worth investigating how much memory is used by the hashmap itself and if there are any convenient points at which we could resize it down.

貢獻者指南