WAL Replay not using more memory than before the restart · prometheus/prometheus#16942

(8 评论) (0 反应) (0 负责人)Go (64,042 star) (10,408 fork)batch import

component/walhelp wantednot-as-easy-as-it-looks

描述

This is to track some ideas from https://github.com/prometheus/prometheus/issues/6934

The current algorithm tries to replay WAL as fast as possible after restart, which can use more memory than the Prometheus use.

This could be problematic for cases where Prometheus is under pressure (tons of metrics and low memory limit) and some operation like an expensive query or API call is OOM-ing it. The recovery is impossible due to startup using even more memory, so manual removal of WAL is needed.

For any other OOMs around too many series scraped, where no specific query or API caused the OOM, but it's just high use due to too many series scraped, this feature (improving startup use) is not going to help alone, but might unlock other options like compact/truncate on start to move big load to TSDB blocks for further debugging and work.

Please use this issue if you have thoughts around replay memory consumption alone. To discuss the OOM detection ideas and the general OOM handling or safeguards, let's use the https://github.com/prometheus/prometheus/issues/13939 issue. For the general unexpected OOMs, where clearly Prometheus uses unexpected amount of memory, given the scraped/ingested load you put through it, please open separate issue.

Acceptance Criteria

A mode (or by default, if fast enough) where Prometheus startup does not use more memory then the "normal" use.

Ideas

Add a mode where Prometheus is slowing down and replaying in segments that guarantee stable resource usage.
Truncate before replay (https://github.com/prometheus/prometheus/issues/13939)
General garbage optimizations if possible.

贡献者指南

技术栈: go
领域: backenddatabaseperformance
议题类型: feature
难度: 4
预计时间: over 1 week
活动状态: fresh
清晰度: mostly clear
前置要求: Go programmingUnderstanding of WAL and memory management
新手友好度: 15
研究方向: This issue aims to reduce WAL replay memory consumption during Prometheus startup. Start by reading the linked issues #6934 and #13939 for context. The current replay algorithm replays all WAL segments sequentially, potentially using more memory than normal operation. The acceptance criteria require a mode (or default) where startup memory does not exceed normal use. Explore ideas such as replaying in segments with controlled memory usage or truncating WAL before replay. Focus on the Prometheus TSDB package, particularly WAL related code in the tsdb/wal directory. Consider how to measure memory usage during replay and implement a throttling mechanism or segmented approach.