opensearch-project/OpenSearch

[Remote Store] Evaluate safely skipping segment_n file uploads after flush for remote store backed indices

Open

#18,239 创建于 2025年5月8日

在 GitHub 查看
 (2 评论) (0 反应) (0 负责人)Java (8,123 star) (1,505 fork)batch import
Storage:Remoteenhancementgood first issuelucene

描述

Is your feature request related to a problem? Please describe

In remote-backed storage indices, OpenSearch provides refresh-level durability by uploading segment metadata on each refresh. This metadata contains information about segments and userdata (such as local checkpoint and max seq no) for that refresh. Currently, we also upload the segment_n file to remote storage when Lucene creates it during each flush

During recovery and replication, we don't use the segment_n file information Reference: IndexShard.java#L5174

Instead, we use the infobytes from remote segment metadata to create another commit on replica. Reference: IndexShard.java#L5202.

Describe the solution you'd like

Proposed Change

Evaluate if we can safely skip uploading the segment_n file to remote storage, as it appears to be unused in recovery and replication scenarios.

Investigation Points

  1. Identify potential risks or edge cases where segment_n file might be needed
  2. Analyze impact on:
    • Recovery scenarios
    • Replication processes

Related component

No response

Describe alternatives you've considered

No response

Additional context

No response

贡献者指南