opensearch-project/OpenSearch

[Feature Request] Paginate snapshot indices status fetching

Open

#16,985 创建于 2025年1月9日

在 GitHub 查看
 (19 评论) (0 反应) (1 负责人)Java (8,123 star) (1,505 fork)batch import
Storage:Snapshotsenhancementgood first issue

描述

Is your feature request related to a problem? Please describe

Our customers depend on the snapshot status API to access information about snapshot indices, like store size, number of docs, etc. The TransportSnapshotStatusAction utilizes a single Generic thread to retrieve repository data, snapshot information, snapshot index metadata, and shard snapshot status if the specified snapshot(s) is not currently running. However, when the specified snapshot contains a large number of indices, the execution time for this action becomes significantly prolonged.

In one of the snapshot which has 15000+ shards, snapshot status fetching was taking 8min.

Describe the solution you'd like

Provide a new API (_snapshot/{repository}/{snapshot}/_list/indices) to paginate snapshot indices status like we did in #14258. The new API works only for indexes belonging to a specific snapshot. Since the order of indices in SnapshotInfo is settled, we can simply use from + size to paginate. If the specified snapshot is running, then the paginating parameters will have no effect.

Related component

Storage:Snapshots

Describe alternatives you've considered

Using the snapshot thread pool to parallelize indices snapshot status fetching. But the snapshot thread pool might be blocked on long running tasks. Moreover, the maximum number of threads in the snapshot thread pool is only 5, so the speedup effect may be limited

Additional context

No response

贡献者指南