apache/hudi

Limit the amount of partitions considered for GlobalBloomIndex

Open

#14484 opened on Nov 30, 2025

View on GitHub
 (0 comments) (0 reactions) (0 assignees)Java (4,823 stars) (2,431 forks)batch import
area:indexfrom-jiragood first issuepriority:hightype:improvement

Description

Currently, global bloom index will check inputs against files in all partitions.. In lot of cases, the user may know a range of partitions actually impacted from updates clearly (e.g upstream system drops updates older than a year, ... ).. In such a scenario,it may make sense to support an option for Global bloom to control how many partitions you want to match against, to gain performance.

JIRA info

Contributor guide