[YSQL] Investigate point-lookup path for fully-bound hash PK IN queries
#32,026 opened on Jun 4, 2026
Description
Jira Link: DB-21716
Created by opus 4.8
Problem
For a query whose IN predicates fully determine a hash primary key, e.g.
CREATE TABLE foo (h1 int, h2 int, PRIMARY KEY ((h1, h2) HASH));
SELECT * FROM foo WHERE h1 IN (0,1) AND h2 IN (0,1);
every target row is a complete, known primary key — the four combinations (0,0),(0,1),(1,0),(1,1). This is effectively a batch of point lookups. Instead, the storage layer executes it through HybridScanChoices, the general option-iteration engine, rather than the direct ybctid/SeekTuple point-read path. This issue is to investigate whether the fully-bound case can take the cheaper point-lookup path, and whether it is worth doing.
User impact
The query does the same RocksDB seeks either way, so there is no extra I/O, but HybridScanChoices adds per-target CPU bookkeeping (SkipTargetsUpTo, IncrementScanTargetAtOptionList, ValidateHashGroup, CurrentTargetMatchesKey, group-end/option-group handling) to do what is fundamentally N point seeks.
Root cause / current behavior
The path is chosen in PgDmlRead::Exec (src/yb/yql/pggate/pg_dml_read.cc):
if (doc_op_ && !ybctid_provider() && IsAllKeyColumnsBound()) {
SubstituteKeyBindsWithYbctids(); // ybctid batch -> DocDB SeekTuple
} else {
ProcessEmptyKeyBinds(); // -> hash permutations -> HybridScanChoices
}
IsAllKeyColumnsBound() returns false for any IN on a hash column (i < num_hash_key_columns is an unsupported-IN early-out). The reason is that the ybctid builder, BuildYbctidsFromKeyBinds(), only handles "all hash columns fixed to single values + at most one range column with IN"; it does not enumerate a cross-product of multiple hash IN-lists.
So hash-IN falls through to ProcessEmptyKeyBinds -> PopulateNextHashPermutationOps, which enumerates the permutations and rewrites them into a single pushed-down condition (yb_hash_code, h1, h2) IN ((hc,0,0),(hc,0,1),...). On the DocDB side DocPgsqlScanSpec::InitOptions turns that tuple-IN into options(), and ScanChoices::Create (src/yb/docdb/scan_choices.cc) constructs a HybridScanChoices because doc_spec.options() is non-null.
Notably, the information needed to take the point-lookup path is already available at rewrite time: PgDocReadOp::BindExprsToBatch computes the full hash key per permutation (EncodePgsqlHash + DecodeMultiColumnHashValue, src/yb/yql/pggate/pg_doc_op.cc). At that exact point the full doc key is in hand — it is just emitted as a tuple-IN condition rather than as a ybctid.