On Fri, Oct 31, 2025 at 5:06 AM Victor Yegorov <vyegorov@gmail.com> wrote:
> I took a look at the patch. Proposed comments look highly valuable, especially around NULLs, doesn't look immediately
obvious,so definitely requires a comment. 
> Looks good to commit.
Cool.
> Wouldn't it be good to add such information also into the user documentation, say into
> https://www.postgresql.org/docs/current/functions-comparisons.html#ROW-WISE-COMPARISON
I wasn't thinking about the user-facing documentation. But I think
that you make a good point.
The main problem with that documentation of row compares is that we
don't say anything about keyset pagination with a multicolumn index
(not anywhere). That is almost the entire reason why this feature
exists -- but we don't actually say anything about it to users. No
wonder the feature is underused.
Separately, there is a risk that NULLs will break applications that
implement keyset pagination. Attached test case shows what I mean by
this.
To summarize the test case:
The test case shows that QUERY 1 returns slightly more rows than QUERY
2 due to the presence of NULLs in lower-order index columns. This
seems surprising. (The underlying issue is more or less the same issue
that makes row comparisons confusing to the implementation, especially
in places like _bt_set_startikey).
A user might expect that it won't matter how many or how few "keyset
pages"/row compare queries were used -- one big query (no pagination)
should give the same answer as (say) 4 "keyset pagination" queries
(just in smaller pieces/sets of rows). But as the test case shows,
when there are NULLs in lower-order index columns, that isn't
necessarily true. QUERY 1 and QUERY 2 are only guaranteed to return
precisely the same rows if we somehow make sure that there can be no
rows returned with NULLs in lower order columns (e.g., by using a
primary key for keyset pagination queries, or by always adding IS NOT
NULL conditions against lower-order columns like "b" and "c").
--
Peter Geoghegan