Thread: BUG #18825: Row value equality predicates do not use indices
The following bug has been logged on the website: Bug reference: 18825 Logged by: Nicolas Williams Email address: nico@cryptonector.com PostgreSQL version: 13.11 Operating system: Linux Description: DELETE FROM foo USING (SELECT dels FROM dels) AS dels WHERE foo = dels; -- <--- does not use indices on either table DELETE FROM foo USING (SELECT dels FROM dels) AS dels WHERE foo.a = (dels).a AND foo.b = (dels).b; -- AND so on; this does use indices I can create indices on row values, but they don't get used: CREATE INDEX foo_idx ON foo ((foo)); -- looks like an index on the full row values. (Incidentally, there are no NULLs in these tables.) Basically I'd expect that since I can write queries that take advantage of row values to be simpler, shorter, easier to read, and to generate, like: SELECT lft, rght FROM foo NATURAL FULL OUTER JOIN bar WHERE (lft IS DISTINCT FROM NULL AND rght IS NOT DISTINCT FROM NULL) OR (lft IS NOT DISTINCT FROM NULL AND rght IS DISTINCT FROM NULL); which does use indices, but that's because the NATURAL JOIN isn't looking at row values exactly but rather it expands to be the equivalent of SELECT lft, rght FROM foo FULL OUTER JOIN bar USING (a, b, .., z) WHERE (lft IS DISTINCT FROM NULL AND rght IS NOT DISTINCT FROM NULL) OR (lft IS NOT DISTINCT FROM NULL AND rght IS DISTINCT FROM NULL); So it seems like there is a missing feature where row equality predicates could be desugared into a conjunction of column equality predicates on those values, not unlike NATURAL JOIN desugaring into JOIN .. USING (<common column list>), and JOIN .. USING (<common column list>) desugaring into JOIN ON <conjunction of column equality predicates for all columns in common>, then row equality predicates would naturally be able to use indices. Alternatively, if row equality predicates could use indices on row value expressions, that would also help, though this seems a bit silly since the combination of an index and the table it indexes should function as "row index", and wasting more space and time updating more indices that shouldn't really be necessary seems counter-productive (though on the plus side there should be only one index per table source in row value equality predicates with this approach, which is easier on the planner!).
PG Bug reporting form <noreply@postgresql.org> writes: > DELETE FROM foo USING (SELECT dels FROM dels) AS dels > WHERE foo = dels; -- <--- does not use indices on either table [ shrug... ] Works for me given the full-row indexes, although I do have to force enable_seqscan off or reduce random_page_cost a lot, because otherwise the planner thinks a seqscan-and-sort is cheaper. I strongly suspect it's right, because full-row indexes are going to be bigger than the table proper. The mere fact that a plan uses indexes does not automatically make it better than one that doesn't. Anyway, AFAICS your gripe has nothing to do with "can the planner use these indexes", and everything to do with its cost estimates about the value of doing it that way. regards, tom lane
On Tue, Feb 25, 2025 at 07:44:42PM -0500, Tom Lane wrote: > PG Bug reporting form <noreply@postgresql.org> writes: > > DELETE FROM foo USING (SELECT dels FROM dels) AS dels > > WHERE foo = dels; -- <--- does not use indices on either table > > [ shrug... ] Works for me given the full-row indexes, although I do > have to force enable_seqscan off or reduce random_page_cost a lot, > because otherwise the planner thinks a seqscan-and-sort is cheaper. That seems like a bug, but a separate bug. > I strongly suspect it's right, because full-row indexes are going to > be bigger than the table proper. The mere fact that a plan uses > indexes does not automatically make it better than one that doesn't. If a full row index is a b-tree (or otherwise a prefix) index then one think that skip-scanning the b-tree would be much faster than scanning the table. These are tables with two or three columns, all small, and the indices are all covering indices, therefore the indices are all "bigger than the table proper", yet decomposing the row equality predicate causes indices to be used. (Again, in this example all columns are NOT NULL.) When I change the query to decompose the row equality predicate into column equality predicates then the indices are used, and the query is fast. For the reasons I give above I suspect that a full-row b-tree index would be equally fast, if only I could get the planner to try it. > Anyway, AFAICS your gripe has nothing to do with "can the planner > use these indexes", and everything to do with its cost estimates > about the value of doing it that way. I suspect that the planner doesn't know to try to decompose row equality predicates into column equality predicates to take advantage of indices on those columns. That would be a missing feature, not a bug, though I admit that it was surprising (PG is so good that this sort of thing is surprising, what can I say). As for not using the full row index, from the evidence I have the planner is indeed making a suboptimal choice here. Surely the planner can't always make the optimal choice, but I suspect row equality predicates are rare, and that when b-tree indices on full row values exist then using those should have a slightly higher weight than not using them. Nico --