Thread: BUG #18825: Row value equality predicates do not use indices

BUG #18825: Row value equality predicates do not use indices

From

PG Bug reporting form

Date:

26 February, 01:24:36

The following bug has been logged on the website:

Bug reference:      18825
Logged by:          Nicolas Williams
Email address:      nico@cryptonector.com
PostgreSQL version: 13.11
Operating system:   Linux
Description:

DELETE FROM foo USING (SELECT dels FROM dels) AS dels
WHERE foo = dels; -- <--- does not use indices on either table

DELETE FROM foo USING (SELECT dels FROM dels) AS dels
WHERE foo.a = (dels).a AND foo.b = (dels).b; -- AND so on; this does use
indices

I can create indices on row values, but they don't get used:

CREATE INDEX foo_idx ON foo ((foo)); -- looks like an index on the full row
values.

(Incidentally, there are no NULLs in these tables.)

Basically I'd expect that since I can write queries that take advantage of
row values to be simpler, shorter, easier to read, and to generate, like:

SELECT lft, rght
FROM foo
NATURAL FULL OUTER JOIN bar
WHERE (lft IS DISTINCT FROM NULL AND rght IS NOT DISTINCT FROM NULL) OR
     (lft IS NOT DISTINCT FROM NULL AND rght IS DISTINCT FROM NULL);

which does use indices, but that's because the NATURAL JOIN isn't looking at
row values exactly but rather it expands to be the equivalent of

SELECT lft, rght
FROM foo
FULL OUTER JOIN bar USING (a, b, .., z)
WHERE (lft IS DISTINCT FROM NULL AND rght IS NOT DISTINCT FROM NULL) OR
     (lft IS NOT DISTINCT FROM NULL AND rght IS DISTINCT FROM NULL);

So it seems like there is a missing feature where row equality predicates
could be desugared into a conjunction of column equality predicates on those
values, not unlike NATURAL JOIN desugaring into JOIN .. USING (<common
column list>), and JOIN .. USING (<common column list>) desugaring into JOIN
ON <conjunction of column equality predicates for all columns in common>,
then row equality predicates would naturally be able to use indices.
Alternatively, if row equality predicates could use indices on row value
expressions, that would also help, though this seems a bit silly since the
combination of an index and the table it indexes should function as "row
index", and wasting more space and time updating more indices that shouldn't
really be necessary seems counter-productive (though on the plus side there
should be only one index per table source in row value equality predicates
with this approach, which is easier on the planner!).

Re: BUG #18825: Row value equality predicates do not use indices

From

Tom Lane

Date:

26 February, 03:44:42

PG Bug reporting form <noreply@postgresql.org> writes:
> DELETE FROM foo USING (SELECT dels FROM dels) AS dels
> WHERE foo = dels; -- <--- does not use indices on either table

[ shrug... ]  Works for me given the full-row indexes, although I do
have to force enable_seqscan off or reduce random_page_cost a lot,
because otherwise the planner thinks a seqscan-and-sort is cheaper.
I strongly suspect it's right, because full-row indexes are going to
be bigger than the table proper.  The mere fact that a plan uses
indexes does not automatically make it better than one that doesn't.

Anyway, AFAICS your gripe has nothing to do with "can the planner
use these indexes", and everything to do with its cost estimates
about the value of doing it that way.

            regards, tom lane

Re: BUG #18825: Row value equality predicates do not use indices

From

Nico Williams

Date:

06 March, 19:50:37

On Tue, Feb 25, 2025 at 07:44:42PM -0500, Tom Lane wrote:
> PG Bug reporting form <noreply@postgresql.org> writes:
> > DELETE FROM foo USING (SELECT dels FROM dels) AS dels
> > WHERE foo = dels; -- <--- does not use indices on either table
> 
> [ shrug... ]  Works for me given the full-row indexes, although I do
> have to force enable_seqscan off or reduce random_page_cost a lot,
> because otherwise the planner thinks a seqscan-and-sort is cheaper.

That seems like a bug, but a separate bug.

> I strongly suspect it's right, because full-row indexes are going to
> be bigger than the table proper.  The mere fact that a plan uses
> indexes does not automatically make it better than one that doesn't.

If a full row index is a b-tree (or otherwise a prefix) index then one
think that skip-scanning the b-tree would be much faster than scanning
the table.

These are tables with two or three columns, all small, and the indices
are all covering indices, therefore the indices are all "bigger than the
table proper", yet decomposing the row equality predicate causes indices
to be used.

(Again, in this example all columns are NOT NULL.)

When I change the query to decompose the row equality predicate into
column equality predicates then the indices are used, and the query is
fast.

For the reasons I give above I suspect that a full-row b-tree index
would be equally fast, if only I could get the planner to try it.

> Anyway, AFAICS your gripe has nothing to do with "can the planner
> use these indexes", and everything to do with its cost estimates
> about the value of doing it that way.

I suspect that the planner doesn't know to try to decompose row equality
predicates into column equality predicates to take advantage of indices
on those columns.  That would be a missing feature, not a bug, though I
admit that it was surprising (PG is so good that this sort of thing is
surprising, what can I say).

As for not using the full row index, from the evidence I have the
planner is indeed making a suboptimal choice here.  Surely the planner
can't always make the optimal choice, but I suspect row equality
predicates are rare, and that when b-tree indices on full row values
exist then using those should have a slightly higher weight than not
using them.

Nico
--