Re: Qual push down to table AM - Mailing list pgsql-hackers

From Andres Freund
Subject Re: Qual push down to table AM
Date
Msg-id mmvp5gcbwmjpl2bb7e3qytam3iy3wonpz26djrge5fcyyqnrui@dckh6eggdqlu
Whole thread Raw
In response to Re: Qual push down to table AM  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: Qual push down to table AM
List pgsql-hackers
Hi,

On 2025-12-09 16:40:17 -0500, Robert Haas wrote:
> On Fri, Aug 29, 2025 at 4:38 AM Julien Tachoires <julien@tachoires.me> wrote:
> Potentially, there could be a performance problem

I think the big performance hazard with this is repeated deforming. The
scankey infrastructure deforms attributes one-by-one *and* it does not
"persist" the work of deforming for later accesses.  So if you e.g. have
something like

  SELECT sum(col_29) FROM tbl WHERE col_30 = common_value;
or
  SELECT * FROM tbl WHERE col_30 = common_value;


we'll now deform col_30 in isolation for the ScanKey evaluation and then we'll
deform columns 1-29 in the slot (because we always deform all the leading
columns), during projection.

But even leaving the slot issue aside, I'd bet that you'll see overhead due to
*not* deforming multiple columns at once. If you have a ScanKey version of
something like
  WHERE column_20 = common_val AND column_21 = some_val AND column_22 = another_val;

and there's a NULL or varlena value in one of the leading columns, we'll redo
a fair bit of work during the fastgetattr() for column_22.



I don't really see this being viable without first tackling two nontrivial
projects:

1) Make slot deforming for expressions & projections selective, i.e. don't
   deform all the leading columns, but only ones that will eventually be
   needed
2) Perform ScanKey evaluation in slot form, to be able to cache the deforming
   and to make deforming of multiple columns sufficiently efficient.


> So, somewhat to my surprise, I think that v4-0001 might be basically
> fine. I wonder if anyone else sees a problem that I'm missing?

I doubt this would be safe as-is: ISTM that if you release the page lock
between tuples, things like the number of items on the page can change. But we
store stuff like that in registers / on the stack, which could change while
the lock is not held.

We could refetch the number items on the page for every loop iteration, but
that'd probably not be free. OTOH, it's probably nothing compared to the cost
of relocking the page...

Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Heikki Linnakangas
Date:
Subject: Re: Fix a minor typo in the comment of read_stream_start_pending
Next
From: Masahiko Sawada
Date:
Subject: Re: POC: enable logical decoding when wal_level = 'replica' without a server restart