Re: Qual push down to table AM - Mailing list pgsql-hackers
| From | Amit Langote |
|---|---|
| Subject | Re: Qual push down to table AM |
| Date | |
| Msg-id | CA+HiwqFk0QBXCyNVYoD9Li9PNdbyAmF1nJ0CMz-6c2C6Vxw=dA@mail.gmail.com Whole thread Raw |
| In response to | Re: Qual push down to table AM (Robert Haas <robertmhaas@gmail.com>) |
| Responses |
Re: Qual push down to table AM
|
| List | pgsql-hackers |
On Thu, Dec 11, 2025 at 12:41 AM Robert Haas <robertmhaas@gmail.com> wrote: > On Tue, Dec 9, 2025 at 6:08 PM Andres Freund <andres@anarazel.de> wrote: > > On 2025-12-09 16:40:17 -0500, Robert Haas wrote: > > > On Fri, Aug 29, 2025 at 4:38 AM Julien Tachoires <julien@tachoires.me> wrote: > > > Potentially, there could be a performance problem > > > > I think the big performance hazard with this is repeated deforming. The > > scankey infrastructure deforms attributes one-by-one *and* it does not > > "persist" the work of deforming for later accesses. So if you e.g. have > > something like > > > > SELECT sum(col_29) FROM tbl WHERE col_30 = common_value; > > or > > SELECT * FROM tbl WHERE col_30 = common_value; > > > > we'll now deform col_30 in isolation for the ScanKey evaluation and then we'll > > deform columns 1-29 in the slot (because we always deform all the leading > > columns), during projection. > > Hmm, this is a good point, and I agree that it's a huge challenge for > this patch set. Repeated tuple deforming is REALLY expensive, which is > why we've spent so much energy trying to use slots in an as many > places as possible. I find it easy to believe that HeapKeyTest's loop > over heap_getattr() is going to prohibitively painful and that this > code will need to somehow also be slot-ified for this to be a viable > project. > > > I don't really see this being viable without first tackling two nontrivial > > projects: > > > > 1) Make slot deforming for expressions & projections selective, i.e. don't > > deform all the leading columns, but only ones that will eventually be > > needed > > 2) Perform ScanKey evaluation in slot form, to be able to cache the deforming > > and to make deforming of multiple columns sufficiently efficient. > > IOW, I agree that we probably need to do #2. I am not entirely sure > about #1. I'm also curious to understand why Andres sees #1 as a prerequisite for qual pushdown. > I'm a little afraid that trying to skip over columns without > deforming them will add a bunch of code complexity that doesn't really > pay off. I think it might be worthwhile. I have a PoC [1] I worked on (at Andres's suggestion) that showed ~2x improvement on simple aggregation queries over wide tables (all pages buffered) by tracking the minimum needed attribute and using cached offsets stored in TupleDesc to skip fixed-not-null prefixes. I'm thinking of reviving it with proper tracking of which attributes are needed and deformed (bitmapset or flag array in TupleTableSlot). > > > So, somewhat to my surprise, I think that v4-0001 might be basically > > > fine. I wonder if anyone else sees a problem that I'm missing? > > > > I doubt this would be safe as-is: ISTM that if you release the page lock > > between tuples, things like the number of items on the page can change. But we > > store stuff like that in registers / on the stack, which could change while > > the lock is not held. > > > > We could refetch the number items on the page for every loop iteration, but > > that'd probably not be free. OTOH, it's probably nothing compared to the cost > > of relocking the page... > > We still hold a pin, though, which I think means very little can > change. More items can be added to the page, so we might want to > refresh the number of items on the page at least when we think we're > done, but I believe that any sort of more invasive page rearrangement > would be precluded by the pin. > > I kind of wonder if it would be good to make a change along the lines > of v4-0001 even if this patch set doesn't move forward overall, or > will need a lot of slot-ification to do so. It seems weird to me that > we're OK with calling out to arbitrary code with a buffer lock held, > and even weirder that whether or not we do that depends on whether > SO_ALLOW_PAGEMODE was set. I don't think a difference of this kind > between pagemode behavior and non-pagemode behavior would survive > review if someone proposed it today; the fact that it works the way it > does is probably an artifact of this mechanism having been added > twenty years ago when the project was in a very different place. One maybe crazy thought: what about only enabling qual pushdown when pagemode is used, since it already processes all tuples on a page in one locked phase? That raises the question of whether there's a class of quals simple enough (built-in ops?) that evaluating them alongside visibility checking would be acceptable, with lock held that is -- but it would avoid the lock churn and racy loop termination issues with v4-0001. -- Thanks, Amit Langote [1] https://www.postgresql.org/message-id/CA%2BHiwqHXDY6TxegR2Cr_4sRa_LY1QJnoL8XRmOqdfrx21pZ6cw%40mail.gmail.com
pgsql-hackers by date: