Re: track needed attributes in plan nodes for executor use - Mailing list pgsql-hackers
From | Amit Langote |
---|---|
Subject | Re: track needed attributes in plan nodes for executor use |
Date | |
Msg-id | CA+HiwqHc1EJ1_LSu491fmXS9CqrFmGOKh2Z7udYBH19zBTLKLA@mail.gmail.com Whole thread Raw |
In response to | Re: track needed attributes in plan nodes for executor use (Japin Li <japinli@hotmail.com>) |
List | pgsql-hackers |
On Fri, Jul 11, 2025 at 6:58 PM Japin Li <japinli@hotmail.com> wrote: > On Fri, 11 Jul 2025 at 17:16, Amit Langote <amitlangote09@gmail.com> wrote: > > Hi, > > > > I’ve been experimenting with an optimization that reduces executor > > overhead by avoiding unnecessary attribute deformation. Specifically, > > if the executor knows which attributes are actually needed by a plan > > node’s targetlist and qual, it can skip deforming unused columns > > entirely. > > > > In a proof-of-concept patch, I initially computed the needed > > attributes during ExecInitSeqScan by walking the plan’s qual and > > targetlist to support deforming only what’s needed when evaluating > > expressions in ExecSeqScan() or the variant thereof (I started with > > SeqScan to keep the initial patch minimal). However, adding more work > > to ExecInit* adds to executor startup cost, which we should generally > > try to reduce. It also makes it harder to apply the optimization > > uniformly across plan types. > > > > I’d now like to propose computing the needed attributes at planning > > time instead. This can be done at the bottom of create_plan_recurse, > > after the plan node has been constructed. A small helper like > > record_needed_attrs(plan) can walk the node’s targetlist and qual > > using pull_varattnos() and store the result in a new Bitmapset > > *attr_used field in the Plan struct. System attributes returned by > > pull_varattnos() can be filtered out during this step, since they're > > either not relevant to deformation or not performance sensitive. > > > > This also lays the groundwork for a related executor-side optimization > > that David Rowley suggested to me off-list. The idea is to remember, > > in the TupleDesc, either the attribute number or the byte offset of > > the first variable-length attribute. Then, if the minimum required > > attribute (as provided by attr_used) lies before that, the executor > > can safely jump directly to it using the cached offset, rather than > > starting deformation from attno 0 as it currently does. That avoids > > walking through fixed-length attributes that aren't needed -- > > specifically, skipping per-attribute alignment, null checking, and > > offset tracking for unused columns -- which reduces CPU work and > > avoids loading irrelevant tuple bytes into cache. > > > > With both patches in place, heap tuple deforming can skip over unused > > attributes entirely. For example, on a 30-column table where the first > > 15 columns are fixed-width, the query: > > > > select sum(a_1) from foo where a_10 = $1; > > > > which references only two fixed-width columns, ran nearly 2x faster > > with the optimization in place (with heap pages prewarmed into > > shared_buffers). > > > > In more complex plans, for example those involving a Sort or Join > > between the scan and aggregation, the CPU cost of the intermediate > > node may dominate, making deforming-related savings at the top less > > visible in overall performance. Still, I don't think that's a reason > > to avoid enabling this optimization more broadly across plan nodes. > > > > I'll post the PoC patches and performance measurements. Posting this > > in advance to get feedback on the proposed direction and where best to > > place attr_used. > > > > That's interesting. If I understand correctly, this approach wouldn't work if > the first attribute is variable-length, right? That is correct. -- Thanks, Amit Langote
pgsql-hackers by date: