Re: Aggregate leads to superfluous projection from the scan - Mailing list pgsql-hackers

From Zhihong Yu
Subject Re: Aggregate leads to superfluous projection from the scan
Date
Msg-id CALNJ-vTfugZ1-Fx+nuxnJeh7xN56hL-Fha9_nurOBQA02TU5=A@mail.gmail.com
Whole thread Raw
In response to Re: Aggregate leads to superfluous projection from the scan  (Zhihong Yu <zyu@yugabyte.com>)
Responses Re: Aggregate leads to superfluous projection from the scan
List pgsql-hackers


On Fri, Jul 8, 2022 at 12:48 PM Zhihong Yu <zyu@yugabyte.com> wrote:


On Fri, Jul 8, 2022 at 12:30 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Ibrar Ahmed <ibrar.ahmad@gmail.com> writes:
> I give a quick look and I think in case whenever data is extracted from the
> heap it shows all the columns. Therefore when columns are extracted from
> the index only it shows the indexed column only.

This is operating as designed, and I don't think that the proposed
patch is an improvement.  The point of use_physical_tlist() is that
returning all the columns is cheaper because it avoids a projection
step.  That's true for any case where we have to fetch the heap
tuple, so IndexScan is included though IndexOnlyScan is not.

Now, that's something that was true a decade or more ago.
There's been considerable discussion recently about cases where
it's not true anymore, for example with columnar storage or FDWs,
and so we ought to invent a way to prevent createplan.c from
doing it when it would be counterproductive.  But just summarily
turning it off is not an improvement.

                        regards, tom lane
Hi,
In createplan.c, there is `change_plan_targetlist`

Plan *
change_plan_targetlist(Plan *subplan, List *tlist, bool tlist_parallel_safe)

But it doesn't have `Path` as parameter.
So I am not sure whether the check of non-returnable columns should be done in change_plan_targetlist().

bq. for example with columnar storage or FDWs,

Yeah. The above is the case where I want to optimize.

Cheers
Hi, Tom:
I was looking at the following comment in createplan.c :

     * For table scans, rather than using the relation targetlist (which is
     * only those Vars actually needed by the query), we prefer to generate a
     * tlist containing all Vars in order.  This will allow the executor to
     * optimize away projection of the table tuples, if possible. 

Maybe you can give me some background on the above decision.

Thanks

pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: replacing role-level NOINHERIT with a grant-level option
Next
From: Jacob Champion
Date:
Subject: Re: Commitfest Update