Home > mailing lists

Re: Extracting only the columns needed for a query - Mailing list pgsql-hackers

From	Melanie Plageman
Subject	Re: Extracting only the columns needed for a query
Date	June 19, 2020 00:46:09
Msg-id	CAAKRu_ZGBOA02zcwx00C7o=37_=oRXOXg6ZwELP0iaA_39VFbA@mail.gmail.com Whole thread
In response to	Re: Extracting only the columns needed for a query (Dmitry Dolgov <9erthalion6@gmail.com>)
List	pgsql-hackers

Tree view

On Fri, Mar 13, 2020 at 12:09 PM Dmitry Dolgov <9erthalion6@gmail.com> wrote:

In general implemented functionality looks good. I've checked how it
works on the existing tests, almost everywhere required columns were not
missing in scanCols (which is probably the most important part).
Sometimes exressions were checked multiple times, which could
potentially introduce some overhead, but I believe this effect is
negligible. Just to mention some counterintuitive examples, for this
kind of query

SELECT min(q1) FROM INT8_TBL;

the same column was checked 5 times in my tests, since it's present also
in baserestrictinfo, and build_minmax_path does one extra round of
planning and invoking make_one_rel.

Thanks so much for the very thorough review, Dmitry.

These extra calls to extract_scan_cols() should be okay in this case.
As you mentioned, for min/max queries, planner attempts an optimization
with an indexscan and, to do this, it modifies the querytree and then
calls query_planner() on it.
It tries it with NULLs first and then NULLs last -- each of which
invokes query_planner(), so that is two out of three calls. The
third is the normal invocation. I'm not sure how you would get five,
though.

Another interesting
example is Values Scan (e.g. in an insert statements with multiple
records), can an abstract table AM user leverage information about
columns in it?

One case, where I believe columns were missing, is statements with
returning:

INSERT INTO foo (col1)
VALUES ('col1'), ('col2'), ('col3')
RETURNING *;

Looks like in this situation there is only expression in reltarget is
for col1, but returning result contains all columns.

This relates to both of your above points:

For this RETURNING query, it is a ValuesScan, so no columns have to be
scanned. We actually do add the reference to col1 to the scanCols
bitmap, though. I'm not sure we want to do so, since we don't want to
scan col1 in this case. I wonder what cases we would miss if we special
cased RTEKind RTE_VALUES when doing extract_scan_cols().

Melanie

pgsql-hackers by date:

From: Alvaro Herrera
Date: 18 June 2020, 23:51:13
Subject: Re: Add A Glossary

From: Michael Paquier
Date: 19 June 2020, 00:59:26
Subject: Re: [PATCH] Allow to specify restart_lsn inpg_create_physical_replication_slot()

Re: Extracting only the columns needed for a query - Mailing list pgsql-hackers

Previous

Next