Re: Extracting only the columns needed for a query - Mailing list pgsql-hackers

From Ashwin Agrawal
Subject Re: Extracting only the columns needed for a query
Date
Msg-id CALfoeiugKFT+5PGceRgJEDjNOjEL8bxGe3UzAgVByOsXoVCcMg@mail.gmail.com
Whole thread Raw
In response to Re: Extracting only the columns needed for a query  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Extracting only the columns needed for a query  (Melanie Plageman <melanieplageman@gmail.com>)
List pgsql-hackers

On Sat, Jun 15, 2019 at 10:02 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Approach B: after parsing and/or after planning

If we wanted to do something about this, making the planner record
the set of used columns seems like the thing to do.  We could avoid
the expense of doing it when it's not needed by setting up an AM/FDW/
etc property or callback to request it.

Sounds good. In Zedstore patch, we have added AM property to convey the AM
leverages column projection and currently skip physical tlist optimization based
on it. So, yes can similarly be leveraged for other planning needs.
 
Another reason for having the planner do this is that presumably, in
an AM that's excited about this, the set of fetched columns should
play into the cost estimates for the scan.  I've not been paying
enough attention to the tableam work to know if we've got hooks for
the AM to affect scan costing ... but if we don't, that seems like
a hole that needs plugged.

AM callback relation_estimate_size exists currently which planner leverages. Via
this callback it fetches tuples, pages, etc.. So, our thought is to extend this
API if possible to pass down needed column and help perform better costing for
the query. Though we think if wish to leverage this function, need to know list
of columns before planning hence might need to use query tree.


>     Approach B, however, does not work for utility statements which do
>     not go through planning.

I'm not sure why you're excited about that case?  Utility statements
tend to be pretty much all-or-nothing as far as data access goes.

Statements like COPY, CREATE INDEX, CREATE CONSTRAINTS, etc.. can benefit from
subset of columns for scan. For example in Zedstore currently for CREATE
INDEX we extract needed columns by walking indexInfo->ii_Predicate and
indexInfo->ii_Expressions. For COPY, we currently use cstate->attnumlist to know
which columns need to be scanned.

pgsql-hackers by date:

Previous
From: Stephen Frost
Date:
Subject: Re: [Proposal] Table-level Transparent Data Encryption (TDE) and KeyManagement Service (KMS)
Next
From: Bruce Momjian
Date:
Subject: Re: [Proposal] Table-level Transparent Data Encryption (TDE) and KeyManagement Service (KMS)