Home > mailing lists

Re: How to know referenced sub-fields of a composite type? - Mailing list pgsql-hackers

From	Haribabu Kommi
Subject	Re: How to know referenced sub-fields of a composite type?
Date	May 29, 2019 08:44:42
Msg-id	CAJrrPGcEnvK86cLsPmE5Eqw-2t9HG-azBUqCqM-f=E+qn2UcxQ@mail.gmail.com Whole thread
In response to	Re: How to know referenced sub-fields of a composite type? (Kohei KaiGai <kaigai@heterodb.com>)
List	pgsql-hackers

Tree view

On Wed, May 29, 2019 at 4:51 PM Kohei KaiGai <kaigai@heterodb.com> wrote:

Hi Amit,

2019年5月29日(水) 13:26 Amit Langote <Langote_Amit_f8@lab.ntt.co.jp>:
>
> Kaigai-san,
>
> On 2019/05/29 12:13, Kohei KaiGai wrote:
> > One interesting data type in Apache Arrow is "Struct" data type. It is
> > equivalent to composite
> > type in PostgreSQL. The "Struct" type has sub-fields, and individual
> > sub-fields have its own
> > values array for each.
> >
> > It means we can skip to load the sub-fields unreferenced, if
> > query-planner can handle
> > referenced and unreferenced sub-fields correctly.
> > On the other hands, it looks to me RelOptInfo or other optimizer
> > related structure don't have
> > this kind of information. RelOptInfo->attr_needed tells extension
> > which attributes are referenced
> > by other relation, however, its granularity is not sufficient for sub-fields.
>
> Isn't that true for some other cases as well, like when a query accesses
> only some sub-fields of a json(b) column? In that case too, planner
> itself can't optimize away access to other sub-fields. What it can do
> though is match a suitable index to the operator used to access the
> individual sub-fields, so that the index (if one is matched and chosen)
> can optimize away accessing unnecessary sub-fields. IOW, it seems to me
> that the optimizer leaves it up to the indexes (and plan nodes) to further
> optimize access to within a field. How is this case any different?
>
I think it is a little bit different scenario.
Even if an index on sub-fields can indicate the tuples to be fetched,
the fetched tuple contains all the sub-fields because heaptuple is
row-oriented data.
For example, if WHERE-clause checks a sub-field: "x" then aggregate
function references other sub-field "y", Scan/Join node has to return
a tuple that contains both "x" and "y". IndexScan also pops up a tuple
with a full composite type, so here is no problem if we cannot know
which sub-fields are referenced in the later stage.
Maybe, if IndexOnlyScan supports to return a partial composite type,
it needs similar infrastructure that can be used for a better composite
type support on columnar storage.

There is another issue related to the columnar store that needs targeted

columns for projection from the scan is discussed in zedstore [1].

Projecting all columns from a columnar store is quite expensive than

the row store.

[1] - https://www.postgresql.org/message-id/CALfoeivu-n5o8Juz9wW%2BkTjnis6_%2BrfMf%2BzOTky1LiTVk-ZFjA%40mail.gmail.com

Regards,

Haribabu Kommi

Fujitsu Australia

pgsql-hackers by date:

From: Haribabu Kommi
Date: 29 May 2019, 08:30:20
Subject: Re: MSVC Build support with visual studio 2019

From: Ashutosh Sharma
Date: 29 May 2019, 12:20:35
Subject: Server crash due to assertion failure in CheckOpSlotCompatibility()

Re: How to know referenced sub-fields of a composite type? - Mailing list pgsql-hackers

Previous

Next