Thread: RangeTblEntry.inh vs. RTE_SUBQUERY

RangeTblEntry.inh vs. RTE_SUBQUERY

From
Peter Eisentraut
Date:
Various code comments say that the RangeTblEntry field inh may only be 
set for entries of kind RTE_RELATION.

For example

  *    inh is true for relation references that should be expanded to 
include
  *    inheritance children, if the rel has any.  This *must* be false for
  *    RTEs other than RTE_RELATION entries.

and various comments in other files.

(Confusingly, it is also listed under "Fields valid in all RTEs:", but 
that definitely seems wrong.)

I have been deploying some assertions to see if the claims in the 
RangeTblEntry comments are all correct, and I tripped over something.

The function pull_up_simple_union_all() in prepjointree.c sets ->inh to 
true for RTE_SUBQUERY entries:

     /*
      * Mark the parent as an append relation.
      */
     rte->inh = true;

Whatever this is doing appears to be some undocumented magic.  If I 
remove the line, then regression tests fail with plan differences, so it 
definitely seems to do something.

Is this something we should explain the RangeTblEntry comments?



Re: RangeTblEntry.inh vs. RTE_SUBQUERY

From
Dean Rasheed
Date:
On Fri, 23 Feb 2024 at 14:35, Peter Eisentraut <peter@eisentraut.org> wrote:
>
> Various code comments say that the RangeTblEntry field inh may only be
> set for entries of kind RTE_RELATION.
>
> The function pull_up_simple_union_all() in prepjointree.c sets ->inh to
> true for RTE_SUBQUERY entries:
>
>      /*
>       * Mark the parent as an append relation.
>       */
>      rte->inh = true;
>
> Whatever this is doing appears to be some undocumented magic.

Yes, it's explained a bit more clearly/accurately in expand_inherited_rtentry():

/*
 * expand_inherited_rtentry
 *        Expand a rangetable entry that has the "inh" bit set.
 *
 * "inh" is only allowed in two cases: RELATION and SUBQUERY RTEs.
 *
 * "inh" on a plain RELATION RTE means that it is a partitioned table or the
 * parent of a traditional-inheritance set.  In this case we must add entries
 * for all the interesting child tables to the query's rangetable, and build
 * additional planner data structures for them, including RelOptInfos,
 * AppendRelInfos, and possibly PlanRowMarks.
 *
 * Note that the original RTE is considered to represent the whole inheritance
 * set.  In the case of traditional inheritance, the first of the generated
 * RTEs is an RTE for the same table, but with inh = false, to represent the
 * parent table in its role as a simple member of the inheritance set.  For
 * partitioning, we don't need a second RTE because the partitioned table
 * itself has no data and need not be scanned.
 *
 * "inh" on a SUBQUERY RTE means that it's the parent of a UNION ALL group,
 * which is treated as an appendrel similarly to inheritance cases; however,
 * we already made RTEs and AppendRelInfos for the subqueries.  We only need
 * to build RelOptInfos for them, which is done by expand_appendrel_subquery.
 */

> Is this something we should explain the RangeTblEntry comments?
>

+1

Regards,
Dean



Re: RangeTblEntry.inh vs. RTE_SUBQUERY

From
Tom Lane
Date:
Dean Rasheed <dean.a.rasheed@gmail.com> writes:
> On Fri, 23 Feb 2024 at 14:35, Peter Eisentraut <peter@eisentraut.org> wrote:
>> Various code comments say that the RangeTblEntry field inh may only be
>> set for entries of kind RTE_RELATION.

> Yes, it's explained a bit more clearly/accurately in expand_inherited_rtentry():

>  * "inh" is only allowed in two cases: RELATION and SUBQUERY RTEs.

Yes.  The latter has been accurate for a very long time, so I'm
surprised that there are any places that think otherwise.  We need
to fix them --- where did you see this exactly?

(Note that RELATION-only is accurate within the parser and rewriter,
so maybe clarifications about context are in order.)

            regards, tom lane



Re: RangeTblEntry.inh vs. RTE_SUBQUERY

From
Peter Eisentraut
Date:
On 23.02.24 16:19, Tom Lane wrote:
> Dean Rasheed <dean.a.rasheed@gmail.com> writes:
>> On Fri, 23 Feb 2024 at 14:35, Peter Eisentraut <peter@eisentraut.org> wrote:
>>> Various code comments say that the RangeTblEntry field inh may only be
>>> set for entries of kind RTE_RELATION.
> 
>> Yes, it's explained a bit more clearly/accurately in expand_inherited_rtentry():
> 
>>   * "inh" is only allowed in two cases: RELATION and SUBQUERY RTEs.
> 
> Yes.  The latter has been accurate for a very long time, so I'm
> surprised that there are any places that think otherwise.  We need
> to fix them --- where did you see this exactly?

In nodes/parsenodes.h, it says both

     This *must* be false for RTEs other than RTE_RELATION entries.

and also puts it under

     Fields valid in all RTEs:

which are both wrong on opposite ends of the spectrum.

I think it would make more sense to group inh under "Fields valid for a 
plain relation RTE" and then explain the exception for subqueries, like 
it is done for several other fields.

See attached patch for a proposal.  (I also shuffled a few fields around 
to make the order a bit more logical.)
Attachment

Re: RangeTblEntry.inh vs. RTE_SUBQUERY

From
Tom Lane
Date:
Peter Eisentraut <peter@eisentraut.org> writes:
> In nodes/parsenodes.h, it says both
>      This *must* be false for RTEs other than RTE_RELATION entries.

Well, that's true in the parser ...

> and also puts it under
>      Fields valid in all RTEs:
> which are both wrong on opposite ends of the spectrum.
> I think it would make more sense to group inh under "Fields valid for a 
> plain relation RTE" and then explain the exception for subqueries, like 
> it is done for several other fields.

Dunno.  The adjacent "lateral" field is also used for only selected
RTE kinds.

I'd be inclined to leave it where it is and just improve the
commentary.  That could read like

 *    inh is true for relation references that should be expanded to include
 *    inheritance children, if the rel has any.  In the parser this
 *    will only be true for RTE_RELATION entries.  The planner also uses
 *    this field to mark RTE_SUBQUERY entries that contain UNION ALL
 *    queries that it has flattened into pulled-up subqueries
 *    (creating a structure much like the effects of inheritance).

If you do insist on moving it, please put it next to relkind so it
packs better.

I agree that perminfoindex seems to have suffered from add-at-the-end
syndrome, and if we do touch the field order you made an improvement
there.  (BTW, who thought they needn't bother with a comment for
perminfoindex?)

            regards, tom lane



Re: RangeTblEntry.inh vs. RTE_SUBQUERY

From
Alvaro Herrera
Date:
On 2024-Feb-29, Tom Lane wrote:

> I agree that perminfoindex seems to have suffered from add-at-the-end
> syndrome, and if we do touch the field order you made an improvement
> there.  (BTW, who thought they needn't bother with a comment for
> perminfoindex?)

There is a comment for it, or at least a61b1f74823c added one, though
not immediately adjacent.  I do see that it's now further away than it
was.  Perhaps we could add /* index of RTEPermissionInfo entry, or 0 */
to the line.

-- 
Álvaro Herrera         PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"The ability of users to misuse tools is, of course, legendary" (David Steele)
https://postgr.es/m/11b38a96-6ded-4668-b772-40f992132797@pgmasters.net



Re: RangeTblEntry.inh vs. RTE_SUBQUERY

From
Tom Lane
Date:
Alvaro Herrera <alvherre@alvh.no-ip.org> writes:
> On 2024-Feb-29, Tom Lane wrote:
>> I agree that perminfoindex seems to have suffered from add-at-the-end
>> syndrome, and if we do touch the field order you made an improvement
>> there.  (BTW, who thought they needn't bother with a comment for
>> perminfoindex?)

> There is a comment for it, or at least a61b1f74823c added one, though
> not immediately adjacent.  I do see that it's now further away than it
> was.  Perhaps we could add /* index of RTEPermissionInfo entry, or 0 */
> to the line.

That'd be enough for me.

            regards, tom lane



Re: RangeTblEntry.inh vs. RTE_SUBQUERY

From
Peter Eisentraut
Date:
On 29.02.24 19:14, Tom Lane wrote:
> Peter Eisentraut <peter@eisentraut.org> writes:
>> In nodes/parsenodes.h, it says both
>>       This *must* be false for RTEs other than RTE_RELATION entries.
> 
> Well, that's true in the parser ...
> 
>> and also puts it under
>>       Fields valid in all RTEs:
>> which are both wrong on opposite ends of the spectrum.
>> I think it would make more sense to group inh under "Fields valid for a
>> plain relation RTE" and then explain the exception for subqueries, like
>> it is done for several other fields.
> 
> Dunno.  The adjacent "lateral" field is also used for only selected
> RTE kinds.

The section is

     /*
      * Fields valid in all RTEs:
      */
     Alias      *alias;          /* user-written alias clause, if any */
     Alias      *eref;           /* expanded reference names */
     bool        lateral;        /* subquery, function, or values is 
LATERAL? */
     bool        inh;            /* inheritance requested? */
     bool        inFromCl;       /* present in FROM clause? */
     List       *securityQuals;  /* security barrier quals to apply, if 
any */

According to my testing, lateral is used for RTE_RELATION, RTE_SUBQUERY, 
RTE_FUNCTION, RTE_TABLEFUNC, RTE_VALUES, which is 5 out of 9 possible. 
So I think it might be okay to relabel that section (in actuality or 
mentally) as "valid in several/many/most RTEs".

But I'm not sure what reason there would be for having inh there, which 
is better described as "valid for RTE_RELATION, but also borrowed by 
RTE_SUBQUERY", which is pretty much exactly what is the case for relid, 
relkind, etc.




Re: RangeTblEntry.inh vs. RTE_SUBQUERY

From
Peter Eisentraut
Date:
On 03.03.24 11:02, Peter Eisentraut wrote:
> On 29.02.24 19:14, Tom Lane wrote:
>> Peter Eisentraut <peter@eisentraut.org> writes:
>>> In nodes/parsenodes.h, it says both
>>>       This *must* be false for RTEs other than RTE_RELATION entries.
>>
>> Well, that's true in the parser ...
>>
>>> and also puts it under
>>>       Fields valid in all RTEs:
>>> which are both wrong on opposite ends of the spectrum.
>>> I think it would make more sense to group inh under "Fields valid for a
>>> plain relation RTE" and then explain the exception for subqueries, like
>>> it is done for several other fields.
>>
>> Dunno.  The adjacent "lateral" field is also used for only selected
>> RTE kinds.
> 
> The section is
> 
>      /*
>       * Fields valid in all RTEs:
>       */
>      Alias      *alias;          /* user-written alias clause, if any */
>      Alias      *eref;           /* expanded reference names */
>      bool        lateral;        /* subquery, function, or values is 
> LATERAL? */
>      bool        inh;            /* inheritance requested? */
>      bool        inFromCl;       /* present in FROM clause? */
>      List       *securityQuals;  /* security barrier quals to apply, if 
> any */
> 
> According to my testing, lateral is used for RTE_RELATION, RTE_SUBQUERY, 
> RTE_FUNCTION, RTE_TABLEFUNC, RTE_VALUES, which is 5 out of 9 possible. 
> So I think it might be okay to relabel that section (in actuality or 
> mentally) as "valid in several/many/most RTEs".
> 
> But I'm not sure what reason there would be for having inh there, which 
> is better described as "valid for RTE_RELATION, but also borrowed by 
> RTE_SUBQUERY", which is pretty much exactly what is the case for relid, 
> relkind, etc.

I have committed the patches for this discussion.

Related discussion will continue at 
https://www.postgresql.org/message-id/flat/4b27fc50-8cd6-46f5-ab20-88dbaadca645@eisentraut.org 
/ https://commitfest.postgresql.org/47/4697/ .