Thread: minor fix for acquire_inherited_sample_rows

minor fix for acquire_inherited_sample_rows

From

Amit Langote

Date:

23 April 2018, 16:14:37

Hi.

acquire_inherited_sample_rows() currently uses equalTupleDescs() being
false as the condition for going to tupconv.c to determine whether tuple
conversion is needed.  But equalTupleDescs() will always return false if
it's passed TupleDesc's of two different tables, which is the most common
case here.  So I first thought we should just unconditionally go to
tupconv.c, but there is still one case where we don't need to, which is
the case where the child table is same as the parent table.  However, it
would be much cheaper to just check if the relation OIDs are different
instead of calling equalTupleDescs, which the attached patch teaches it to do.

Thanks,
Amit

Attachment

condition-may-need-tuple-conversion.patch

Re: minor fix for acquire_inherited_sample_rows

From

Ashutosh Bapat

Date:

23 April 2018, 17:25:09

On Mon, Apr 23, 2018 at 3:44 PM, Amit Langote
<Langote_Amit_f8@lab.ntt.co.jp> wrote:
> Hi.
>
> acquire_inherited_sample_rows() currently uses equalTupleDescs() being
> false as the condition for going to tupconv.c to determine whether tuple
> conversion is needed.  But equalTupleDescs() will always return false if
> it's passed TupleDesc's of two different tables, which is the most common
> case here.  So I first thought we should just unconditionally go to
> tupconv.c, but there is still one case where we don't need to, which is
> the case where the child table is same as the parent table.  However, it
> would be much cheaper to just check if the relation OIDs are different
> instead of calling equalTupleDescs, which the attached patch teaches it to do.

We want to check whether tuple conversion is needed. Theoretically
(not necessarily practically), one could have tuple descs of two
different tables stamped with the same tdtypeid. From that POV,
equalTupleDescs seems to be a stronger check than what you have in the
patch.

BTW the patch you have posted also has a fix you proposed in some
other thread. Probably you want to get rid of it.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

Re: minor fix for acquire_inherited_sample_rows

From

Amit Langote

Date:

23 April 2018, 19:15:47

Thanks for the review.

On Mon, Apr 23, 2018 at 8:25 PM, Ashutosh Bapat
<ashutosh.bapat@enterprisedb.com> wrote:
> On Mon, Apr 23, 2018 at 3:44 PM, Amit Langote
> <Langote_Amit_f8@lab.ntt.co.jp> wrote:
>> Hi.
>>
>> acquire_inherited_sample_rows() currently uses equalTupleDescs() being
>> false as the condition for going to tupconv.c to determine whether tuple
>> conversion is needed.  But equalTupleDescs() will always return false if
>> it's passed TupleDesc's of two different tables, which is the most common
>> case here.  So I first thought we should just unconditionally go to
>> tupconv.c, but there is still one case where we don't need to, which is
>> the case where the child table is same as the parent table.  However, it
>> would be much cheaper to just check if the relation OIDs are different
>> instead of calling equalTupleDescs, which the attached patch teaches it to do.
>
> We want to check whether tuple conversion is needed. Theoretically
> (not necessarily practically), one could have tuple descs of two
> different tables stamped with the same tdtypeid. From that POV,
> equalTupleDescs seems to be a stronger check than what you have in the
> patch.

If I'm reading right, that sounds like a +1 to the patch. :)

> BTW the patch you have posted also has a fix you proposed in some
> other thread. Probably you want to get rid of it.

Oops, fixed that in the attached.

Thanks,
Amit

Attachment

condition-may-need-tuple-conversion.patch

Re: minor fix for acquire_inherited_sample_rows

From

Alvaro Herrera

Date:

23 April 2018, 21:16:23

Hello Amit

Amit Langote wrote:

> acquire_inherited_sample_rows() currently uses equalTupleDescs() being
> false as the condition for going to tupconv.c to determine whether tuple
> conversion is needed.  But equalTupleDescs() will always return false if
> it's passed TupleDesc's of two different tables, which is the most common
> case here.  So I first thought we should just unconditionally go to
> tupconv.c, but there is still one case where we don't need to, which is
> the case where the child table is same as the parent table.  However, it
> would be much cheaper to just check if the relation OIDs are different
> instead of calling equalTupleDescs, which the attached patch teaches it to do.

When (not if) we get around to updating equalTupleDescs to cope, we will
need this call right where it is (and we'd have a hard time finding this
potential callsite later, I think).  If this were a hot spot I would be
happy to change it, but it's not so I'd rather leave it alone.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: minor fix for acquire_inherited_sample_rows

From

Ashutosh Bapat

Date:

24 April 2018, 10:29:33

On Mon, Apr 23, 2018 at 6:45 PM, Amit Langote <amitlangote09@gmail.com> wrote:
> Thanks for the review.
>
> On Mon, Apr 23, 2018 at 8:25 PM, Ashutosh Bapat
> <ashutosh.bapat@enterprisedb.com> wrote:
>> On Mon, Apr 23, 2018 at 3:44 PM, Amit Langote
>> <Langote_Amit_f8@lab.ntt.co.jp> wrote:
>>> Hi.
>>>
>>> acquire_inherited_sample_rows() currently uses equalTupleDescs() being
>>> false as the condition for going to tupconv.c to determine whether tuple
>>> conversion is needed.  But equalTupleDescs() will always return false if
>>> it's passed TupleDesc's of two different tables, which is the most common
>>> case here.  So I first thought we should just unconditionally go to
>>> tupconv.c, but there is still one case where we don't need to, which is
>>> the case where the child table is same as the parent table.  However, it
>>> would be much cheaper to just check if the relation OIDs are different
>>> instead of calling equalTupleDescs, which the attached patch teaches it to do.
>>
>> We want to check whether tuple conversion is needed. Theoretically
>> (not necessarily practically), one could have tuple descs of two
>> different tables stamped with the same tdtypeid. From that POV,
>> equalTupleDescs seems to be a stronger check than what you have in the
>> patch.
>
> If I'm reading right, that sounds like a +1 to the patch. :)

Not really! To make things clear, I am not in favour of this patch.

>
>> BTW the patch you have posted also has a fix you proposed in some
>> other thread. Probably you want to get rid of it.
>
> Oops, fixed that in the attached.

Thanks.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

Re: minor fix for acquire_inherited_sample_rows

From

Ashutosh Bapat

Date:

24 April 2018, 10:29:48

On Mon, Apr 23, 2018 at 8:46 PM, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
> Hello Amit
>
> Amit Langote wrote:
>
>> acquire_inherited_sample_rows() currently uses equalTupleDescs() being
>> false as the condition for going to tupconv.c to determine whether tuple
>> conversion is needed.  But equalTupleDescs() will always return false if
>> it's passed TupleDesc's of two different tables, which is the most common
>> case here.  So I first thought we should just unconditionally go to
>> tupconv.c, but there is still one case where we don't need to, which is
>> the case where the child table is same as the parent table.  However, it
>> would be much cheaper to just check if the relation OIDs are different
>> instead of calling equalTupleDescs, which the attached patch teaches it to do.
>
> When (not if) we get around to updating equalTupleDescs to cope, we will
> need this call right where it is (and we'd have a hard time finding this
> potential callsite later, I think).  If this were a hot spot I would be
> happy to change it, but it's not so I'd rather leave it alone.
>
+1




-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

Re: minor fix for acquire_inherited_sample_rows

From

Amit Langote

Date:

24 April 2018, 10:43:21

On 2018/04/24 13:29, Ashutosh Bapat wrote:
> On Mon, Apr 23, 2018 at 6:45 PM, Amit Langote <amitlangote09@gmail.com> wrote:
>> On Mon, Apr 23, 2018 at 8:25 PM, Ashutosh Bapat wrote:
>>> On Mon, Apr 23, 2018 at 3:44 PM, Amit Langote wrote:
>>>> Hi.
>>>>
>>>> acquire_inherited_sample_rows() currently uses equalTupleDescs() being
>>>> false as the condition for going to tupconv.c to determine whether tuple
>>>> conversion is needed.  But equalTupleDescs() will always return false if
>>>> it's passed TupleDesc's of two different tables, which is the most common
>>>> case here.  So I first thought we should just unconditionally go to
>>>> tupconv.c, but there is still one case where we don't need to, which is
>>>> the case where the child table is same as the parent table.  However, it
>>>> would be much cheaper to just check if the relation OIDs are different
>>>> instead of calling equalTupleDescs, which the attached patch teaches it to do.
>>>
>>> We want to check whether tuple conversion is needed. Theoretically
>>> (not necessarily practically), one could have tuple descs of two
>>> different tables stamped with the same tdtypeid. From that POV,
>>> equalTupleDescs seems to be a stronger check than what you have in the
>>> patch.
>>
>> If I'm reading right, that sounds like a +1 to the patch. :)
> 
> Not really! To make things clear, I am not in favour of this patch.

Oh okay.  I read your last sentence as "equalTupleDescs() seems to be a
stronger check than needed" (which is how I see it), but apparently that's
not what you meant to say.  Anyway, thanks for clarifying.

Regards,
Amit

Re: minor fix for acquire_inherited_sample_rows

From

Amit Langote

Date:

24 April 2018, 16:19:44

Hi.

On 2018/04/24 0:16, Alvaro Herrera wrote:
> Hello Amit
> 
> Amit Langote wrote:
> 
>> acquire_inherited_sample_rows() currently uses equalTupleDescs() being
>> false as the condition for going to tupconv.c to determine whether tuple
>> conversion is needed.  But equalTupleDescs() will always return false if
>> it's passed TupleDesc's of two different tables, which is the most common
>> case here.  So I first thought we should just unconditionally go to
>> tupconv.c, but there is still one case where we don't need to, which is
>> the case where the child table is same as the parent table.  However, it
>> would be much cheaper to just check if the relation OIDs are different
>> instead of calling equalTupleDescs, which the attached patch teaches it to do.
> 
> When (not if) we get around to updating equalTupleDescs to cope, we will
> need this call right where it is (and we'd have a hard time finding this
> potential callsite later, I think).

Given what equalTupleDescs was invented for (commit a152ebeec), reducing
it down to what can be sensibly used for checking whether tuple conversion
is needed between a parent and child will, I'm afraid, make it useless for
its original purpose.  Just looking at a few checks that are now in it,
for example:

    for (i = 0; i < tupdesc1->natts; i++)
    {
        Form_pg_attribute attr1 = TupleDescAttr(tupdesc1, i);
        Form_pg_attribute attr2 = TupleDescAttr(tupdesc2, i);

        <...snip...>

        if (attr1->attislocal != attr2->attislocal)
            return false;
        if (attr1->attinhcount != attr2->attinhcount)
            return false;
        <...snip...>
        /* attacl, attoptions and attfdwoptions are not even present... */
    }

attislocal and attinhcount obviously can't be same for parent and a child.
 Also, the comment above seems to assume that this function is being
called from the only place it was designed for (RelationClearRelation).

Further down:

    if (tupdesc1->constr != NULL)
    {
        TupleConstr *constr1 = tupdesc1->constr;
        TupleConstr *constr2 = tupdesc2->constr;

        <...snip...>
        n = constr1->num_defval;
        if (n != (int) constr2->num_defval)
            return false;
        for (i = 0; i < n; i++)
        {
            AttrDefault *defval1 = constr1->defval + i;
            AttrDefault *defval2 = constr2->defval;

            <...snip...>
            if (strcmp(defval1->adbin, defval2->adbin) != 0)
                return false;

Now this last one will always return false, because the location *appears*
to be different in this usage.  We'll have to believe that this test
returns true in at least the cases for which equalTupleDescs() was
designed for.

Then there is code further down that checks other details being equal like
constr->matching, constr->check, etc.  For CHECK constraints, we again
check ccbin, which might have different location values, their inheritance
status, etc.

To summarize, I think equalTupleDescs()'s implementation relies on lots of
details to match between passed-in descriptors (down to basically
everything) for them to be considered "logically" equal.  Not to mention
that "logically" here is for relcache.c's or RelationClearRelation()'s
purposes.

We're considering using this to determine if we need to convert tuples
between parent and child/partition and for that we only need to check if
attributes of the same name appear in the same physical position in both
parent and child.  Rest of the details are either guaranteed same
(individual attributes's types, typemod, typcollate, etc.) or irrelevant
to comparison (their atthasdef, default expressions, check constraints,
etc.).  I proposed to implement such a logic in tupconv.c itself [1],
which I'll re-propose for PG 12.

> If this were a hot spot I would be
> happy to change it, but it's not so I'd rather leave it alone.

Sure, but if we adopt some solution to optimize the check for whether we
need to convert tuples for a hot spot, say, ExecInitPartitionInfo(), then
I'd imagine we'd want to use the same solution everywhere.

Thanks,
Amit

[1]
https://www.postgresql.org/message-id/825031be-942c-8c24-6163-13c27f217a3d%40lab.ntt.co.jp

Re: minor fix for acquire_inherited_sample_rows

From

Robert Haas

Date:

26 April 2018, 01:38:46

On Tue, Apr 24, 2018 at 6:19 AM, Amit Langote
<Langote_Amit_f8@lab.ntt.co.jp> wrote:
> Given what equalTupleDescs was invented for (commit a152ebeec), reducing
> it down to what can be sensibly used for checking whether tuple conversion
> is needed between a parent and child will, I'm afraid, make it useless for
> its original purpose.  Just looking at a few checks that are now in it,
> for example:
>
>     for (i = 0; i < tupdesc1->natts; i++)
>     {
>         Form_pg_attribute attr1 = TupleDescAttr(tupdesc1, i);
>         Form_pg_attribute attr2 = TupleDescAttr(tupdesc2, i);
>
>         <...snip...>
>
>         if (attr1->attislocal != attr2->attislocal)
>             return false;
>         if (attr1->attinhcount != attr2->attinhcount)
>             return false;
>         <...snip...>
>         /* attacl, attoptions and attfdwoptions are not even present... */
>     }
>
> attislocal and attinhcount obviously can't be same for parent and a child.
>  Also, the comment above seems to assume that this function is being
> called from the only place it was designed for (RelationClearRelation).

+1.  I think we're really abusing equalTupleDescs() for purposes for
which it was not invented.  Instead of changing it, let's invent a new
function that tests for the thing partitioning cares about (same
ordering of the same columns with the same type information) and call
it logicallyEqualTupleDescs() or something like that.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: minor fix for acquire_inherited_sample_rows

From

Ashutosh Bapat

Date:

26 April 2018, 18:54:35

On Thu, Apr 26, 2018 at 1:08 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Tue, Apr 24, 2018 at 6:19 AM, Amit Langote
> <Langote_Amit_f8@lab.ntt.co.jp> wrote:
>> Given what equalTupleDescs was invented for (commit a152ebeec), reducing
>> it down to what can be sensibly used for checking whether tuple conversion
>> is needed between a parent and child will, I'm afraid, make it useless for
>> its original purpose.  Just looking at a few checks that are now in it,
>> for example:
>>
>>     for (i = 0; i < tupdesc1->natts; i++)
>>     {
>>         Form_pg_attribute attr1 = TupleDescAttr(tupdesc1, i);
>>         Form_pg_attribute attr2 = TupleDescAttr(tupdesc2, i);
>>
>>         <...snip...>
>>
>>         if (attr1->attislocal != attr2->attislocal)
>>             return false;
>>         if (attr1->attinhcount != attr2->attinhcount)
>>             return false;
>>         <...snip...>
>>         /* attacl, attoptions and attfdwoptions are not even present... */
>>     }
>>
>> attislocal and attinhcount obviously can't be same for parent and a child.
>>  Also, the comment above seems to assume that this function is being
>> called from the only place it was designed for (RelationClearRelation).
>
> +1.  I think we're really abusing equalTupleDescs() for purposes for
> which it was not invented.  Instead of changing it, let's invent a new
> function that tests for the thing partitioning cares about (same
> ordering of the same columns with the same type information) and call
> it logicallyEqualTupleDescs() or something like that.

Why don't we just rely on the output of convert_tuples_by_name(),
which it seems is always called right now? What's advantage of adding
another tuple descriptor comparison?

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

Re: minor fix for acquire_inherited_sample_rows

From

Amit Langote

Date:

26 April 2018, 19:38:00

On Thu, Apr 26, 2018 at 9:54 PM, Ashutosh Bapat
<ashutosh.bapat@enterprisedb.com> wrote:
> On Thu, Apr 26, 2018 at 1:08 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>> +1.  I think we're really abusing equalTupleDescs() for purposes for
>> which it was not invented.  Instead of changing it, let's invent a new
>> function that tests for the thing partitioning cares about (same
>> ordering of the same columns with the same type information) and call
>> it logicallyEqualTupleDescs() or something like that.
>
> Why don't we just rely on the output of convert_tuples_by_name(),
> which it seems is always called right now? What's advantage of adding
> another tuple descriptor comparison?

The patch I mentioned in my email above does more or less that (what
you're saying we should do).  In fact it even modifies
convert_tuple_by_name and convert_tuple_by_name_map to remove some
redundant computation.  See that patch here if you're interested:

https://www.postgresql.org/message-id/825031be-942c-8c24-6163-13c27f217a3d%40lab.ntt.co.jp

Thanks,
Amit

Re: minor fix for acquire_inherited_sample_rows

From

Alvaro Herrera

Date:

26 April 2018, 19:47:15

Amit Langote wrote:
> On Thu, Apr 26, 2018 at 9:54 PM, Ashutosh Bapat
> <ashutosh.bapat@enterprisedb.com> wrote:
> > On Thu, Apr 26, 2018 at 1:08 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> >> +1.  I think we're really abusing equalTupleDescs() for purposes for
> >> which it was not invented.  Instead of changing it, let's invent a new
> >> function that tests for the thing partitioning cares about (same
> >> ordering of the same columns with the same type information) and call
> >> it logicallyEqualTupleDescs() or something like that.
> >
> > Why don't we just rely on the output of convert_tuples_by_name(),
> > which it seems is always called right now? What's advantage of adding
> > another tuple descriptor comparison?
> 
> The patch I mentioned in my email above does more or less that (what
> you're saying we should do).  In fact it even modifies
> convert_tuple_by_name and convert_tuple_by_name_map to remove some
> redundant computation.  See that patch here if you're interested:
> 
> https://www.postgresql.org/message-id/825031be-942c-8c24-6163-13c27f217a3d%40lab.ntt.co.jp

Yeah, I didn't like the additional nullness checks in that patch.  I was
thinking it might be saner to create a new struct with the AttrNumber *
array, its length and a boolean flag indicating a no-op conversion.
Then we can call map_variable_attnos unconditionally and it does nothing
if the flag is set.  This localizes the ugliness.

I'm not sure if this idea is better than Robert's logicallyEqualTupleDescs().
Maybe we can use both in different places.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: minor fix for acquire_inherited_sample_rows

From

Ashutosh Bapat

Date:

27 April 2018, 19:42:14

On Thu, Apr 26, 2018 at 7:08 PM, Amit Langote <amitlangote09@gmail.com> wrote:
> On Thu, Apr 26, 2018 at 9:54 PM, Ashutosh Bapat
> <ashutosh.bapat@enterprisedb.com> wrote:
>> On Thu, Apr 26, 2018 at 1:08 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>>> +1.  I think we're really abusing equalTupleDescs() for purposes for
>>> which it was not invented.  Instead of changing it, let's invent a new
>>> function that tests for the thing partitioning cares about (same
>>> ordering of the same columns with the same type information) and call
>>> it logicallyEqualTupleDescs() or something like that.
>>
>> Why don't we just rely on the output of convert_tuples_by_name(),
>> which it seems is always called right now? What's advantage of adding
>> another tuple descriptor comparison?
>
> The patch I mentioned in my email above does more or less that (what
> you're saying we should do).  In fact it even modifies
> convert_tuple_by_name and convert_tuple_by_name_map to remove some
> redundant computation.  See that patch here if you're interested:
>
> https://www.postgresql.org/message-id/825031be-942c-8c24-6163-13c27f217a3d%40lab.ntt.co.jp

I spent some time looking at the patch. The patch clubs all kinds of
refactoring together, making review a bit difficult. I think, it would
be better to split the patch into multiple, each addressing one set of
changes, it might become easier to review.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

Re: minor fix for acquire_inherited_sample_rows

From

Amit Langote

Date:

01 May 2018, 15:47:29

On 2018/04/27 22:42, Ashutosh Bapat wrote:
> On Thu, Apr 26, 2018 at 7:08 PM, Amit Langote <amitlangote09@gmail.com> wrote:
>> On Thu, Apr 26, 2018 at 9:54 PM, Ashutosh Bapat
>> <ashutosh.bapat@enterprisedb.com> wrote:
>>> On Thu, Apr 26, 2018 at 1:08 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>>>> +1.  I think we're really abusing equalTupleDescs() for purposes for
>>>> which it was not invented.  Instead of changing it, let's invent a new
>>>> function that tests for the thing partitioning cares about (same
>>>> ordering of the same columns with the same type information) and call
>>>> it logicallyEqualTupleDescs() or something like that.
>>>
>>> Why don't we just rely on the output of convert_tuples_by_name(),
>>> which it seems is always called right now? What's advantage of adding
>>> another tuple descriptor comparison?
>>
>> The patch I mentioned in my email above does more or less that (what
>> you're saying we should do).  In fact it even modifies
>> convert_tuple_by_name and convert_tuple_by_name_map to remove some
>> redundant computation.  See that patch here if you're interested:
>>
>> https://www.postgresql.org/message-id/825031be-942c-8c24-6163-13c27f217a3d%40lab.ntt.co.jp
> 
> I spent some time looking at the patch. The patch clubs all kinds of
> refactoring together, making review a bit difficult. I think, it would
> be better to split the patch into multiple, each addressing one set of
> changes, it might become easier to review.

Thanks Ashutosh for looking at it.  I will think of a way to break it up
when I re-propose it for the next cycle.

Regards,
Amit