Re: [BUG?] strange behavior in ALTER TABLE ... RENAME TO on inherited columns - Mailing list pgsql-hackers

From Robert Haas
Subject Re: [BUG?] strange behavior in ALTER TABLE ... RENAME TO on inherited columns
Date
Msg-id 603c8f071001291036x33ba0b91k6431d35c3bd88a91@mail.gmail.com
Whole thread Raw
In response to Re: [BUG?] strange behavior in ALTER TABLE ... RENAME TO on inherited columns  (KaiGai Kohei <kaigai@ak.jp.nec.com>)
Responses Re: [BUG?] strange behavior in ALTER TABLE ... RENAME TO on inherited columns  (KaiGai Kohei <kaigai@ak.jp.nec.com>)
List pgsql-hackers
2010/1/28 KaiGai Kohei <kaigai@ak.jp.nec.com>:
> (2010/01/29 9:58), KaiGai Kohei wrote:
>> (2010/01/29 9:29), Robert Haas wrote:
>>> 2010/1/28 KaiGai Kohei<kaigai@ak.jp.nec.com>:
>>>> (2010/01/29 0:46), Robert Haas wrote:
>>>>> 2010/1/27 KaiGai Kohei<kaigai@ak.jp.nec.com>:
>>>>>> Hmm, indeed, this logic (V3/V5) is busted.
>>>>>>
>>>>>> The idea of V4 patch can also handle this case correctly, although it
>>>>>> is lesser in performance.
>>>>>> I wonder whether it is really unacceptable cost in performance, or not.
>>>>>> Basically, I assume ALTER TABLE RENAME/TYPE is not frequent operations,
>>>>>> and I don't think this bugfix will damage to the reputation of PostgreSQL.
>>>>>>
>>>>>> Where should we go on the next?
>>>>>
>>>>> Isn't the problem here just that the following comment is 100% wrong?
>>>>>
>>>>>                    /*
>>>>>                     * Unlike find_all_inheritors(), we need to walk on
>>>>> child relations
>>>>>                     * that have diamond inheritance tree, because this
>>>>> function has to
>>>>>                     * return correct expected inhecount to the caller.
>>>>>                     */
>>>>>
>>>>> It seems to me that the right solution here is to just add one more
>>>>> argument to find_all_inheritors(), something like List
>>>>> **expected_inh_count.
>>>>>
>>>>> Am I missing something?
>>>>
>>>> The find_all_inheritors() does not walk on child relations more than
>>>> two times, even if a child has multiple parents inherited from common
>>>> origin, because list_concat_unique_oid() ignores the given OID if it
>>>> is already on the list. It means all the child relations under the
>>>> relation already walked on does not checked anywhere. (Of course,
>>>> this assumption is correct for the purpose of find_all_inheritors()
>>>> with minimum cost.)
>>>>
>>>> What we want to do here is to compute the number of times a certain
>>>> child relation is inherited from a common origin; it shall be the
>>>> expected-inhcount. So, we need an arrangement to the logic.
>>>>
>>>> For example, see the following diagram.
>>>>
>>>>     T2
>>>>    /  \
>>>> T1    T4---T5
>>>>    \  /
>>>>     T3
>>>>
>>>> If we call find_all_inheritors() with T1. The find_inheritance_children()
>>>> returns T2 and T3 for T1.
>>>> Then, it calls find_inheritance_children() for T2, and it returns T4.
>>>> Then, it calls find_inheritance_children() for T3, and it returns T4, but
>>>> it is already in the "rels_list", so list_concat_unique_oid() ignores it.
>>>> Then, it calls find_inheritance_children() for T4, and it returns T5.
>>>>
>>>> In this example, we want the expected inhcount for T2 and T3 should be 1,
>>>> for T4 and T5 should be 2. However, it walks on T4 and T5 only once, so
>>>> they will have 1 incorrectly.
>>>> Even if we count up the ignored OID (T4), find_all_inheritors() does not
>>>> walk on T5, because it is already walked on obviously when T4 is ignored.
>>>
>>> I think the count for T5 should be 1, and I think that the count for
>>> T4 can easily be made to be 2 by coding the algorithm correctly.
>>
>> Ahh, it is right. I was confused.
>>
>> Is it possible to introduce the logic mathematical-strictly?
>> Now I'm considering whether the find_all_inheritors() logic is suitable
>> for any cases, or not.
>
> I modified the logic to compute an expected inhcount of the child relations.
>
> What we want to compute here is to compute the number of times a certain
> relation being inherited directly from any relations delivered from a unique
> origin. If the column to be renamed is eventually inherited from a common
> origin, its attinhcount is not larger than the expected inhcount.
>
>>    WITH RECURSIVE r AS (
>>      SELECT 't1'::regclass AS inhrelid
>>    UNION ALL
>>      SELECT c.inhrelid FROM pg_inherits c, r WHERE r.inhrelid = c.inhparent
>>    )  -- r is all the child relations inherited from 't1'
>>    SELECT inhrelid::regclass, count(*) FROM pg_inherits
>>      WHERE inhparent IN (SELECT inhrelid FROM r) GROUP BY inhrelid;
>
> The modified logic increments the expected inhcount of the relation already
> walked on. If we found a child relation twice or more, it means the child
> relation is at the junction of the inheritance tree.
> In this case, we don't walk on the child relations any more, but it is not
> necessary, because the first path already checked it.
>
> The find_all_inheritors() is called from several points more than renameatt(),
> so I added find_all_inheritors_with_inhcount() which takes argument of the
> expected inhcount list. And, find_all_inheritors() was also modified to call
> find_all_inheritors_with_inhcount() with NULL argument to avoid code duplication.
>
> It is the result of Berrnd's test case.
>
> - CVS HEAD
>  0.895s
>  0.903s
>  0.884s
>  0.896s
>  0.892s
>
> - with V6 patch
>  0.972s
>  0.941s
>  0.961s
>  0.949s
>  0.946s

Well, that seems a lot better.  Unfortunately, I can't commit this
code: it's mind-bogglingly ugly.  I don't believe that duplicating
find_all_inheritors is the right solution (a point I've mentioned
previously), and it's certainly not right to use typeName->location as
a place to store an unrelated attribute inheritance count.  There is
also at least one superfluous variable renaming; and the recursion
handling looks pretty grotty to me, too.

I wonder if we should just leave this alone for 9.0 and revisit the
issue after doing some of the previously-proposed refactoring for 9.1.The amount of time we're spending trying to fix
thisis somewhat out 
of proportion to the importance of the problem.

...Robert


pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: Hot Standby: Relation-specific deferred conflict resolution
Next
From: Robert Haas
Date:
Subject: Re: Review: listagg aggregate