Re: [BUG?] strange behavior in ALTER TABLE ... RENAME TO on inherited columns - Mailing list pgsql-hackers

From KaiGai Kohei
Subject Re: [BUG?] strange behavior in ALTER TABLE ... RENAME TO on inherited columns
Date
Msg-id 4B6644D5.9080009@ak.jp.nec.com
Whole thread Raw
In response to Re: [BUG?] strange behavior in ALTER TABLE ... RENAME TO on inherited columns  (KaiGai Kohei <kaigai@ak.jp.nec.com>)
Responses Re: [BUG?] strange behavior in ALTER TABLE ... RENAME TO on inherited columns
List pgsql-hackers
(2010/02/01 8:41), KaiGai Kohei wrote:
> (2010/01/30 3:36), Robert Haas wrote:
>> 2010/1/28 KaiGai Kohei<kaigai@ak.jp.nec.com>:
>>> (2010/01/29 9:58), KaiGai Kohei wrote:
>>>> (2010/01/29 9:29), Robert Haas wrote:
>>>>> 2010/1/28 KaiGai Kohei<kaigai@ak.jp.nec.com>:
>>>>>> (2010/01/29 0:46), Robert Haas wrote:
>>>>>>> 2010/1/27 KaiGai Kohei<kaigai@ak.jp.nec.com>:
>>>>>>>> Hmm, indeed, this logic (V3/V5) is busted.
>>>>>>>>
>>>>>>>> The idea of V4 patch can also handle this case correctly, although it
>>>>>>>> is lesser in performance.
>>>>>>>> I wonder whether it is really unacceptable cost in performance, or not.
>>>>>>>> Basically, I assume ALTER TABLE RENAME/TYPE is not frequent operations,
>>>>>>>> and I don't think this bugfix will damage to the reputation of PostgreSQL.
>>>>>>>>
>>>>>>>> Where should we go on the next?
>>>>>>>
>>>>>>> Isn't the problem here just that the following comment is 100% wrong?
>>>>>>>
>>>>>>>                      /*
>>>>>>>                       * Unlike find_all_inheritors(), we need to walk on
>>>>>>> child relations
>>>>>>>                       * that have diamond inheritance tree, because this
>>>>>>> function has to
>>>>>>>                       * return correct expected inhecount to the caller.
>>>>>>>                       */
>>>>>>>
>>>>>>> It seems to me that the right solution here is to just add one more
>>>>>>> argument to find_all_inheritors(), something like List
>>>>>>> **expected_inh_count.
>>>>>>>
>>>>>>> Am I missing something?
>>>>>>
>>>>>> The find_all_inheritors() does not walk on child relations more than
>>>>>> two times, even if a child has multiple parents inherited from common
>>>>>> origin, because list_concat_unique_oid() ignores the given OID if it
>>>>>> is already on the list. It means all the child relations under the
>>>>>> relation already walked on does not checked anywhere. (Of course,
>>>>>> this assumption is correct for the purpose of find_all_inheritors()
>>>>>> with minimum cost.)
>>>>>>
>>>>>> What we want to do here is to compute the number of times a certain
>>>>>> child relation is inherited from a common origin; it shall be the
>>>>>> expected-inhcount. So, we need an arrangement to the logic.
>>>>>>
>>>>>> For example, see the following diagram.
>>>>>>
>>>>>>       T2
>>>>>>      /  \
>>>>>> T1    T4---T5
>>>>>>      \  /
>>>>>>       T3
>>>>>>
>>>>>> If we call find_all_inheritors() with T1. The find_inheritance_children()
>>>>>> returns T2 and T3 for T1.
>>>>>> Then, it calls find_inheritance_children() for T2, and it returns T4.
>>>>>> Then, it calls find_inheritance_children() for T3, and it returns T4, but
>>>>>> it is already in the "rels_list", so list_concat_unique_oid() ignores it.
>>>>>> Then, it calls find_inheritance_children() for T4, and it returns T5.
>>>>>>
>>>>>> In this example, we want the expected inhcount for T2 and T3 should be 1,
>>>>>> for T4 and T5 should be 2. However, it walks on T4 and T5 only once, so
>>>>>> they will have 1 incorrectly.
>>>>>> Even if we count up the ignored OID (T4), find_all_inheritors() does not
>>>>>> walk on T5, because it is already walked on obviously when T4 is ignored.
>>>>>
>>>>> I think the count for T5 should be 1, and I think that the count for
>>>>> T4 can easily be made to be 2 by coding the algorithm correctly.
>>>>
>>>> Ahh, it is right. I was confused.
>>>>
>>>> Is it possible to introduce the logic mathematical-strictly?
>>>> Now I'm considering whether the find_all_inheritors() logic is suitable
>>>> for any cases, or not.
>>>
>>> I modified the logic to compute an expected inhcount of the child relations.
>>>
>>> What we want to compute here is to compute the number of times a certain
>>> relation being inherited directly from any relations delivered from a unique
>>> origin. If the column to be renamed is eventually inherited from a common
>>> origin, its attinhcount is not larger than the expected inhcount.
>>>
>>>>      WITH RECURSIVE r AS (
>>>>        SELECT 't1'::regclass AS inhrelid
>>>>      UNION ALL
>>>>        SELECT c.inhrelid FROM pg_inherits c, r WHERE r.inhrelid = c.inhparent
>>>>      )  -- r is all the child relations inherited from 't1'
>>>>      SELECT inhrelid::regclass, count(*) FROM pg_inherits
>>>>        WHERE inhparent IN (SELECT inhrelid FROM r) GROUP BY inhrelid;
>>>
>>> The modified logic increments the expected inhcount of the relation already
>>> walked on. If we found a child relation twice or more, it means the child
>>> relation is at the junction of the inheritance tree.
>>> In this case, we don't walk on the child relations any more, but it is not
>>> necessary, because the first path already checked it.
>>>
>>> The find_all_inheritors() is called from several points more than renameatt(),
>>> so I added find_all_inheritors_with_inhcount() which takes argument of the
>>> expected inhcount list. And, find_all_inheritors() was also modified to call
>>> find_all_inheritors_with_inhcount() with NULL argument to avoid code duplication.
>>>
>>> It is the result of Berrnd's test case.
>>>
>>> - CVS HEAD
>>>    0.895s
>>>    0.903s
>>>    0.884s
>>>    0.896s
>>>    0.892s
>>>
>>> - with V6 patch
>>>    0.972s
>>>    0.941s
>>>    0.961s
>>>    0.949s
>>>    0.946s
>>
>> Well, that seems a lot better.  Unfortunately, I can't commit this
>> code: it's mind-bogglingly ugly.  I don't believe that duplicating
>> find_all_inheritors is the right solution (a point I've mentioned
>> previously), and it's certainly not right to use typeName->location as
>> a place to store an unrelated attribute inheritance count.  There is
>> also at least one superfluous variable renaming; and the recursion
>> handling looks pretty grotty to me, too.
>>
>> I wonder if we should just leave this alone for 9.0 and revisit the
>> issue after doing some of the previously-proposed refactoring for 9.1.
>>    The amount of time we're spending trying to fix this is somewhat out
>> of proportion to the importance of the problem.
>
> At least, I think we can fix the bug on renameatt() case in clean way,
> although we need an additional recursion handling in ATPrepAlterColumnType()
> to fix ALTER COLUMN TYPE cases.
>
> Should it focus on only the original renameatt() matter in this stage?

The attached patch modified find_all_inheritors() to return the list of
expected inhcount, if List * pointer is given. And, it focuses on only
the bugs in renameatt() case.

Also, the ALTER COLUMN TYPE case should be also fixed in the 9.1 release
(or 9.0.1?).

Thanks,
--
OSS Platform Development Division, NEC
KaiGai Kohei <kaigai@ak.jp.nec.com>

Attachment

pgsql-hackers by date:

Previous
From: Fujii Masao
Date:
Subject: Re: HS/SR and smart shutdown
Next
From: Tom Lane
Date:
Subject: Re: Pathological regexp match