Re: Oddity in tuple routing for foreign partitions - Mailing list pgsql-hackers

From Etsuro Fujita
Subject Re: Oddity in tuple routing for foreign partitions
Date
Msg-id 5AE18F9C.6080805@lab.ntt.co.jp
Whole thread Raw
In response to Re: Oddity in tuple routing for foreign partitions  (Amit Langote <Langote_Amit_f8@lab.ntt.co.jp>)
Responses Re: Oddity in tuple routing for foreign partitions
List pgsql-hackers
(2018/04/26 15:35), Amit Langote wrote:
> On 2018/04/26 12:43, Etsuro Fujita wrote:
>> (2018/04/25 17:29), Amit Langote wrote:
>>> Hmm, I missed that we do need information from a proper RTE as well.  So,
>>> I suppose we should be doing this instead of creating the RTE for foreign
>>> partition from scratch.
>>>
>>> +    rte = list_nth(estate->es_range_table, resultRelation - 1);
>>> +    rte = copyObject(rte);
>>> +    rte->relid = RelationGetRelid(rel);
>>> +    rte->relkind = RELKIND_FOREIGN_TABLE;
>>
>> As I said upthread, we can use the RTE in the range table as-is if the
>> foreign table is one of the UPDATE subplan partitions or the target
>> specified in a COPY command.  So, I created the patch that way because
>> that would save cycles.  Why not just use that RTE in those cases?
>
> Yes, I agree it's better to reuse the RTE in the cases we can.  So, how
> does this look:
>
> +    /*
> +     * If the foreign table is a partition, we need to create a new RTE
> +     * describing the foreign table for use by deparseInsertSql and
> +     * create_foreign_modify() below, after first copying the parent's
> +     * RTE and modifying some fields to describe the foreign partition to
> +     * work on. However, if this is invoked by UPDATE, the existing RTE
> +     * may already correspond to this partition if it is one of the
> +     * UPDATE subplan target rels; in that case, we can just use the
> +     * existing RTE as-is.
> +     */
> +    rte = list_nth(estate->es_range_table, resultRelation - 1);
> +    if (rte->relid != RelationGetRelid(rel))
> +    {
> +        rte = copyObject(rte);
> +        rte->relid = RelationGetRelid(rel);
> +        rte->relkind = RELKIND_FOREIGN_TABLE;
> +
> +        /*
> +         * For UPDATE, we must use the RT index of the first subplan
> +         * target rel's RTE, because the core code would have built
> +         * expressions for the partition, such as RETURNING, using that
> +         * RT index as varno of Vars contained in those expressions.
> +         */
> +        if (plan&&  plan->operation == CMD_UPDATE&&
> +            resultRelation == plan->nominalRelation)
> +            resultRelation = mtstate->resultRelInfo[0].ri_RangeTableIndex;
> +    }

Seems like a better change than mine; because this simplifies the logic.

> I added the block inside the if because I agree with your comment below
> that the changes to ExecInitPartitionInfo are better kept for later to
> think more carefully about the dependencies it might break.

OK

>>> If we apply the other patch I proposed, resultRelation always points to
>>> the correct RTE.
>>>
>>>>> I tried to do that.  So, attached are two patches.
>>>>>
>>>>> 1. Fix for ExecInitPartitionInfo to use the right RT index to pass to
>>>>>      InitResultRelInfo
>>>>>
>>>>> 2. v5 of the patch to fix the bug of foreign partitions
>>>>>
>>>>> Thoughts?
>>
>> Actually, I also thought the former when creating the patch, but I left
>> that as-is because I'm not sure that would be really safe; ExecConstraints
>> looks at the RT index via GetInsertedColumns/GetUpdatedColumns, so what I
>> thought was: that might cause unexpected behavior.
>
> OK, I have to agree here that we better leave 1 to be looked at later.
>
> After this change, GetInsertedColumns/GetUpdatedColumns will start
> returning a different set of columns in some cases than it does now.
> Currently, it *always* returns a set containing root table's attribute
> numbers, even for UPDATE.  But with this change, for UPDATE, it will start
> returning the set containing the first subplan target rel's attribute
> numbers, which could be any partition with arbitrarily different attribute
> numbers.
>
>> Anyway, I think that
>> the former is more like an improvement rather than a fix, so it would be
>> better to leave that for another patch for PG12?
>
> I agree, so I'm dropping the patch for 1.

OK, let's focus on #2!

> See attached an updated version with changes as described above.

Looks good to me.  Thanks for the updated version!

Best regards,
Etsuro Fujita


pgsql-hackers by date:

Previous
From: Amit Langote
Date:
Subject: Re: Oddity in tuple routing for foreign partitions
Next
From: Vladimir Svedov
Date:
Subject: Re: Racing DEADLOCK on PostgreSQL 9.3