Re: non-bulk inserts and tuple routing - Mailing list pgsql-hackers

From Etsuro Fujita
Subject Re: non-bulk inserts and tuple routing
Date
Msg-id 5A783549.4020409@lab.ntt.co.jp
Whole thread Raw
In response to Re: non-bulk inserts and tuple routing  (Amit Langote <Langote_Amit_f8@lab.ntt.co.jp>)
Responses Re: non-bulk inserts and tuple routing
Re: non-bulk inserts and tuple routing
List pgsql-hackers
(2018/02/05 14:34), Amit Langote wrote:
> On 2018/02/02 19:56, Etsuro Fujita wrote:
>> * ExecInitPartitionResultRelInfo is called from ExecFindPartition, but we
>> could call that another way; in ExecInsert/CopyFrom we call that after
>> ExecFindPartition if the partition chosen by ExecFindPartition has not
>> been initialized yet.  Maybe either would be OK, but I like that because I
>> think that would not only better divide that labor but better fit into the
>> existing code in ExecInsert/CopyFrom IMO.
>
> I see no problem with that, so done that way.

Thanks.

>> * In ExecInitPartitionResultRelInfo:
>> +       /*
>> +        * Note that the entries in this list appear in no predetermined
>> +        * order as result of initializing partition result rels as and when
>> +        * they're needed.
>> +        */
>> +       estate->es_leaf_result_relations =
>> + lappend(estate->es_leaf_result_relations,
>> +                                           leaf_part_rri);
>>
>> Is it OK to put this within the "if (leaf_part_rri == NULL)" block?
>
> Good catch.  I moved it outside the block.  I was under the impression
> that leaf result relations that were reused from the
> mtstate->resultRelInfo arrary would have already been added to the list,
> but it seems they are not.

I commented this because the update-tuple-routing patch has added to the 
list ResultRelInfos reused from the mtstate->resultRelInfo arrary, but 
on reflection I noticed this would cause oddity in reporting execution 
stats for partitions' triggers in EXPLAIN ANALYZE.  Here is an example 
using the head:

postgres=# create table trigger_test (a int, b text) partition by list (a);
CREATE TABLE
postgres=# create table trigger_test1 partition of trigger_test for 
values in (1);
CREATE TABLE
postgres=# create table trigger_test2 partition of trigger_test for 
values in (2);
CREATE TABLE
postgres=# create trigger before_upd_row_trig before update on 
trigger_test1 for
  each row execute procedure trigger_data (23, 'skidoo');
CREATE TRIGGER
postgres=# create trigger before_del_row_trig before delete on 
trigger_test1 for
  each row execute procedure trigger_data (23, 'skidoo');
CREATE TRIGGER
postgres=# insert into trigger_test values (1, 'foo');
INSERT 0 1
postgres=# explain analyze update trigger_test set a = 2 where a = 1;
NOTICE:  before_upd_row_trig(23, skidoo) BEFORE ROW UPDATE ON trigger_test1
NOTICE:  OLD: (1,foo),NEW: (2,foo)
NOTICE:  before_del_row_trig(23, skidoo) BEFORE ROW DELETE ON trigger_test1
NOTICE:  OLD: (1,foo)
                                                   QUERY PLAN

--------------------------------------------------------------------------------
-------------------------------
  Update on trigger_test  (cost=0.00..25.88 rows=6 width=42) (actual 
time=2.337..
2.337 rows=0 loops=1)
    Update on trigger_test1
    ->  Seq Scan on trigger_test1  (cost=0.00..25.88 rows=6 width=42) 
(actual tim
e=0.009..0.011 rows=1 loops=1)
          Filter: (a = 1)
  Planning time: 0.186 ms
  Trigger before_del_row_trig on trigger_test1: time=0.495 calls=1
  Trigger before_upd_row_trig on trigger_test1: time=0.870 calls=1
  Trigger before_del_row_trig on trigger_test1: time=0.495 calls=1
  Trigger before_upd_row_trig on trigger_test1: time=0.870 calls=1
  Execution time: 2.396 ms
(10 rows)

Both trigger stats for the on-update and on-delete triggers are doubly 
shown in the above output.  The reason would be that 
ExplainPrintTriggers called report_triggers twice for trigger_test1's 
ResultRelInfo: once for it from queryDesc->estate->es_result_relations 
and once for it from queryDesc->estate->es_leaf_result_relations.  I 
don't think this is intended behavior, so I think we should fix this.  I 
think we could probably address this by modifying ExecInitPartitionInfo 
in your patch so that it doesn't add to the es_leaf_result_relations 
list ResultRelInfos reused from the mtstate->resultRelInfo arrary, as 
your previous version of the patch.  (ExecGetTriggerResultRel looks at 
the list too, but it would probably work well for this change.)  It 
might be better to address this in another patch, though.

>> * In the same function:
>> +   /*
>> +    * Verify result relation is a valid target for INSERT.
>> +    */
>> +   CheckValidResultRel(leaf_part_rri, CMD_INSERT);
>>
>> I think it would be better to leave the previous comments as-is here:
>>
>>          /*
>>           * Verify result relation is a valid target for an INSERT.  An UPDATE
>>           * of a partition-key becomes a DELETE+INSERT operation, so this
>> check
>>           * is still required when the operation is CMD_UPDATE.
>>           */
>
> Oops, my bad.  Didn't notice that I had ended up removing the part about
> UPDATE.

OK.

>> * ExecInitPartitionResultRelInfo does the work other than the
>> initialization of ResultRelInfo for the chosen partition (eg, create a
>> tuple conversion map to convert a tuple routed to the partition from the
>> parent's type to the partition's).  So I'd propose to rename that function
>> to eg, ExecInitPartition.
>
> I went with ExevInitPartitionInfo.

Fine with me.

>> * CopyFrom is modified so that it calls ExecSetupPartitionTupleRouting and
>> ExecFindPartition with a mostly-dummy ModifyTableState node.  I'm not sure
>> that is a good idea.  My concern about that is that might be something
>> like a headache in future development.
>
> OK, I removed those changes.

Thanks.

>> * The patch 0001 and 0002 are pretty small but can't be reviewed without
>> the patch 0003.  The total size of the three patches aren't that large, so
>> I think it would be better to put those patches together into a single patch.
>
> As I said above, I got rid of 0001.  Then, I merged the
> ExecFindPartition() refactoring patch 0002 into 0003.
>
> The code in tupconv_map_for_subplan() currently assumes that it can rely
> on all leaf partitions having been initialized.  Since we're breaking that
> assumption with this proposal, that needed to be changed.  So the patch
> contained some refactoring to make it not rely on that assumption.  I
> carved that out into a separate patch which can be applied and tested
> before the main patch.

OK, will review that patch separately.

> Here is the updated version that contains two patches as described above.

Thanks for updating the patches!  I'll post my next comments in a few days.

Best regards,
Etsuro Fujita


pgsql-hackers by date:

Previous
From: Pierre Ducroquet
Date:
Subject: Re: JIT compiling with LLVM v9.1
Next
From: Metin Doslu
Date:
Subject: Add PGDLLIMPORT to enable_hashagg