Home > mailing lists

non-bulk inserts and tuple routing - Mailing list pgsql-hackers

From	Amit Langote
Subject	non-bulk inserts and tuple routing
Date	December 19, 2017 13:06:16
Msg-id	8975331d-d961-cbdd-f862-fdd3d97dc2d0@lab.ntt.co.jp Whole thread Raw
Responses	Re: non-bulk inserts and tuple routing Re: non-bulk inserts and tuple routing
List	pgsql-hackers

Tree view

Hi.

I have a patch that rearranges the code around partition tuple-routing,
such that allocation of per-partition objects (ResultRelInfo,
TupleConversionMap, etc.) is delayed until a given partition is actually
inserted into (i.e., a tuple is routed to it).  I can see good win for
non-bulk inserts with the patch and the patch is implemented such that it
doesn't affect the bulk-insert case much.

Performance numbers:

* Uses following hash-partitioned table:

create table t1 (a int, b int) partition by hash (a);
create table t1_x partition of t1 for values with (modulus M, remainder R)
...


* Non-bulk insert uses the following code (insert 100,000 rows one-by-one):

do $$
begin
  for i in 1..100000 loop
    insert into t1 values (i, i+1);
  end loop;
end; $$;

* Times in milliseconds:

#parts           HEAD        Patched

     8       6216.300       4977.670
    16       9061.388       6360.093
    32      14081.656       8752.405
    64      24887.110      13919.384
   128      45926.251      24582.411
   256      88088.084      45490.894

As you can see the performance can be as much as 2x faster with the patch,
although time taken still increases as the number of partitions increases,
because we still lock *all* partitions at the beginning.

* Bulk-inserting 100,000 rows using COPY:

copy t1 from '/tmp/t1.csv' csv;

* Times in milliseconds:

#parts           HEAD        Patched

     8        458.301        450.875
    16        409.271        510.723
    32        500.960        612.003
    64        430.687        795.046
   128        449.314        565.786
   256        493.171        490.187

Not much harm here, although numbers are a bit noisy.

Patch is divided into 4, first 3 of which are refactoring patches.

I know this patch will conflict severely with [1] and [2], so it's fine if
we consider applying these later.  Will add this to next CF.

Thanks,
Amit

[1] https://commitfest.postgresql.org/16/1023/

[2] https://commitfest.postgresql.org/16/1184/

Attachment

pgsql-hackers by date:

From: Amit Kapila
Date: 19 December 2017, 13:01:05
Subject: Re: [HACKERS] parallel.c oblivion of worker-startup failures

From: Ashutosh Bapat
Date: 19 December 2017, 13:12:10
Subject: Re: non-bulk inserts and tuple routing

non-bulk inserts and tuple routing - Mailing list pgsql-hackers

Attachment

Previous

Next