Home > mailing lists

Re: Performance on Bulk Insert to Partitioned Table - Mailing list pgsql-performance

From	Charles Gomes
Subject	Re: Performance on Bulk Insert to Partitioned Table
Date	December 24, 2012 18:36:22
Msg-id	BLU002-W47FA8B09A7E4DB3918AD6AAB3B0@phx.gbl Whole thread Raw
In response to	Re: Performance on Bulk Insert to Partitioned Table (Evgeny Shishkin <itparanoia@gmail.com>)
Responses	Re: Performance on Bulk Insert to Partitioned Table
List	pgsql-performance

Tree view

I've just found this:
From:
http://archives.postgresql.org/pgsql-hackers/2008-12/msg01221.php

"initial tests to insert 140k rows are as follows:

- direct inserts in a child table: 2 seconds

- pgplsql trigger (IF ... ELSE IF ... blocks) : 14.5 seconds.

- C trigger: 4 seconds (actually the overhead is in the constraint check)

"

This is from 2008 and looks like at that time those folks where already having performance issues with partitions.

Going to copy some folks from the old thread, hopefully 4 years later they may have found a solution. Maybe they've
movedon into something more exciting, maybe He is in another world where we don't have database servers. In special the
braveEmmanuel for posting his trigger code that I will hack into my own :P 
Thanks Emmanuel.






----------------------------------------
> Subject: Re: [PERFORM] Performance on Bulk Insert to Partitioned Table
> From: itparanoia@gmail.com
> Date: Mon, 24 Dec 2012 21:11:07 +0400
> CC: jeff.janes@gmail.com; ondrej.ivanic@gmail.com; pgsql-performance@postgresql.org
> To: charlesrg@outlook.com
>
>
> On Dec 24, 2012, at 9:07 PM, Charles Gomes <charlesrg@outlook.com> wrote:
>
> > By the way, I've just re-wrote the code to target the partitions individually and I've got almost 4 times
improvement.
> > Shouldn't it be faster to process the trigger, I would understand if there was no CPU left, but there is lots of
cputo chew. 
>
> I saw your 20% idle cpu and raise eyebrows.
>
> > It seems that there will be no other way to speedup unless the insert code is partition aware.
> >
> > ----------------------------------------
> >> From: charlesrg@outlook.com
> >> To: jeff.janes@gmail.com
> >> CC: ondrej.ivanic@gmail.com; pgsql-performance@postgresql.org
> >> Subject: Re: [PERFORM] Performance on Bulk Insert to Partitioned Table
> >> Date: Mon, 24 Dec 2012 10:51:12 -0500
> >>
> >> ________________________________
> >>> Date: Sun, 23 Dec 2012 14:55:16 -0800
> >>> Subject: Re: [PERFORM] Performance on Bulk Insert to Partitioned Table
> >>> From: jeff.janes@gmail.com
> >>> To: charlesrg@outlook.com
> >>> CC: ondrej.ivanic@gmail.com; pgsql-performance@postgresql.org
> >>>
> >>> On Thursday, December 20, 2012, Charles Gomes wrote:
> >>> True, that's the same I feel, I will be looking to translate the
> >>> trigger to C if I can find good examples, that should accelerate.
> >>>
> >>> I think your performance bottleneck is almost certainly the dynamic
> >>> SQL. Using C to generate that dynamic SQL isn't going to help much,
> >>> because it is still the SQL engine that has to parse, plan, and execute
> >>> it.
> >>>
> >>> Are the vast majority if your inserts done on any given day for records
> >>> from that same day or the one before; or are they evenly spread over
> >>> the preceding year? If the former, you could use static SQL in IF and
> >>> ELSIF for those days, and fall back on the dynamic SQL for the
> >>> exceptions in the ELSE block. Of course that means you have to update
> >>> the trigger every day.
> >>>
> >>>
> >>> Using rules would be totally bad as I'm partitioning daily and after
> >>> one year having 365 lines of IF won't be fun to maintain.
> >>>
> >>> Maintaining 365 lines of IF is what Perl was invented for. That goes
> >>> for triggers w/ static SQL as well as for rules.
> >>>
> >>> If you do the static SQL in a trigger and the dates of the records are
> >>> evenly scattered over the preceding year, make sure your IFs are nested
> >>> like a binary search, not a linear search. And if they are mostly for
> >>> "today's" date, then make sure you search backwards.
> >>>
> >>> Cheers,
> >>>
> >>> Jeff
> >>
> >> Jeff, I've changed the code from dynamic to:
> >>
> >> CREATE OR REPLACE FUNCTION quotes_insert_trigger()
> >> RETURNS trigger AS $$
> >> DECLARE
> >> r_date text;
> >> BEGIN
> >> r_date = to_char(new.received_time, 'YYYY_MM_DD');
> >> case r_date
> >> when '2012_09_10' then
> >> insert into quotes_2012_09_10 values (NEW.*) using new;
> >> return;
> >> when '2012_09_11' then
> >> insert into quotes_2012_09_11 values (NEW.*) using new;
> >> return;
> >> when '2012_09_12' then
> >> insert into quotes_2012_09_12 values (NEW.*) using new;
> >> return;
> >> when '2012_09_13' then
> >> insert into quotes_2012_09_13 values (NEW.*) using new;
> >> return;
> >> when '2012_09_14' then
> >> insert into quotes_2012_09_14 values (NEW.*) using new;
> >> return;
> >> when '2012_09_15' then
> >> insert into quotes_2012_09_15 values (NEW.*) using new;
> >> return;
> >> when '2012_09_16' then
> >> insert into quotes_2012_09_16 values (NEW.*) using new;
> >> return;
> >> when '2012_09_17' then
> >> insert into quotes_2012_09_17 values (NEW.*) using new;
> >> return;
> >> when '2012_09_18' then
> >> insert into quotes_2012_09_18 values (NEW.*) using new;
> >> return;
> >> when '2012_09_19' then
> >> insert into quotes_2012_09_19 values (NEW.*) using new;
> >> return;
> >> when '2012_09_20' then
> >> insert into quotes_2012_09_20 values (NEW.*) using new;
> >> return;
> >> when '2012_09_21' then
> >> insert into quotes_2012_09_21 values (NEW.*) using new;
> >> return;
> >> when '2012_09_22' then
> >> insert into quotes_2012_09_22 values (NEW.*) using new;
> >> return;
> >> when '2012_09_23' then
> >> insert into quotes_2012_09_23 values (NEW.*) using new;
> >> return;
> >> when '2012_09_24' then
> >> insert into quotes_2012_09_24 values (NEW.*) using new;
> >> return;
> >> end case
> >> RETURN NULL;
> >> END;
> >> $$
> >> LANGUAGE plpgsql;
> >>
> >>
> >> However I've got no speed improvement.
> >> I need to keep two weeks worth of partitions at a time, that's why all the WHEN statements.
> >> Wish postgres could automate the partition process natively like the other sql db.
> >>
> >> Thank you guys for your help.
> >>
> >> --
> >> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
> >> To make changes to your subscription:
> >> http://www.postgresql.org/mailpref/pgsql-performance
> >
> > --
> > Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
> > To make changes to your subscription:
> > http://www.postgresql.org/mailpref/pgsql-performance
>
>
>
> --
> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance

pgsql-performance by date:

From: Evgeny Shishkin
Date: 24 December 2012, 17:11:13
Subject: Re: Performance on Bulk Insert to Partitioned Table

From: Richard Neill
Date: 24 December 2012, 18:37:17
Subject: Re: Why does the query planner use two full indexes, when a dedicated partial index exists? (solved?)

Re: Performance on Bulk Insert to Partitioned Table - Mailing list pgsql-performance

Previous

Next