Home > mailing lists

Re: Benchmarking partitioning triggers and rules - Mailing list pgsql-general

From	Jim Nasby
Subject	Re: Benchmarking partitioning triggers and rules
Date	April 13, 2015 03:21:33
Msg-id	552B0BF4.2050103@BlueTreble.com Whole thread Raw
In response to	Re: Benchmarking partitioning triggers and rules (Tim Uckun <timuckun@gmail.com>)
List	pgsql-general

Tree view

On 4/8/15 4:58 PM, Tim Uckun wrote:
> So is there a third and even faster way of doing this?

Please don't top-post.

> On Wed, Apr 8, 2015 at 4:08 PM, Adrian Klaver <adrian.klaver@aklaver.com
> <mailto:adrian.klaver@aklaver.com>> wrote:
>
>     On 04/07/2015 07:49 PM, Tim Uckun wrote:
>
>         I understand that there is overhead involved in parsing the
>         strings and
>         such.  The amount of overhead was surprising to me but that's
>         another
>         matter.  What I am really puzzled about is the difference
>         between the
>         statements
>
>         EXECUTE  'INSERT INTO ' ||  quote_ident(partition_name) ||  ' SELECT
>         ($1).*' USING NEW ;
>
>         and
>
>             EXECUTE  'INSERT INTO ' ||  quote_ident(partition_name) ||
>         ' VALUES
>         (($1).*)' USING NEW ;
>
>
>
>     Offhand I would say because in the first case you are doing a SELECT
>     and in the second you are just doing a substitution.
>
>
>
>         They both do string interpolation but one is significantly
>         faster than
>         the other.   Is there a third and even faster way?
>
>         I am using RDS so I can't really do stored procs in C.

There are only 3 ways you can "steer" data into the correct partition:

1) Outside the database (ie: the application itself knows what partition
table to use)
2) Rules
3) Triggers

#1 will almost always be faster (I say almost because if something has
to decide row-by-row it may be faster for the database to do it).

#2 and #3 depend on the workload. Rules are parsed once PER STATEMENT,
so if you're using a single INSERT to load a lot of rows they might well
be faster than triggers. But as mentioned, they're a real PITA to use
and they don't work at all with COPY.

When it comes to triggers, you will get different performance depending
on the trigger language used as well as how you write the trigger
itself. Presumably a trigger function written in C will be faster than
anything else. I would expect plpgsql to be the next fastest after that,
but that's just a guess.

As for the difference between SELECT and VALUES above, that's probably
due to a difference in parsing or in the number of transforms the NEW
has to go through in the executor. My guess is that because SELECT is
much more general purpose than VALUES it's both more expensive to parse
as well as more expensive to execute. If you really want to know for
certain, connect gdb to a database backend on your laptop/desktop, fire
off an insert (to load plpgsql into backend memory), set a gdb
breakpoint on exec_stmt_block(), and see what's different between the
two use cases.
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com

pgsql-general by date:

From: Ilya Ashchepkov
Date: 12 April 2015, 18:25:50
Subject: Re: Hot standby problems: consistent state not reached, no connection to master server.

From: Adrian Klaver
Date: 13 April 2015, 03:30:54
Subject: Re: Re: Hot standby problems: consistent state not reached, no connection to master server.

Re: Benchmarking partitioning triggers and rules - Mailing list pgsql-general

Previous

Next