Thread: best performance for simple dml

best performance for simple dml

From
chester c young
Date:
what is the best performance / best practices for frequently-used simple dml, for example, an insert

1. fast-interface

2. prepared statement calling "insert ..." with binary parameters

3. prepared statement calling "myfunc(..." with binary parameters; myfunc takes its arguments and performs an insert using them

Re: best performance for simple dml

From
Pavel Stehule
Date:
Hello

try it and you will see. Depends on network speed, hw speed. But the most fast is using a COPY API

http://www.postgresql.org/docs/9.0/interactive/libpq-copy.html

Regards

Pavel Stehule


2011/6/27 chester c young <chestercyoung@yahoo.com>
what is the best performance / best practices for frequently-used simple dml, for example, an insert

1. fast-interface

2. prepared statement calling "insert ..." with binary parameters

3. prepared statement calling "myfunc(..." with binary parameters; myfunc takes its arguments and performs an insert using them


Re: best performance for simple dml

From
chester c young
Date:
two questions:

I thought copy was for multiple rows - is its setup cost effective for one row?

copy would also only be good for insert or select, not update - is this right?

--- On Mon, 6/27/11, Pavel Stehule <pavel.stehule@gmail.com> wrote:

From: Pavel Stehule <pavel.stehule@gmail.com>
Subject: Re: [SQL] best performance for simple dml
To: "chester c young" <chestercyoung@yahoo.com>
Cc: pgsql-sql@postgresql.org
Date: Monday, June 27, 2011, 12:35 AM

Hello

try it and you will see. Depends on network speed, hw speed. But the most fast is using a COPY API

http://www.postgresql.org/docs/9.0/interactive/libpq-copy.html

Regards

Pavel Stehule


2011/6/27 chester c young <chestercyoung@yahoo.com>
what is the best performance / best practices for frequently-used simple dml, for example, an insert

1. fast-interface

2. prepared statement calling "insert ..." with binary parameters

3. prepared statement calling "myfunc(..." with binary parameters; myfunc takes its arguments and performs an insert using them


Re: best performance for simple dml

From
Pavel Stehule
Date:
2011/6/27 chester c young <chestercyoung@yahoo.com>
>
> two questions:
> I thought copy was for multiple rows - is its setup cost effective for one row?

I expect it will be faster for one row too - it is not sql statement

if you want to understand to performance issues you have to understand to

a) network communication costs
b) SQL parsing and SQL planning costs
c) commits costs
d) other costs - triggers, referential integrity costs

>
> copy would also only be good for insert or select, not update - is this right?

sure,

If you need to call a lot of simple dml statement in cycle, then

a) try tu move it to stored function
b) if you can't to move it, then ensure, so statements will be
executed under outer transaction

slow code

for(i = 0; i < 1000; i++) exec("insert into foo values($1), itoa(i));

10x faster code

exec('begin');
for(i = 0; i < 1000; i++) exec("insert into foo values($1), itoa(i));
exec('commit');

Regards

Pavel Stehule

>
> --- On Mon, 6/27/11, Pavel Stehule <pavel.stehule@gmail.com> wrote:
>
> From: Pavel Stehule <pavel.stehule@gmail.com>
> Subject: Re: [SQL] best performance for simple dml
> To: "chester c young" <chestercyoung@yahoo.com>
> Cc: pgsql-sql@postgresql.org
> Date: Monday, June 27, 2011, 12:35 AM
>
> Hello
>
> try it and you will see. Depends on network speed, hw speed. But the most fast is using a COPY API
>
> http://www.postgresql.org/docs/9.0/interactive/libpq-copy.html
>
> Regards
>
> Pavel Stehule
>
>
> 2011/6/27 chester c young <chestercyoung@yahoo.com>
>
> what is the best performance / best practices for frequently-used simple dml, for example, an insert
> 1. fast-interface
> 2. prepared statement calling "insert ..." with binary parameters
> 3. prepared statement calling "myfunc(..." with binary parameters; myfunc takes its arguments and performs an insert
usingthem
 
>


Re: best performance for simple dml

From
chester c young
Date:
<table border="0" cellpadding="0" cellspacing="0"><tr><td style="font: inherit;" valign="top">very nice pointers.
 thankyou very much!<br /><br />--- On <b>Mon, 6/27/11, Pavel Stehule <i><pavel.stehule@gmail.com></i></b>
wrote:<br/><blockquote style="border-left: 2px solid rgb(16, 16, 255); margin-left: 5px; padding-left: 5px;"><br
/>From:Pavel Stehule <pavel.stehule@gmail.com><br />Subject: Re: [SQL] best performance for simple dml<br />To:
"chesterc young" <chestercyoung@yahoo.com><br />Cc: pgsql-sql@postgresql.org<br />Date: Monday, June 27, 2011,
1:05AM<br /><br /><div class="plainMail">2011/6/27 chester c young <<a href="/mc/compose?to=chestercyoung@yahoo.com"
ymailto="mailto:chestercyoung@yahoo.com">chestercyoung@yahoo.com</a>><br/>><br />> two questions:<br />> I
thoughtcopy was for multiple rows - is its setup cost effective for one row?<br /><br />I expect it will be faster for
onerow too - it is not sql statement<br /><br />if you want to understand to performance issues you have to understand
to<br/><br />a) network communication costs<br />b) SQL parsing and SQL planning costs<br />c) commits costs<br />d)
othercosts - triggers, referential integrity costs<br /><br />><br />> copy would also only be good for insert or
select,not update - is this right?<br /><br />sure,<br /><br />If you need to call a lot of simple dml statement in
cycle,then<br /><br />a) try tu move it to stored function<br />b) if you can't to move it, then ensure, so statements
willbe<br />executed under outer transaction<br /><br />slow code<br /><br />for(i = 0; i < 1000; i++)<br /> 
exec("insertinto foo values($1), itoa(i));<br /><br />10x faster code<br /><br />exec('begin');<br />for(i = 0; i <
1000;i++)<br />  exec("insert into foo values($1), itoa(i));<br />exec('commit');<br /><br />Regards<br /><br />Pavel
Stehule<br/><br />><br />> --- On Mon, 6/27/11, Pavel Stehule <<a
href="/mc/compose?to=pavel.stehule@gmail.com"ymailto="mailto:pavel.stehule@gmail.com">pavel.stehule@gmail.com</a>>
wrote:<br/>><br />> From: Pavel Stehule <<a href="/mc/compose?to=pavel.stehule@gmail.com"
ymailto="mailto:pavel.stehule@gmail.com">pavel.stehule@gmail.com</a>><br/>> Subject: Re: [SQL] best performance
forsimple dml<br />> To: "chester c young" <<a href="/mc/compose?to=chestercyoung@yahoo.com"
ymailto="mailto:chestercyoung@yahoo.com">chestercyoung@yahoo.com</a>><br/>> Cc: <a
href="/mc/compose?to=pgsql-sql@postgresql.org"ymailto="mailto:pgsql-sql@postgresql.org">pgsql-sql@postgresql.org</a><br
/>>Date: Monday, June 27, 2011, 12:35 AM<br />><br />> Hello<br />><br />> try it and you will see.
Dependson network speed, hw speed. But the most fast is using a COPY API<br />><br />> <a
href="http://www.postgresql.org/docs/9.0/interactive/libpq-copy.html"
target="_blank">http://www.postgresql.org/docs/9.0/interactive/libpq-copy.html</a><br/>><br />> Regards<br
/>><br/>> Pavel Stehule<br />><br />><br />> 2011/6/27 chester c young <<a
href="/mc/compose?to=chestercyoung@yahoo.com"
ymailto="mailto:chestercyoung@yahoo.com">chestercyoung@yahoo.com</a>><br/>><br />> what is the best
performance/ best practices for frequently-used simple dml, for example, an insert<br />> 1. fast-interface<br
/>>2. prepared statement calling "insert ..." with binary parameters<br />> 3. prepared statement calling
"myfunc(..."with binary parameters; myfunc takes its arguments and performs an insert using them<br />><br
/></div></blockquote></td></tr></table>

Re: best performance for simple dml

From
Craig Ringer
Date:
On 27/06/11 15:05, Pavel Stehule wrote:

> exec('begin');
> for(i = 0; i < 1000; i++)
>   exec("insert into foo values($1), itoa(i));
> exec('commit');

You can probably also benefit from multi-valued INSERTs, though I
haven't verified this.

INSERT INTO foo VALUES
(1,'joe','dean'),
(4,'fred','bob'),
(11,'anne','smith');

There'll be a threshhold above which the COPY protocol becomes faster,
though.

--
Craig Ringer


Re: best performance for simple dml

From
chester c young
Date:
forgive me for brain storming a little re copy:

if there are a limited number of tables you're inserting, would there be anything wrong with the app opening a copy connection?  ie, a connection initiates the copy and then stays open like a pipe for any inserts coming through it.  visually it's a very cool paradigm, but is it actually a good idea?

--- On Mon, 6/27/11, Pavel Stehule <pavel.stehule@gmail.com> wrote:

From: Pavel Stehule <pavel.stehule@gmail.com>
Subject: Re: [SQL] best performance for simple dml
To: "chester c young" <chestercyoung@yahoo.com>
Cc: pgsql-sql@postgresql.org
Date: Monday, June 27, 2011, 1:05 AM

2011/6/27 chester c young <chestercyoung@yahoo.com>
>
> two questions:
> I thought copy was for multiple rows - is its setup cost effective for one row?

I expect it will be faster for one row too - it is not sql statement

if you want to understand to performance issues you have to understand to

a) network communication costs
b) SQL parsing and SQL planning costs
c) commits costs
d) other costs - triggers, referential integrity costs

>
> copy would also only be good for insert or select, not update - is this right?

sure,

If you need to call a lot of simple dml statement in cycle, then

a) try tu move it to stored function
b) if you can't to move it, then ensure, so statements will be
executed under outer transaction

slow code

for(i = 0; i < 1000; i++)
  exec("insert into foo values($1), itoa(i));

10x faster code

exec('begin');
for(i = 0; i < 1000; i++)
  exec("insert into foo values($1), itoa(i));
exec('commit');

Regards

Pavel Stehule

>
> --- On Mon, 6/27/11, Pavel Stehule <pavel.stehule@gmail.com> wrote:
>
> From: Pavel Stehule <pavel.stehule@gmail.com>
> Subject: Re: [SQL] best performance for simple dml
> To: "chester c young" <chestercyoung@yahoo.com>
> Cc: pgsql-sql@postgresql.org
> Date: Monday, June 27, 2011, 12:35 AM
>
> Hello
>
> try it and you will see. Depends on network speed, hw speed. But the most fast is using a COPY API
>
> http://www.postgresql.org/docs/9.0/interactive/libpq-copy.html
>
> Regards
>
> Pavel Stehule
>
>
> 2011/6/27 chester c young <chestercyoung@yahoo.com>
>
> what is the best performance / best practices for frequently-used simple dml, for example, an insert
> 1. fast-interface
> 2. prepared statement calling "insert ..." with binary parameters
> 3. prepared statement calling "myfunc(..." with binary parameters; myfunc takes its arguments and performs an insert using them
>

--
Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql

Re: best performance for simple dml

From
Pavel Stehule
Date:
Hello

2011/6/27 chester c young <chestercyoung@yahoo.com>
forgive me for brain storming a little re copy:

if there are a limited number of tables you're inserting, would there be anything wrong with the app opening a copy connection?  ie, a connection initiates the copy and then stays open like a pipe for any inserts coming through it.  visually it's a very cool paradigm, but is it actually a good idea?

depends on application. Usually you can use a connection better than just "insert connection". I am thinking, so it doesn't carry some special - it remove a connection cost, but nothing more. You can use a more connections to do paralel inserts - it has a sense.

look on pgpool or other similar sw for connection pooling

Pavel


 

--- On Mon, 6/27/11, Pavel Stehule <pavel.stehule@gmail.com> wrote:

From: Pavel Stehule <pavel.stehule@gmail.com>
Subject: Re: [SQL] best performance for simple dml
To: "chester c young" <chestercyoung@yahoo.com>
Cc: pgsql-sql@postgresql.org
Date: Monday, June 27, 2011, 1:05 AM

2011/6/27 chester c young <chestercyoung@yahoo.com>
>
> two questions:
> I thought copy was for multiple rows - is its setup cost effective for one row?

I expect it will be faster for one row too - it is not sql statement

if you want to understand to performance issues you have to understand to

a) network communication costs
b) SQL parsing and SQL planning costs
c) commits costs
d) other costs - triggers, referential integrity costs

>
> copy would also only be good for insert or select, not update - is this right?

sure,

If you need to call a lot of simple dml statement in cycle, then

a) try tu move it to stored function
b) if you can't to move it, then ensure, so statements will be
executed under outer transaction

slow code

for(i = 0; i < 1000; i++)
  exec("insert into foo values($1), itoa(i));

10x faster code

exec('begin');
for(i = 0; i < 1000; i++)
  exec("insert into foo values($1), itoa(i));
exec('commit');

Regards

Pavel Stehule

>
> --- On Mon, 6/27/11, Pavel Stehule <pavel.stehule@gmail.com> wrote:
>
> From: Pavel Stehule <pavel.stehule@gmail.com>
> Subject: Re: [SQL] best performance for simple dml
> To: "chester c young" <chestercyoung@yahoo.com>
> Cc: pgsql-sql@postgresql.org
> Date: Monday, June 27, 2011, 12:35 AM
>
> Hello
>
> try it and you will see. Depends on network speed, hw speed. But the most fast is using a COPY API
>
> http://www.postgresql.org/docs/9.0/interactive/libpq-copy.html
>
> Regards
>
> Pavel Stehule
>
>
> 2011/6/27 chester c young <chestercyoung@yahoo.com>
>
> what is the best performance / best practices for frequently-used simple dml, for example, an insert
> 1. fast-interface
> 2. prepared statement calling "insert ..." with binary parameters
> 3. prepared statement calling "myfunc(..." with binary parameters; myfunc takes its arguments and performs an insert using them
>

--
Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-sql