Thread: dblink bulk operations
Last night I needed to move a bunch of data from an OLTP database to an archive database, and used dblink with a bunch of insert statements. Since I was moving about 4m records this was distressingly but not surprisingly slow. It set me wondering why we don't build more support for libpq operations into dblink, like transactions and prepared queries, and maybe COPY too. It would be nice to be able to do something like: select dblink_connect('dbh','dbname=foo'); select dblink_begin('dbh'); select dblink_prepare('dbh','sth','insert intobar values ($1,$2,$3)'); select dblink_exec_prepared('dbh','sth',row(a,b,c)) from bar; -- can we do this? selectdblink_commit('dbh'); select dblink_disconnect('dbh'); Does this seem worthwhile and doable, or am I smoking crack? cheers andrew
On Thu, Aug 6, 2009 at 11:11 AM, Andrew Dunstan<andrew@dunslane.net> wrote: > > Last night I needed to move a bunch of data from an OLTP database to an > archive database, and used dblink with a bunch of insert statements. Since I > was moving about 4m records this was distressingly but not surprisingly > slow. It set me wondering why we don't build more support for libpq > operations into dblink, like transactions and prepared queries, and maybe > COPY too. It would be nice to be able to do something like: > > select dblink_connect('dbh','dbname=foo'); > select dblink_begin('dbh'); you can always exec a sql 'begin'. > select dblink_prepare('dbh','sth','insert into bar values ($1,$2,$3)'); > select dblink_exec_prepared('dbh','sth',row(a,b,c)) from bar; -- can > we do this? The answer to this I think is yes, but not quite that way. Much better I think is to use 8.4 variable argument functions, use parametrized features off libpq always, and use the binary protocol when possible. This does end up running much faster, and easier to use...(we've done exactly that for our in house stuff). IIRC you can parameterize 'execute', so the above should work for prepared queries as well. If we get the ability to set specific OIDs for types, I can remove some of the hacks we have to send text for composites and arrays of composites. select * from pqlink_exec(connstr, 'select $1 + $2', 3, 4) as R(v int);v ---7 (1 row) merlin
On Thu, Aug 06, 2009 at 11:11:58AM -0400, Andrew Dunstan wrote: > > Last night I needed to move a bunch of data from an OLTP database to an > archive database, and used dblink with a bunch of insert statements. > Since I was moving about 4m records this was distressingly but not > surprisingly slow. It set me wondering why we don't build more support > for libpq operations into dblink, like transactions and prepared > queries, and maybe COPY too. It would be nice to be able to do something > like: > > select dblink_connect('dbh','dbname=foo'); > select dblink_begin('dbh'); > select dblink_prepare('dbh','sth','insert into bar values ($1,$2,$3)'); > select dblink_exec_prepared('dbh','sth',row(a,b,c)) from bar; -- can > we do this? > select dblink_commit('dbh'); > select dblink_disconnect('dbh'); > > > Does this seem worthwhile and doable, or am I smoking crack? For what it's worth, DBI-Link provides a lot of this. Cheers, David. -- David Fetter <david@fetter.org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fetter@gmail.com Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
David Fetter wrote: > > For what it's worth, DBI-Link provides a lot of this. > > > Indeed, but that assumes that perl+DBI+DBD::Pg is available, which is by no means always the case. If we're going to have a dblink module ISTM it should be capable of reasonable bulk operations. cheers andrew
On Thu, Aug 06, 2009 at 12:28:15PM -0400, Andrew Dunstan wrote: > David Fetter wrote: >> >> For what it's worth, DBI-Link provides a lot of this. > > Indeed, but that assumes that perl+DBI+DBD::Pg is available, which > is by no means always the case. If we're going to have a dblink > module ISTM it should be capable of reasonable bulk operations. I didn't mean to suggest that you should use DBI-Link, just that it's a requirement that's come up in very similar contexts to that of dblink. Cheers, David. -- David Fetter <david@fetter.org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fetter@gmail.com Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
On Thu, Aug 6, 2009 at 11:11 AM, Andrew Dunstan<andrew@dunslane.net> wrote: > > Last night I needed to move a bunch of data from an OLTP database to an > archive database, and used dblink with a bunch of insert statements. Since I > was moving about 4m records this was distressingly but not surprisingly > slow. It set me wondering why we don't build more support for libpq > operations into dblink, like transactions and prepared queries, and maybe > COPY too. It would be nice to be able to do something like: > > select dblink_connect('dbh','dbname=foo'); > select dblink_begin('dbh'); > select dblink_prepare('dbh','sth','insert into bar values ($1,$2,$3)'); > select dblink_exec_prepared('dbh','sth',row(a,b,c)) from bar; -- can > we do this? > select dblink_commit('dbh'); > select dblink_disconnect('dbh'); thinking about this some more, you can get pretty close with vanilla dblink with something like (i didn't test): select dblink_exec('dbh', 'prepare xyz as insert into foo select ($1::foo).*'); select dblink_exec('dbh', 'execute xyz(' || my_foo::text || ')'); This maybe defeats a little bit of what you are trying to achieve (especially performance), but is much easier to craft for basically any table as long as the fields match. The above runs into problems with quoting (composite with bytea in it), but works ok most of the time. If you want faster/better, dblink need to be factored to parametrize queries and, if possible, use binary. merlin