On Sat, Oct 5, 2013 at 6:39 AM, Stephen Frost <sfrost@snowman.net> wrote:
* Robert Haas (robertmhaas@gmail.com) wrote: > On Fri, Oct 4, 2013 at 11:28 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > David Rowley <dgrowleyml@gmail.com> writes: > >> Here's a small patch which greatly increases the speed of > >> pg_dump --column-inserts. > > > > The reason why no one's paid any attention to the speed of that code path > > is that if you care about dump/restore speed, you should be using the COPY > > code paths instead. Is it really worth adding code and complexity to > > pg_dump for this? > > One possible reason to care about this is if you're trying to move > data to another database. The INSERT format is more portable. > > Also, this isn't really adding any net code or complexity AFAICS.
Agreed- this looks more like a "gee, that makes a lot of sense" than a "wow, that's way more complicated". Not a whole lot of point in building up a known-to-be-constant string on every iteration of the loop.
These words made me think that the changes I made were not quite enough to satisfy this what you said.
I understand that most people won't use the --column-inserts feature, but that's not really a great reason to have not very clever and wasteful code in there. This fruit was so low hanging it was pretty much touching the ground, so I couldn't resist fixing it when I saw it.
The attached revised patch goes a little further and prepares everything that is constant on processing the first row, this now includes the "INSERT INTO tablename " part. I don't think the changes make the code any harder to read, the code which builds the staticStmt fits into my small laptop screen.
The timings with my benchmark look something like:
Unpatched: 9200 ms
Version 0.1: 5700 ms
Version 0.2: 5250 ms
So it does shave off a bit more, for what it's worth.