Re: multi-line copy (was: Re: COPY Hacks) - Mailing list pgsql-performance

From Mischa Sandberg
Subject Re: multi-line copy (was: Re: COPY Hacks)
Date
Msg-id 1112936002.42560e420adf7@webmail.telus.net
Whole thread Raw
In response to Re: COPY Hacks (WAS: RE: Postgresql vs SQLserver for this application ?)  ("Greg Sabino Mullane" <greg@turnstep.com>)
List pgsql-performance
Quoting Greg Sabino Mullane <greg@turnstep.com>:

> > Using COPY ... FROM STDIN via the Perl DBI (DBD::Pg) interface,
> > I accidentally strung together several \n-terminated input lines,
> > and sent them to the server with a single "putline".
> ...
> > So. Is it a feechur? Worth stress-testing? Could be VERY cool.
>
> As explained elsewhere, not really a feature, more of a side-effect.
> Keep in mind, however, that any network round-trip time saved has to
> be balanced against some additional overhead of constructing the
> combined strings in Perl before sending them over. Most times COPY
> is used to parse a newline-separated file anyway. If you have a slow
> network connection to the database, it *might* be a win, but my
> limited testing shows that it is not an advantage for a "normal"
> connection: I added 1 million rows via COPY using the normal way
> (1 million pg_putline calls), via pg_putline of 1000 rows at a
> time, and via 10,000 rows at a time. They all ran in 22 seconds,
> with no statistical difference between them. (This was the "real" time,
> the system time was actually much lower for the combined calls).
>
> It can't hurt to test things out on your particular system and see
> if it makes a real difference: it certainly does no harm as long as
> you make sure the string you send always *end* in a newline.

Many thanks for digging into it.

For the app I'm working with, the time delay between rows being posted
is /just/ enough to exceed the TCP Nagle delay, so every row goes across
in its own packet :-( Reducing the number of network roundtrips
by a factor of 40 is enough to cut elapsed time in half.
The cost of join("",@FortyRows), which produces a 1-4K string, is what's
negligible in this case.

--
"Dreams come true, not free" -- S.Sondheim, ITW


pgsql-performance by date:

Previous
From: "Greg Sabino Mullane"
Date:
Subject: Re: COPY Hacks (WAS: RE: Postgresql vs SQLserver for this application ?)
Next
From: Josh Berkus
Date:
Subject: Re: [sfpug] DATA directory on network attached storage