Re: COPY vs INSERT - Mailing list pgsql-performance

From Kris Jurka
Subject Re: COPY vs INSERT
Date
Msg-id Pine.BSO.4.56.0505050406080.14495@leary.csoft.net
Whole thread Raw
In response to Re: COPY vs INSERT  (Mischa Sandberg <mischa.sandberg@telus.net>)
List pgsql-performance

On Wed, 4 May 2005, Mischa Sandberg wrote:

> Quoting Kris Jurka <books@ejurka.com>:
>
> > Not true.  A client may send any number of Bind/Execute messages on
> > a prepared statement before a Sync message.

> Hunh. Interesting optimization in the JDBC driver. I gather it is
> sending a string of (;)-separated inserts.

No, it uses the V3 protocol and a prepared statement and uses
Bind/Execute, as I mentioned.

> Sounds like efficient-but-risky stuff we did for ODBC drivers at Simba
> ... gets interesting when one of the insert statements in the middle
> fails.

When running inside a transaction (as you really want to do anyway when
bulk loading) it is well defined, it is a little odd for auto commit mode
though.  In autocommit mode the transaction boundary is at the Sync
message, not the individual Execute messages, so you will get some
rollback on error.  The JDBC spec is poorly defined in this area, so we
can get away with this.

> Good to know. Hope that the batch size is parametric, given that
> you can have inserts with rather large strings bound to 'text' columns
> in PG --- harder to identify BLOBs when talking to PG, than when talking
> to MSSQL/Oracle/Sybase.

The batch size is not a parameter and I don't think it needs to be.  The
issue of filling both sides of network buffers and deadlocking only needs
to be avoided on one side.  The response to an insert request is small and
not dependent on the size of the data sent, so we can send as much as we
want as long as the server doesn't send much back to us.

Kris Jurka

pgsql-performance by date:

Previous
From: Mischa Sandberg
Date:
Subject: Re: COPY vs INSERT
Next
From: Jona
Date:
Subject: Bad choice of query plan from PG 7.3.6 to PG 7.3.9 part 1