Re: Insert performance - Mailing list pgsql-performance

From Andreas Kostyrka
Subject Re: Insert performance
Date
Msg-id 20070306124937.GP22981@andi-lap.la.revver.com
Whole thread Raw
In response to Re: Insert performance  (Richard Huxton <dev@archonet.com>)
List pgsql-performance
* Richard Huxton <dev@archonet.com> [070306 13:47]:
> Andreas Kostyrka wrote:
> >* Richard Huxton <dev@archonet.com> [070306 12:22]:
> >>>>2. You can do a COPY from libpq - is it really not possible?
> >>>>
> >>>Not really but i have been testing it and inserts are flying (about
> >>>100000 inserts/sec) !!
> >>What's the problem with the COPY? Could you COPY into one table then insert from that to your target table?
> >Well, there are some issues. First your client needs to support it.
> >E.g. psycopg2 supports only some specific CSV formatting in it's
> >methods. (plus I had sometimes random psycopg2 crashes, but guarding against
> >these is cheap compared to the speedup from COPY versus INSERT)
> >Plus you need to be sure that your data will apply cleanly (which in
> >my app was not the case), or you need to code a fallback that
> >localizes the row that doesn't work.
> >And the worst thing is, that it ignores RULES on the tables, which
> >sucks if you use them ;) (e.g. table partitioning).
>
> Ah, but two things deal with these issues:
> 1. Joel is using libpq
> 2. COPY into a holding table, tidy data and INSERT ... SELECT

Clearly COPY is the way for bulk loading data, BUT you asked, so I
wanted to point out some problems and brittle points with COPY.

(and the copy into the holding table doesn't solve completly the
problem with the dirty inconsistent data)

Andreas

pgsql-performance by date:

Previous
From: Richard Huxton
Date:
Subject: Re: Insert performance
Next
From: Bill Moran
Date:
Subject: Re: Estimate the size of the SQL file generated by pg_dump utility