Re: COPY and heap_sync - Mailing list pgsql-hackers

From Fabrízio de Royes Mello
Subject Re: COPY and heap_sync
Date
Msg-id CAFcNs+rraii9UTtc2LC_jbHfUkyF1tDEqdqn+tFh7EJbmMiNUQ@mail.gmail.com
Whole thread
In response to Re: COPY and heap_sync  (Peter Eisentraut <peter_e@gmx.net>)
List pgsql-hackers

On Sun, Aug 31, 2014 at 10:10 AM, Peter Eisentraut <peter_e@gmx.net> wrote:
>
> On 8/30/14 2:26 AM, Jeff Janes wrote:
> > But there cases were people use COPY in a loop with a small amount of
> > data in each statement.
>
> What would be the reason for doing that?
>

I used that to the same thing many times. In a company that I was employed we developed scripts to migrate data from one database do another.

The first version we used INSERT statements and was very very slow. Then we wrote a second version changing the INSERT by COPY statements. The performance was very better, but  we believe that could be better, so in the third version we created some kind of "cache" (using arrays) to accumulate the records in memory then after N rows we build the COPY statement with the cache contents and run it. This was a really good performance improvement.

It's my use case to we have a feature to postpone the heap_sync in COPY statements. I don't know if it's a feature that a lot of people wants, but IMHO it could be nice to improve the bulk load operations.

Regards,

--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
>> Timbira: http://www.timbira.com.br
>> Blog: http://fabriziomello.github.io
>> Linkedin: http://br.linkedin.com/in/fabriziomello
>> Twitter: http://twitter.com/fabriziomello
>> Github: http://github.com/fabriziomello

pgsql-hackers by date:

Previous
From: Craig Ringer
Date:
Subject: Re: Tips/advice for implementing integrated RESTful HTTP API
Next
From: rohtodeveloper
Date:
Subject: Re: Why data of timestamptz does not store value of timezone passed to it?