Re: generic copy options - Mailing list pgsql-hackers

From Robert Haas
Subject Re: generic copy options
Date
Msg-id 603c8f070909171604l2625e6f3h72af1e0988bd60ef@mail.gmail.com
Whole thread Raw
In response to Re: generic copy options  (Greg Smith <gsmith@gregsmith.com>)
List pgsql-hackers
On Thu, Sep 17, 2009 at 6:54 PM, Greg Smith <gsmith@gregsmith.com> wrote:
> On Thu, 17 Sep 2009, Dan Colish wrote:
>
>>        - Performance appears to be the same although I don't have a good
>> way for
>>          testing this at the moment
>
> Here's what I do to generate simple COPY performance test cases:
>
> CREATE TABLE t (i integer);
> INSERT INTO t SELECT x FROM generate_series(1,100000) AS x;
> \timing
> COPY t TO '/some/file' WITH [options];
> BEGIN;
> TRUNCATE TABLE t;
> COPY t FROM '/some/file' WITH [options];
> COMMIT;
>
> You can adjust the size of the generated table based on whether you want to
> minimize (small number) or maximize (big number) the impact of the setup
> overhead relative to actual processing time.  Big numbers make sense if
> there's a per-row change, small ones if it's mainly COPY setup that's been
> changed if you want a small bit of data to test against.
>
> An example with one column in it is a good test case for seeing whether
> per-row impact has gone up.  You'd want something with a wider row for other
> types of performance tests.
>
> The reason for the BEGIN/COMMIT there is that form utilizes an optimization
> that lowers WAL volume when doing the COPY insertion, which makes it more
> likely you'll be testing performance of the right thing.

Unless something has changed drastically in the last day or two, this
patch is only affecting the option-parsing phase of copy, so the
impact should be nearly all but noticeable, and it should be an
up-front cost, not per row.  It would be good to verify that, of
course.

...Robert


pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: Hot Standby 0.2.1
Next
From: Andrew Dunstan
Date:
Subject: Re: generic copy options