Thread: Copy performance issues

Copy performance issues

From

s anwar

Date:

18 August 2010, 20:09:49

I am having severe COPY performance issues after adding indices. What used to take a few minutes (without indices) now takes several hours (with indices). I've tried to tweak the database configuration (based on Postgres documentation and forums), but it hasn't helped as yet. Perhaps, I haven't increased the limits sufficiently. Dropping and recreating indices may not be an option due to a long time it takes to rebuild all indices.

I'll appreciate someone looking at my configuration and giving me a few ideas on how to increase the copy performance.

Thanks.
Saadat.

Table structure:
===========
table C:
           Table "public.C"
Column |       Type       | Modifiers
----------+------------------+-----------
sclk   | double precision | not null
chan     | smallint         | not null
det      | smallint         | not null
x        | real             | not null
y        | real             | not null
z | real             | not null
r    | real             |
t       | real             |
lat     | real             |
lon     | real             |
a    | real             |
b    | real             |
c    | real             |
time     | real             |
qa     | smallint         | not null
qb     | smallint         | not null
qc     | smallint         | not null
Indexes:
    "C_pkey" PRIMARY KEY, btree (sclk, chan, det)

partitioned into 19 sub-tables covering lat bands. For example:

sub-table C0:
   Inherits: C
   Check constraints:
   "C0_lat_check" CHECK (lat >= (-10::real) AND lat < 0::real)
   Indexes:
       "C0_pkey" PRIMARY KEY, btree (sclk, chan, det)
       "C0_lat" btree (lat)
       "C0_time" btree (time)
       "C0_lon" btree (lon)

sub-table C1:
   Inherits: C
   Check constraints:
   "C1_lat_check" CHECK (lat >= (-20::real) AND lat < -10::real)
   Indexes:
       "C1_pkey" PRIMARY KEY, btree (sclk, chan, det)
       "C1_lat" btree (lat)
       "C1_time" btree (time)
       "C1_lon" btree (lon)

The partitions C?s are ~30G (328,000,000 rows) each except one, which is ~65G (909,000,000 rows). There are no rows in umbrella table C from which C1, C2, ..., C19 inherit. The data is partitioned in C1, C2, ..., C19 in order to promote better access. Most people will access the data in C by specifying a lat range. Also, C?s can become quite large over time.

The COPY operation copies one file per partition, for each of the 19 partitions. Each file is between 300,000 - 600,000 records.

System configuration:
================
1. RHEL5 x86_64
2. 32G RAM
3. 8T RAID5 partition for database on a Dell PERC 5/E controller
   (I understand that I'll never get fast inserts/updates on it based on
http://wiki.postgresql.org/wiki/SlowQueryQuestions but cannot change
    to a RAID0+1 for now).
    Database's filesystem is ext4 on LVM on RAID5.
4. Postgres 8.4.2
    shared_buffers = 10GB
    temp_buffers = 16MB
    work_mem = 2GB
    maintenance_work_mem = 256MB
    max_files_per_process = 1000
    effective_io_concurrency = 3
    wal_buffers = 8MB
    checkpoint_segments = 40
    enable_seqscan = off
    effective_cache_size = 16GB
5. analyze verbose; ran on the database before copy operation

Bonnie++ output:
=============
Version 1.03       ------Sequential Output------ --Sequential Input- --Random-
                    -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP
dbtest    64240M 78829 99 266172 42 47904   6 58410 72 116247   9 767.9   1
                    ------Sequential Create------ --------Random Create--------
                    -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
              files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP
                256 16229 98 371704 99 20258 36 16115 97 445680 99 17966 36
dbtest,64240M,78829,99,266172,42,47904,6,58410,72,116247,9,767.9,1,256,16229,98,371704,99,20258,36,16115,97,445680,99,17966,36

Re: Copy performance issues

From

Greg Smith

Date:

18 August 2010, 22:50:04

s anwar wrote:
> 3. 8T RAID5 partition for database on a Dell PERC 5/E controller
>    (I understand that I'll never get fast inserts/updates on it based on
>     http://wiki.postgresql.org/wiki/SlowQueryQuestions but cannot change
>     to a RAID0+1 for now).
>     Database's filesystem is ext4 on LVM on RAID5.

And LVM slows things down too, so you're stuck with basically the worst
possible layout here.  Have you checked that your PERC is setup with a
write-back cache?  If not, that will kill performance.  You need to have
a battery for the controller for that to work right.  This is the first
thing I'd check on your system.  With ext4, on this controller you might
need to use the "nobarrier" write option when mounting the filesystem to
get good performance too.  Should be safe if you have a battery on the
controller.

>     shared_buffers = 10GB
>     checkpoint_segments = 40

I've gotten some reports that the fall-off where shared_buffers stops
helping is lower than this on Linux.  I'd suggest at most 8GB, and you
might even try something like 4GB just to see if that turns out to be
better.

With what you're doing, you could likely increase checkpoint_segments
quite a bit from here too.  I'd try something >100 and see if that helps.

>     enable_seqscan = off

That's going to cause you serious trouble one day if you leave it like
this in production, on a table where index scans are much more expensive
than sequential ones for what you're doing.

One thing that can help large COPYies a lot is to increase Linux
read-ahead.  Something like:

/sbin/blockdev --setra 4096 /dev/sda

Done for each physical drive the system sees will help large sequential
reads of files the way COPY does significantly.  The default is probably
256 on your system given the LVM setup, possibly even smaller.

Beyond that, you may just need to investigate clever ways to reduce the
indexing requirements.

--
Greg Smith  2ndQuadrant US  Baltimore, MD
PostgreSQL Training, Services and Support
greg@2ndQuadrant.com   www.2ndQuadrant.us