Thread: Copy performance issues

Copy performance issues

From

Saadat Anwar

Date:

18 August 2010, 19:17:01

I am having severe COPY performance issues after adding indices. What used to take a few minutes (without indices) now takes several hours (with indices). I've tried to tweak the database configuration (based on Postgres documentation and forums), but it hasn't helped as yet. Perhaps, I haven't increased the limits sufficiently. Dropping and recreating indices may not be an option due to a long time it takes to rebuild all indices.

I'll appreciate someone looking at my configuration and giving me a few ideas on how to increase the copy performance.

Thanks.
Saadat.

Table structure:
===========
table C:
           Table "public.C"
Column |       Type       | Modifiers
----------+------------------+-----------
sclk   | double precision | not null
chan     | smallint         | not null
det      | smallint         | not null
x        | real             | not null
y        | real             | not null
z | real             | not null
r    | real             |
t       | real             |
lat     | real             |
lon     | real             |
a    | real             |
b    | real             |
c    | real             |
time     | real             |
qa     | smallint         | not null
qb     | smallint         | not null
qc     | smallint         | not null
Indexes:
    "C_pkey" PRIMARY KEY, btree (sclk, chan, det)

partitioned into 19 sub-tables covering lat bands. For example:

sub-table C0:
   Inherits: C
   Check constraints:
   "C0_lat_check" CHECK (lat >= (-10::real) AND lat < 0::real)
   Indexes:
       "C0_pkey" PRIMARY KEY, btree (sclk, chan, det)
       "C0_lat" btree (lat)
       "C0_time" btree (time)
       "C0_lon" btree (lon)

sub-table C1:
   Inherits: C
   Check constraints:
   "C1_lat_check" CHECK (lat >= (-20::real) AND lat < -10::real)
   Indexes:
       "C1_pkey" PRIMARY KEY, btree (sclk, chan, det)
       "C1_lat" btree (lat)
       "C1_time" btree (time)
       "C1_lon" btree (lon)

The partitions C?s are ~30G (328,000,000 rows) each except one, which is ~65G (909,000,000 rows). There are no rows in umbrella table C from which C1, C2, ..., C19 inherit. The data is partitioned in C1, C2, ..., C19 in order to promote better access. Most people will access the data in C by specifying a lat range. Also, C?s can become quite large over time.

The COPY operation copies one file per partition, for each of the 19 partitions. Each file is between 300,000 - 600,000 records.

System configuration:
================
1. RHEL5 x86_64
2. 32G RAM
3. 8T RAID5 partition for database on a Dell PERC 5/E controller
   (I understand that I'll never get fast inserts/updates on it based on
http://wiki.postgresql.org/wiki/SlowQueryQuestions but cannot change
    to a RAID0+1 for now).
    Database's filesystem is ext4 on LVM on RAID5.
4. Postgres 8.4.2
    shared_buffers = 10GB
    temp_buffers = 16MB
    work_mem = 2GB
    maintenance_work_mem = 256MB
    max_files_per_process = 1000
    effective_io_concurrency = 3
    wal_buffers = 8MB
    checkpoint_segments = 40
    enable_seqscan = off
    effective_cache_size = 16GB
5. analyze verbose; ran on the database before copy operation

Bonnie++ output:
=============
Version 1.03       ------Sequential Output------ --Sequential Input- --Random-
                    -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP
dbtest    64240M 78829 99 266172 42 47904   6 58410 72 116247   9 767.9   1
                    ------Sequential Create------ --------Random Create--------
                    -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
              files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP
                256 16229 98 371704 99 20258 36 16115 97 445680 99 17966 36
dbtest,64240M,78829,99,266172,42,47904,6,58410,72,116247,9,767.9,1,256,16229,98,371704,99,20258,36,16115,97,445680,99,17966,36

Re: Copy performance issues

From

Tom Lane

Date:

18 August 2010, 19:42:25

Saadat Anwar <sanwar@asu.edu> writes:
> I am having severe COPY performance issues after adding indices. What used
> to take a few minutes (without indices) now takes several hours (with
> indices). I've tried to tweak the database configuration (based on Postgres
> documentation and forums), but it hasn't helped as yet. Perhaps, I haven't
> increased the limits sufficiently. Dropping and recreating indices may not
> be an option due to a long time it takes to rebuild all indices.

I suspect your problem is basically that the index updates require a
working set larger than available RAM, so the machine spends all its
time shuffling index pages in and out.  Can you reorder the input so
that there's more locality of reference in the index values?

Also, my first reaction to that schema is to wonder whether the lat/lon
indexes are worth anything.  What sort of queries are you using them
for, and have you considered an rtree/gist index instead?

            regards, tom lane

Re: Copy performance issues

From

s anwar

Date:

18 August 2010, 20:00:07

On Wed, Aug 18, 2010 at 3:42 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Saadat Anwar <sanwar@asu.edu> writes:
> I am having severe COPY performance issues after adding indices. What used
> to take a few minutes (without indices) now takes several hours (with
> indices). I've tried to tweak the database configuration (based on Postgres
> documentation and forums), but it hasn't helped as yet. Perhaps, I haven't
> increased the limits sufficiently. Dropping and recreating indices may not
> be an option due to a long time it takes to rebuild all indices.

I suspect your problem is basically that the index updates require a
working set larger than available RAM, so the machine spends all its
time shuffling index pages in and out. Can you reorder the input so
that there's more locality of reference in the index values?

I can potentially reorder the data so that it has locality of reference w.r.t. one index, but not all. Or did I not interpret your response correctly?

Also, my first reaction to that schema is to wonder whether the lat/lon
indexes are worth anything. What sort of queries are you using them
for, and have you considered an rtree/gist index instead?

I always assumed that the btree indices on individual fields were smaller and more efficient as compared to the rtree/gist indices. Is that not the case? And since the users did not need points and point-queries, I decided in the favor of indexing individual fields.

regards, tom lane

Thanks.
Saadat.

Re: Copy performance issues

From

Sergey Konoplev

Date:

20 August 2010, 04:25:40

Hi,

Try to split your data in small batches. It helped me in a similar
situation recently. I was loading about a million rows into the table
highly populated with indexes and different triggers and the batch
size was 100 (using COPY). The only thing I did with DDL is droped FKs
and recreated them after.

BTW question to gurus - why and in what cases small batch loading
could theoretically be faster then huge one if there is no another
load on the database but this?


On 18 August 2010 20:25, Saadat Anwar <sanwar@asu.edu> wrote:
> I am having severe COPY performance issues after adding indices. What used
> to take a few minutes (without indices) now takes several hours (with
> indices). I've tried to tweak the database configuration (based on Postgres
> documentation and forums), but it hasn't helped as yet. Perhaps, I haven't
> increased the limits sufficiently. Dropping and recreating indices may not
> be an option due to a long time it takes to rebuild all indices.
>
> I'll appreciate someone looking at my configuration and giving me a few
> ideas on how to increase the copy performance.
>
> Thanks.
> Saadat.
>
> Table structure:
> ===========
> table C:
>            Table "public.C"
>   Column  |       Type       | Modifiers
> ----------+------------------+-----------
>  sclk     | double precision | not null
>  chan     | smallint         | not null
>  det      | smallint         | not null
>  x        | real             | not null
>  y        | real             | not null
>  z        | real             | not null
>  r        | real             |
>  t        | real             |
>  lat      | real             |
>  lon      | real             |
>  a        | real             |
>  b        | real             |
>  c        | real             |
>  time     | real             |
>  qa       | smallint         | not null
>  qb       | smallint         | not null
>  qc       | smallint         | not null
> Indexes:
>     "C_pkey" PRIMARY KEY, btree (sclk, chan, det)
>
>
> partitioned into 19 sub-tables covering lat bands. For example:
>
> sub-table C0:
>    Inherits: C
>    Check constraints:
>        "C0_lat_check" CHECK (lat >= (-10::real) AND lat < 0::real)
>    Indexes:
>        "C0_pkey" PRIMARY KEY, btree (sclk, chan, det)
>        "C0_lat" btree (lat)
>        "C0_time" btree (time)
>        "C0_lon" btree (lon)
>
> sub-table C1:
>    Inherits: C
>    Check constraints:
>        "C1_lat_check" CHECK (lat >= (-20::real) AND lat < -10::real)
>    Indexes:
>        "C1_pkey" PRIMARY KEY, btree (sclk, chan, det)
>        "C1_lat" btree (lat)
>        "C1_time" btree (time)
>        "C1_lon" btree (lon)
>
> The partitions C?s are ~30G (328,000,000 rows) each except one, which is
> ~65G (909,000,000 rows). There are no rows in umbrella table C from which
> C1, C2, ..., C19 inherit. The data is partitioned in C1, C2, ..., C19 in
> order to promote better access. Most people will access the data in C by
> specifying a lat range. Also, C?s can become quite large over time.
>
> The COPY operation copies one file per partition, for each of the 19
> partitions. Each file is between 300,000 - 600,000 records.
>
>
> System configuration:
> ================
> 1. RHEL5 x86_64
> 2. 32G RAM
> 3. 8T RAID5 partition for database on a Dell PERC 5/E controller
>    (I understand that I'll never get fast inserts/updates on it based on
>     http://wiki.postgresql.org/wiki/SlowQueryQuestions but cannot change
>     to a RAID0+1 for now).
>     Database's filesystem is ext4 on LVM on RAID5.
> 4. Postgres 8.4.2
>     shared_buffers = 10GB
>     temp_buffers = 16MB
>     work_mem = 2GB
>     maintenance_work_mem = 256MB
>     max_files_per_process = 1000
>     effective_io_concurrency = 3
>     wal_buffers = 8MB
>     checkpoint_segments = 40
>     enable_seqscan = off
>     effective_cache_size = 16GB
> 5. analyze verbose; ran on the database before copy operation
>
> Bonnie++ output:
> =============
> Version  1.03       ------Sequential Output------ --Sequential Input-
> --Random-
>                     -Per Chr- --Block-- -Rewrite- -Per Chr- --Block--
> --Seeks--
> Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec
> %CP
> dbtest    64240M 78829  99 266172  42 47904   6 58410  72 116247   9 767.9
> 1
>                     ------Sequential Create------ --------Random
> Create--------
>                     -Create-- --Read--- -Delete-- -Create-- --Read---
> -Delete--
>               files  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec
> %CP
>                 256 16229  98 371704  99 20258  36 16115  97 445680  99
> 17966  36
>
dbtest,64240M,78829,99,266172,42,47904,6,58410,72,116247,9,767.9,1,256,16229,98,371704,99,20258,36,16115,97,445680,99,17966,36
>
>
>



--
Sergey Konoplev

Blog: http://gray-hemp.blogspot.com /
Linkedin: http://ru.linkedin.com/in/grayhemp /
JID/GTalk: gray.ru@gmail.com / Skype: gray-hemp / ICQ: 29353802