Getting even more insert performance (250m+rows/day) - Mailing list pgsql-performance

From Daniel J. Luke
Subject Getting even more insert performance (250m+rows/day)
Date
Msg-id CE96DEF6-C455-4BB3-8B57-A78849A519E8@geeklair.net
Whole thread Raw
Responses Re: Getting even more insert performance (250m+rows/day)  ("Dave Dutcher" <dave@tridecap.com>)
Re: Getting even more insert performance (250m+rows/day)  ("Steinar H. Gunderson" <sgunderson@bigfoot.com>)
List pgsql-performance
I have a system that currently inserts ~ 250 million rows per day (I
have about 10k more raw data than that, but I'm at the limit of my
ability to get useful insert performance out of postgres).

Things I've already done that have made a big difference:
- modified postgresql.conf shared_buffers value
- converted to COPY from individual insert statements
- changed BLCKSZ to 32768

I currently get ~35k/sec inserts on a table with one index (~70k/sec
inserts if I don't have any indexes).

The indexed field is basically a time_t (seconds since the epoch),
autovacuum is running (or postgres would stop choosing to use the
index). The other fields have relatively lower cardinality.

Each days worth of data gets inserted into its own table so that I
can expire the data without too much effort (since drop table is much
faster than running a delete and then vacuum).

I would really like to be able to have 1 (or 2) more indexes on the
table since it takes a while for a sequential scan of 250million rows
to complete, but CPU time goes way up.

In fact, it looks like I'm not currently IO bound, but CPU-bound. I
think some sort of lazy-index generation (especially if it could be
parallelized to use the other processors/cores that currently sit
mostly idle) would be a solution. Is anyone working on something like
this? Any other ideas? Where should I look if I want to start to
think about creating a new index that would work this way (or am I
just crazy)?

Thanks for any insight!

--
Daniel J. Luke
+========================================================+
| *---------------- dluke@geeklair.net ----------------* |
| *-------------- http://www.geeklair.net -------------* |
+========================================================+
|   Opinions expressed are mine and do not necessarily   |
|          reflect the opinions of my employer.          |
+========================================================+



Attachment

pgsql-performance by date:

Previous
From: "Robin Ericsson"
Date:
Subject: Re: Selects query stats?
Next
From: "Dave Dutcher"
Date:
Subject: Re: Getting even more insert performance (250m+rows/day)