What is the best COPY performance that you have gotten on a "normal" table?
I know that this is question is almost too general, but it might help
me out a bit, or at least give me the right things to tweak. Perhaps
the question can be rewritten as "Where are the major bottlenecks in a
COPY?" or "How can I compute the max theoretical COPY performance for
my hardware?". The two subquestions that I have from this are:
-Are my ETL scripts (perl) maximizing the database COPY speeds?
-Can I tweak my DB further to eek out a bit more performance?
I'm using perl to ETL a decent sized data set (10 million records) and
then loading it through perl::DBI's copy. I am currently getting
between 10K and 15K inserts/second. I've profiled the ETL scripts a
bit and have performance-improved a lot of the code, but I'd like to
determine whether it makes sense to try and further optimize my Perl
or count it as "done" and look for improvements elsewhere.
I ran trivial little insert into a table with a single integer row and
came close to 250K inserts/second using psql's \copy, so I'm thinking
that my code could be optimized a bit more, but wanted to check around
to see if that was the case.
I am most interested in loading two tables, one with about 21 (small)
VARCHARs where each record is about 200 bytes, and another with 7
INTEGERs, 3 TIMESTAMPs, and 1 BYTEA where each record is about 350
bytes.
I have implemented most of the various bits of PG config advice that I
have seen, both here and with a some googling, such as:
wal_buffers=128
checkpoint_segments=128
checkpoint_timeout=3000
Software: PG 8.1.3 on RHEL 4.3 x86_64
Hardware: Quad Dual-core Opteron, Fibre Channel SAN with 256M BBC
Thanks!