Folks,
I'm doing some massive data transformations on Postgresql, and they're
a lot slower than they should be. I'm looking for some tips on
improving things. If the PGSQL-PERFORMANCE list was ever created,
please tell me and I'll go over there.
The update: A series of 7 update statements which cull data from a 1.5
million row table to update a 120,000 row table.
The Machine: A dual-processor RAID 5 UW SCSI server.
The postgresql.conf settings:
Connections: 128
Shared Buffers: 256
Sort Mem: 1024
Checkpoint Segments: 16
Stats on.
Light debug logging.
The problem: The update series (done as a function) takes 10-15
minutes. During this time, the CPU is never more than 31% busy, only
256mb of 512 is in use, and the disk channel is only 25% - 50%
saturated. As such, is seems like we could run things faster.
What does everybody suggest tweaking?
-Josh Berkus