> You know that too many indices hurts insert/update/delete performance,
> right? For each of these actions, all related indices would need to be
> updated. So if you have any indices that are not used or you could do
> without, remove them.
As mentioned, tables which see a lot of data modification have their index's
removed and then re-created. I don't bother doing this with tables which
aren't so heavily modified.
> Are you using the same database connection for each thread in your
> multi-threaded approach? Postgresql will only benefit from multiple
> processors if there are multiple postgres processes running. In
> application speach that means that you need to have multiple database
> connections open (i.e. one backend process per connection).
Each thread opens up a new connection. Using (g)top I have already verified
that there are several postgres instances running. The machine is entirely
dedicated to this task and there were no other users.
Just to re-iterate.
1. I've done quite a bit of tweaking with the WAL*, shared buffers and sort
mem over several weeks. They're about as optimum as I can get them. Turning
off fsync was one of the first things I did BTW.
2. The executable was compiled on the machine with as many optimisations as
gcc could reasonably support.
3. The queries are just about as good as I can get them and have been
throughly EXPLAIN'ed with live/large amounts of data in the tables. No IN's
used, only EXISTS (where required). This one bit me right at the start. I
had to kill one query after it ran for ~36hrs! BTW, EXISTS seems to be far
more efficient than a JOIN. Is this always true?
4. Temp tables are used to simplify complex queries (and speed them up I
hope).
5. RAID-0 (SCSI/HW) + Dual Proc + 1GB RAM. Linux 2.4.17(smp) (pure Linus. No
other patches). Swap is on a seperate IDE drive.
6. COPY is used where-ever it can be. Index's are dropped before heavy
modification and then recreated. I don't use CLUSTER.
7. Driver app was multi-threaded. It made things worse. BTW, the apps jobs
consists largely of firing off SQL queries in the correct sequence; so you'd
better not go blaming my code!
Is there anything I've missed out?
--Arsalan.