Greetings. We're having trouble with full logging since we moved from
an 8-core server with 16 GB memory to a machine with double that
spec and I am wondering if this *should* be working or if there is a
point on larger machines where logging and scheduling seeks of
background writes - or something along those lines; it might be a
theory - doesn't work together any more?
The box in question is a Dell PowerEdge R900 with 16 cores and 64 GB
of RAM (16 GB of shared buffers allocated), and a not-so-great
root@db04:~# lspci|grep RAID
19:00.0 RAID bus controller: LSI Logic / Symbios Logic MegaRAID SAS
1078 (rev 04)
controller with 8 10k rpm disks in RAID 1+0 (one big filesystem),
running Ubuntu Hardy with kernel version
root@db04:~# uname -a
Linux db04 2.6.24-22-server #1 SMP Mon Nov 24 20:06:28 UTC 2008 x86_64 GNU/Linux
Logging to the disk array actually stops working much earlier; at
off-peak time we have around 3k transactions per second and if we set
log_statement = all, the server gets bogged down immediately: Load,
context switches, and above all mean query duration shoot up; the
application slows to a crawl and becomes unusable.
So the idea came up to log to /dev/shm which is a default ram disk on
Linux with half the available memory as a maximum size.
This works much better but once we are at about 80% of peak load -
which is around 8000 transactions per second currently - the server goes
into a tailspin in the manner described above and we have to switch off full
logging.
This is a problem because we can't do proper query analysis any more.
How are others faring with full logging on bigger boxes?
Regards,
Frank