Thread: Urgent: Tuning strategies?
Hello! We are trying to get a new PostgreSQL-Database in production state, but we currently experience very serious problems with performance. Data and application (ColdFusion on four webservers, accessed via ODBC) have been ported from an Oracle 8i database. We haven't had any experience with PostgreSQL beforehand - that's why we're so desparate for some aid... Here's the data I can supply: Its a postgresql-7.2.1-installation under SuSE 7.3 on a 1GB RAM, 4xPIII550MHz Xeon machine with ~30MB storage on RAID 5 (3 disks), filesystem is ext3. custom-settings in postgresql.conf: max_connections = 256 shared_buffers = 56320 wal_buffers = 32 sort_mem = 64336 wal_files = 64 fsync = false effective_cache_size = 18200 cat /proc/sys/kernel/shmmax: 536870912 exemplary ipcs -m: ------ Shared Memory Segments -------- key shmid owner perms bytes nattch status 0x00000000 32768 root 600 1056768 3 dest 0x00000000 98305 root 600 33554432 3 dest 0x00000000 131074 wwwrun 600 368644 3 dest 0x0052e2c1 1638403 postgres 600 472064000 33 0x07021999 229380 root 644 1104 1 A complete dump.sql of the database-installation is roughly 300MB in size. It consists out of ten databases, one of which makes up for about 75% of all data (community-database for forums, usertables, messaging-system), so we have a suspicion that this is where our serverload-hog sits. This database has 33 tables, the biggest of which as currently 237884 records and is 24 fields wide, the second biggest has got 84512 records and is 60 fields wide. exemplary top-output: 12:04pm up 5 days, 1:13, 3 users, load average: 2.78, 2.78, 2.89 143 processes: 138 sleeping, 5 running, 0 zombie, 0 stopped CPU0 states: 44.0% user, 12.3% system, 0.0% nice, 43.2% idle CPU1 states: 48.4% user, 12.1% system, 0.0% nice, 39.0% idle CPU2 states: 48.5% user, 15.5% system, 0.0% nice, 35.1% idle CPU3 states: 55.0% user, 12.5% system, 0.0% nice, 31.5% idle Mem: 1029400K av, 1023660K used, 5740K free, 0K shrd, 2932K buff Swap: 2097136K av, 459800K used, 1637336K free 699220K cached PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME COMMAND 29565 postgres 16 0 124M 124M 118M R 85.4 12.4 15:42 postmaster 30004 postgres 15 0 305M 305M 303M R 67.4 30.3 0:13 postmaster 29647 postgres 11 0 440M 440M 438M R 56.7 43.8 6:19 postmaster 30057 postgres 17 0 980 980 732 R 17.3 0.0 0:03 top 26122 root 10 0 1000 1000 732 R 17.0 0.0 92:15 top 29726 postgres 9 0 82536 80M 81340 S 3.1 8.0 0:56 postmaster I vacuum-analyze regularly once a day during low-traffic times (3:00 a.m.), which takes ~3.5 minutes over all databases. Everytime we do some big updates/inserts in the process of porting the original Oracle-DB-data over to PostgreSQL, we do a vacuum-analyze afterwards. The whole project feeds several websites, one of which (currently still on Oracle) will probably cause about 20x as much load as all the rest; we need to switch this last website over to the PostgreSQL-DB, too, and as soon as possible. What I'd need is your opinion on the given fundamental data (server- and OS-config, shared memory, top-output) and some hints on how to find the bottlenecks we seem to have. I read as much documenation as I could, but there's no "guided tour to postgres-tuning", or at least not one that I could find. We do use EXPLAIN-output for finding out if our indexes are used or not (and it seems they are), but there's still so much left to guess-work, like what bit exactly is causing the high server-loads. How can we identify these bottlenecks? How can I find the "sweet spots" for the optimizer-settings in postgresql.conf? What else should we or can we do in order to maximize performance? How can we reduce swap-activity further, because I think it's quite high (see top-output)? I know that I haven't exactly given profound info, but unfortunately I don't know what info you'd need in order to be able to provide specific hints. So what info can you give me now, please, and what info can I give you, so you could give me further assistance? Thank you very much in advance for you help, I am (almost) sure that with your aid we'll be able to accomplish the task :) Regards, Markus
Hi, -- Markus Wollny <Markus.Wollny@computec.de> wrote: > How can we reduce swap-activity further, because I > think it's quite high (see top-output)? you may try to reduce the shared memory and sorting memory for postgres, sometimes it's too high and it's better to not use too much shared memory. Here are some hints for this topic: http://www.ca.postgresql.org/docs/momjian/hw_performance/ > We do use > EXPLAIN-output for finding out if our indexes are used or not (and it > seems they are), explain is not only interesting for checking the use of indexes; I use it also to check if some queries took too long. With "explain analyze" in 7.2 you can measure the time which a query really needs, not only the estimated time. Ciao Alvar -- // Unterschreiben! http://www.odem.org/informationsfreiheit/ // Internet am Telefon: http://www.teletrust.info/ // Das freieste Medium? http://www.odem.org/insert_coin/ // Blaster: http://www.assoziations-blaster.de/
On Tue, 25 Jun 2002, Markus Wollny wrote: > Its a postgresql-7.2.1-installation under SuSE 7.3 on a 1GB RAM, > 4xPIII550MHz Xeon machine with ~30MB storage on RAID 5 (3 disks), > filesystem is ext3. I assume you mean 30 GB of storage, not 30 MB. You don't say what kind of RAID you have. Software? Hardware? IDE drives? SCSI drives? If IDE, does each drive have its own controller? What drives are you using? Anyway, if you're doing a lot of updates, you definitely want to get the log file on to a separate disk, and one that's preferably not RAID-5. (RAID-5 writes tend to be very slow.) Consider adding a mirrored pair of disks just for the log file. > fsync = false You don't like your data? If not, you should ditch the RAID-5 as well, and use a striped (RAID-0) pair for data and the other disk for logs. Otherwise you should turn on fsync, since the RAID is otherwise not giving you much useful protection should the database crash. > max_connections = 256 > shared_buffers = 56320 That's a lot of shared buffers: 450 MB worth. Since the OS is also doing caching, you're pretty much maximizing your changes that a block will be cached both in the shared buffers and in the OS buffer cache. Reducing your shared buffers to a few thousand might increase your cache hit rate. > sort_mem = 64336 This is probably a bit high, since a back-end actually uses more than 3x the amount of memory specified in sort_mem when it does a sort. (This is due to the way memory is counted--this is more accurate in 7.3, I think.) So this will let a backend grow to 200 MB or more when sorting, which may drive it into swap, which will then slow down your sort, rather than speeding it up. I generally use 16-32 MB for sort_mem. Note you can always increase it for a particular connection using the SET command if you're rebuilding indexes or whatever. > Mem: 1029400K av, 1023660K used, 5740K free, 0K shrd, 2932K buff > Swap: 2097136K av, 459800K used, 1637336K free 699220K cached Ouch! You are being hosed over big time; you should never, ever see any significant swapping in a database server. (The I/O is supposed to be there for the database; don't waste it on moving programs in and out of memory!) Start cranking down memory limits all over the place until swapping goes away. Add more memory if you have to. First place to start is with the shared_buffers and sort_mem. > A complete dump.sql of the database-installation is roughly 300MB in So it sounds like you don't have much data. It sounds like you can easily fit into 10 GB, and have plenty of room to grow to many times your current size. If you're doing a lot of updates, and you don't anticipate really growing into that 30 GB, drop the third volume of your RAID-5 and mirror the other pair of disks, for 10 GB of storage. Writes will be much faster. cjs -- Curt Sampson <cjs@cynic.net> +81 90 7737 2974 http://www.netbsd.org Don't you know, in this new Dark Age, we're all light. --XTC
"Markus Wollny" <Markus.Wollny@computec.de> writes: > ... We do use > EXPLAIN-output for finding out if our indexes are used or not (and it > seems they are), but there's still so much left to guess-work, like what > bit exactly is causing the high server-loads. You already got lots of good advice from Alvar and Curt, but I just wanted to add that you can learn something about which queries are causing the load with appropriate use of logging and stats. Logging: turn on debug_print_query and show_query_stats, run some test cases, eyeball results in postmaster log to see which queries use the most CPU time and I/O. Stats: see http://www.ca.postgresql.org/users-lounge/docs/7.2/postgres/monitoring.html You should turn on stats_command_string and then correlate the pg_stat_activity outputs with the PIDs that are chewing CPU according to 'top'. regards, tom lane
Hi! > -----Ursprüngliche Nachricht----- > Von: Curt Sampson [mailto:cjs@cynic.net] > Gesendet: Dienstag, 25. Juni 2002 13:53 > An: Markus Wollny > Cc: pgsql-general@postgresql.org > Betreff: Re: [GENERAL] Urgent: Tuning strategies? > > > On Tue, 25 Jun 2002, Markus Wollny wrote: > > > Its a postgresql-7.2.1-installation under SuSE 7.3 on a 1GB RAM, > > 4xPIII550MHz Xeon machine with ~30MB storage on RAID 5 (3 disks), > > filesystem is ext3. > > I assume you mean 30 GB of storage, not 30 MB. You don't say what kind > of RAID you have. Software? Hardware? IDE drives? SCSI drives? If IDE, > does each drive have its own controller? What drives are you using? Yes, sorry, 30GB storage, hardware-RAID5, smart-array-controller 64MB cache and 5 (not 3) 18GB scsi-disks. > Anyway, if you're doing a lot of updates, you definitely want > to get the > log file on to a separate disk, and one that's preferably not RAID-5. > (RAID-5 writes tend to be very slow.) Consider adding a > mirrored pair of > disks just for the log file. As far as I can see, logfile isn't a problem on debug_level 0 (which I always set for production state) because there's hardly anything to write except on vacuum analyze (which usually only takes place when hardly anybody does anything on the system anyway). > > fsync = false > > You don't like your data? If not, you should ditch the RAID-5 as > well, and use a striped (RAID-0) pair for data and the other disk > for logs. Otherwise you should turn on fsync, since the RAID is > otherwise not giving you much useful protection should the database > crash. There's an UPS connected to the server, ext3 als fs, backups are performed each day, it's a frontend-db anyway, so except community-wise we wouldn't even loose one day's updates in the worst case scenario, so I don't see much point in not setting fsync to false for faster updates. And community-data's not that valuable to us, we can without doubt risk loosing one day's updates. > > max_connections = 256 > > shared_buffers = 56320 > > That's a lot of shared buffers: 450 MB worth. Since the OS is also > doing caching, you're pretty much maximizing your changes that a > block will be cached both in the shared buffers and in the OS buffer > cache. Reducing your shared buffers to a few thousand might increase > your cache hit rate. Okay, I'll try that - I just followed recommendations like "you just cannot have enough shared buffers". Seems to be wrong, though. Thanks for that hint. > > sort_mem = 64336 > > This is probably a bit high, since a back-end actually uses > more than 3x > the amount of memory specified in sort_mem when it does a sort. (This > is due to the way memory is counted--this is more accurate in 7.3, I > think.) So this will let a backend grow to 200 MB or more > when sorting, > which may drive it into swap, which will then slow down your sort, > rather than speeding it up. I generally use 16-32 MB for > sort_mem. Note > you can always increase it for a particular connection using the SET > command if you're rebuilding indexes or whatever. Ah, that might be one big issue - I often noticed several backends growing far beyond 200MB size, which got me worried... > > Mem: 1029400K av, 1023660K used, 5740K free, 0K shrd, > 2932K buff > > Swap: 2097136K av, 459800K used, 1637336K free > 699220K cached > > Ouch! You are being hosed over big time; you should never, > ever see any > significant swapping in a database server. (The I/O is supposed to be > there for the database; don't waste it on moving programs in and out > of memory!) Start cranking down memory limits all over the place until > swapping goes away. Add more memory if you have to. First > place to start > is with the shared_buffers and sort_mem. > > > A complete dump.sql of the database-installation is roughly 300MB in > > So it sounds like you don't have much data. It sounds like you can > easily fit into 10 GB, and have plenty of room to grow to many > times your current size. If you're doing a lot of updates, and you > don't anticipate really growing into that 30 GB, drop the third > volume of your RAID-5 and mirror the other pair of disks, for 10 > GB of storage. Writes will be much faster. There's not really that many updates or inserts, it's mostly selects. But I'll have to consider this right enough. Thank you very much for your help! It's hard finding some of the correct settings for one's own scenario without any beforehand experience :) So thanks a lot :)
"Markus Wollny" <Markus.Wollny@computec.de> writes: >> (RAID-5 writes tend to be very slow.) Consider adding a >> mirrored pair of disks just for the log file. > As far as I can see, logfile isn't a problem on debug_level 0 (which I > always set for production state) because there's hardly anything to > write except on vacuum analyze (which usually only takes place when > hardly anybody does anything on the system anyway). Curt's talking about the WAL log (pg_xlog/), not the postmaster's debug output. However, if your problems are with SELECTs and not with update operations, then the WAL log is not the source of your problems anyway. regards, tom lane
I'll add one more observation here: On Tue, 25 Jun 2002, Markus Wollny wrote: > wal_buffers = 32 > wal_files = 64 Everyone else already mentioned the shared mem and sort mem, but I've found that while increasing wal_buffers and wal_files helps, 64 WAL files is probably overkill. I'd suggest the strategy to take your postgresql.conf file to stock, run a dozen or so slow queries at once, then make ONE CHANGE at a time and see how much of a difference it makes. And remember, just because a little helps does NOT mean a lot will. Often it's the first change that makes the biggest difference. -- "Force has no place where there is need of skill.", "Haste in every business brings failures.", "This is the bitterest pain among men, to have much knowledge but no power." -- Herodotus
On Tue, 25 Jun 2002, Markus Wollny wrote: > There's an UPS connected to the server, Which can also fail, and which does not protect against, e.g., a kernel panic. > backups are performed each day, Well, even given the data loss, how long is it going to take you to do a restore from those backups, compared to just letting the database clean things up and continue on? Or do you not care about downtime, either? > I don't see much point in not setting fsync to false for faster updates. > And community-data's not that valuable to us, we can without doubt risk > loosing one day's updates. fsync or not makes little speed difference if your log has its own disk. > Ah, that might be one big issue - I often noticed several backends > growing far beyond 200MB size, which got me worried... Right. Also, for your queries doing sorts, check out the query stats ("set show_query_stats yes"; results go to the error log) and see if they're swapping. If they are, that's killing your performance. cjs -- Curt Sampson <cjs@cynic.net> +81 90 7737 2974 http://www.netbsd.org Don't you know, in this new Dark Age, we're all light. --XTC