Thread: Database size Vs performance degradation
Morning folks, Long time listener, first time poster. Having an interesting problem related to performance which I'll try and describe below and hopefully get some enlightenment. First the environment: Postgres 8.1.8 shared_buffers = 2000 max_fsm_pages = 400000 Redhat Enterprise 4 Running on HP DL380 w/ 4GB RAM, dual 10K HDDs in RAID 0+1 Also running on the server is a tomcat web server and other ancillaries Now, the problem. We have an application that continually writes a bunch of data to a few tables which is then deleted by a batch job each night. We're adding around 80,000 rows to one table per day and removing around 75,000 that are deemed to be "unimportant". Now, the problem we see is that after a period of time, the database access becomes very 'slow' and the load avg on the machine gets up around 5. When this happens, the application using the DB basically grinds to a halt. Checking the stats, the DB size is around 7.5GB; no tables or indexes look to be 'bloated' (we have been using psql since 7.3 with the classic index bloat problem) and the auto-vac has been running solidly. We had this problem around a month ago and again yesterday. Because the application needs reasonably high availability, we couldn't full vacuum so what we did was a dump and load to another system. What I found here was that after the load, the DB size was around 2.7GB - a decrease of 5GB. Re-loading this back onto the main system, and the world is good. One observation I've made on the DB system is the disk I/O seems dreadfully slow...we're at around 75% I/O wait sometimes and the read rates seem quite slow (hdparm says around 2.2MB/sec - 20MB/sec for un-cached reads). I've also observed that the OS cache seems to be using all of the remaining memory for it's cache (around 3GB) which seems probably the best it can do with the available memory. Now, clearly we need to examine the need for the application to write and remove so much data but my main question is: Why does the size of the database with so much "un-used" space seem to impact performance so much? If (in this case) the extra 5GB of space is essentially "unallocated", does it factor into any of the caching or performance metrics that the DBMS uses? And if so, would I be better having a higher shared_buffers rather than relying so much on OS cache? Yes, I know we need to upgrade to 8.3 but that's going to take some time :) Many thanks in advance. Dave ___ Dave North dnorth@signiant.com Signiant - Making Media Move Visit Signiant at: www.signiant.com <http://www.signiant.com/>
Dave North wrote: > Morning folks, > Long time listener, first time poster. Hi Dave > Postgres 8.1.8 > shared_buffers = 2000 > max_fsm_pages = 400000 > Redhat Enterprise 4 > Running on HP DL380 w/ 4GB RAM, dual 10K HDDs in RAID 0+1 > Also running on the server is a tomcat web server and other ancillaries The value of 2000 seems a bit low for shared_buffers perhaps. Oh, and 8.1.13 seems to be the latest bugfix for 8.1 too. > Now, the problem. We have an application that continually writes a > bunch of data to a few tables which is then deleted by a batch job each > night. We're adding around 80,000 rows to one table per day and > removing around 75,000 that are deemed to be "unimportant". [snip] > We had this problem around a month ago and again yesterday. Because the > application needs reasonably high availability, we couldn't full vacuum > so what we did was a dump and load to another system. What I found here > was that after the load, the DB size was around 2.7GB - a decrease of > 5GB. Re-loading this back onto the main system, and the world is good. Well, that's pretty much the definition of bloat. Are you sure you're vacuuming enough? I don't have an 8.1 to hand at the moment, but a "vacuum verbose" in 8.2+ gives some details at the end about how many free-space slots need to be tracked. Presumably you're not tracking enough of them, or your vacuuming isn't actually taking place. Check the size of your database every night. It will rise from 2.7GB, but it should stay roughly static (apart from whatever data you add of course). If you can keep it so that most of the working-set of your database fits in RAM speed will stay just fine. > Yes, I know we need to upgrade to 8.3 but that's going to take some time > :) I think you'll like some of the improvements, but it's probably more important to get 8.1.13 installed soon-ish. -- Richard Huxton Archonet Ltd
On Wed, 30 Jul 2008, Dave North wrote: > Running on HP DL380 w/ 4GB RAM, dual 10K HDDs in RAID 0+1 > Checking the stats, the DB size is around 7.5GB; Doesn't fit in RAM. > ...after the load, the DB size was around 2.7GB Does fit in RAM. > One observation I've made on the DB system is the disk I/O seems > dreadfully slow...we're at around 75% I/O wait sometimes and the read > rates seem quite slow (hdparm says around 2.2MB/sec - 20MB/sec for > un-cached reads). That's incredibly slow in this day and age, especially from 10krpm HDDs. Definitely worth investigating. However, I think vacuuming more agressively is going to be your best win at the moment. Matthew -- Patron: "I am looking for a globe of the earth." Librarian: "We have a table-top model over here." Patron: "No, that's not good enough. Don't you have a life-size?" Librarian: (pause) "Yes, but it's in use right now."
-----Original Message----- From: Richard Huxton [mailto:dev@archonet.com] Sent: July 30, 2008 8:28 AM To: Dave North Cc: pgsql-performance@postgresql.org Subject: Re: [PERFORM] Database size Vs performance degradation Dave North wrote: > Morning folks, > Long time listener, first time poster. Hi Dave > Postgres 8.1.8 > shared_buffers = 2000 > max_fsm_pages = 400000 > Redhat Enterprise 4 > Running on HP DL380 w/ 4GB RAM, dual 10K HDDs in RAID 0+1 Also running > on the server is a tomcat web server and other ancillaries The value of 2000 seems a bit low for shared_buffers perhaps. Oh, and 8.1.13 seems to be the latest bugfix for 8.1 too. DN: Yeah, I was thinking the same. I spent several hours reading info on this list and other places and it's highly inconclusive about having high or low shared buffs Vs letting the OS disk cache handle it. > Now, the problem. We have an application that continually writes a > bunch of data to a few tables which is then deleted by a batch job > each night. We're adding around 80,000 rows to one table per day and > removing around 75,000 that are deemed to be "unimportant". [snip] > We had this problem around a month ago and again yesterday. Because > the application needs reasonably high availability, we couldn't full > vacuum so what we did was a dump and load to another system. What I > found here was that after the load, the DB size was around 2.7GB - a > decrease of 5GB. Re-loading this back onto the main system, and the world is good. Well, that's pretty much the definition of bloat. Are you sure you're vacuuming enough? DN: Well, the auto-vac is kicking off pretty darn frequently...around once every 2 minutes. However, you just made me think of the obvious - is it actually doing anything?! The app is pretty darn write intensive so I wonder if it's actually able to vacuum the tables? I don't have an 8.1 to hand at the moment, but a "vacuum verbose" in 8.2+ gives some details at the end about how many free-space slots need to be tracked. Presumably you're not tracking enough of them, or your vacuuming isn't actually taking place. DN: I think you've hit it. Now the next obvious problem is how to make the vac actually vac while maintaining a running system? Check the size of your database every night. It will rise from 2.7GB, but it should stay roughly static (apart from whatever data you add of course). If you can keep it so that most of the working-set of your database fits in RAM speed will stay just fine. DN: Yep, I'm just implementing a size tracker now to keep a track on it. It grew from the 2.5GB to 7GB in around a month so it's pretty easy to see big jumps I'd say. Does the auto-vac log it's results somewhere by any chance do you know? Fantastic post, thanks so much. Dave > Yes, I know we need to upgrade to 8.3 but that's going to take some > time > :) I think you'll like some of the improvements, but it's probably more important to get 8.1.13 installed soon-ish. -- Richard Huxton Archonet Ltd
-----Original Message----- From: pgsql-performance-owner@postgresql.org [mailto:pgsql-performance-owner@postgresql.org] On Behalf Of Matthew Wakeling Sent: July 30, 2008 8:37 AM To: pgsql-performance@postgresql.org Subject: Re: [PERFORM] Database size Vs performance degradation On Wed, 30 Jul 2008, Dave North wrote: > Running on HP DL380 w/ 4GB RAM, dual 10K HDDs in RAID 0+1 > Checking the stats, the DB size is around 7.5GB; Doesn't fit in RAM. > ...after the load, the DB size was around 2.7GB Does fit in RAM. > One observation I've made on the DB system is the disk I/O seems > dreadfully slow...we're at around 75% I/O wait sometimes and the read > rates seem quite slow (hdparm says around 2.2MB/sec - 20MB/sec for > un-cached reads). That's incredibly slow in this day and age, especially from 10krpm HDDs. Definitely worth investigating. DN: Yeah, I was thinking the same thing. Unlike the folks here, I'm no performance whiz but it did seem crazy slow. Given the 10K disks, it seems to me there is most likely something on the RAID Array itself that is set sub-optimally. Next thing to look at. However, I think vacuuming more agressively is going to be your best win at the moment. DN: As I just replied to the past (very helpful) chap, I think I need to go see what exactly the vac is vac'ing (autovac that is) because although it's running super frequently, the big question is "is it doing anything" :) Cheers Dave
I am guessing that you are using DELETE to remove the 75,000 unimportant. Change your batch job to CREATE a new table consisting only of the 5,000 important. You can use "CREATE TABLE table_nameAS select_statement" command. Then drop the old table. After that you can use ALTER TABLE to change the name ofthe new table to that of the old one. I am not an expert but if this is a viable solution for you then I think doing it this way will rid you of your bloatingproblem. Regards, Val --- On Wed, 30/7/08, Dave North <DNorth@signiant.com> wrote: > From: Dave North <DNorth@signiant.com> > Subject: [PERFORM] Database size Vs performance degradation > To: pgsql-performance@postgresql.org > Date: Wednesday, 30 July, 2008, 1:09 PM > Morning folks, > Long time listener, first time poster. Having an > interesting > problem related to performance which I'll try and > describe below and > hopefully get some enlightenment. First the environment: > > > Postgres 8.1.8 > shared_buffers = 2000 > max_fsm_pages = 400000 > Redhat Enterprise 4 > Running on HP DL380 w/ 4GB RAM, dual 10K HDDs in RAID 0+1 > Also running on the server is a tomcat web server and other > ancillaries > > Now, the problem. We have an application that continually > writes a > bunch of data to a few tables which is then deleted by a > batch job each > night. We're adding around 80,000 rows to one table > per day and > removing around 75,000 that are deemed to be > "unimportant". Now, the > problem we see is that after a period of time, the database > access > becomes very 'slow' and the load avg on the machine > gets up around 5. > When this happens, the application using the DB basically > grinds to a > halt. Checking the stats, the DB size is around 7.5GB; no > tables or > indexes look to be 'bloated' (we have been using > psql since 7.3 with the > classic index bloat problem) and the auto-vac has been > running solidly. > > We had this problem around a month ago and again yesterday. > Because the > application needs reasonably high availability, we > couldn't full vacuum > so what we did was a dump and load to another system. What > I found here > was that after the load, the DB size was around 2.7GB - a > decrease of > 5GB. Re-loading this back onto the main system, and the > world is good. > > One observation I've made on the DB system is the disk > I/O seems > dreadfully slow...we're at around 75% I/O wait > sometimes and the read > rates seem quite slow (hdparm says around 2.2MB/sec - > 20MB/sec for > un-cached reads). I've also observed that the OS cache > seems to be > using all of the remaining memory for it's cache > (around 3GB) which > seems probably the best it can do with the available > memory. > > Now, clearly we need to examine the need for the > application to write > and remove so much data but my main question is: > > Why does the size of the database with so much > "un-used" space seem to > impact performance so much? If (in this case) the extra > 5GB of space is > essentially "unallocated", does it factor into > any of the caching or > performance metrics that the DBMS uses? And if so, would I > be better > having a higher shared_buffers rather than relying so much > on OS cache? > > Yes, I know we need to upgrade to 8.3 but that's going > to take some time > :) > > Many thanks in advance. > > Dave > > ___ > Dave North > dnorth@signiant.com > Signiant - Making Media Move > Visit Signiant at: www.signiant.com > <http://www.signiant.com/> > > > -- > Sent via pgsql-performance mailing list > (pgsql-performance@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-performance __________________________________________________________ Not happy with your email address?. Get the one you really want - millions of new email addresses available now at Yahoo! http://uk.docs.yahoo.com/ymail/new.html
Thank you for the suggestion..much appreciated. Alas, I don't think this will be possible without a change to the application but it's a good idea nonetheless. Where I am now is looking at the autovac tuning parameters. I strongly suspect that the 2 tables that are "frequent changers" are just not getting enough cleaning. It's hard to tell though because the AV messages are only at debug2 and setting debug2 on this server would be a killer. However, just from running the math using the autovac limit and how it's a percentage of table size, I'm pretty sure that we're not vac'ing enough and that reducing the multiplier down from 0.4 would make a significant difference. My main question was answered though I think - the growth is NOT normal which was the stimulus I needed to investigate further. Thanks again Dave > -----Original Message----- > From: Valentin Bogdanov [mailto:valiouk@yahoo.co.uk] > Sent: July 30, 2008 10:58 AM > To: pgsql-performance@postgresql.org; Dave North > Subject: Re: [PERFORM] Database size Vs performance degradation > > I am guessing that you are using DELETE to remove the 75,000 > unimportant. > Change your batch job to CREATE a new table consisting only > of the 5,000 important. You can use "CREATE TABLE table_name > AS select_statement" command. Then drop the old table. After > that you can use ALTER TABLE to change the name of the new > table to that of the old one. > > I am not an expert but if this is a viable solution for you > then I think doing it this way will rid you of your bloating problem. > > Regards, > Val > > > --- On Wed, 30/7/08, Dave North <DNorth@signiant.com> wrote: > > > From: Dave North <DNorth@signiant.com> > > Subject: [PERFORM] Database size Vs performance degradation > > To: pgsql-performance@postgresql.org > > Date: Wednesday, 30 July, 2008, 1:09 PM Morning folks, > > Long time listener, first time poster. Having an > interesting problem > > related to performance which I'll try and describe below > and hopefully > > get some enlightenment. First the environment: > > > > > > Postgres 8.1.8 > > shared_buffers = 2000 > > max_fsm_pages = 400000 > > Redhat Enterprise 4 > > Running on HP DL380 w/ 4GB RAM, dual 10K HDDs in RAID 0+1 > Also running > > on the server is a tomcat web server and other ancillaries > > > > Now, the problem. We have an application that continually writes a > > bunch of data to a few tables which is then deleted by a batch job > > each night. We're adding around 80,000 rows to one table > per day and > > removing around 75,000 that are deemed to be "unimportant". > Now, the > > problem we see is that after a period of time, the database access > > becomes very 'slow' and the load avg on the machine gets up > around 5. > > When this happens, the application using the DB basically > grinds to a > > halt. Checking the stats, the DB size is around 7.5GB; no > tables or > > indexes look to be 'bloated' (we have been using psql since > 7.3 with > > the classic index bloat problem) and the auto-vac has been running > > solidly. > > > > We had this problem around a month ago and again yesterday. > > Because the > > application needs reasonably high availability, we couldn't full > > vacuum so what we did was a dump and load to another > system. What I > > found here was that after the load, the DB size was around > 2.7GB - a > > decrease of 5GB. Re-loading this back onto the main > system, and the > > world is good. > > > > One observation I've made on the DB system is the disk I/O seems > > dreadfully slow...we're at around 75% I/O wait sometimes > and the read > > rates seem quite slow (hdparm says around 2.2MB/sec - 20MB/sec for > > un-cached reads). I've also observed that the OS cache seems to be > > using all of the remaining memory for it's cache (around 3GB) which > > seems probably the best it can do with the available memory. > > > > Now, clearly we need to examine the need for the > application to write > > and remove so much data but my main question is: > > > > Why does the size of the database with so much "un-used" > space seem to > > impact performance so much? If (in this case) the extra > 5GB of space > > is essentially "unallocated", does it factor into any of > the caching > > or performance metrics that the DBMS uses? And if so, would I be > > better having a higher shared_buffers rather than relying > so much on > > OS cache? > > > > Yes, I know we need to upgrade to 8.3 but that's going to take some > > time > > :) > > > > Many thanks in advance. > > > > Dave > > > > ___ > > Dave North > > dnorth@signiant.com > > Signiant - Making Media Move > > Visit Signiant at: www.signiant.com > > <http://www.signiant.com/> > > > > > > -- > > Sent via pgsql-performance mailing list > > (pgsql-performance@postgresql.org) > > To make changes to your subscription: > > http://www.postgresql.org/mailpref/pgsql-performance > > > __________________________________________________________ > Not happy with your email address?. > Get the one you really want - millions of new email addresses > available now at Yahoo! http://uk.docs.yahoo.com/ymail/new.html >
"Dave North" <DNorth@signiant.com> writes: > From: Richard Huxton [mailto:dev@archonet.com] >> Well, that's pretty much the definition of bloat. Are you sure you're >> vacuuming enough? > DN: Well, the auto-vac is kicking off pretty darn frequently...around > once every 2 minutes. However, you just made me think of the obvious - > is it actually doing anything?! The app is pretty darn write intensive > so I wonder if it's actually able to vacuum the tables? IIRC, the default autovac parameters in 8.1 were pretty darn unaggressive. You should also check for long-running transactions that might be preventing vacuum from removing recently-dead rows. One of the reasons for updating off 8.1 is that finding out what autovac is really doing is hard :-(. I think it does log, but at level DEBUG2 or so, which means that the only way to find out is to accept a huge amount of useless chatter in the postmaster log. Newer releases have a saner logging scheme. regards, tom lane
Dave North wrote: > Thank you for the suggestion..much appreciated. Alas, I don't think > this will be possible without a change to the application but it's a > good idea nonetheless. I assume you mean the "create table as select ..." suggestion (don't forget to include a little quoted material so we'llknow what you are replying to :-) You don't have to change the application. One of the great advantages of Postgres is that even table creation, droppingand renaming are transactional. So you can do the select / drop / rename as a transaction by an external app, andyour main application will be none the wiser. In pseudo-SQL: begin create table new_table as (select * from old_table); create index ... on new_table ... (as needed) drop table old_table alter table new_table rename to old_table commit You should be able to just execute this by hand on a running system, and see if some of your bloat goes away. Craig
On Wed, 2008-07-30 at 10:02 -0500, Dave North wrote: > Thank you for the suggestion..much appreciated. Alas, I don't think > this will be possible without a change to the application but it's a > good idea nonetheless. Affirmative, Dave. I read you. If I were in your situation (not having access/desire to change the base application), I'd write a sql script that does something like this: - Create __new_table__ from old_table # Read lock on old table - Rename old_table to __old_table__ # Access Exclusive Lock - Rename __new_table__ to old_table # Access Exclusive Lock - Commit # Now the application can write to the new table - Sync newly written changes to the new table (these would be written between the creation and access exclusive lock). - Drop/Vacuum full/Archive old_table Well, it would at least let you get the benefits of the rename approach without actually altering the application. Additionally, the application's writes would only be blocked for the duration of the rename itself. This makes the assumption that these writes aren't strictly necessary immediately (such as via a find or insert construct). If this assumption is false, you would need to lock the table and block the application from writing while you create the temporary table. This has the advantage of not requiring the final sync step. Sorry if all of this seems redundant, but best of luck! -Mark
On Wed, 30 Jul 2008, Craig James wrote: > You don't have to change the application. One of the great advantages of > Postgres is that even table creation, dropping and renaming are > transactional. So you can do the select / drop / rename as a transaction by > an external app, and your main application will be none the wiser. In > pseudo-SQL: > > begin > create table new_table as (select * from old_table); > create index ... on new_table ... (as needed) > drop table old_table > alter table new_table rename to old_table > commit I believe this SQL snippet could cause data loss, because there is a period during which writes can be made to the old table that will not be copied to the new table. On a side note, I would be interested to know what happens with locks when renaming tables. For example, if we were to alter the above SQL, and add a "LOCK TABLE old_table IN ACCESS EXCLUSIVE" line, would this fix the problem? What I mean is, if the application tries to run "INSERT INTO old_table ...", and blocks on the lock, when the old_table is dropped, will it resume trying to insert into the dropped table and fail, or will it redirect its attentions to the new table that has been renamed into place? Also, if a lock is taken on a table, and the table is renamed, does the lock follow the table, or does it stay attached to the table name? Anyway, surely it's much safer to just run VACUUM manually? Matthew -- Change is inevitable, except from vending machines.
On Wed, 30 Jul 2008, Dave North wrote: > One observation I've made on the DB system is the disk I/O seems > dreadfully slow...we're at around 75% I/O wait sometimes and the read > rates seem quite slow (hdparm says around 2.2MB/sec - 20MB/sec for > un-cached reads). This is typically what happens when you are not buffering enough of the right information in RAM, such that there are lots of small reads and writes to the disk involve lots of seeking. You'll only get a couple of MB/s out of a disk if it has to move all over the place to retreive the blocks you asked for. Setting shared_buffers too low makes this more likely to happen, because PostgreSQL has to constantly read and write out random blocks to make space to read new ones in its limited work area. The OS buffers some of that, but not as well as if the database server has a bit more RAM for itself because then the blocks it most uses won't leave that area. > And if so, would I be better having a higher shared_buffers rather than > relying so much on OS cache? The main situation where making shared_buffers too high is a problem on 8.1 involves checkpoints writing out too much information at once. You didn't mention changing checkpoint_segments on your system; if it's at its default of 3, your system is likely continuously doing tiny checkpoints, which might be another reason why you're seeing so much scattered seek behavior above. Something >30 would be more appropriate for checkpoint_segments on your server. I'd suggest re-tuning as follows: 1) Increase shared_buffers to 10,000, test. Things should be a bit faster. 2) Increase checkpoint_segments to 30, test. What you want to watch for here whether there are periods where the server seems to freeze for a couple of seconds. That's a "checkpoint spike". If this happens, reduce checkpoint_segments to some sort of middle ground; some people never get above 10 before it's a problem. 3) Increase shared_buffers in larger chunks, as long as you don't see any problematic spikes you might usefully keep going until it's set to at least 100,000 before improvements level off. > I spent several hours reading info on this list and other places and > it's highly inconclusive about having high or low shared buffs Vs > letting the OS disk cache handle it. A lot of the material floating around the 'net was written circa PostgreSQL 8.0 or earlier, and you need to ignore any advice in this area from those articles. I think if you rescan everything with that filter in place you'll find its not so inconclusive that increasing shared_buffers is a win, so long as it doesn't trigger checkpoint spikes (on your platform at least, there are still Windows issues). Check out my "Inside the PostgreSQL Buffer Cache" presentation at http://www.westnet.com/~gsmith/content/postgresql for an excess of detail on this topic. Unfortunately the usage_count recommendations given there are impractical for use on 8.1 because pg_buffercache doesn't include that info, but the general "shared_buffers vs. OS cache" theory and suggestions apply. The other parameter I hope you're setting correctly for your system is effective_cache_size, which should be at least 2GB for your server (exact sizing depends on how much RAM is leftover after the Tomcat app is running). All this is something to consider in parallel with the vacuum investigation you're doing. It looks like your autovacuum isn't anywhere close to aggressive enough for your workload, which is not unusual at all for 8.1, and that may actually be the majority if your problem. -- * Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD
On Wed, 2008-07-30 at 17:16 +0100, Matthew Wakeling wrote: > > I believe this SQL snippet could cause data loss, because there is a > period during which writes can be made to the old table that will not > be > copied to the new table. It could indeed cause data loss. > On a side note, I would be interested to know what happens with locks > when > renaming tables. For example, if we were to alter the above SQL, and > add a > "LOCK TABLE old_table IN ACCESS EXCLUSIVE" line, would this fix the > problem? What I mean is, if the application tries to run "INSERT INTO > old_table ...", and blocks on the lock, when the old_table is > dropped, > will it resume trying to insert into the dropped table and fail, or > will > it redirect its attentions to the new table that has been renamed > into > place? Yes, that would resolve the issue. It would also block the application's writes for however long the process takes (this could be unacceptable). > Also, if a lock is taken on a table, and the table is renamed, does > the > lock follow the table, or does it stay attached to the table name? The lock will follow the table itself (rather than the table name). > Anyway, surely it's much safer to just run VACUUM manually? Generally, you would think so. The problem comes from Vacuum blocking the application process' writes. -Mark
Dave North wrote: > -----Original Message----- > From: Richard Huxton [mailto:dev@archonet.com] > > Well, that's pretty much the definition of bloat. Are you sure you're > vacuuming enough? > > DN: Well, the auto-vac is kicking off pretty darn frequently...around > once every 2 minutes. However, you just made me think of the obvious - > is it actually doing anything?! The app is pretty darn write intensive > so I wonder if it's actually able to vacuum the tables? If you've got a big batch delete, it can't hurt to manually vacuum that table immediately afterwards. -- Richard Huxton Archonet Ltd
> -----Original Message----- > From: Greg Smith [mailto:gsmith@gregsmith.com] > Sent: July 30, 2008 12:48 PM > To: Dave North > Cc: pgsql-performance@postgresql.org > Subject: Re: [PERFORM] Database size Vs performance degradation > > On Wed, 30 Jul 2008, Dave North wrote: > > > One observation I've made on the DB system is the disk I/O seems > > dreadfully slow...we're at around 75% I/O wait sometimes > and the read > > rates seem quite slow (hdparm says around 2.2MB/sec - 20MB/sec for > > un-cached reads). > > This is typically what happens when you are not buffering > enough of the right information in RAM, such that there are > lots of small reads and writes to the disk involve lots of > seeking. You'll only get a couple of MB/s out of a disk if > it has to move all over the place to retreive the blocks you > asked for. I could totally see that except on another identical server, the MAX rate I was able to get under no load was 20MB/sec which just seems awfully low for 10K rpm disks to me (but granted, I'm not a performance analysis expert by any stretch) > > Setting shared_buffers too low makes this more likely to > happen, because PostgreSQL has to constantly read and write > out random blocks to make space to read new ones in its > limited work area. The OS buffers some of that, but not as > well as if the database server has a bit more RAM for itself > because then the blocks it most uses won't leave that area. OK, this makes sense that a "specialist" cache will provide more benefits that a "general" cache. Got it. > > > And if so, would I be better having a higher shared_buffers rather > > than relying so much on OS cache? > > The main situation where making shared_buffers too high is a > problem on > 8.1 involves checkpoints writing out too much information at > once. You didn't mention changing checkpoint_segments on > your system; if it's at its default of 3, your system is > likely continuously doing tiny checkpoints, which might be > another reason why you're seeing so much scattered seek > behavior above. Something >30 would be more appropriate for > checkpoint_segments on your server. It appears ours is currently set to 12 but this is something I'll have a play with as well. > > I'd suggest re-tuning as follows: > > 1) Increase shared_buffers to 10,000, test. Things should be > a bit faster. > > 2) Increase checkpoint_segments to 30, test. What you want > to watch for here whether there are periods where the server > seems to freeze for a couple of seconds. That's a > "checkpoint spike". If this happens, reduce > checkpoint_segments to some sort of middle ground; some > people never get above 10 before it's a problem. > > 3) Increase shared_buffers in larger chunks, as long as you > don't see any problematic spikes you might usefully keep > going until it's set to at least 100,000 before improvements > level off. Do you happen to know if these are "reload" or "restart" tunable parameters? I think I've read somewhere before that they are restart parameters (assuming I've set SHMMAX high enough of course) > > > I spent several hours reading info on this list and other > places and > > it's highly inconclusive about having high or low shared buffs Vs > > letting the OS disk cache handle it. > <SNIP good reading info > > The other parameter I hope you're setting correctly for your > system is effective_cache_size, which should be at least 2GB > for your server (exact sizing depends on how much RAM is > leftover after the Tomcat app is running). Now, this is interesting. I'm seeing just from top and vmstat, that the OS cache is around 2-3GB pretty consistently with everything running under full load. So it seems I should be able to pretty safely set this to 2GB as you suggest. > > All this is something to consider in parallel with the vacuum > investigation you're doing. It looks like your autovacuum > isn't anywhere close to aggressive enough for your workload, > which is not unusual at all for 8.1, and that may actually be > the majority if your problem. Yeah, I've pretty well convinced myself that it is. We have 2 tables that see this add/remove pattern where we add something like 100,000 rows per table and then delete around 75,000 per night...what I've just done is added enties into pg_autovacuum to change the vac_scale_factor down to 0.2 for both of these tables. By my calcs this will lower the vac threshold for these tables from 221,000 to 111,000 tuples each. Even this may be too high with max_fsm_pages at 400,000 but I can go lower if needed. As per other threads, my only real metric to measure this is the overall database size since the better AV messaging is an 8.3 enhancement. I'm starting with just changing the autovac parameters and see how that affects things. I'm reluctant to change multiple parameters in one shot (although they all make sense logically) just in case one goes awry ;) I have to say, I've learnt a whole load from you folks here this morning...very enlightening. I'm now moving on to your site Greg! :) Cheers Dave > > -- > * Greg Smith gsmith@gregsmith.com http://www.gregsmith.com > Baltimore, MD >
Mark Roberts <mailing_lists@pandapocket.com> writes: > On Wed, 2008-07-30 at 17:16 +0100, Matthew Wakeling wrote: >> Anyway, surely it's much safer to just run VACUUM manually? > Generally, you would think so. The problem comes from Vacuum blocking > the application process' writes. Huh? Vacuum doesn't block writes. regards, tom lane
Dave North a écrit : > [...] >> I'd suggest re-tuning as follows: >> >> 1) Increase shared_buffers to 10,000, test. Things should be >> a bit faster. >> >> 2) Increase checkpoint_segments to 30, test. What you want >> to watch for here whether there are periods where the server >> seems to freeze for a couple of seconds. That's a >> "checkpoint spike". If this happens, reduce >> checkpoint_segments to some sort of middle ground; some >> people never get above 10 before it's a problem. >> >> 3) Increase shared_buffers in larger chunks, as long as you >> don't see any problematic spikes you might usefully keep >> going until it's set to at least 100,000 before improvements >> level off. > > Do you happen to know if these are "reload" or "restart" tunable > parameters? I think I've read somewhere before that they are restart > parameters (assuming I've set SHMMAX high enough of course) > shared_buffers and checkpoint_segments both need a restart. > [...] > I have to say, I've learnt a whole load from you folks here this > morning...very enlightening. I'm now moving on to your site Greg! :) > There's much to learn from Greg's site. I was kinda impressed by all the good articles in it. -- Guillaume. http://www.postgresqlfr.org http://dalibo.com
On Wed, 2008-07-30 at 13:51 -0400, Tom Lane wrote: > > > Huh? Vacuum doesn't block writes. > > regards, tom lane > Of course, you are correct. I was thinking of Vacuum full, which is recommended for use when you're deleting the majority of rows in a table. http://www.postgresql.org/docs/8.1/interactive/sql-vacuum.html -Mark
Valentin Bogdanov <valiouk@yahoo.co.uk> wrote: > I am guessing that you are using DELETE to remove the 75,000 > unimportant. Change your batch job to CREATE a new table consisting > only of the 5,000 important. You can use "CREATE TABLE table_name AS > select_statement" command. Then drop the old table. After that you can > use ALTER TABLE to change the name of the new table to that of the old > one. I have a similar, but different situation, where I TRUNCATE a table with 60k rows every hour, and refill it with new rows. Would it be better (concerning bloat) to just DROP the table every hour, and recreate it, then to TRUNCATE it? Or does TRUNCATE take care of the boat as good as a DROP and CREATE? I am running 8.3.3 in a 48 MB RAM Xen, so performance matters much. -- Miernik http://miernik.name/
On Wed, 2008-07-30 at 23:58 +0200, Miernik wrote: > I have a similar, but different situation, where I TRUNCATE a table > with > 60k rows every hour, and refill it with new rows. Would it be better > (concerning bloat) to just DROP the table every hour, and recreate it, > then to TRUNCATE it? Or does TRUNCATE take care of the boat as good as > a > DROP and CREATE? > > I am running 8.3.3 in a 48 MB RAM Xen, so performance matters much. I've successfully used truncate for this purpose (server 8.2.6): ----------------------------- psql=> select pg_relation_size(oid) from pg_class where relname = 'asdf'; pg_relation_size ------------------ 32768 (1 row) Time: 0.597 ms psql=> truncate asdf; TRUNCATE TABLE Time: 1.069 ms psql=> select pg_relation_size(oid) from pg_class where relname = 'asdf'; pg_relation_size ------------------ 0 (1 row) -Mark
Mark Roberts wrote: > On Wed, 2008-07-30 at 13:51 -0400, Tom Lane wrote: > >> Huh? Vacuum doesn't block writes. >> >> regards, tom lane >> >> > > Of course, you are correct. I was thinking of Vacuum full, which is > recommended for use when you're deleting the majority of rows in a > table. > > http://www.postgresql.org/docs/8.1/interactive/sql-vacuum.html > Maybe I'm wrong but if this "bulk insert and delete" process is cyclical then You don't need vacuum full. Released tuples will fill up again with fresh data next day - after regular vacuum. I have such situation at work. Size of database on disk is 60GB and is stable. -- Andrzej Zawadzki
On Thu, 31 Jul 2008, Andrzej Zawadzki wrote: > Maybe I'm wrong but if this "bulk insert and delete" process is cyclical then > You don't need vacuum full. > Released tuples will fill up again with fresh data next day - after regular > vacuum. Yes, a regular manual vacuum will prevent the table from growing more than it needs to. However, a vacuum full is required to actually reduce the size of the table from 7.5G to 2.7G if that hasn't been done on the production system already. Matthew -- It's one of those irregular verbs - "I have an independent mind," "You are an eccentric," "He is round the twist." -- Bernard Woolly, Yes Prime Minister
2008/8/1 Matthew Wakeling <matthew@flymine.org>: > On Thu, 31 Jul 2008, Andrzej Zawadzki wrote: >> >> Maybe I'm wrong but if this "bulk insert and delete" process is cyclical >> then You don't need vacuum full. >> Released tuples will fill up again with fresh data next day - after >> regular vacuum. > > Yes, a regular manual vacuum will prevent the table from growing more than > it needs to. However, a vacuum full is required to actually reduce the size > of the table from 7.5G to 2.7G if that hasn't been done on the production > system already. One good possibility is use pg8.3 for fix problem. Enable Autovacuum+HOT was won a significant performance compared with 8.2 and minor versions. :) Kind Regards, -- Fernando Ike http://www.midstorm.org/~fike/weblog
Hello to all, I have a Linux 2.6.24.2-xxxx-std-ipv4-64 #3 SMP Tue Feb 12 12:27:47 CET 2008 x86_64 Intel(R) Xeon(R) CPU X5355 @ 2.66GHz GenuineIntel GNU/Linux with 8Gb of memory. Using sata II disk in RAID 1 (I known that is bad, but it would change has quickly I can). I have a database of 38Go and take 6Go per week. I have a lot of update and insert, especially in 8 tables. 2 tables are using for temporary storage, so I right something like 15000 request per 2 minutes and empty it into 10 min. I'm making some update or select on tables including more than 20 millions of entrance. I have the following postgresql.conf settings : # ----------------------------- # PostgreSQL configuration file # ----------------------------- # # This file consists of lines of the form: # # name = value # # (The '=' is optional.) White space may be used. Comments are introduced # with '#' anywhere on a line. The complete list of option names and # allowed values can be found in the PostgreSQL documentation. The # commented-out settings shown in this file represent the default values. # # Please note that re-commenting a setting is NOT sufficient to revert it # to the default value, unless you restart the postmaster. # # Any option can also be given as a command line switch to the # postmaster, e.g. 'postmaster -c log_connections=on'. Some options # can be changed at run-time with the 'SET' SQL command. # # This file is read on postmaster startup and when the postmaster # receives a SIGHUP. If you edit the file on a running system, you have # to SIGHUP the postmaster for the changes to take effect, or use # "pg_ctl reload". Some settings, such as listen_addresses, require # a postmaster shutdown and restart to take effect. #--------------------------------------------------------------------------- # FILE LOCATIONS #--------------------------------------------------------------------------- # The default values of these variables are driven from the -D command line # switch or PGDATA environment variable, represented here as ConfigDir. #data_directory = 'ConfigDir' # use data in another directory #hba_file = 'ConfigDir/pg_hba.conf' # host-based authentication file #ident_file = 'ConfigDir/pg_ident.conf' # IDENT configuration file # If external_pid_file is not explicitly set, no extra pid file is written. #external_pid_file = '(none)' # write an extra pid file #--------------------------------------------------------------------------- # CONNECTIONS AND AUTHENTICATION #--------------------------------------------------------------------------- # - Connection Settings - listen_addresses = 'xxx.xxx.xxx.xxx' # what IP address(es) to listen on; # comma-separated list of addresses; # defaults to 'localhost', '*' = all port = 5432 max_connections = 624 # note: increasing max_connections costs ~400 bytes of shared memory per # connection slot, plus lock space (see max_locks_per_transaction). You # might also need to raise shared_buffers to support more connections. #superuser_reserved_connections = 2 #unix_socket_directory = '' #unix_socket_group = '' #unix_socket_permissions = 0777 # octal #bonjour_name = '' # defaults to the computer name # - Security & Authentication - #authentication_timeout = 60 # 1-600, in seconds #ssl = off #password_encryption = on #db_user_namespace = off # Kerberos #krb_server_keyfile = '' #krb_srvname = 'postgres' #krb_server_hostname = '' # empty string matches any keytab entry #krb_caseins_users = off # - TCP Keepalives - # see 'man 7 tcp' for details tcp_keepalives_idle = 300 # TCP_KEEPIDLE, in seconds; # 0 selects the system default #tcp_keepalives_interval = 0 # TCP_KEEPINTVL, in seconds; # 0 selects the system default #tcp_keepalives_count = 0 # TCP_KEEPCNT; # 0 selects the system default #--------------------------------------------------------------------------- # RESOURCE USAGE (except WAL) #--------------------------------------------------------------------------- # - Memory - shared_buffers = 250000 # min 16 or max_connections*2, 8KB each temp_buffers = 500 # min 100, 8KB each max_prepared_transactions = 200 # can be 0 or more # note: increasing max_prepared_transactions costs ~600 bytes of shared memory # per transaction slot, plus lock space (see max_locks_per_transaction). work_mem = 9000 # min 64, size in KB maintenance_work_mem = 5000 # min 1024, size in KB max_stack_depth = 8192 # min 100, size in KB # - Free Space Map - max_fsm_pages = 100000 # min max_fsm_relations*16, 6 bytes each max_fsm_relations = 5000 # min 100, ~70 bytes each # - Kernel Resource Usage - #max_files_per_process = 1000 # min 25 #preload_libraries = '' # - Cost-Based Vacuum Delay - vacuum_cost_delay = 5 # 0-1000 milliseconds vacuum_cost_page_hit = 10 # 0-10000 credits vacuum_cost_page_miss = 100 # 0-10000 credits vacuum_cost_page_dirty = 20 # 0-10000 credits vacuum_cost_limit = 500 # 0-10000 credits # - Background writer - bgwriter_delay = 50 # 10-10000 milliseconds between rounds bgwriter_lru_percent = 1.0 # 0-100% of LRU buffers scanned/round bgwriter_lru_maxpages = 25 # 0-1000 buffers max written/round bgwriter_all_percent = 0.333 # 0-100% of all buffers scanned/round bgwriter_all_maxpages = 50 # 0-1000 buffers max written/round #--------------------------------------------------------------------------- # WRITE AHEAD LOG #--------------------------------------------------------------------------- # - Settings - #fsync = on # turns forced synchronization on or off #wal_sync_method = fsync # the default is the first option # supported by the operating system: # open_datasync # fdatasync # fsync # fsync_writethrough # open_sync #full_page_writes = on # recover from partial page writes wal_buffers = 16 # min 4, 8KB each commit_delay = 500 # range 0-100000, in microseconds commit_siblings = 50 # range 1-1000 # - Checkpoints - checkpoint_segments = 50 # in logfile segments, min 1, 16MB each checkpoint_timeout = 1800 # range 30-3600, in seconds checkpoint_warning = 180 # in seconds, 0 is off # - Archiving - #archive_command = '' # command to use to archive a logfile # segment #--------------------------------------------------------------------------- # QUERY TUNING #--------------------------------------------------------------------------- # - Planner Method Configuration - #enable_bitmapscan = on #enable_hashagg = on #enable_hashjoin = on #enable_indexscan = on #enable_mergejoin = on #enable_nestloop = on #enable_seqscan = on #enable_sort = on #enable_tidscan = on # - Planner Cost Constants - effective_cache_size = 625000 # typically 8KB each random_page_cost = 3 # units are one sequential page fetch # cost #cpu_tuple_cost = 0.01 # (same) #cpu_index_tuple_cost = 0.001 # (same) #cpu_operator_cost = 0.0025 # (same) # - Genetic Query Optimizer - #geqo = on #geqo_threshold = 12 #geqo_effort = 5 # range 1-10 #geqo_pool_size = 0 # selects default based on effort #geqo_generations = 0 # selects default based on effort #geqo_selection_bias = 2.0 # range 1.5-2.0 # - Other Planner Options - #default_statistics_target = 10 # range 1-1000 #constraint_exclusion = off #from_collapse_limit = 8 #join_collapse_limit = 8 # 1 disables collapsing of explicit # JOINs #--------------------------------------------------------------------------- # ERROR REPORTING AND LOGGING #--------------------------------------------------------------------------- # - Where to Log - #log_destination = 'stderr' # Valid values are combinations of # stderr, syslog and eventlog, # depending on platform. # This is used when logging to stderr: #redirect_stderr = off # Enable capturing of stderr into log # files # These are only used if redirect_stderr is on: #log_directory = 'pg_log' # Directory where log files are written # Can be absolute or relative to PGDATA #log_filename = 'postgresql-%Y-%m-%d_%H%M%S.log' # Log file name pattern. # Can include strftime() escapes #log_truncate_on_rotation = off # If on, any existing log file of the same # name as the new log file will be # truncated rather than appended to. But # such truncation only occurs on # time-driven rotation, not on restarts # or size-driven rotation. Default is # off, meaning append to existing files # in all cases. #log_rotation_age = 1440 # Automatic rotation of logfiles will # happen after so many minutes. 0 to # disable. #log_rotation_size = 10240 # Automatic rotation of logfiles will # happen after so many kilobytes of log # output. 0 to disable. # These are relevant when logging to syslog: #syslog_facility = 'LOCAL0' #syslog_ident = 'postgres' # - When to Log - #client_min_messages = notice # Values, in order of decreasing detail: # debug5 # debug4 # debug3 # debug2 # debug1 # log # notice # warning # error #log_min_messages = notice # Values, in order of decreasing detail: # debug5 # debug4 # debug3 # debug2 # debug1 # info # notice # warning # error # log # fatal # panic #log_error_verbosity = default # terse, default, or verbose messages #log_min_error_statement = panic # Values in order of increasing severity: # debug5 # debug4 # debug3 # debug2 # debug1 # info # notice # warning # error # panic(off) #log_min_duration_statement = -1 # -1 is disabled, 0 logs all statements # and their durations, in milliseconds. #silent_mode = off # DO NOT USE without syslog or # redirect_stderr # - What to Log - #debug_print_parse = off #debug_print_rewritten = off #debug_print_plan = off #debug_pretty_print = off #log_connections = off #log_disconnections = off #log_duration = off #log_line_prefix = '' # Special values: # %u = user name # %d = database name # %r = remote host and port # %h = remote host # %p = PID # %t = timestamp (no milliseconds) # %m = timestamp with milliseconds # %i = command tag # %c = session id # %l = session line number # %s = session start timestamp # %x = transaction id # %q = stop here in non-session # processes # %% = '%' # e.g. '<%u%%%d> ' #log_statement = 'none' # none, mod, ddl, all #log_hostname = off #--------------------------------------------------------------------------- # RUNTIME STATISTICS #--------------------------------------------------------------------------- # - Statistics Monitoring - #log_parser_stats = off #log_planner_stats = off #log_executor_stats = off #log_statement_stats = off # - Query/Index Statistics Collector - stats_start_collector = on #stats_command_string = off #stats_block_level = off stats_row_level = on #stats_reset_on_server_start = off #--------------------------------------------------------------------------- # AUTOVACUUM PARAMETERS #--------------------------------------------------------------------------- autovacuum = on # enable autovacuum subprocess? autovacuum_naptime = 180 # time between autovacuum runs, in secs autovacuum_vacuum_threshold = 100000 # min # of tuple updates before # vacuum autovacuum_analyze_threshold = 9000 # min # of tuple updates before # analyze autovacuum_vacuum_scale_factor = 0.4 # fraction of rel size before # vacuum autovacuum_analyze_scale_factor = 0.2 # fraction of rel size before # analyze autovacuum_vacuum_cost_delay = -1 # default vacuum cost delay for # autovac, -1 means use # vacuum_cost_delay autovacuum_vacuum_cost_limit = -1 # default vacuum cost limit for # autovac, -1 means use # vacuum_cost_limit #--------------------------------------------------------------------------- # CLIENT CONNECTION DEFAULTS #--------------------------------------------------------------------------- # - Statement Behavior - #search_path = '$user,public' # schema names #default_tablespace = '' # a tablespace name, '' uses # the default #check_function_bodies = on #default_transaction_isolation = 'read committed' #default_transaction_read_only = off #statement_timeout = 0 # 0 is disabled, in milliseconds # - Locale and Formatting - #datestyle = 'iso, mdy' #timezone = unknown # actually, defaults to TZ # environment setting #australian_timezones = off #extra_float_digits = 0 # min -15, max 2 #client_encoding = sql_ascii # actually, defaults to database # encoding # These settings are initialized by initdb -- they might be changed #lc_messages = 'fr_FR@euro' # locale for system error message # strings #lc_monetary = 'fr_FR@euro' # locale for monetary formatting #lc_numeric = 'fr_FR@euro' # locale for number formatting #lc_time = 'fr_FR@euro' # locale for time formatting lc_messages = 'fr_FR.UTF-8' # locale for system error message # strings lc_monetary = 'fr_FR.UTF-8' # locale for monetary formatting lc_numeric = 'fr_FR.UTF-8' # locale for number formatting lc_time = 'fr_FR.UTF-8' # locale for time formatting # - Other Defaults - #explain_pretty_print = on #dynamic_library_path = '$libdir' #--------------------------------------------------------------------------- # LOCK MANAGEMENT #--------------------------------------------------------------------------- #deadlock_timeout = 1000 # in milliseconds #max_locks_per_transaction = 64 # min 10 # note: each lock table slot uses ~220 bytes of shared memory, and there are # max_locks_per_transaction * (max_connections + max_prepared_transactions) # lock table slots. #--------------------------------------------------------------------------- # VERSION/PLATFORM COMPATIBILITY #--------------------------------------------------------------------------- # - Previous Postgres Versions - #add_missing_from = off #backslash_quote = safe_encoding # on, off, or safe_encoding #default_with_oids = off #escape_string_warning = off #regex_flavor = advanced # advanced, extended, or basic #sql_inheritance = on # - Other Platforms & Clients - #transform_null_equals = off #--------------------------------------------------------------------------- # CUSTOMIZED OPTIONS #--------------------------------------------------------------------------- #custom_variable_classes = '' # list of custom variable class names I'm sure that it could be more optimised. I don't know any thing on WAL, autovacuum, fsm, bgwriter, kernel process, geqo or planner cost settings. I'll thanks you all in advance for your precious help Regards David
On Wed, Aug 6, 2008 at 6:12 PM, dforum <dforums@vieonet.com> wrote: > Hello to all, > > > I have a Linux 2.6.24.2-xxxx-std-ipv4-64 #3 SMP Tue Feb 12 12:27:47 CET 2008 > x86_64 Intel(R) Xeon(R) CPU X5355 @ 2.66GHz GenuineIntel GNU/Linux > with 8Gb of memory. Using sata II disk in RAID 1 (I known that is bad, but <snip> this is likely your problem...with fsync on (as you have it), you will be lucky to get a couple of hundred transactions/sec out of the database. you are probably just exceeding your operational capabilities of the hardware so you probably need to upgrade or turn off fsync (which means data corruption in event of hard crash). merlin
Tx for your reply. You mean that RAID use fsync method for keeping data's copy. So you invite me to desactivate fsync to increase the performance ? Desactivating fsync. my second disk will not be uptodate, so if the machine crash, I wont be able to get the server working quickly??? But if I use a second machine to replicate the database, I escape this problem isn't it ? If I understand right, could you tell me how to do desactivate fsync please ? Best regards David Merlin Moncure a écrit : > On Wed, Aug 6, 2008 at 6:12 PM, dforum <dforums@vieonet.com> wrote: > >> Hello to all, >> >> >> I have a Linux 2.6.24.2-xxxx-std-ipv4-64 #3 SMP Tue Feb 12 12:27:47 CET 2008 >> x86_64 Intel(R) Xeon(R) CPU X5355 @ 2.66GHz GenuineIntel GNU/Linux >> with 8Gb of memory. Using sata II disk in RAID 1 (I known that is bad, but >> > <snip> > > this is likely your problem...with fsync on (as you have it), you will > be lucky to get a couple of hundred transactions/sec out of the > database. you are probably just exceeding your operational > capabilities of the hardware so you probably need to upgrade or turn > off fsync (which means data corruption in event of hard crash). > > merlin > >
dforum wrote: > Tx for your reply. > > You mean that RAID use fsync method for keeping data's copy. No, Merlin means PostgreSQL will issue a sync to force WAL to actual disk. > So you invite me to desactivate fsync to increase the performance ? He means you might have to if you can't afford new hardware. Is disk activity the problem? Have you looked at the output of "vmstat" to check? > Desactivating fsync. my second disk will not be uptodate, No - the RAID stuff is happening in the operating-system. > so if the > machine crash, I wont be able to get the server working quickly??? Not "quickly", perhaps not "at all". > But > if I use a second machine to replicate the database, I escape this > problem isn't it ? You reduce the chance of a single failure causing disaster. > If I understand right, could you tell me how to do desactivate fsync > please ? There's an "fsync = on" setting in your postgresql.conf, but don't change it yet. > I have a database of 38Go and take 6Go per week. What do you mean by "take 6Go per week"? You update/delete that much data? It's growing by that amount each week? > I have a lot of update and insert, especially in 8 tables. 2 tables are > using for temporary storage, so I right something like 15000 request per > 2 minutes and empty it into 10 min. I'm not sure what "15000 request per 2 minutes and empty it into 10 min" means. Do you have 7500 requests per minute? Are these updates? To the "temporary storage"? What is this "temporary storage" - an ordinary table? > I'm making some update or select on tables including more than 20 > millions of entrance. Again, I'm not sure what this means. Oh - *important* - which version of PostgreSQL are you running? Is an upgrade practical? Looking at your postgresql.conf settings: max_connections = 624 That's an odd number. Do you usually have that many connections? What are they doing? They can't all be active, the machine you've got wouldn't cope. shared_buffers = 250000 work_mem = 9000 temp_buffers = 500 These three are important. The shared_buffers are workspace shared between all backends, and you've allocated about 2GB. You've also set work_mem=9MB, which is how much each backend can use for a single sort. That means it can use double or triple that in a complex query. If you're using temporary tables, then you'll want to make sure the temp_buffers setting is correct. I can't say whether these figures are good or bad without knowing how the database is being used. effective_cache_size = 625000 That's around 5GB - is that roughly the amount of memory used for caching (what does free -m say for buffers/cache)? max_prepared_transactions = 200 Do you use a lot of prepared transactions in two-phase commit? I'm guessing that you don't. > I'm sure that it could be more optimised. I don't know any thing on > WAL, > autovacuum, fsm, bgwriter, kernel process, geqo or planner cost > settings. If you run a "vacuum verbose" it will recommend fsm settings at the end of its output. I think you probably need to make your autovacuum more aggressive, but that's something you'll be able to tell by monitoring your database. It's quite likely that Merlin's right, and you need better hardware to cope with the number of updates you're making - that's something where you need fast disks. However, he's just guessing because you've not told us enough to tell where the problem really lies. -- Richard Huxton Archonet Ltd
Richard Huxton a écrit : > dforum wrote: >> Tx for your reply. >> >> You mean that RAID use fsync method for keeping data's copy. > > No, Merlin means PostgreSQL will issue a sync to force WAL to actual disk. > >> So you invite me to desactivate fsync to increase the performance ? > > He means you might have to if you can't afford new hardware. Is disk > activity the problem? Have you looked at the output of "vmstat" to check? vmstat is giving : procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 0 2 1540 47388 41684 7578976 0 0 131 259 0 1 9 3 82 7 > >> Desactivating fsync. my second disk will not be uptodate, > > No - the RAID stuff is happening in the operating-system. > > > so if the >> machine crash, I wont be able to get the server working quickly??? > > Not "quickly", perhaps not "at all". Oups > > > But >> if I use a second machine to replicate the database, I escape this >> problem isn't it ? > You reduce the chance of a single failure causing disaster. Not clear this reply. It's scare me .... > >> If I understand right, could you tell me how to do desactivate fsync >> please ? > > There's an "fsync = on" setting in your postgresql.conf, but don't > change it yet. OK > > > I have a database of 38Go and take 6Go per week. > > What do you mean by "take 6Go per week"? You update/delete that much > data? It's growing by that amount each week? YES > > > I have a lot of update and insert, especially in 8 tables. 2 tables are > > using for temporary storage, so I right something like 15000 request per > > 2 minutes and empty it into 10 min. > > I'm not sure what "15000 request per 2 minutes and empty it into 10 min" > means. I insert 15000 datas every 2 min and delete 15000 every 10 min in those tables > > Do you have 7500 requests per minute? should be that, But in fact I'm not treating the datas in real time, and I buffer the datas and push the data into the database every 2 min > Are these updates? during the delete the data are aggregated in other tables which make updates > To the "temporary storage"? > What is this "temporary storage" - an ordinary table? Yes, I thied to use temporary tables but I never been able to connect this tables over 2 different session/connection, seems that is a functionnality of postgresql, or a misunderstanding from me. > > > I'm making some update or select on tables including more than 20 > > millions of entrance. > > Again, I'm not sure what this means. To aggregate the data, I have to check the presence of others information that are stores in 2 tables which includes 24 millions of entrance. > > > Oh - *important* - which version of PostgreSQL are you running? 8.1.11 > Is an upgrade practical? We are working of trying to upgrade to 8.3.3, but we are not yet ready for such migration > > > Looking at your postgresql.conf settings: > > max_connections = 624 > > That's an odd number. Now we could decrease this number, it's not so much usefull for now. we could decrease is to 350. > Do you usually have that many connections? > What are they doing? They can't all be active, the machine you've got > wouldn't cope. > > shared_buffers = 250000 > work_mem = 9000 > temp_buffers = 500 > > These three are important. The shared_buffers are workspace shared > between all backends, and you've allocated about 2GB. You've also set > work_mem=9MB, which is how much each backend can use for a single sort. > That means it can use double or triple that in a complex query (i now about it). If > you're using temporary tables, then you'll want to make sure the > temp_buffers setting is correct. I need help for that, I don't know > > I can't say whether these figures are good or bad without knowing how > the database is being used. > > effective_cache_size = 625000 > > That's around 5GB - is that roughly the amount of memory used for > caching (what does free -m say for buffers/cache)? total used free shared buffers cached Mem: 7984 7828 156 0 38 7349 -/+ buffers/cache: 440 7544 Swap: 509 1 508 > > max_prepared_transactions = 200 > > Do you use a lot of prepared transactions in two-phase commit? > I'm guessing that you don't. I don't > > > I'm sure that it could be more optimised. I don't know any thing on > > WAL, > > autovacuum, fsm, bgwriter, kernel process, geqo or planner cost > > settings. > > If you run a "vacuum verbose" it will recommend fsm settings at the end > of its output. I think you probably need to make your autovacuum more > aggressive, but that's something you'll be able to tell by monitoring > your database. > > It's quite likely that Merlin's right, and you need better hardware to > cope with the number of updates you're making - that's something where > you need fast disks. However, he's just guessing because you've not told > us enough to tell where the problem really lies. > Hope that new information will give you more information to help me. Regards david -- <http://www.1st-affiliation.fr> *David Bigand *Président Directeur Générale* *51 chemin des moulins 73000 CHAMBERY - FRANCE Web : htttp://www.1st-affiliation.fr Email : david@1st-affiliation.com Tel. : +33 479 696 685 Mob. : +33 666 583 836 Skype : firstaffiliation_support
dforums wrote: > vmstat is giving : > procs -----------memory---------- ---swap-- -----io---- --system-- > ----cpu---- > r b swpd free buff cache si so bi bo in cs us sy > id wa > 0 2 1540 47388 41684 7578976 0 0 131 259 0 1 9 > 3 82 7 This system is practically idle. Either you're not measuring it at a useful time, or there isn't a performance problem. > > > But > >> if I use a second machine to replicate the database, I escape this > >> problem isn't it ? > > You reduce the chance of a single failure causing disaster. > Not clear this reply. It's scare me .... If server A fails, you still have server B. If server A fails so that replication stops working and you don't notice, server B won't help any more. > > What do you mean by "take 6Go per week"? You update/delete that much > > data? It's growing by that amount each week? > YES That wasn't a yes/no question. Please choose one of: Are you updating 6Go per week? Are you adding 6Go per week? > > I'm not sure what "15000 request per 2 minutes and empty it into 10 min" > > means. > I insert 15000 datas every 2 min and delete 15000 every 10 min in those > tables > > > > Do you have 7500 requests per minute? > should be that, But in fact I'm not treating the datas in real time, and > I buffer the datas and push the data into the database every 2 min > > Are these updates? > during the delete the data are aggregated in other tables which make > updates OK, so every 2 minutes you run one big query that adds 15000 rows. Every 10 minutes you run one big query that deletes 15000 rows. > > To the "temporary storage"? > > > What is this "temporary storage" - an ordinary table? > Yes, I thied to use temporary tables but I never been able to connect > this tables over 2 different session/connection, seems that is a > functionnality of postgresql, or a misunderstanding from me. That's correct - temporary tables are private to a backend (connection). > > > I'm making some update or select on tables including more than 20 > > > millions of entrance. > > > > Again, I'm not sure what this means. > > To aggregate the data, I have to check the presence of others > information that are stores in 2 tables which includes 24 millions of > entrance. OK. I assume you're happy with the plans you are getting on these queries, since you've not provided any information about them. > > Oh - *important* - which version of PostgreSQL are you running? > 8.1.11 > > Is an upgrade practical? > We are working of trying to upgrade to 8.3.3, but we are not yet ready > for such migration OK > > Looking at your postgresql.conf settings: > > > > max_connections = 624 > > > > That's an odd number. > Now we could decrease this number, it's not so much usefull for now. we > could decrease is to 350. I don't believe you've got 350 active connections either. It will be easier to help if you can provide some useful information. > > effective_cache_size = 625000 > > > > That's around 5GB - is that roughly the amount of memory used for > > caching (what does free -m say for buffers/cache)? > total used free shared buffers cached > Mem: 7984 7828 156 0 38 7349 > -/+ buffers/cache: 440 7544 > Swap: 509 1 508 Not far off - free is showing 7349MB cached. You're not running 350 clients there though - you're only using 440MB of RAM. I don't see anything to show a performance problem from these emails. -- Richard Huxton Archonet Ltd
The performance problem is really only on the insertion and even more on the treatment for the aggregation. To treat the 3000 entrances and to insert, or update the tables it needs 10 minutes. As I told you I inject 14000 query every 2 minutes, and it needs 10 minutes to treat 3000 of those query. As you can easly understand it's a big narrow section. I'm not doing the treatment in ones, cause I can't, but all is managed by procedure. > That wasn't a yes/no question. Please choose one of: > Are you updating 6Go per week? most of update > Are you adding 6Go per week? less of injection, This action depend if the data are already present in the database. > > OK. I assume you're happy with the plans you are getting on these > queries, since you've not provided any information about them. The plan seems ok as it use index as well. here is the plan : explain analyse SELECT "insertUpdateTracks"(137,2605, 852, ('2008-08-06 19:28:54'::text)::date,3,'dailydisplay',2,NULL); INFO: method 1 QUERY PLAN ------------------------------------------------------------------------------------ Result (cost=0.00..0.01 rows=1 width=0) (actual time=1.151..1.151 rows=1 loops=1) Total runtime: 1.160 ms (2 lignes) Has you can see the runtime processs for an update in this table. multiplying this per 10000, it is too long. regards david Richard Huxton a écrit : > dforums wrote: >> vmstat is giving : >> procs -----------memory---------- ---swap-- -----io---- --system-- >> ----cpu---- >> r b swpd free buff cache si so bi bo in cs us >> sy id wa >> 0 2 1540 47388 41684 7578976 0 0 131 259 0 1 9 >> 3 82 7 > > This system is practically idle. Either you're not measuring it at a > useful time, or there isn't a performance problem. > >> > > But >> >> if I use a second machine to replicate the database, I escape this >> >> problem isn't it ? >> > You reduce the chance of a single failure causing disaster. >> Not clear this reply. It's scare me .... > > If server A fails, you still have server B. If server A fails so that > replication stops working and you don't notice, server B won't help any > more. > >> > What do you mean by "take 6Go per week"? You update/delete that much >> > data? It's growing by that amount each week? >> YES > > That wasn't a yes/no question. Please choose one of: > Are you updating 6Go per week? > Are you adding 6Go per week? > >> > I'm not sure what "15000 request per 2 minutes and empty it into 10 >> min" >> > means. >> I insert 15000 datas every 2 min and delete 15000 every 10 min in >> those tables >> > >> > Do you have 7500 requests per minute? >> should be that, But in fact I'm not treating the datas in real time, >> and I buffer the datas and push the data into the database every 2 min >> > Are these updates? >> during the delete the data are aggregated in other tables which make >> updates > > OK, so every 2 minutes you run one big query that adds 15000 rows. > Every 10 minutes you run one big query that deletes 15000 rows. > >> > To the "temporary storage"? >> >> > What is this "temporary storage" - an ordinary table? >> Yes, I thied to use temporary tables but I never been able to connect >> this tables over 2 different session/connection, seems that is a >> functionnality of postgresql, or a misunderstanding from me. > > That's correct - temporary tables are private to a backend (connection). > >> > > I'm making some update or select on tables including more than 20 >> > > millions of entrance. >> > >> > Again, I'm not sure what this means. >> >> To aggregate the data, I have to check the presence of others >> information that are stores in 2 tables which includes 24 millions of >> entrance. > > OK. I assume you're happy with the plans you are getting on these > queries, since you've not provided any information about them. > >> > Oh - *important* - which version of PostgreSQL are you running? >> 8.1.11 >> > Is an upgrade practical? >> We are working of trying to upgrade to 8.3.3, but we are not yet ready >> for such migration > > OK > >> > Looking at your postgresql.conf settings: >> > >> > max_connections = 624 >> > >> > That's an odd number. >> Now we could decrease this number, it's not so much usefull for now. >> we could decrease is to 350. > > I don't believe you've got 350 active connections either. It will be > easier to help if you can provide some useful information. > >> > effective_cache_size = 625000 >> > >> > That's around 5GB - is that roughly the amount of memory used for >> > caching (what does free -m say for buffers/cache)? >> total used free shared buffers cached >> Mem: 7984 7828 156 0 38 7349 >> -/+ buffers/cache: 440 7544 >> Swap: 509 1 508 > > Not far off - free is showing 7349MB cached. You're not running 350 > clients there though - you're only using 440MB of RAM. > > > I don't see anything to show a performance problem from these emails. > -- <http://www.1st-affiliation.fr> *David Bigand *Président Directeur Générale* *51 chemin des moulins 73000 CHAMBERY - FRANCE Web : htttp://www.1st-affiliation.fr Email : david@1st-affiliation.com Tel. : +33 479 696 685 Mob. : +33 666 583 836 Skype : firstaffiliation_support
dforums wrote: > The performance problem is really only on the insertion and even more on > the treatment for the aggregation. > > To treat the 3000 entrances and to insert, or update the tables it needs > 10 minutes. > > As I told you I inject 14000 query every 2 minutes, and it needs 10 > minutes to treat 3000 of those query. Sorry - I still don't understand. What is this "treatment" you are doing? > > > > OK. I assume you're happy with the plans you are getting on these > > queries, since you've not provided any information about them. > > The plan seems ok as it use index as well. > here is the plan : > > explain analyse SELECT "insertUpdateTracks"(137,2605, 852, ('2008-08-06 > 19:28:54'::text)::date,3,'dailydisplay',2,NULL); > INFO: method 1 > QUERY PLAN > ------------------------------------------------------------------------------------ > > Result (cost=0.00..0.01 rows=1 width=0) (actual time=1.151..1.151 > rows=1 loops=1) > Total runtime: 1.160 ms There's nothing to do with an index here - this is a function call. > Has you can see the runtime processs for an update in this table. > > multiplying this per 10000, it is too long. So - are you calling this function 14000 times to inject your data? You're doing this in one transaction, yes? -- Richard Huxton Archonet Ltd
On Thu, Aug 7, 2008 at 9:30 AM, dforums <dforums@vieonet.com> wrote: > The performance problem is really only on the insertion and even more on the > treatment for the aggregation. > > To treat the 3000 entrances and to insert, or update the tables it needs 10 > minutes. > > As I told you I inject 14000 query every 2 minutes, and it needs 10 minutes > to treat 3000 of those query. > > As you can easly understand it's a big narrow section. > > I'm not doing the treatment in ones, cause I can't, but all is managed by > procedure. > >> That wasn't a yes/no question. Please choose one of: >> Are you updating 6Go per week? most of update >> Are you adding 6Go per week? less of injection, > > This action depend if the data are already present in the database. > > >> >> OK. I assume you're happy with the plans you are getting on these >> queries, since you've not provided any information about them. > > The plan seems ok as it use index as well. > here is the plan : > > explain analyse SELECT "insertUpdateTracks"(137,2605, 852, ('2008-08-06 > 19:28:54'::text)::date,3,'dailydisplay',2,NULL); > INFO: method 1 > QUERY PLAN > ------------------------------------------------------------------------------------ > Result (cost=0.00..0.01 rows=1 width=0) (actual time=1.151..1.151 rows=1 > loops=1) > Total runtime: 1.160 ms > (2 lignes) > > Has you can see the runtime processs for an update in this table. > > multiplying this per 10000, it is too long. > please don't top-post (put your reply after previous comments). With fsync on, you are lucky to get 10k inserts in 10 minutes on single sata 1. The basic issue is that after each time function runs postgesql tells disk drive to flush, guaranteeing data safety. You have few different options here: *) group multiple inserts into single transaction *) modify function to take multiple 'update' records at once. *) disable fsync (extremely unsafe as noted) *) upgrade to 8.3 and disable synchronized_commit (the 'fsync lite', a good compromise between fsync on/off) merlin
dforums wrote: > The delete is global, the procedure is called for each line/tracks. > > So - are you calling this function 14000 times to inject your data? > > You're doing this in one transaction, yes? > NO I have to make it 14000 times cause, I use some inserted information > for other insert to make links between data. Why does that stop you putting all 14000 calls in one transaction? -- Richard Huxton Archonet Ltd