Thread: Large table update/vacuum PLEASE HELP!
Hi, everybody! This is the continuation of my yesterday's trouble, having updated a large table Please see my previous message - Alter/update large tables - VERRY annoying behaviour The only responses I got to that was, pretty much how annoying I was, posting to several mailing lists at once (I am not doing that any more) and posting messages in HTML (I hope, that is fixed now too)... Well... EVEN THOUGH I posted to different lists, I got no response! Is it because I am asking something stupid? Is my English lousy, so that people just don't understand what I am talking about? Or am I just not welcome here at all? I am sorry, if I sound too irritated... that's just because I am :-( You see, I ran that vacuum command on that table... it took about 24 hours... AND STILL DID NOT FIX ANYTHING! select * from a limit 1; Takes about 30 minutes and I have no idea what it is thinking about for so long! If anyone has anyu idea at all what could be the problem, PLEASE HELP! Thanks a lot! Dima
At 11:20 AM 4/16/02 -0400, Dmitry Tkach wrote: >select * from a limit 1; > >Takes about 30 minutes and I have no idea what it is thinking about for so >long! > >If anyone has anyu idea at all what could be the problem, PLEASE HELP! If a is still something like: create table a ( id int primary key, some_data int ); Try: vacuum analyze; explain select * from a order by id limit 1; select * from a order by id limit 1; Hopefully the results are good. Good luck, Link.
Lincoln Yeoh wrote: > At 11:20 AM 4/16/02 -0400, Dmitry Tkach wrote: > >> select * from a limit 1; >> >> Takes about 30 minutes and I have no idea what it is thinking about >> for so long! >> >> If anyone has anyu idea at all what could be the problem, PLEASE HELP! > > > If a is still something like: > > create table a > ( > id int primary key, > some_data int > ); Well... Not quite. The whole problem started when I merged a and b tables together - so, now a looks like: create table a ( id int primary key, some_data int, some_other_data int ); ( I added the last column with alter table and populated it with update a set some_other_data from b where a.id=b.id; That's when my nightmare started ) > > Try: > > vacuum analyze; Yeah... I did that yesterday... It took about 24 hours (!) to run... And I don't see any difference in the behaviour :-( > > explain select * from a order by id limit 1; Aha... This is an interesting thing to try. Thanks. Unfortunately, I can't try that right now either! :-( I am running 'vacuum full' on that table (out of despare), and, despite what I have repeatedly heard about vaccums in 7.2, my 'explain' command seems to be hopelessly waiting on a lock, created by that vacuum :-( Is it supposed to be that way. > > select * from a order by id limit 1; As I said, I could not check it right now... My understanding is that you suggest that I force it to scan the index rather then the table itself... This very well may help the immediate problem (once it's done with the vacuum, which I expect some time tomorrow :-()... BUT, the main issue is that I can't really do a sequentual scan on that table (not in any reasonable time anyway) the way it is... I am just used to thinking, that, sequentual scan of a table is the most basic operation I can imagine ... if my database is so screwed up that even that doesn't work, I won't be able to make much use of it anyway... :-( Is that right? > Good luck, Thanks! :-) Dima
On Tue, 2002-04-16 at 14:11, Dmitry Tkach wrote: [snip] > > > > Try: > > > > vacuum analyze; > > Yeah... I did that yesterday... It took about 24 hours (!) to run... And > I don't see any difference in the behaviour :-( > > > > > explain select * from a order by id limit 1; > > > Aha... This is an interesting thing to try. Thanks. > Unfortunately, I can't try that right now either! :-( > I am running 'vacuum full' on that table (out of despare), and, despite [snip] Just to be sure we all understand, you ran vacuum analyze; you didn't just run vacuum or vacuum full the analyze part is important; it creates statistics for the query planner -- Tom Jenkins Development InfoStructure http://www.devis.com
On Tue, 16 Apr 2002, Dmitry Tkach wrote: > > vacuum analyze; > > Yeah... I did that yesterday... It took about 24 hours (!) to run... And > I don't see any difference in the behaviour :-( > > > > > explain select * from a order by id limit 1; > > > Aha... This is an interesting thing to try. Thanks. > Unfortunately, I can't try that right now either! :-( > I am running 'vacuum full' on that table (out of despare), and, despite > what I have repeatedly heard about > vaccums in 7.2, my 'explain' command seems to be hopelessly waiting on a > lock, created by that vacuum > :-( > Is it supposed to be that way. Unfortunately, yes. VACUUM FULL is effectively the 7.1 and earlier vacuum. Normal vacuum wouldn't have helped you in this case since it wouldn't compress the table (IIRC it only moves tuples within a block rather than between them, so all the dead blocks at the beginning are still there).
Tom Jenkins wrote: > >Just to be sure we all understand, you ran > vacuum analyze; > >you didn't just run vacuum or vacuum full > >the analyze part is important; it creates statistics for the query >planner > Well... NO. I just did vacuum, not analyze (I was hoping to save the 'analyze ' part till after I create the indexes). I don't think that statistics matters for this particular query plan, which is nothing but 'seq scan on a'... And it does come up with the correct plan (as if it had much choice anyway :-), despite missing statistics... I'll try to do 'analyze' (after my vacuum full finishes sometime tomorrow) though, but doubt it will make any difference (unless, I am missing something important in my inderstanding of the meaning of 'analyze') Thanks! Dima
Stephan Szabo wrote: > >Unfortunately, yes. VACUUM FULL is effectively the 7.1 and earlier >vacuum. Normal vacuum wouldn't have helped you in this case since it >wouldn't compress the table (IIRC it only moves tuples within a block >rather than between them, so all the dead blocks at the beginning are >still there). > Aha! This (finally) gives me some hope! If I read you correctly - are you saying, that, once my 'vacuum full' is finished, I'm finally back in business? Thank you very much! If you are still not tired of me, could you also give me an idea on the extent of this problem, while you are at it? I mean, how bad it really is? I will never update the whole table from, now on (I realize now, that it's much cheaper to just recreate it)... But in general, if I put this database into production, I am going to routinely update about 10-15% of all the rows in that table every month... How bad it is? Am I going to need a 'vacuum full' after every update? Or how often would it be practical to do that? Also, do you have any idea, why does vacuum take me so long (24 hours sounds a little excessive, doesn't it)? I've got a decent machine (4 CPU, 750 MHz, 8 Gig of RAM), so hardware whouldn't be a problem... Could it be something wrong with my configuration. Here is my postgresql.conf (with all the comments removed - just the stuff I changed from the default)... Does anything look wrong to you here? tcpip_socket = true max_connections = 100 shared_buffers = 64000 max_locks_per_transaction = 640 wal_buffers = 80 sort_mem = 10240 wal_files = 64 checkpoint_segments = 20 checkpoint_timeout = 600 stats_command_string = true stats_row_level = true stats_block_level = true deadlock_timeout = 300000 Thanks again for giving my hope back! :-) I really appreciate your response! Dima
Dmitry Tkach <dmitry@openratings.com> writes: > But in general, if I put this database into production, I am going to > routinely update about 10-15% of all the rows in that table every > month... How bad it is? > Am I going to need a 'vacuum full' after every update? Or how often > would it be practical to do that? I wouldn't recommend a VACUUM FULL at all. Just do plain VACUUMs on a regular basis, and accept the 10% or so storage overhead. VACUUM FULL is good for the sort of situation where you've updated all or most of the rows at one time, and now you have a factor-of-2 storage overhead; you need to physically compact the table. But the price of doing that is high enough that I wouldn't do it to save 10-15%. regards, tom lane
Tom Lane wrote: > >I wouldn't recommend a VACUUM FULL at all. Just do plain VACUUMs on >a regular basis, and accept the 10% or so storage overhead. > >VACUUM FULL is good for the sort of situation where you've updated all >or most of the rows at one time, and now you have a factor-of-2 storage >overhead; you need to physically compact the table. But the price of >doing that is high enough that I wouldn't do it to save 10-15%. > > regards, tom lane > I am not worried about storage overhead at all at this point, but rather about performance degradation when it has to scan through all those dead tuples in the table and there are LOTS of them :-( Thanks! Dima
On Wed, 17 Apr 2002, Dima Tkach wrote: > Tom Lane wrote: > > > > >I wouldn't recommend a VACUUM FULL at all. Just do plain VACUUMs on > >a regular basis, and accept the 10% or so storage overhead. > > > >VACUUM FULL is good for the sort of situation where you've updated all > >or most of the rows at one time, and now you have a factor-of-2 storage > >overhead; you need to physically compact the table. But the price of > >doing that is high enough that I wouldn't do it to save 10-15%. > > > > regards, tom lane > > > I am not worried about storage overhead at all at this point, but rather > about performance degradation when it > has to scan through all those dead tuples in the table and there are > LOTS of them :-( In the 10% case, you should be within the realm where the table's steady state size is around that much more with reasonable frequency normal VACUUMs and an appropriately sized free space map.
> > > >In the 10% case, you should be within the realm where the table's steady >state size is around that much more with reasonable frequency normal >VACUUMs and an appropriately sized free space map. > Are you saying that, if I, say, update 1000 tuples today, and another 1000 tomorow, it will reuse the today's dead tuples, and not create new ones, so that I end up with just 1000 of them, not 2000? Just making sure... Thanks a lot! Dima
On Wed, 17 Apr 2002, Dmitry Tkach wrote: > >In the 10% case, you should be within the realm where the table's steady > >state size is around that much more with reasonable frequency normal > >VACUUMs and an appropriately sized free space map. > > > Are you saying that, if I, say, update 1000 tuples today, and another > 1000 tomorow, it will reuse the today's dead tuples, and not create new > ones, so that I end up with just 1000 of them, not 2000? > > Just making sure... The expectation is that if you update 1000 tuples today, do a normal vacuum when no transaction is left that can see the old state of those tuples, then update 1000 tuples tomorrow, it'll attempt to reuse as much of that "dead" space as possible which may very well mean you end up with 1200 of them say, but no less than 1000 and almost certainly not 2000. For 1000 that should work, for much larger numbers you may need to play with settings to get an appropriate effect (you may see that as the number updated grows in order of magnitude that the wasted space approaches 2x as you the map of free space isn't large enough unless you up those settings).
Stephan Szabo wrote: > > >For 1000 that should work, for much larger numbers you may need to play >with settings to get an appropriate effect (you may see that as the number >updated grows in order of magnitude that the wasted space approaches 2x as >you the map of free space isn't large enough unless you up those >settings). > I am sorry, I am afraid, I don't quite understand this. What exactly are those settings I need to play with? And what is this 'map of free space'? Thanks a lot! Dima