Thread: Large table update/vacuum PLEASE HELP!

Large table update/vacuum PLEASE HELP!

From

Dmitry Tkach

Date:

16 April 2002, 11:41:37

Hi, everybody!

This is the continuation of my yesterday's trouble, having updated a large table
Please see my previous message - Alter/update large tables - VERRY annoying behaviour

The only responses I got to that was, pretty much how annoying I was, posting to several
mailing lists at once (I am not doing that any more) and posting messages in HTML (I hope,
that is fixed now too)...

Well... EVEN THOUGH I posted to different lists, I got no response! Is it because I am asking
something stupid? Is my English lousy, so that people just don't understand what I am talking about?
Or am I just not welcome here at all?

I am sorry, if I sound too irritated... that's just because I am :-(
You see, I ran that vacuum command on that table... it took about 24 hours... AND STILL DID NOT FIX ANYTHING!

select * from a limit 1;

Takes about 30 minutes and I have no idea what it is thinking about for so long!

If anyone has anyu idea at all what could be the problem, PLEASE HELP!

Thanks a lot!

Dima

Re: Large table update/vacuum PLEASE HELP!

From

Lincoln Yeoh

Date:

16 April 2002, 11:53:29

At 11:20 AM 4/16/02 -0400, Dmitry Tkach wrote:

>select * from a limit 1;
>
>Takes about 30 minutes and I have no idea what it is thinking about for so
>long!
>
>If anyone has anyu idea at all what could be the problem, PLEASE HELP!

If a is still something like:

create table a
(
    id int primary key,
    some_data int
);

Try:

vacuum analyze;
explain select * from a order by id limit 1;
select * from a order by id limit 1;

Hopefully the results are good.

Good luck,
Link.

Re: Large table update/vacuum PLEASE HELP!

From

Dmitry Tkach

Date:

16 April 2002, 12:44:41

Lincoln Yeoh wrote:

> At 11:20 AM 4/16/02 -0400, Dmitry Tkach wrote:
>
>> select * from a limit 1;
>>
>> Takes about 30 minutes and I have no idea what it is thinking about
>> for so long!
>>
>> If anyone has anyu idea at all what could be the problem, PLEASE HELP!
>
>
> If a is still something like:
>
> create table a
> (
>    id int primary key,
>    some_data int
> );

Well... Not quite.
The whole problem started when I merged a and b tables together - so,
now a looks like:
create table a
(
    id int primary key,
    some_data int,
    some_other_data int
);

(
I added the last column with alter table and populated it with
update a set some_other_data from b where a.id=b.id;
That's when my nightmare started
)

>
> Try:
>
> vacuum analyze;

Yeah... I did that yesterday... It took about 24 hours (!) to run... And
I don't see any difference in the behaviour :-(

>
> explain select * from a order by id limit 1;

Aha... This is an interesting thing to try. Thanks.
Unfortunately, I can't try that right now either! :-(
I am running 'vacuum full' on that table (out of despare), and, despite
what I have repeatedly heard about
vaccums in 7.2, my 'explain' command seems to be hopelessly waiting on a
lock, created by that vacuum
:-(
Is it supposed to be that way.

>
> select * from a order by id limit 1;

As I said, I could not check it right now... My understanding is that
you suggest that I force it to scan the index rather then the table
itself... This very well may help the immediate problem (once it's done
with the vacuum, which I expect some time tomorrow :-()...

BUT, the main issue is that I can't really do a sequentual scan on that
table (not in any reasonable time anyway) the way it is...
I am just used to thinking, that,  sequentual scan of a table is the
most basic operation I can imagine ... if my database is so screwed up
that even that doesn't work, I won't be able to make much use of it
anyway... :-(
Is that right?

> Good luck,

Thanks! :-)

Dima

Re: Large table update/vacuum PLEASE HELP!

From

Tom Jenkins

Date:

16 April 2002, 16:33:46

On Tue, 2002-04-16 at 14:11, Dmitry Tkach wrote:
[snip]
> >
> > Try:
> >
> > vacuum analyze;
>
> Yeah... I did that yesterday... It took about 24 hours (!) to run... And
> I don't see any difference in the behaviour :-(
>
> >
> > explain select * from a order by id limit 1;
>
>
> Aha... This is an interesting thing to try. Thanks.
> Unfortunately, I can't try that right now either! :-(
> I am running 'vacuum full' on that table (out of despare), and, despite
[snip]

Just to be sure we all understand, you ran
  vacuum analyze;

you didn't just run vacuum or vacuum full

the analyze part is important; it creates statistics for the query
planner

--

Tom Jenkins
Development InfoStructure
http://www.devis.com

Re: Large table update/vacuum PLEASE HELP!

From

Stephan Szabo

Date:

16 April 2002, 17:27:55

On Tue, 16 Apr 2002, Dmitry Tkach wrote:

> > vacuum analyze;
>
> Yeah... I did that yesterday... It took about 24 hours (!) to run... And
> I don't see any difference in the behaviour :-(
>
> >
> > explain select * from a order by id limit 1;
>
>
> Aha... This is an interesting thing to try. Thanks.
> Unfortunately, I can't try that right now either! :-(
> I am running 'vacuum full' on that table (out of despare), and, despite
> what I have repeatedly heard about
> vaccums in 7.2, my 'explain' command seems to be hopelessly waiting on a
> lock, created by that vacuum
> :-(
> Is it supposed to be that way.

Unfortunately, yes.  VACUUM FULL is effectively the 7.1 and earlier
vacuum.  Normal vacuum wouldn't have helped you in this case since it
wouldn't compress the table (IIRC it only moves tuples within a block
rather than between them, so all the dead blocks at the beginning are
still there).

Re: Large table update/vacuum PLEASE HELP!

From

Dmitry Tkach

Date:

16 April 2002, 18:35:26

Tom Jenkins wrote:

>
>Just to be sure we all understand, you ran
>  vacuum analyze;
>
>you didn't just run vacuum or vacuum full
>
>the analyze part is important; it creates statistics for the query
>planner
>
Well... NO. I just did vacuum, not analyze (I was hoping to save the
'analyze ' part till after I create the indexes).

I don't think that statistics matters for this particular query plan,
which is nothing but 'seq scan on a'...
And it does come up with the correct plan (as if it had much choice
anyway :-), despite missing statistics...

I'll try to do 'analyze' (after my vacuum full finishes sometime
tomorrow) though, but doubt it will make any difference (unless, I am
missing something important in my inderstanding of the meaning of 'analyze')

Thanks!

Dima

Re: Large table update/vacuum PLEASE HELP!

From

Dmitry Tkach

Date:

16 April 2002, 18:44:40

Stephan Szabo wrote:

>
>Unfortunately, yes.  VACUUM FULL is effectively the 7.1 and earlier
>vacuum.  Normal vacuum wouldn't have helped you in this case since it
>wouldn't compress the table (IIRC it only moves tuples within a block
>rather than between them, so all the dead blocks at the beginning are
>still there).
>
Aha! This (finally) gives me some hope! If I read you correctly - are
you saying, that, once my 'vacuum full' is finished, I'm finally back in
business?
Thank you very much!

If you are still not tired of me, could you also give me an idea on the
extent of this problem, while you are at it? I mean, how bad it really is?
I will never update the whole table from, now on (I realize now, that
it's much cheaper to just recreate it)...
But in general, if I put this database into production, I am going to
routinely update about 10-15% of all the rows in that table every
month... How bad it is?
 Am I going to need a 'vacuum full' after every update? Or how often
would it be practical to do that?

Also, do you have any idea, why does vacuum take me so long (24 hours
sounds a little excessive, doesn't it)? I've got a decent machine (4
CPU, 750 MHz, 8 Gig of RAM), so hardware whouldn't be a problem...
Could it be something wrong with my configuration.

Here is my postgresql.conf (with all the comments removed - just the
stuff I changed from the default)...
Does anything look wrong to you here?

tcpip_socket = true
max_connections = 100
shared_buffers = 64000
max_locks_per_transaction = 640
wal_buffers = 80
sort_mem =    10240
wal_files = 64
checkpoint_segments = 20
checkpoint_timeout = 600
stats_command_string = true
stats_row_level = true
stats_block_level = true
deadlock_timeout = 300000

Thanks again for giving my hope back! :-)
I really appreciate your response!

Dima

Re: Large table update/vacuum PLEASE HELP!

From

Tom Lane

Date:

16 April 2002, 19:38:53

Dmitry Tkach <dmitry@openratings.com> writes:
> But in general, if I put this database into production, I am going to
> routinely update about 10-15% of all the rows in that table every
> month... How bad it is?
>  Am I going to need a 'vacuum full' after every update? Or how often
> would it be practical to do that?

I wouldn't recommend a VACUUM FULL at all.  Just do plain VACUUMs on
a regular basis, and accept the 10% or so storage overhead.

VACUUM FULL is good for the sort of situation where you've updated all
or most of the rows at one time, and now you have a factor-of-2 storage
overhead; you need to physically compact the table.  But the price of
doing that is high enough that I wouldn't do it to save 10-15%.

            regards, tom lane

Re: Large table update/vacuum PLEASE HELP!

From

Dima Tkach

Date:

17 April 2002, 10:00:54

Tom Lane wrote:

>
>I wouldn't recommend a VACUUM FULL at all.  Just do plain VACUUMs on
>a regular basis, and accept the 10% or so storage overhead.
>
>VACUUM FULL is good for the sort of situation where you've updated all
>or most of the rows at one time, and now you have a factor-of-2 storage
>overhead; you need to physically compact the table.  But the price of
>doing that is high enough that I wouldn't do it to save 10-15%.
>
>            regards, tom lane
>
I am not worried about storage overhead at all at this point, but rather
about performance degradation when it
has to scan through all those dead tuples in the table and there are
LOTS of them :-(

Thanks!

Dima

Re: Large table update/vacuum PLEASE HELP!

From

Stephan Szabo

Date:

17 April 2002, 12:35:30

On Wed, 17 Apr 2002, Dima Tkach wrote:

> Tom Lane wrote:
>
> >
> >I wouldn't recommend a VACUUM FULL at all.  Just do plain VACUUMs on
> >a regular basis, and accept the 10% or so storage overhead.
> >
> >VACUUM FULL is good for the sort of situation where you've updated all
> >or most of the rows at one time, and now you have a factor-of-2 storage
> >overhead; you need to physically compact the table.  But the price of
> >doing that is high enough that I wouldn't do it to save 10-15%.
> >
> >            regards, tom lane
> >
> I am not worried about storage overhead at all at this point, but rather
> about performance degradation when it
> has to scan through all those dead tuples in the table and there are
> LOTS of them :-(

In the 10% case, you should be within the realm where the table's steady
state size is around that much more with reasonable frequency normal
VACUUMs and an appropriately sized free space map.

Re: Large table update/vacuum PLEASE HELP!

From

Dmitry Tkach

Date:

17 April 2002, 12:39:22

>
>
>
>In the 10% case, you should be within the realm where the table's steady
>state size is around that much more with reasonable frequency normal
>VACUUMs and an appropriately sized free space map.
>
Are you saying that, if I, say, update 1000 tuples today, and another
1000 tomorow, it will reuse the today's dead tuples, and not create new
ones, so that I end up with just 1000 of them, not 2000?

Just making sure...

Thanks a lot!

Dima

Re: Large table update/vacuum PLEASE HELP!

From

Stephan Szabo

Date:

17 April 2002, 12:51:00

On Wed, 17 Apr 2002, Dmitry Tkach wrote:

> >In the 10% case, you should be within the realm where the table's steady
> >state size is around that much more with reasonable frequency normal
> >VACUUMs and an appropriately sized free space map.
> >
> Are you saying that, if I, say, update 1000 tuples today, and another
> 1000 tomorow, it will reuse the today's dead tuples, and not create new
> ones, so that I end up with just 1000 of them, not 2000?
>
> Just making sure...

The expectation is that if you update 1000 tuples today, do a normal
vacuum when no transaction is left that can see the old state of those
tuples, then update 1000 tuples tomorrow, it'll attempt to reuse as
much of that "dead" space as possible which may very well mean you
end up with 1200 of them say, but no less than 1000 and almost certainly
not 2000.

For 1000 that should work, for much larger numbers you may need to play
with settings to get an appropriate effect (you may see that as the number
updated grows in order of magnitude that the wasted space approaches 2x as
you the map of free space isn't large enough unless you up those
settings).

Re: Large table update/vacuum PLEASE HELP!

From

Dmitry Tkach

Date:

17 April 2002, 13:01:24

Stephan Szabo wrote:

>
>
>For 1000 that should work, for much larger numbers you may need to play
>with settings to get an appropriate effect (you may see that as the number
>updated grows in order of magnitude that the wasted space approaches 2x as
>you the map of free space isn't large enough unless you up those
>settings).
>
I am sorry, I am afraid, I don't quite understand this. What exactly are
those settings I need to play with?
And what is this 'map of free space'?

Thanks a lot!

Dima