Thread: Questions about vacuum analyze

Questions about vacuum analyze

From
"Steven M. Wheeler"
Date:
System: Compaq ProLiant 3000, dual-336MHz Pentium II, 512MB RAM
O/S: Linux 2.2.6 (Slackware 4.0 dist)
DB: Postgresql 6.5.1
DB on 8GB 4-disk stripe-set using md.
DB with two tables, each with about 1M rows.
Current DB size about 1.5GB

I am having trouble deciding when to run vacuum analyze, how often, and
what state the system should be in when I run it.  I have found that if
I run it once a night, on a system that is quiet (no other processes
connected to the DB) it will still take several (4-8) hours.

Does anyone have recommendations regarding vacuum analyze?
Specifically:

1) Should it be run on a system that is quiet or will it run acceptably
with other processes accessing the DB?

2) How often should it be run?


Thanks,
--
Steven Wheeler
UNIX Engineering
(918) 292-4119




Re: [SQL] Questions about vacuum analyze

From
Tom Lane
Date:
"Steven M. Wheeler" <swheeler@sabre.com> writes:
> Does anyone have recommendations regarding vacuum analyze?
> Specifically:
> 1) Should it be run on a system that is quiet or will it run acceptably
> with other processes accessing the DB?

Vacuum will run fine, but it obtains an exclusive lock on each table
while it is working on that table.  You are likely to find your other
processes waiting for the lock...

> 2) How often should it be run?

How fast does your database change?

I'd doubt that running vacuum analyze, as opposed to a plain vacuum,
needs to be done very often --- it's unlikely that the statistics
vacuum analyze measures are changing that fast, especially not in
million-row tables.  The other function of vacuum is reclaiming space
from dead tuples, and the need for that depends on how often you update
or delete tuples.

I'm just guessing here, but a rule of thumb might be that it's worth
vacuuming when something like 20% of the tuples in your table are
dead, ie, the number of updates/deletes you've done since last vacuum
is about 20% of the table row count.

6.5 seems to have some performance problems with vacuuming large
indexes, BTW.  We are looking into that, but in the meantime you might
experiment with dropping indexes on a table, vacuum table, recreating
indexes to see if that's faster than a straight vacuum.
        regards, tom lane


Re: [SQL] Questions about vacuum analyze

From
"Steven M. Wheeler"
Date:
System: Compaq ProLiant 3000 with 2 300MHz PentPro, 512MB RAM, 32GB hardware
based stripe-set for the DB area.
OS: Linux kernel 2.2.12
DB: version 6.5

Thanks for the info, thought I'd send a followup regarding my ongoing
problems with vacuum.

After the last email, I vacuumed the DB a couple of times, with indexes
dropped.  The first time it completed in 2-3 hours, the next time was
appreciably longer.  After this I stopped doing anything except for inserts,
trying to catch up on the incoming data.  This went on for a little over a
week until I had inserted several million additional rows (its now up to
31M+).  I then dropped the indexes and started vacuum again.
Start: 09/28/1999@10:06:57
Finish: 09/30/1999@19:13:14

33 hours - WOW!

This morning, I rebuilt the indexes and tried to do a "select distinct
statdate from currnt;"  This select statement has been running for several
hours now.  The only output has been the following message, repeated twice:
"NOTICE:  BufFileRead: should have flushed after writing".  The system is
running at about 50% utilization.

I have checked everything I can think of, and at this point I have very
little hair left to loose.  What the heck am I doing wrong?!?!?!?

Have I mentioned how desperate I am;-}

--
Steven Wheeler
Mid-Range UNIX Engineering
Sabre Inc.
(918) 292-4119





Tom Lane wrote:

> "Steven M. Wheeler" <swheeler@sabre.com> writes:
> > Does anyone have recommendations regarding vacuum analyze?
> > Specifically:
> > 1) Should it be run on a system that is quiet or will it run acceptably
> > with other processes accessing the DB?
>
> Vacuum will run fine, but it obtains an exclusive lock on each table
> while it is working on that table.  You are likely to find your other
> processes waiting for the lock...
>
> > 2) How often should it be run?
>
> How fast does your database change?
>
> I'd doubt that running vacuum analyze, as opposed to a plain vacuum,
> needs to be done very often --- it's unlikely that the statistics
> vacuum analyze measures are changing that fast, especially not in
> million-row tables.  The other function of vacuum is reclaiming space
> from dead tuples, and the need for that depends on how often you update
> or delete tuples.
>
> I'm just guessing here, but a rule of thumb might be that it's worth
> vacuuming when something like 20% of the tuples in your table are
> dead, ie, the number of updates/deletes you've done since last vacuum
> is about 20% of the table row count.
>
> 6.5 seems to have some performance problems with vacuuming large
> indexes, BTW.  We are looking into that, but in the meantime you might
> experiment with dropping indexes on a table, vacuum table, recreating
> indexes to see if that's faster than a straight vacuum.
>
>                         regards, tom lane



Re: [SQL] Questions about vacuum analyze

From
Bruce Momjian
Date:
> System: Compaq ProLiant 3000 with 2 300MHz PentPro, 512MB RAM, 32GB hardware
> based stripe-set for the DB area.
> OS: Linux kernel 2.2.12
> DB: version 6.5
> 
> Thanks for the info, thought I'd send a followup regarding my ongoing
> problems with vacuum.
> 
> After the last email, I vacuumed the DB a couple of times, with indexes
> dropped.  The first time it completed in 2-3 hours, the next time was
> appreciably longer.  After this I stopped doing anything except for inserts,
> trying to catch up on the incoming data.  This went on for a little over a
> week until I had inserted several million additional rows (its now up to
> 31M+).  I then dropped the indexes and started vacuum again.
> Start: 09/28/1999@10:06:57
> Finish: 09/30/1999@19:13:14
> 
> 33 hours - WOW!

Yikes, that is a long time.

> 
> This morning, I rebuilt the indexes and tried to do a "select distinct
> statdate from currnt;"  This select statement has been running for several

SELECT DISTINCT    is going to take forever because it has to read all
rows, then sort them to remove the duplicates.  That could take some
time.  DISTINCT doesn't use the indexes, because if it did, it would be
paging in all over the place.  Better to read the table sequentially,
then sort.



--  Bruce Momjian                        |  http://www.op.net/~candle maillist@candle.pha.pa.us            |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: [SQL] Questions about vacuum analyze

From
Tom Lane
Date:
"Steven M. Wheeler" <swheeler@sabre.com> writes:
> DB: version 6.5
> 33 hours [to vacuum] - WOW!

Yipes.  We did find and fix a glitch in memory allocation in vacuum
(causing both slowness and excessive memory usage) between 6.5 and
6.5.2.  You might try updating to see if it's gotten any better...
        regards, tom lane


Re: [SQL] Questions about vacuum analyze

From
"Steven M. Wheeler"
Date:
Boy this is fun! ;-}

I have upgraded to 6.5.2 as you suggest. Binaries are built and I am now
re-running my vacuum just to give my self a benchmark starting point.

Just wanted to let you know that, for perl5, when I tried to run "make
test" after having built the perl kit, I get the following:

cobra:/opt/pgsql/postgresql-6.5.2/src/interfaces/perl5 $make test
PERL_DL_NONLAZY=1 /usr/bin/perl -Iblib/arch -Iblib/lib
-I/usr/lib/perl5/i386-linux -I/usr/lib/perl5 test.pl
Pg::conndefaults ........ ok
Pg::connectdb ........... ok
$conn->exec ............. ok
$conn->errorMessage ..... ok
pqReadData() -- backend closed the channel unexpectedly.       This probably means the backend terminated abnormally
  before or while processing the request.
 
make: *** [test_dynamic] Error 255

Am I better off re-writing my perl scripts in 'C'?  I've been thinking
about it in hopes of boosting the performance.

--
Steven Wheeler
UNIX Engineering
(918) 292-4119


Tom Lane wrote:

> "Steven M. Wheeler" <swheeler@sabre.com> writes:
> > DB: version 6.5
> > 33 hours [to vacuum] - WOW!
>
> Yipes.  We did find and fix a glitch in memory allocation in vacuum
> (causing both slowness and excessive memory usage) between 6.5 and
> 6.5.2.  You might try updating to see if it's gotten any better...
>
>                         regards, tom lane



Re: [SQL] Questions about vacuum analyze

From
"Steven M. Wheeler"
Date:
Tom,

I believe that 6.5.2 and a little space reclamation in my DB directory may
have taken care of the vacuum problem.  The last vacuum ran in about 10
minutes on an already vacuumed DB.  I am now loading another 400K inserts
and will perform the vacuum again at the conclusion.  A couple of questions
remain.

1) With 6.5.2 should I still drop the indexes prior to running vacuum?

2) Is there a command/script that will cleanup the temp files and previous
table files in the DB directory.  The reason I ask: I now have 3 copies of
my currnt table file (currnt, currnt.1, currnt.2) for a total usage of
about 3GB.  Since it appears that currnt.2 is the one being accessed, can I
safely delete currnt & currnt.1?

Thanks again for all the help!

--
Steven Wheeler
Mid-Range UNIX Engineering
Sabre Inc.
(918) 292-4119

Tom Lane wrote:

> "Steven M. Wheeler" <swheeler@sabre.com> writes:
> > DB: version 6.5
> > 33 hours [to vacuum] - WOW!
>
> Yipes.  We did find and fix a glitch in memory allocation in vacuum
> (causing both slowness and excessive memory usage) between 6.5 and
> 6.5.2.  You might try updating to see if it's gotten any better...
>
>                         regards, tom lane



Re: [SQL] Questions about vacuum analyze

From
Tom Lane
Date:
"Steven M. Wheeler" <swheeler@sabre.com> writes:
> I believe that 6.5.2 and a little space reclamation in my DB directory may
> have taken care of the vacuum problem.  The last vacuum ran in about 10
> minutes on an already vacuumed DB.

Ah, that's more like it...

> 1) With 6.5.2 should I still drop the indexes prior to running vacuum?

Probably, but I'm not sure.  Try it both ways and see.

> 2) Is there a command/script that will cleanup the temp files and previous
> table files in the DB directory.  The reason I ask: I now have 3 copies of
> my currnt table file (currnt, currnt.1, currnt.2) for a total usage of
> about 3GB.  Since it appears that currnt.2 is the one being accessed, can I
> safely delete currnt & currnt.1?

NO NO NO NO NO!!!!

Files named like that are NOT temp files!!  What they are are segments
of a large table.  We segment big tables into gigabyte-sized chunks
to avoid problems on systems that have an int32-related limit on the
size of individual files (which is most Unixes these days, I think).

If you see anything named like pg_tempNNN.NNN, then that really is
a temp file, and if it's not got a very recent mod time then it's
probably left over from a crashed backend.  Old temp files should be
safe to get rid of.  (IIRC, one of the components of the name is the
PID of the creating backend, so you can check to be sure that the
backend is no longer around if you want to be doubly sure.)
        regards, tom lane


Re: [SQL] Questions about vacuum analyze

From
"Steven M. Wheeler"
Date:
More information on the continuing vacuum saga:
The latest run after having processed an additional 40K+ msgs (4M+ inserts).
Start: 10/06/1999@07:28:07
complete: 10/06/1999@21:17:32
About 14 hours.
Two tables.
The currnt table has 39.5M+ rows in it.  Select count(*) took 14m41s
The history table only has 190K rows.

I am now running into another problem.  I need to do select distinct from the
currnt to get dates for maintenance activity (rows to summarize and move to
history).  When ever I do I get this message in my server.log file:

NOTICE:  BufFileRead: should have flushed after writing

And I get 0 records selected.  Can't find this message in the docs.  Any
suggestions?  Should I submit this on the SQL list?

Thanks, you're help is greatly appreciated!

--
Steven Wheeler
Mid-Range UNIX Engineering
Sabre Inc.
(918) 292-4119

Tom Lane wrote:

> "Steven M. Wheeler" <swheeler@sabre.com> writes:
> > I believe that 6.5.2 and a little space reclamation in my DB directory may
> > have taken care of the vacuum problem.  The last vacuum ran in about 10
> > minutes on an already vacuumed DB.
>
> Ah, that's more like it...
>
> > 1) With 6.5.2 should I still drop the indexes prior to running vacuum?
>
> Probably, but I'm not sure.  Try it both ways and see.
>
> > 2) Is there a command/script that will cleanup the temp files and previous
> > table files in the DB directory.  The reason I ask: I now have 3 copies of
> > my currnt table file (currnt, currnt.1, currnt.2) for a total usage of
> > about 3GB.  Since it appears that currnt.2 is the one being accessed, can I
> > safely delete currnt & currnt.1?
>
> NO NO NO NO NO!!!!
>
> Files named like that are NOT temp files!!  What they are are segments
> of a large table.  We segment big tables into gigabyte-sized chunks
> to avoid problems on systems that have an int32-related limit on the
> size of individual files (which is most Unixes these days, I think).
>
> If you see anything named like pg_tempNNN.NNN, then that really is
> a temp file, and if it's not got a very recent mod time then it's
> probably left over from a crashed backend.  Old temp files should be
> safe to get rid of.  (IIRC, one of the components of the name is the
> PID of the creating backend, so you can check to be sure that the
> backend is no longer around if you want to be doubly sure.)
>
>                         regards, tom lane



Re: [SQL] Questions about vacuum analyze

From
Tom Lane
Date:
"Steven M. Wheeler" <swheeler@sabre.com> writes:
> I am now running into another problem.  I need to do select distinct from the
> currnt to get dates for maintenance activity (rows to summarize and move to
> history).  When ever I do I get this message in my server.log file:

> NOTICE:  BufFileRead: should have flushed after writing

Hmph --- just looking at the code that issues that message, it looks
like it is a harmless gripe (it complains, but then goes and does the
missing step).  However, it's suggestive that there may be a logic error
in the sorting code that's calling the BufFileXXX routines.

> And I get 0 records selected.

And that's even more suggestive ;-)

This is going to be tough to debug if it takes a multi-gig database to
replicate.  But if I'm guessing right about where the problem is, it
probably can be replicated on smaller tables if you use a smaller -S
(sort memory limit) setting when starting the backend.  You can set -S
as small as 32 (kb), and having done that the bug might show up on tables
with a few hundred K of data.  Would you try that and let me know?

The easiest way to experiment is to set -S on a per-backend basis
via the PGOPTIONS environment variable, egsetenv PGOPTIONS "-S 32"psql ...
The backend started for this psql session will set S=32 instead of
whatever the prevailing default is.
        regards, tom lane


Re: [SQL] Questions about vacuum analyze

From
Tom Lane
Date:
"Steven M. Wheeler" <swheeler@sabre.com> writes:
> I am now running into another problem.  I need to do select distinct from the
> currnt to get dates for maintenance activity (rows to summarize and move to
> history).  When ever I do I get this message in my server.log file:

> NOTICE:  BufFileRead: should have flushed after writing

> And I get 0 records selected.

After some code browsing I have come up with a theory: is it possible
that you are running out of disk space during the sort?  I see psort.c
neglects to check for write failure :-(, although I am not entirely
clear on why that particular message and nothing else would come out
after it bulled ahead with an incomplete temporary file.

Sorting a large table seems to require temp disk space equal to several
times the size of the table.  Since you said your table is 40M rows,
I can easily believe you don't have that much free space...

I don't believe anyone has taken a hard look at psort.c in a long time.
I will try to dive into it and see if its resource consumption can't
be reduced.  Mergesort inherently requires 2X the original data space,
but it seems that psort is using even more than that, and perhaps this
can be improved.
        regards, tom lane


Re: [SQL] Questions about vacuum analyze

From
Tom Lane
Date:
I wrote:
> "Steven M. Wheeler" <swheeler@sabre.com> writes:
>> I am now running into another problem.  I need to do select distinct from the
>> currnt to get dates for maintenance activity (rows to summarize and move to
>> history).  When ever I do I get this message in my server.log file:
>>
>> NOTICE:  BufFileRead: should have flushed after writing
>>
>> And I get 0 records selected.

> After some code browsing I have come up with a theory: is it possible
> that you are running out of disk space during the sort?  I see psort.c
> neglects to check for write failure :-(, although I am not entirely
> clear on why that particular message and nothing else would come out
> after it bulled ahead with an incomplete temporary file.

> Sorting a large table seems to require temp disk space equal to several
> times the size of the table.  Since you said your table is 40M rows,
> I can easily believe you don't have that much free space...

A further thought: 40M rows could well be more than 4gig of data.
Although Postgres supports tables that large (by segmenting them into
1gig-apiece files to avoid OS limitations on file size), we currently
do not cope with temporary sort files that exceed 4 gig, which means
a "select distinct" on that much data will fail regardless of whether
you have enough free disk space :-(.  We have a TODO item to fix this.

After looking at psort.c, I see a bunch of other shortcomings, one being
that it's using a polyphase merge algorithm --- which was hot stuff back
when people did this sort of thing on magnetic tape drives, but it's
less than appropriate for sorting on disk.  The worst problem with it is
that the space consumption is about 4x the actual data volume, which is
not too cool when you're talking about a huge file to begin with...

Meanwhile, there is a completely separate implementation of external
sorting over in nbtsort.c (where it's only used for CREATE INDEX,
AFAICS), with a somewhat different set of deficiencies.

I think I'll add "rewrite sort code from the ground up" to my to-do
list ;-)
        regards, tom lane


Re: [SQL] Questions about vacuum analyze

From
Bruce Momjian
Date:
> A further thought: 40M rows could well be more than 4gig of data.
> Although Postgres supports tables that large (by segmenting them into
> 1gig-apiece files to avoid OS limitations on file size), we currently
> do not cope with temporary sort files that exceed 4 gig, which means
> a "select distinct" on that much data will fail regardless of whether
> you have enough free disk space :-(.  We have a TODO item to fix this.
> 
> After looking at psort.c, I see a bunch of other shortcomings, one being
> that it's using a polyphase merge algorithm --- which was hot stuff back
> when people did this sort of thing on magnetic tape drives, but it's
> less than appropriate for sorting on disk.  The worst problem with it is
> that the space consumption is about 4x the actual data volume, which is
> not too cool when you're talking about a huge file to begin with...
> 
> Meanwhile, there is a completely separate implementation of external
> sorting over in nbtsort.c (where it's only used for CREATE INDEX,
> AFAICS), with a somewhat different set of deficiencies.
> 
> I think I'll add "rewrite sort code from the ground up" to my to-do
> list ;-)

I have on TODO:
Make index creation use psort code, because it is now faster(Vadim)

I didn't know sorting algorithms for tape and disk had different
optimizations.  I figured the paging in of disk blocks had a similar
penalty to tape rewinding.  None of us really knows a lot about the best
algorithm for that job.  Nice you recognized it.

--  Bruce Momjian                        |  http://www.op.net/~candle maillist@candle.pha.pa.us            |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: [SQL] Questions about vacuum analyze

From
Tom Lane
Date:
Bruce Momjian <maillist@candle.pha.pa.us> writes:
> I didn't know sorting algorithms for tape and disk had different
> optimizations.  I figured the paging in of disk blocks had a similar
> penalty to tape rewinding.  None of us really knows a lot about the best
> algorithm for that job.  Nice you recognized it.

I didn't know much about it either until the comments in psort.c led me
to read the relevant parts of Knuth.  Tape rewind and disk seek times
are not remotely comparable, especially when you're talking about N tape
units versus one disk unit --- using more files (tapes) to minimize
rewind time can be a big win on tapes, but put those same files on a
disk and you're just spreading the seek time around differently.  More
to the point, though, tape-oriented algorithms tend to assume that tape
space is free.  Polyphase merge doesn't care in the least that it has
several copies of any given record laying about on different tapes, only
one of which is of interest anymore.

I know how to get the sort space requirement down to 2x the actual data
volume (just use a balanced merge instead of the "smarter" polyphase)
but I am thinking about ways to get it down to data volume + a few
percent with an extra layer of bookkeeping.  The trick is to release
and recycle space as soon as we have read in the tuples stored in it...
        regards, tom lane


Re: [SQL] Questions about vacuum analyze

From
Bruce Momjian
Date:
> I didn't know much about it either until the comments in psort.c led me
> to read the relevant parts of Knuth.  Tape rewind and disk seek times

Yes, that's what I did too.

> are not remotely comparable, especially when you're talking about N tape
> units versus one disk unit --- using more files (tapes) to minimize
> rewind time can be a big win on tapes, but put those same files on a
> disk and you're just spreading the seek time around differently.  More
> to the point, though, tape-oriented algorithms tend to assume that tape
> space is free.  Polyphase merge doesn't care in the least that it has
> several copies of any given record laying about on different tapes, only
> one of which is of interest anymore.
> 
> I know how to get the sort space requirement down to 2x the actual data
> volume (just use a balanced merge instead of the "smarter" polyphase)
> but I am thinking about ways to get it down to data volume + a few
> percent with an extra layer of bookkeeping.  The trick is to release
> and recycle space as soon as we have read in the tuples stored in it...

The thing I liked about the existing algorithm is that it did the
sorting without the page faulting/thrashing caused by many sort
algorithms.  I can see better space reuse as being a big win because we
would get even less page faulting because the replaced page would
already be in memory.

--  Bruce Momjian                        |  http://www.op.net/~candle maillist@candle.pha.pa.us            |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: [SQL] Questions about vacuum analyze

From
Tom Lane
Date:
Bruce Momjian <maillist@candle.pha.pa.us> writes:
> The thing I liked about the existing algorithm is that it did the
> sorting without the page faulting/thrashing caused by many sort
> algorithms.

Yes, I think we want to stick with the basic architecture of an
initial replacement-selection filter operating in limited memory,
followed by merge passes.  I'm just thinking about how to reduce
the amount of disk space chewed up by intermediate merge results...
        regards, tom lane


Re: [SQL] Questions about vacuum analyze

From
Bruce Momjian
Date:
> Bruce Momjian <maillist@candle.pha.pa.us> writes:
> > The thing I liked about the existing algorithm is that it did the
> > sorting without the page faulting/thrashing caused by many sort
> > algorithms.
> 
> Yes, I think we want to stick with the basic architecture of an
> initial replacement-selection filter operating in limited memory,
> followed by merge passes.  I'm just thinking about how to reduce
> the amount of disk space chewed up by intermediate merge results...

Wonder how our create index does it.  Seems it is must be similar.

--  Bruce Momjian                        |  http://www.op.net/~candle maillist@candle.pha.pa.us            |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: [SQL] Questions about vacuum analyze

From
Tom Lane
Date:
Bruce Momjian <maillist@candle.pha.pa.us> writes:
> Wonder how our create index does it.  Seems it is must be similar.

CREATE INDEX (for btrees) is very similar, and really ought to share
code.  Someone apparently didn't want to figure out how to generalize
psort.c to handle index tuples, though.  The CREATE INDEX code is a
little better for large amounts of data but a lot worse for small
amounts --- AFAICT it doesn't take any shortcuts, even when everything
fits in memory.  (I'm sure you've noticed that CREATE INDEX hits the
disk pretty hard even when the source table is empty :-(.)

I'm planning to merge the two sets of code and keep the best features of
both.  I already have some first-draft code that allows them to work
with >2gig data sets.
        regards, tom lane


Re: [SQL] Questions about vacuum analyze

From
Bruce Momjian
Date:
> Bruce Momjian <maillist@candle.pha.pa.us> writes:
> > Wonder how our create index does it.  Seems it is must be similar.
> 
> CREATE INDEX (for btrees) is very similar, and really ought to share
> code.  Someone apparently didn't want to figure out how to generalize
> psort.c to handle index tuples, though.  The CREATE INDEX code is a
> little better for large amounts of data but a lot worse for small
> amounts --- AFAICT it doesn't take any shortcuts, even when everything
> fits in memory.  (I'm sure you've noticed that CREATE INDEX hits the
> disk pretty hard even when the source table is empty :-(.)
> 
> I'm planning to merge the two sets of code and keep the best features of
> both.  I already have some first-draft code that allows them to work
> with >2gig data sets.

Great.

--  Bruce Momjian                        |  http://www.op.net/~candle maillist@candle.pha.pa.us            |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: [SQL] Questions about vacuum analyze

From
"Steven M. Wheeler"
Date:
Tom,

I tried your suggestion for -S 32 through -S 1 running a select distinct against
my history table.  I normally run with -S 2048 on the startup of the master
process.  FYI: the history table uses the same layout as the currnt table, its
just an hourly summation of the minute by minute data in currnt.  history has
189,724 records.

Bottom line: The select never failed.

Thanks

--
Steven Wheeler
Mid-Range UNIX Engineering
Sabre Inc.
(918) 292-4119

Tom Lane wrote:

> "Steven M. Wheeler" <swheeler@sabre.com> writes:
> > I am now running into another problem.  I need to do select distinct from the
> > currnt to get dates for maintenance activity (rows to summarize and move to
> > history).  When ever I do I get this message in my server.log file:
>
> > NOTICE:  BufFileRead: should have flushed after writing
>
> Hmph --- just looking at the code that issues that message, it looks
> like it is a harmless gripe (it complains, but then goes and does the
> missing step).  However, it's suggestive that there may be a logic error
> in the sorting code that's calling the BufFileXXX routines.
>
> > And I get 0 records selected.
>
> And that's even more suggestive ;-)
>
> This is going to be tough to debug if it takes a multi-gig database to
> replicate.  But if I'm guessing right about where the problem is, it
> probably can be replicated on smaller tables if you use a smaller -S
> (sort memory limit) setting when starting the backend.  You can set -S
> as small as 32 (kb), and having done that the bug might show up on tables
> with a few hundred K of data.  Would you try that and let me know?
>
> The easiest way to experiment is to set -S on a per-backend basis
> via the PGOPTIONS environment variable, eg
>         setenv PGOPTIONS "-S 32"
>         psql ...
> The backend started for this psql session will set S=32 instead of
> whatever the prevailing default is.
>
>                         regards, tom lane



Re: [SQL] Questions about vacuum analyze

From
Tom Lane
Date:
"Steven M. Wheeler" <swheeler@sabre.com> writes:
> I tried your suggestion for -S 32 through -S 1 running a select
> distinct against my history table.  I normally run with -S 2048 on the
> startup of the master process.  FYI: the history table uses the same
> layout as the currnt table, its just an hourly summation of the minute
> by minute data in currnt.  history has 189,724 records.
> Bottom line: The select never failed.

My current theory is that you were running out of disk space or else
running up against a 4-gig-per-temp-file limit in the sort that's done
for SELECT DISTINCT.  So -S wouldn't really affect matters, and testing
on a smaller table definitely won't prove anything one way or the other.

BTW, I believe -S 32 is the smallest value the backend will actually
accept --- it just ignores any attempt to set a smaller value.
        regards, tom lane


Re: [SQL] Questions about vacuum analyze

From
"Steven M. Wheeler"
Date:
Space taken up by entire DB:  6.7GB
Space available on filesystem: 9.2GB

Space taken by currnt table: 3.7GB
Space taken by currnt index: 2.5GB

Fluctuation of available space during a select distinct on the currnt table:
max: 9.2GB, min: 4.1GB

--
Steven Wheeler
Mid-Range UNIX Engineering
Sabre Inc.
(918) 292-4119

Tom Lane wrote:

> "Steven M. Wheeler" <swheeler@sabre.com> writes:
> > I am now running into another problem.  I need to do select distinct from the
> > currnt to get dates for maintenance activity (rows to summarize and move to
> > history).  When ever I do I get this message in my server.log file:
>
> > NOTICE:  BufFileRead: should have flushed after writing
>
> > And I get 0 records selected.
>
> After some code browsing I have come up with a theory: is it possible
> that you are running out of disk space during the sort?  I see psort.c
> neglects to check for write failure :-(, although I am not entirely
> clear on why that particular message and nothing else would come out
> after it bulled ahead with an incomplete temporary file.
>
> Sorting a large table seems to require temp disk space equal to several
> times the size of the table.  Since you said your table is 40M rows,
> I can easily believe you don't have that much free space...
>
> I don't believe anyone has taken a hard look at psort.c in a long time.
> I will try to dive into it and see if its resource consumption can't
> be reduced.  Mergesort inherently requires 2X the original data space,
> but it seems that psort is using even more than that, and perhaps this
> can be improved.
>
>                         regards, tom lane



Re: [SQL] Questions about vacuum analyze

From
Tom Lane
Date:
"Steven M. Wheeler" <swheeler@sabre.com> writes:
> Space taken up by entire DB:  6.7GB
> Space available on filesystem: 9.2GB
> Space taken by currnt table: 3.7GB
> Space taken by currnt index: 2.5GB
> Fluctuation of available space during a select distinct on the currnt table:
> max: 9.2GB, min: 4.1GB

OK, so you're not running out of space, but it could easy be that you
are hitting the limit on temp file size --- which is likely either 2 or
4Gb depending on whether your OS uses unsigned arithmetic for file
offsets.

Can you watch the pg_sorttempNNNN.NN files in the database directory and
see how large they get during the sort?  There should be seven of 'em
generated by the sorting process, and at least one needs to grow to
reach the same size as the table (more or less).

BTW, I have some code which solves this problem by splitting large
temporary sort files into segments, just as we do for the tables
themselves ... but it's not even well-tested enough to commit to current
yet, let alone backpatch into 6.5.* ;-)
        regards, tom lane