Thread: Allow vacuumdb to only analyze

Allow vacuumdb to only analyze

From
decibel
Date:
One of the talks at PGCon (update in place?) recommended running
vacuumdb -z to analyze all tables to rebuild statistics. Problem with
that is it also vacuums everything. ISTM it'd be useful to be able to
just vacuum all databases in a cluster, so I hacked it into vacuumdb.

Of course, using a command called vacuumdb is rather silly, but I
don't see a reasonable way to deal with that. I did change the name
of the functions from vacuum_* to process_*, since they can vacuum
and/or analyze.

The only thing I see missing is the checks for invalid combinations
of options, which I'm thinking should go in the function rather than
in the option parsing section. But I didn't want to put any more
effort into this if it's not something we actually want.
--
Decibel!, aka Jim C. Nasby, Database Architect  decibel@decibel.org
Give your computer some brain candy! www.distributed.net Team #1828



Attachment

Re: Allow vacuumdb to only analyze

From
Robert Haas
Date:
On Sat, May 23, 2009 at 10:31 PM, decibel <decibel@decibel.org> wrote:
> One of the talks at PGCon (update in place?) recommended running vacuumdb -z
> to analyze all tables to rebuild statistics. Problem with that is it also
> vacuums everything. ISTM it'd be useful to be able to just vacuum all
> databases in a cluster, so I hacked it into vacuumdb.

I think you meant "ISTM it'd be useful to be able to just analyze all
databases in a cluster".

> Of course, using a command called vacuumdb is rather silly, but I don't see
> a reasonable way to deal with that. I did change the name of the functions
> from vacuum_* to process_*, since they can vacuum and/or analyze.
>
> The only thing I see missing is the checks for invalid combinations of
> options, which I'm thinking should go in the function rather than in the
> option parsing section. But I didn't want to put any more effort into this
> if it's not something we actually want.

It does seem somewhat useful to be able to analyze all databases
easily from the command-line, but putting it into vacuumdb is
certainly a hack.  (By the way, we don't allow C++ style comments.)

I wonder if we ought not to find a way to make pg_migrator
automatically do some of these things after starting up the database.
Given autovacuum, it should be a pretty rare thing to need to manually
analyze every database in the cluster, so instead of building a
general tool to do this, it might make more sense to make it happen
automatically in the one case where we know it's necessary.

I noticed in Bruce's talk that there are a number of post-migration
steps which are currently partially manual.  Ideally we'd like to
automate them all, preferably in some sort of well-thought-out order.
I actually suspect this is something like: analyze each database,
reindex those indices invalidated by the upgrade, analyze each
database again.  Ideally we'd like to have some control over the
degree of parallelism here too but that might be asking too much for
8.4.

...Robert


Re: Allow vacuumdb to only analyze

From
"Kevin Grittner"
Date:
Robert Haas <robertmhaas@gmail.com> wrote:
> I noticed in Bruce's talk that there are a number of post-migration
> steps which are currently partially manual.  Ideally we'd like to
> automate them all, preferably in some sort of well-thought-out
order.
> I actually suspect this is something like: analyze each database,
> reindex those indices invalidated by the upgrade, analyze each
> database again.
We have found it useful to VACUUM FREEZE ANALYZE a converted database.
The first access to any page will cause writing of hint bits, and
we'd rather deal with that before we let the users in, to avoid having
sluggish performance for them while that happens.  The FREEZE part is
to avoid a freeze of all subsequently untouched data in all tables at
some unpredictable time -- potentially in the middle of a busy
workday.  The usual argument against aggressive freezing (that
forensic information useful in recovery of a corrupted database)
doesn't carry much weight right after a conversion.
-Kevin


Re: Allow vacuumdb to only analyze

From
decibel
Date:
On May 23, 2009, at 9:51 PM, Robert Haas wrote:
>> vacuums everything. ISTM it'd be useful to be able to just vacuum all
>> databases in a cluster, so I hacked it into vacuumdb.
>
> I think you meant "ISTM it'd be useful to be able to just analyze all
> databases in a cluster".

Heh. "Oops".

>> Of course, using a command called vacuumdb is rather silly, but I  
>> don't see
>> a reasonable way to deal with that. I did change the name of the  
>> functions
>> from vacuum_* to process_*, since they can vacuum and/or analyze.
>>
>> The only thing I see missing is the checks for invalid  
>> combinations of
>> options, which I'm thinking should go in the function rather than  
>> in the
>> option parsing section. But I didn't want to put any more effort  
>> into this
>> if it's not something we actually want.
>
> It does seem somewhat useful to be able to analyze all databases
> easily from the command-line, but putting it into vacuumdb is
> certainly a hack.
So... do we want a completely separate analyzedb command? That seems  
like far overkill.

Arguably there are yet other things you'd want to do across an entire  
cluster, so perhaps what we really want is a 'clusterrun' or  
'clustercmd' command?

> (By the way, we don't allow C++ style comments.)
Yeah, was being lazy since they're just temporary TODOs.

> I wonder if we ought not to find a way to make pg_migrator
> automatically do some of these things after starting up the database.
Sure, pg_migrator is what started this, but it's completely  
orthogonal to the lack of a "analyze everything" command.
-- 
Decibel!, aka Jim C. Nasby, Database Architect  decibel@decibel.org
Give your computer some brain candy! www.distributed.net Team #1828




Re: Allow vacuumdb to only analyze

From
decibel
Date:
On May 27, 2009, at 11:31 AM, decibel wrote:
>> It does seem somewhat useful to be able to analyze all databases
>> easily from the command-line, but putting it into vacuumdb is
>> certainly a hack.
> So... do we want a completely separate analyzedb command? That  
> seems like far overkill.
>
> Arguably there are yet other things you'd want to do across an  
> entire cluster, so perhaps what we really want is a 'clusterrun' or  
> 'clustercmd' command?


No one else has commented, so I'm guessing that means no one is  
opposed to allowing for vacuumdb to just analyze. If anyone else  
objects to this please speak up before I put the final touches on the  
patch...
-- 
Decibel!, aka Jim C. Nasby, Database Architect  decibel@decibel.org
Give your computer some brain candy! www.distributed.net Team #1828




Re: Allow vacuumdb to only analyze

From
Bruce Momjian
Date:
Robert Haas wrote:
> I noticed in Bruce's talk that there are a number of post-migration
> steps which are currently partially manual.  Ideally we'd like to
> automate them all, preferably in some sort of well-thought-out order.
> I actually suspect this is something like: analyze each database,
> reindex those indices invalidated by the upgrade, analyze each
> database again.  Ideally we'd like to have some control over the
> degree of parallelism here too but that might be asking too much for
> 8.4.

I can easily have pg_migrator run those scripts itself but I pushed it
on to the administrator so pg_migrator could finish and they could
decide when to run those scripts.  For example, there might only be
issues in a few databases and the other database could be used fully
while the upgrade scripts are running.  The same hold for analyzing the
cluster --- anything I thought might take a while I gave to the
administrators.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +


Re: Allow vacuumdb to only analyze

From
Bruce Momjian
Date:
decibel wrote:
> One of the talks at PGCon (update in place?) recommended running  
> vacuumdb -z to analyze all tables to rebuild statistics. Problem with  
> that is it also vacuums everything. ISTM it'd be useful to be able to  
> just vacuum all databases in a cluster, so I hacked it into vacuumdb.
> 
> Of course, using a command called vacuumdb is rather silly, but I  
> don't see a reasonable way to deal with that. I did change the name  
> of the functions from vacuum_* to process_*, since they can vacuum  
> and/or analyze.
> 
> The only thing I see missing is the checks for invalid combinations  
> of options, which I'm thinking should go in the function rather than  
> in the option parsing section. But I didn't want to put any more  
> effort into this if it's not something we actually want.

This is implemented in 9.0 from vacuumdb:
  -Z, --analyze-only              only update optimizer hints

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.comPG East:  http://www.enterprisedb.com/community/nav-pg-east-2010.do + If your life is a hard
drive,Christ can be your backup. +
 


Re: Allow vacuumdb to only analyze

From
Jaime Casanova
Date:
On Thu, Feb 25, 2010 at 5:09 PM, Bruce Momjian <bruce@momjian.us> wrote:
>
> This is implemented in 9.0 from vacuumdb:
>
>          -Z, --analyze-only              only update optimizer hints
>

maybe just noise, but it's not better to say "optimizer statistics"
instead of "optimizer hints"?

--
Atentamente,
Jaime Casanova
Soporte y capacitación de PostgreSQL
Asesoría y desarrollo de sistemas
Guayaquil - Ecuador
Cel. +59387171157


Re: Allow vacuumdb to only analyze

From
Bruce Momjian
Date:
Jaime Casanova wrote:
> On Thu, Feb 25, 2010 at 5:09 PM, Bruce Momjian <bruce@momjian.us> wrote:
> >
> > This is implemented in 9.0 from vacuumdb:
> >
> > ? ? ? ? ?-Z, --analyze-only ? ? ? ? ? ? ?only update optimizer hints
> >
>
> maybe just noise, but it's not better to say "optimizer statistics"
> instead of "optimizer hints"?

Wow, I never noticed that but --analyze used "hints" too, and in 8.4 as
well.  I have updated it to call it "statistics" in the attached patch.
The manual page does not call them hints.

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com
  PG East:  http://www.enterprisedb.com/community/nav-pg-east-2010.do
  + If your life is a hard drive, Christ can be your backup. +
Index: src/bin/scripts/vacuumdb.c
===================================================================
RCS file: /cvsroot/pgsql/src/bin/scripts/vacuumdb.c,v
retrieving revision 1.36
diff -c -c -r1.36 vacuumdb.c
*** src/bin/scripts/vacuumdb.c    26 Feb 2010 02:01:20 -0000    1.36
--- src/bin/scripts/vacuumdb.c    26 Feb 2010 04:13:42 -0000
***************
*** 336,343 ****
      printf(_("  -q, --quiet                     don't write any messages\n"));
      printf(_("  -t, --table='TABLE[(COLUMNS)]'  vacuum specific table only\n"));
      printf(_("  -v, --verbose                   write a lot of output\n"));
!     printf(_("  -z, --analyze                   update optimizer hints\n"));
!     printf(_("  -Z, --analyze-only              only update optimizer hints\n"));
      printf(_("  --help                          show this help, then exit\n"));
      printf(_("  --version                       output version information, then exit\n"));
      printf(_("\nConnection options:\n"));
--- 336,343 ----
      printf(_("  -q, --quiet                     don't write any messages\n"));
      printf(_("  -t, --table='TABLE[(COLUMNS)]'  vacuum specific table only\n"));
      printf(_("  -v, --verbose                   write a lot of output\n"));
!     printf(_("  -z, --analyze                   update optimizer statistics\n"));
!     printf(_("  -Z, --analyze-only              only update optimizer statistics\n"));
      printf(_("  --help                          show this help, then exit\n"));
      printf(_("  --version                       output version information, then exit\n"));
      printf(_("\nConnection options:\n"));