Re: Is analyze_new_cluster.sh still useful? - Mailing list pgsql-hackers

From Jeff Janes
Subject Re: Is analyze_new_cluster.sh still useful?
Date
Msg-id CAMkU=1wOkAzO+s_68iU1DN+3ZkX3=SFrjzaEMZEwuKETUw9sGA@mail.gmail.com
Whole thread Raw
In response to Re: Is analyze_new_cluster.sh still useful?  (Andres Freund <andres@2ndquadrant.com>)
List pgsql-hackers
On Wed, Jun 18, 2014 at 10:58 AM, Andres Freund <andres@2ndquadrant.com> wrote:
> On 2014-06-18 13:54:16 -0400, Tom Lane wrote:

>> I think we're not on the same page.  My point is that someone might want
>> to automate the whole sequence: stop old postmaster, run pg_upgrade, start
>> the updated postmaster normally (hence it *is* open for business), kick
>> off the analyze runs.  If you're concerned about minimal downtime you
>> would not want to be waiting around for the admin to issue a perfectly
>> predictable series of commands.
>
> Oh, yea. Definitely. I think that's what I've seen happen in pretty much
> *all* usages of pg_upgrade.

I think it is a popular way to do it not because it is a particularly
good way, but because the better alternatives are not readily
available.

If your database needs statistics badly enough that you want to do a
coarse pre-pass with default_statistics_target=1, why would you want
that pass to be done on an open database?  Surely you don't want 100
open connections all doing giant seq scans (that should be single-row
look up, but without stats they are not) competing with the analyze.

Having a database which is "open" to queries but they have such
deranged execution plans that they never actually finish is not truly
open, and the attempts to service those futile queries just delays the
true opening even further.

If you really need a multi pass ANALYZE, you probably need the first
pass to be before the database opens because otherwise the open will
be a disaster, and the 2nd pass to be after the database opens but
before your bulk queries (mass deletes, EOM reports, etc.) kick in.
Having both passes be on the same "side" of the opening seems unlikely
to do much good for most use cases.  Fortunately it probably doesn't
do much harm to most people, either, simple because most databases are
not terribly sensitive to the issue.

Cheers,

Jeff



pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: pg_control is missing a field for LOBLKSIZE
Next
From: Robert Haas
Date:
Subject: Re: [COMMITTERS] pgsql: Reduce the number of semaphores used under --disable-spinlocks.