Thread: autovacuum launcher using InitPostgres

autovacuum launcher using InitPostgres

From
Alvaro Herrera
Date:
Hi,

This seems to be the last holdup for flatfiles.c.  This patch removes
usage of that code in autovacuum launcher, instead making it go through
the most basic relcache initialization so that it is able to read
pg_database.

To this end, InitPostgres has been split in two.  The launcher only
calls the first half, the rest of the callers have been patched to
invoke the second half.

The only thing I'm aware is missing from this patch is fixing up
avlauncher's signal handling, and adding a bit more commentary; also I
haven't tested it under EXEC_BACKEND yet.

Comments?

--
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

Attachment

Re: autovacuum launcher using InitPostgres

From
Tom Lane
Date:
Alvaro Herrera <alvherre@commandprompt.com> writes:
> To this end, InitPostgres has been split in two.  The launcher only
> calls the first half, the rest of the callers have been patched to
> invoke the second half.

This just seems truly messy :-(.  Let me see if I can find something
cleaner.

BTW, is it *really* the case that the AV launcher won't need
RecentGlobalXmin?  The way the HOT stuff works, I think anything that
examines heap pages at all had better have that set.
        regards, tom lane


Re: autovacuum launcher using InitPostgres

From
Alvaro Herrera
Date:
Tom Lane wrote:
> Alvaro Herrera <alvherre@commandprompt.com> writes:
> > To this end, InitPostgres has been split in two.  The launcher only
> > calls the first half, the rest of the callers have been patched to
> > invoke the second half.
> 
> This just seems truly messy :-(.  Let me see if I can find something
> cleaner.

I was considering having InitPostgres be an umbrella function, so that
extant callers stay as today, but the various underlying pieces are
skipped depending on who's calling.  For example I didn't like the bit
about starting a transaction or not depending on whether it was the
launcher.

> BTW, is it *really* the case that the AV launcher won't need
> RecentGlobalXmin?  The way the HOT stuff works, I think anything that
> examines heap pages at all had better have that set.

Ugh.  I forgot about that.

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.


Re: autovacuum launcher using InitPostgres

From
Tom Lane
Date:
Alvaro Herrera <alvherre@commandprompt.com> writes:
> Tom Lane wrote:
>> This just seems truly messy :-(.  Let me see if I can find something
>> cleaner.

> I was considering having InitPostgres be an umbrella function, so that
> extant callers stay as today, but the various underlying pieces are
> skipped depending on who's calling.  For example I didn't like the bit
> about starting a transaction or not depending on whether it was the
> launcher.

Yeah.  If you have InitPostgres know that much about the AV launcher's
requirements, it's not clear why it shouldn't just know everything.
Having it return with the initial transaction still open just seems
completely horrid.

>> BTW, is it *really* the case that the AV launcher won't need
>> RecentGlobalXmin?  The way the HOT stuff works, I think anything that
>> examines heap pages at all had better have that set.

> Ugh.  I forgot about that.

We could possibly put
if (IsAutovacuumLauncher()){    CommitTransactionCommand();    return;}

right after the RelationCacheInitializePhase2 call.  No uglier than
what's here, for sure.

While I was looking at this I wondered whether
RelationCacheInitializePhase2 really needs to be inside the startup
transaction at all.  I think it could probably be moved up before
that.  However, if the AV launcher has to do GetTransactionSnapshot
then it's not clear that improves matters anyway. 
        regards, tom lane


Re: autovacuum launcher using InitPostgres

From
Alvaro Herrera
Date:
Tom Lane wrote:
> Alvaro Herrera <alvherre@commandprompt.com> writes:
> > Tom Lane wrote:
> >> This just seems truly messy :-(.  Let me see if I can find something
> >> cleaner.
>
> > I was considering having InitPostgres be an umbrella function, so that
> > extant callers stay as today, but the various underlying pieces are
> > skipped depending on who's calling.  For example I didn't like the bit
> > about starting a transaction or not depending on whether it was the
> > launcher.
>
> Yeah.  If you have InitPostgres know that much about the AV launcher's
> requirements, it's not clear why it shouldn't just know everything.
> Having it return with the initial transaction still open just seems
> completely horrid.

How about this?

> While I was looking at this I wondered whether
> RelationCacheInitializePhase2 really needs to be inside the startup
> transaction at all.  I think it could probably be moved up before
> that.  However, if the AV launcher has to do GetTransactionSnapshot
> then it's not clear that improves matters anyway.

Well, the difference is that the initial transaction would be a few
microsec shorter ... not sure if that matters.

--
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

Attachment

Re: autovacuum launcher using InitPostgres

From
Heikki Linnakangas
Date:
Tom Lane wrote:
> Alvaro Herrera <alvherre@commandprompt.com> writes:
>> Tom Lane wrote:
>>> This just seems truly messy :-(.  Let me see if I can find something
>>> cleaner.

I quite like the idea of splitting initialization into two phases, one
that let's you access shared catalogs, and one to bind to a database. I
didn't look into the details, though.

>> I was considering having InitPostgres be an umbrella function, so that
>> extant callers stay as today, but the various underlying pieces are
>> skipped depending on who's calling.  For example I didn't like the bit
>> about starting a transaction or not depending on whether it was the
>> launcher.
> 
> Yeah.  If you have InitPostgres know that much about the AV launcher's
> requirements, it's not clear why it shouldn't just know everything.
> Having it return with the initial transaction still open just seems
> completely horrid.

Yeah, that sounds messy. Can AV launcher simply open a 2nd initial
transaction?


--  Heikki Linnakangas EnterpriseDB   http://www.enterprisedb.com


Re: autovacuum launcher using InitPostgres

From
Alvaro Herrera
Date:
Heikki Linnakangas wrote:
> Tom Lane wrote:
> > Alvaro Herrera <alvherre@commandprompt.com> writes:
> >> Tom Lane wrote:
> >>> This just seems truly messy :-(.  Let me see if I can find something
> >>> cleaner.
> 
> I quite like the idea of splitting initialization into two phases, one
> that let's you access shared catalogs, and one to bind to a database. I
> didn't look into the details, though.

The problem is that it only gives you access to pg_database, because the
other shared catalogs require more relcache initialization than we do.
So I'm not sure it'd be useful for anything else.

> >> I was considering having InitPostgres be an umbrella function, so that
> >> extant callers stay as today, but the various underlying pieces are
> >> skipped depending on who's calling.  For example I didn't like the bit
> >> about starting a transaction or not depending on whether it was the
> >> launcher.
> > 
> > Yeah.  If you have InitPostgres know that much about the AV launcher's
> > requirements, it's not clear why it shouldn't just know everything.
> > Having it return with the initial transaction still open just seems
> > completely horrid.
> 
> Yeah, that sounds messy. Can AV launcher simply open a 2nd initial
> transaction?

The main body of avlauncher opens a new transaction whenever it needs
one.  The problem is that the transaction in InitPostgres is closed in
the second half -- the code I had was skipping StartTransactionCommand
in the launcher case, but as Tom says that was wrong because it was
failing to set RecentGlobalXmin.

If we're looking at simplifing InitPostgres, one thing we could do is
separate the part for the bootstrap process, which is like 10% of the
work for a regular backend ...

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


Re: autovacuum launcher using InitPostgres

From
Tom Lane
Date:
Alvaro Herrera <alvherre@commandprompt.com> writes:
> How about this?

I think the accounting for the AV launcher in shmem allocation is a bit
confused yet --- for instance, isn't MaxBackends already including the
launcher?  I wonder if it would be cleaner to include the launcher in
the autovacuum_max_workers parameter, and increase the min/default
values of that by one.

>> While I was looking at this I wondered whether
>> RelationCacheInitializePhase2 really needs to be inside the startup
>> transaction at all.  I think it could probably be moved up before
>> that.  However, if the AV launcher has to do GetTransactionSnapshot
>> then it's not clear that improves matters anyway. 

> Well, the difference is that the initial transaction would be a few
> microsec shorter ... not sure if that matters.

I can't see how it would.  At the point where that runs we'd not be
holding any locks of interest, so there's no concurrency benefit
to be gained.  With this setup it wouldn't make InitPostgres any
cleaner anyway, so I'd leave it alone.
        regards, tom lane


Re: autovacuum launcher using InitPostgres

From
Tom Lane
Date:
Alvaro Herrera <alvherre@commandprompt.com> writes:
> Heikki Linnakangas wrote:
>> I quite like the idea of splitting initialization into two phases, one
>> that let's you access shared catalogs, and one to bind to a database. I
>> didn't look into the details, though.

> The problem is that it only gives you access to pg_database, because the
> other shared catalogs require more relcache initialization than we do.
> So I'm not sure it'd be useful for anything else.

The part I was unhappy about was falling out with the startup
transaction still open (which the patch as written didn't do, but
would have needed to given the snapshot issue).  I don't object to
trying to restructure InitPostgres along the above lines if it can be
done cleanly.  But on the third hand, Alvaro's right that there isn't
any other obvious use for it.

We've also got the client_encoding (GUC initialization timing) issue
to fix in this area.  I suggest that any major restructuring-for-beauty
wait till after the dust settles.  I think what we have here (in the
revised patch) is ugly but not too ugly to live with.
        regards, tom lane


Re: autovacuum launcher using InitPostgres

From
Alvaro Herrera
Date:
Tom Lane wrote:
> Alvaro Herrera <alvherre@commandprompt.com> writes:
> > How about this?
>
> I think the accounting for the AV launcher in shmem allocation is a bit
> confused yet --- for instance, isn't MaxBackends already including the
> launcher?  I wonder if it would be cleaner to include the launcher in
> the autovacuum_max_workers parameter, and increase the min/default
> values of that by one.

Huh, yeah, sorry about that -- fixed here.  I think the name of the
param, which includes "worker", precludes from raising the values.

Changes between v2 and v3:

diff -u src/backend/storage/lmgr/proc.c src/backend/storage/lmgr/proc.c
--- src/backend/storage/lmgr/proc.c     31 Aug 2009 13:36:56 -0000
+++ src/backend/storage/lmgr/proc.c     31 Aug 2009 16:14:08 -0000
@@ -103,7 +103,7 @@
        /* AuxiliaryProcs */
        size = add_size(size, mul_size(NUM_AUXILIARY_PROCS, sizeof(PGPROC)));
        /* MyProcs, including autovacuum workers and launcher */
-       size = add_size(size, mul_size(MaxBackends + 1, sizeof(PGPROC)));
+       size = add_size(size, mul_size(MaxBackends, sizeof(PGPROC)));
        /* ProcStructLock */
        size = add_size(size, sizeof(slock_t));

@@ -192,6 +192,7 @@
                ProcGlobal->freeProcs = &procs[i];
        }

+       /* note: the "+1" here accounts for the autovac launcher */
        procs = (PGPROC *) ShmemAlloc((autovacuum_max_workers + 1) * sizeof(PGPROC));
        if (!procs)
                ereport(FATAL,
diff -u src/backend/utils/misc/guc.c src/backend/utils/misc/guc.c
--- src/backend/utils/misc/guc.c        31 Aug 2009 03:07:47 -0000
+++ src/backend/utils/misc/guc.c        31 Aug 2009 16:12:56 -0000
@@ -7570,7 +7570,7 @@
 static bool
 assign_maxconnections(int newval, bool doit, GucSource source)
 {
-       if (newval + autovacuum_max_workers > INT_MAX / 4)
+       if (newval + autovacuum_max_workers + 1 > INT_MAX / 4)
                return false;

        if (doit)


--
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

Attachment

Re: autovacuum launcher using InitPostgres

From
Tom Lane
Date:
Alvaro Herrera <alvherre@commandprompt.com> writes:
> Tom Lane wrote:
>> I wonder if it would be cleaner to include the launcher in
>> the autovacuum_max_workers parameter, and increase the min/default
>> values of that by one.

> Huh, yeah, sorry about that -- fixed here.  I think the name of the
> param, which includes "worker", precludes from raising the values.

Well, I'm not sure the average user knows or cares about the difference
between the launcher and the workers.  The thing that was in the back of
my mind was that we would now have the option to have the launcher show
up in pg_stat_activity.  If we were to do that then the case for
counting it in the user-visible number-of-processes parameter would get
a lot stronger (enough to justify renaming the parameter, if you insist
that the launcher isn't a worker).  I don't however have any strong
opinion on whether we *should* include it in pg_stat_activity ---
comments?

In the meantime, this looks reasonably sane in a fast read-through,
but I saw a few comments that could use improvement, and I have not
tried to actually review it (like look for missed places to change).
Do you mind if I work it over for an hour or two?
        regards, tom lane


Re: autovacuum launcher using InitPostgres

From
Tom Lane
Date:
Alvaro Herrera <alvherre@commandprompt.com> writes:
> Tom Lane wrote:
>> While I was looking at this I wondered whether
>> RelationCacheInitializePhase2 really needs to be inside the startup
>> transaction at all.  I think it could probably be moved up before
>> that.  However, if the AV launcher has to do GetTransactionSnapshot
>> then it's not clear that improves matters anyway. 

> Well, the difference is that the initial transaction would be a few
> microsec shorter ... not sure if that matters.

Actually, there is a better way to do this: if we move up the
RelationCacheInitializePhase2 call, then we can have the AV launcher
case just fall out *before* the transaction start.  It can do
GetTransactionSnapshot inside its own transaction that reads
pg_database.  This is a better solution because it'll have a more
up-to-date version of RecentGlobalXmin while scanning pg_database.
(Indeed, I think this might be *necessary* over the very long haul.)

I think I've got the signal handling cleaned up, but need to test.
Is there any really good reason why autovacuum has its own avl_quickdie
instead of using quickdie() for SIGQUIT?
        regards, tom lane


Re: autovacuum launcher using InitPostgres

From
Alvaro Herrera
Date:
Tom Lane wrote:

> Actually, there is a better way to do this: if we move up the
> RelationCacheInitializePhase2 call, then we can have the AV launcher
> case just fall out *before* the transaction start.  It can do
> GetTransactionSnapshot inside its own transaction that reads
> pg_database.  This is a better solution because it'll have a more
> up-to-date version of RecentGlobalXmin while scanning pg_database.
> (Indeed, I think this might be *necessary* over the very long haul.)

Hmm, good idea.

> I think I've got the signal handling cleaned up, but need to test.
> Is there any really good reason why autovacuum has its own avl_quickdie
> instead of using quickdie() for SIGQUIT?

No, probably I just copied it because the others were static.  Not sure
about quickdie() itself.

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


Re: autovacuum launcher using InitPostgres

From
Alvaro Herrera
Date:
Tom Lane wrote:
> Alvaro Herrera <alvherre@commandprompt.com> writes:
> > Tom Lane wrote:
> >> I wonder if it would be cleaner to include the launcher in
> >> the autovacuum_max_workers parameter, and increase the min/default
> >> values of that by one.
> 
> > Huh, yeah, sorry about that -- fixed here.  I think the name of the
> > param, which includes "worker", precludes from raising the values.
> 
> Well, I'm not sure the average user knows or cares about the difference
> between the launcher and the workers.  The thing that was in the back of
> my mind was that we would now have the option to have the launcher show
> up in pg_stat_activity.  If we were to do that then the case for
> counting it in the user-visible number-of-processes parameter would get
> a lot stronger (enough to justify renaming the parameter, if you insist
> that the launcher isn't a worker).  I don't however have any strong
> opinion on whether we *should* include it in pg_stat_activity ---
> comments?

The user may not care about the difference, but there's a point in
having the limit be the simpler concept of "this is the maximum amount
of processes running vacuum at any time".  The launcher is very
uninteresting to users.

> In the meantime, this looks reasonably sane in a fast read-through,
> but I saw a few comments that could use improvement, and I have not
> tried to actually review it (like look for missed places to change).
> Do you mind if I work it over for an hour or two?

Please go ahead.

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.


Re: autovacuum launcher using InitPostgres

From
Tom Lane
Date:
Alvaro Herrera <alvherre@commandprompt.com> writes:
> The only thing I'm aware is missing from this patch is fixing up
> avlauncher's signal handling, and adding a bit more commentary; also I
> haven't tested it under EXEC_BACKEND yet.

I did the signal handling work and fixed a couple of small oversights,
and applied it.  I'm not sure what other commentary you had in mind,
but feel free to add.
        regards, tom lane


Re: autovacuum launcher using InitPostgres

From
Tom Lane
Date:
Alvaro Herrera <alvherre@commandprompt.com> writes:
> Tom Lane wrote:
>> Well, I'm not sure the average user knows or cares about the difference
>> between the launcher and the workers.  The thing that was in the back of
>> my mind was that we would now have the option to have the launcher show
>> up in pg_stat_activity.  If we were to do that then the case for
>> counting it in the user-visible number-of-processes parameter would get
>> a lot stronger (enough to justify renaming the parameter, if you insist
>> that the launcher isn't a worker).  I don't however have any strong
>> opinion on whether we *should* include it in pg_stat_activity ---
>> comments?

> The user may not care about the difference, but there's a point in
> having the limit be the simpler concept of "this is the maximum amount
> of processes running vacuum at any time".  The launcher is very
> uninteresting to users.

I committed things that way, but I'm still not convinced that we
shouldn't expose the launcher in pg_stat_activity.  The thing that
is bothering me is that it is now able to take locks and potentially
could block some other process or even participate in a deadlock.
Do we really want to have entries in pg_locks that don't match any
entry in pg_stat_activity?
        regards, tom lane


Re: autovacuum launcher using InitPostgres

From
Dimitri Fontaine
Date:
Hi,

Tom Lane <tgl@sss.pgh.pa.us> writes:
>> The user may not care about the difference, but there's a point in
>> having the limit be the simpler concept of "this is the maximum amount
>> of processes running vacuum at any time".  The launcher is very
>> uninteresting to users.

Adding to this, the launcher will not consume maintenance_work_mem
whereas each worker is able to allocate that much, IIUC.

> I committed things that way, but I'm still not convinced that we
> shouldn't expose the launcher in pg_stat_activity.  The thing that
> is bothering me is that it is now able to take locks and potentially
> could block some other process or even participate in a deadlock.
> Do we really want to have entries in pg_locks that don't match any
> entry in pg_stat_activity?

Having the launcher locks show as such gets my vote too, but then I'm
following on your opinion that a launcher ain't a worker and that users
need to know about it. 

Let's keep the autovacuum_max_workers GUC naming, not counting the
"there can be only one" launcher so that we're able to size
maintenance_work_mem.

Regards,
-- 
dim


Re: autovacuum launcher using InitPostgres

From
Magnus Hagander
Date:
On Mon, Aug 31, 2009 at 21:49, Tom Lane<tgl@sss.pgh.pa.us> wrote:
> Alvaro Herrera <alvherre@commandprompt.com> writes:
>> Tom Lane wrote:
>>> Well, I'm not sure the average user knows or cares about the difference
>>> between the launcher and the workers.  The thing that was in the back of
>>> my mind was that we would now have the option to have the launcher show
>>> up in pg_stat_activity.  If we were to do that then the case for
>>> counting it in the user-visible number-of-processes parameter would get
>>> a lot stronger (enough to justify renaming the parameter, if you insist
>>> that the launcher isn't a worker).  I don't however have any strong
>>> opinion on whether we *should* include it in pg_stat_activity ---
>>> comments?
>
>> The user may not care about the difference, but there's a point in
>> having the limit be the simpler concept of "this is the maximum amount
>> of processes running vacuum at any time".  The launcher is very
>> uninteresting to users.
>
> I committed things that way, but I'm still not convinced that we
> shouldn't expose the launcher in pg_stat_activity.  The thing that
> is bothering me is that it is now able to take locks and potentially
> could block some other process or even participate in a deadlock.
> Do we really want to have entries in pg_locks that don't match any
> entry in pg_stat_activity?

At first I'd have voted that we don't have it in pg_stat_activity, but
the argument about pg_locks really is the killer on. IMHO, it *has* to
go there then.

I would also suggest that we add a separate column to pg_stat_activity
indicating what type of process it is, so that monitoring tools can
figure that out without having to resort to string matching on the
current query field. This field would indicate if it's a backend,
autovac worker, autovac launcher. And perhaps we should even add the
other utility processes to pg_stat_activity, for completeness?

Another thought along a similar line is some way to detect the idle
and idle-in-transaction states without doing string matching as well.
Perhaps this could be overloaded on said extra field, or should it
have a separate one?

-- Magnus HaganderMe: http://www.hagander.net/Work: http://www.redpill-linpro.com/