Thread: vacuum as flags in PGPROC

vacuum as flags in PGPROC

From
Alvaro Herrera
Date:
In the spirit of incremental improvement, here is a patch that turns the
couple of bools in PGPROC into a bitmask, and associated fallout.

This patch also contains a change to make a cancelled autovacuum
continue with the schedule (indeed to continue with the schedule on any
error), rather than aborting completely.

My idea is to add a flag for "analyze" and another for "is for xid
wraparound", and to build on that for the cancel-autovac-on-deadlock-
checker project, as submitted by Simon.

--
Alvaro Herrera       Valdivia, Chile   ICBM: S 39º 49' 18.1", W 73º 13' 56.4"
"El sudor es la mejor cura para un pensamiento enfermo" (Bardia)

Attachment

Re: vacuum as flags in PGPROC

From
Tom Lane
Date:
Alvaro Herrera <alvherre@commandprompt.com> writes:
> In the spirit of incremental improvement, here is a patch that turns the
> couple of bools in PGPROC into a bitmask, and associated fallout.

Maybe declare the field as uint8 instead of char?  Otherwise, +1.

> This patch also contains a change to make a cancelled autovacuum
> continue with the schedule (indeed to continue with the schedule on any
> error), rather than aborting completely.

I think we'd considered that a bug to be fixed.  Are you intending this
for 8.3 or 8.4?  I don't have a problem with it for 8.3, but someone
else might ...

            regards, tom lane

Re: vacuum as flags in PGPROC

From
Alvaro Herrera
Date:
Tom Lane wrote:
> Alvaro Herrera <alvherre@commandprompt.com> writes:
> > In the spirit of incremental improvement, here is a patch that turns the
> > couple of bools in PGPROC into a bitmask, and associated fallout.
>
> Maybe declare the field as uint8 instead of char?  Otherwise, +1.

I'm wondering if it's safe to do something like

MyProc->vacuumFlags |= PROC_FOR_XID_WRAPAROUND

without holding the ProcArrayLock.  It seems in practice a type smaller
than "int" (which uint8 always is) is always stored atomically so this
shouldn't be a problem.

--
Alvaro Herrera                               http://www.PlanetPostgreSQL.org/
"La rebeldía es la virtud original del hombre" (Arthur Schopenhauer)

Re: vacuum as flags in PGPROC

From
Tom Lane
Date:
Alvaro Herrera <alvherre@commandprompt.com> writes:
> I'm wondering if it's safe to do something like
> MyProc->vacuumFlags |= PROC_FOR_XID_WRAPAROUND
> without holding the ProcArrayLock.

This seems a bit itchy.

One thing I'd be worried about is processors that implement that by
fetching the whole word containing the field, setting the bit, and
storing back the whole word.  (I believe some RISC processors are likely
to do it like that.)  This would mean that it'd work OK only as long as
no other process was concurrently changing any field that happened to be
in the same word, which is the kind of requirement that seems horribly
fragile.  You could probably fix that by declaring the field as
sig_atomic_t.  Maybe better, just make it int.

Another problem is that if you don't take a lock then there's no memory
fence instructions executed, which would create issues around how soon
other processors would see the effects of the change.  While that might
not be critical for the vacuum flags, it still seems a bit worrisome.

Given that you're only changing these flags twice per vacuum, it doesn't
seem worth taking any risks by not taking a lock ...

            regards, tom lane

Re: vacuum as flags in PGPROC

From
Alvaro Herrera
Date:
Tom Lane wrote:
> Alvaro Herrera <alvherre@commandprompt.com> writes:
> > I'm wondering if it's safe to do something like
> > MyProc->vacuumFlags |= PROC_FOR_XID_WRAPAROUND
> > without holding the ProcArrayLock.
>
> This seems a bit itchy.
>
> One thing I'd be worried about is processors that implement that by
> fetching the whole word containing the field, setting the bit, and
> storing back the whole word.  (I believe some RISC processors are likely
> to do it like that.)  This would mean that it'd work OK only as long as
> no other process was concurrently changing any field that happened to be
> in the same word, which is the kind of requirement that seems horribly
> fragile.

I did it that way (i.e. added locking) and then realized that it
shouldn't really be a problem, because the only one who can be setting
vacuum flags is the process itself.  Other processes can only read the
flags.

Maybe the locking is not a problem anyway, but there are two additional
flag sets in analyze and two more in the autovacuum code when it detects
that it's vacuuming a table for Xid wraparound.  (The idea is to use
these flags in the deadlock patch Simon posted, so that signals are
automatically sent or not according to the current activity of
autovacuum.  This way the signal handler can be kept stupid.)

Also, I forgot to mention it on the first email, but this patch adds
errcontext() lines when an autovacuum/analyze is being aborted.  It
works fine, but I'm not seeing code anywhere else that does something
like this.

--
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

Attachment

Re: vacuum as flags in PGPROC

From
Heikki Linnakangas
Date:
Alvaro Herrera wrote:
> I did it that way (i.e. added locking) and then realized that it
> shouldn't really be a problem, because the only one who can be setting
> vacuum flags is the process itself.  Other processes can only read the
> flags.

It would still be a problem if there was any other fields that were
updated by other processes, adjacent to the vacuum flags. I don't think
that's the case, however.

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

Re: vacuum as flags in PGPROC

From
Alvaro Herrera
Date:
Heikki Linnakangas wrote:
> Alvaro Herrera wrote:
> > I did it that way (i.e. added locking) and then realized that it
> > shouldn't really be a problem, because the only one who can be setting
> > vacuum flags is the process itself.  Other processes can only read the
> > flags.
>
> It would still be a problem if there was any other fields that were
> updated by other processes, adjacent to the vacuum flags. I don't think
> that's the case, however.

Yeah, that's not the case currently.  Tom is right in that it's fragile
if we ever change the definition so that there is such a flag.  Maybe
this is solved by adding a comment however.

--
Alvaro Herrera                  http://www.amazon.com/gp/registry/5ZYLFMCVHXC
Jude: I wish humans laid eggs
Ringlord: Why would you want humans to lay eggs?
Jude: So I can eat them

Re: vacuum as flags in PGPROC

From
Tom Lane
Date:
Heikki Linnakangas <heikki@enterprisedb.com> writes:
> Alvaro Herrera wrote:
>> I did it that way (i.e. added locking) and then realized that it
>> shouldn't really be a problem, because the only one who can be setting
>> vacuum flags is the process itself.  Other processes can only read the
>> flags.

> It would still be a problem if there was any other fields that were
> updated by other processes, adjacent to the vacuum flags. I don't think
> that's the case, however.

Well, that may not be the case today, but it still seems like an
assumption that will come back to bite us someday.  And can you imagine
trying to debug a misbehavior like that?  It's really not worth the risk,
given how seldom these flags will be changed.

            regards, tom lane

Re: vacuum as flags in PGPROC

From
Heikki Linnakangas
Date:
Tom Lane wrote:
> Heikki Linnakangas <heikki@enterprisedb.com> writes:
>> Alvaro Herrera wrote:
>>> I did it that way (i.e. added locking) and then realized that it
>>> shouldn't really be a problem, because the only one who can be setting
>>> vacuum flags is the process itself.  Other processes can only read the
>>> flags.
>
>> It would still be a problem if there was any other fields that were
>> updated by other processes, adjacent to the vacuum flags. I don't think
>> that's the case, however.
>
> Well, that may not be the case today, but it still seems like an
> assumption that will come back to bite us someday.  And can you imagine
> trying to debug a misbehavior like that?  It's really not worth the risk,
> given how seldom these flags will be changed.

Oh, I totally agree. I wasn't trying to argue to the contrary.

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

Re: vacuum as flags in PGPROC

From
Tom Lane
Date:
Alvaro Herrera <alvherre@commandprompt.com> writes:
> Also, I forgot to mention it on the first email, but this patch adds
> errcontext() lines when an autovacuum/analyze is being aborted.  It
> works fine, but I'm not seeing code anywhere else that does something
> like this.

This is a little bit scary because you might be invoking system catalog
reads after the transaction has already failed.  What would it take to
save the names involved before starting the TRY block?  I'm not worried
about the errcontext() call per se, only about the syscache fetches.

            regards, tom lane

Re: vacuum as flags in PGPROC

From
Alvaro Herrera
Date:
Tom Lane wrote:
> Alvaro Herrera <alvherre@commandprompt.com> writes:
> > Also, I forgot to mention it on the first email, but this patch adds
> > errcontext() lines when an autovacuum/analyze is being aborted.  It
> > works fine, but I'm not seeing code anywhere else that does something
> > like this.
>
> This is a little bit scary because you might be invoking system catalog
> reads after the transaction has already failed.  What would it take to
> save the names involved before starting the TRY block?  I'm not worried
> about the errcontext() call per se, only about the syscache fetches.

Hmm, now it's misbehaving strangely.  I just got this log output.  The
interesting thing is the previous-to-last line (CONTEXT).

29789 DEBUG:  autovacuum: processing database "alvherre"
29789 DEBUG:  autovac_balance_cost(pid=29789 db=16384, rel=16413, cost_limit=200, cost_delay=20)
29789 DEBUG:  vacuuming "public.a"
29790 DEBUG:  autovacuum: processing database "alvherre"
29742 DEBUG:  server process (PID 29790) exited with exit code 0
29791 DEBUG:  autovacuum: processing database "alvherre"
29742 DEBUG:  server process (PID 29791) exited with exit code 0
29792 DEBUG:  autovacuum: processing database "alvherre"
29742 DEBUG:  server process (PID 29792) exited with exit code 0
29793 DEBUG:  autovacuum: processing database "alvherre"
29742 DEBUG:  server process (PID 29793) exited with exit code 0
29789 DEBUG:  "a": removed 1000000 row versions in 4425 pages
29789 DEBUG:  "a": found 1000000 removable, 1000000 nonremovable row versions in 8850 pages
29789 DETAIL:  0 dead row versions cannot be removed yet.
        There were 0 unused item pointers.
        4426 pages contain useful free space.
        0 pages are entirely empty.
        CPU 0.00s/0.25u sec elapsed 23.91 sec.
29794 DEBUG:  autovacuum: processing database "alvherre"
29742 DEBUG:  server process (PID 29794) exited with exit code 0
29800 DEBUG:  autovacuum: processing database "alvherre"
29742 DEBUG:  server process (PID 29800) exited with exit code 0
29744 DEBUG:  recycled transaction log file "000000010000000000000064"
29744 DEBUG:  recycled transaction log file "000000010000000000000065"
29744 DEBUG:  recycled transaction log file "000000010000000000000063"
29806 DEBUG:  autovacuum: processing database "alvherre"
29742 DEBUG:  server process (PID 29806) exited with exit code 0
29744 LOG:  checkpoints are occurring too frequently (9 seconds apart)
29744 HINT:  Consider increasing the configuration parameter "checkpoint_segments".
29753 DEBUG:  drop auto-cascades to type pg_temp_16413
29753 DEBUG:  drop auto-cascades to type pg_temp_16413[]
29789 DEBUG:  analyzing "public.a"
29753 DEBUG:  sending cancel to blocking autovacuum pid = 29789
29789 ERROR:  canceling statement due to user request
29789 CONTEXT:  automatic vacuum of table ".."
29742 DEBUG:  server process (PID 29789) exited with exit code 0

Note how the cancel log line has a context of ".." instead of the
qualified table name.

What I did was create a big table using Simon's suggested
create table a as select generate_series(1,1000000)::integer as col1;
then launch
update a set col1 = col1 + 1;

When this is finished, autovac starts to vacuum the table; and I run

alter table a alter column col1 type bigint;

This does not cancel the vacuum (because I'm only cancelling in
analyze).  When the vacuum is finished, alter table proceeds.  When
alter table finishes, the ANALYZE corresponding to the vacuum starts;
then I start another ALTER TABLE and this one is cancelled.  That's the
errcontext message that's messed up.

I don't really know what's going on ...  I suppose the sigsetjmp() in
PG_TRY is messing things up, but I'm not sure how to fix it.  Any
thoughts?

--
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

Re: vacuum as flags in PGPROC

From
Alvaro Herrera
Date:
Alvaro Herrera wrote:

> Hmm, now it's misbehaving strangely.

Nevermind, I think I see what's happening.

--
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support