Thread: AutoVacuum starvation from sinval messages

AutoVacuum starvation from sinval messages

From
Jeff Janes
Date:
While doing a rather contrived test of something else, using:
perl -le 'print "create table foo (x serial); drop table foo;"
foreach (1..1e6)'|psql

I noticed starvation of autovacuum launcher process and bloating of catalogs.

Basically the WaitLatch in autovacuum.c line 602 (in head) never returns.

This was introduced by a180776f7a1c4554f214b, "Teach unix_latch.c to
use poll() where available"

When the poll() gets EINTR by SIGUSR1 for the invalidation, it
restarts at the full timeout (60 seconds in this case) with no
accounting for the time already waited.

It does not seem outrageous to me that there would be real-world
conditions in which invalidations would be sent more than once a
minute over prolonged periods, so this total starvation seems like a
bug.

Should gettimeofday be called before and after the poll() and then the
difference deducted from timeout?

Cheers,

Jeff



Re: AutoVacuum starvation from sinval messages

From
Jeff Janes
Date:
On Thu, Nov 8, 2012 at 12:36 PM, Jeff Janes <jeff.janes@gmail.com> wrote:
>
> Should gettimeofday be called before and after the poll() and then the
> difference deducted from timeout?
>

Something like this?

Attachment

Re: AutoVacuum starvation from sinval messages

From
Tom Lane
Date:
Jeff Janes <jeff.janes@gmail.com> writes:
> Basically the WaitLatch in autovacuum.c line 602 (in head) never returns.

> This was introduced by a180776f7a1c4554f214b, "Teach unix_latch.c to
> use poll() where available"

> When the poll() gets EINTR by SIGUSR1 for the invalidation, it
> restarts at the full timeout (60 seconds in this case) with no
> accounting for the time already waited.

Hmm.  That seems like a problem, all right, but I don't believe the
claim that it applies only to the poll() code path.  On many platforms
the select() path would have the same issue.

It's a bit annoying to have to incur an extra gettimeofday call per
WaitLatch operation in case this happens, but there doesn't seem to
be a lot of choice.  I guess we can arrange to incur the extra call
only when a timeout is requested --- but IIRC that's going to be
the majority of usages anyway ...
        regards, tom lane



Re: AutoVacuum starvation from sinval messages

From
Tom Lane
Date:
Jeff Janes <jeff.janes@gmail.com> writes:
> On Thu, Nov 8, 2012 at 12:36 PM, Jeff Janes <jeff.janes@gmail.com> wrote:
>> Should gettimeofday be called before and after the poll() and then the
>> difference deducted from timeout?

> Something like this?

Meh.  Not like that, because

(1) we shouldn't add overhead when no timeout is requested.
(2) it'd be better to do only one gettimeofday at the start and then
one in the EINTR path, not one per loop; this saves syscalls and also
avoids slippage of the timeout time.
(3) the select() path needs a similar fix.

But I think you have a good idea to use the INSTR_TIME macros instead
of custom code, even though this is Unix-only so there's not more than
one underlying implementation.

Will fix this up and commit.
        regards, tom lane



Re: AutoVacuum starvation from sinval messages

From
Tom Lane
Date:
I wrote:
> Jeff Janes <jeff.janes@gmail.com> writes:
>> When the poll() gets EINTR by SIGUSR1 for the invalidation, it
>> restarts at the full timeout (60 seconds in this case) with no
>> accounting for the time already waited.

> Hmm.  That seems like a problem, all right, but I don't believe the
> claim that it applies only to the poll() code path.  On many platforms
> the select() path would have the same issue.

BTW, doesn't win32_latch.c have the identical problem?  I don't see it
updating the timeout after a signal, either.
        regards, tom lane



Re: AutoVacuum starvation from sinval messages

From
Simon Riggs
Date:
On 8 November 2012 20:36, Jeff Janes <jeff.janes@gmail.com> wrote:

> It does not seem outrageous to me that there would be real-world
> conditions in which invalidations would be sent more than once a
> minute over prolonged periods, so this total starvation seems like a
> bug.

Yes, its a bug, but do you really believe the above? In what cases?

-- Simon Riggs                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services



Re: AutoVacuum starvation from sinval messages

From
Andres Freund
Date:
On 2012-11-08 22:40:43 +0000, Simon Riggs wrote:
> On 8 November 2012 20:36, Jeff Janes <jeff.janes@gmail.com> wrote:
>
> > It does not seem outrageous to me that there would be real-world
> > conditions in which invalidations would be sent more than once a
> > minute over prolonged periods, so this total starvation seems like a
> > bug.
>
> Yes, its a bug, but do you really believe the above? In what cases?

Temporary tables might a good contender.

Greetings,

Andres Freund



Re: AutoVacuum starvation from sinval messages

From
Hannu Krosing
Date:
On 11/08/2012 11:40 PM, Simon Riggs wrote:
> On 8 November 2012 20:36, Jeff Janes <jeff.janes@gmail.com> wrote:
>
>> It does not seem outrageous to me that there would be real-world
>> conditions in which invalidations would be sent more than once a
>> minute over prolonged periods, so this total starvation seems like a
>> bug.
> Yes, its a bug, but do you really believe the above? In what cases?
>
Could heavy use of temporary tables cause this ?

Hannu



Re: AutoVacuum starvation from sinval messages

From
Tom Lane
Date:
Simon Riggs <simon@2ndQuadrant.com> writes:
> On 8 November 2012 20:36, Jeff Janes <jeff.janes@gmail.com> wrote:
>> It does not seem outrageous to me that there would be real-world
>> conditions in which invalidations would be sent more than once a
>> minute over prolonged periods, so this total starvation seems like a
>> bug.

> Yes, its a bug, but do you really believe the above? In what cases?

It doesn't take a whole lot of DDL to provoke an sinval overrun, if
the recipient process is just sitting idle and not servicing the
messages.  I think Jeff's concern is entirely valid.
        regards, tom lane



Re: AutoVacuum starvation from sinval messages

From
Jeff Janes
Date:
On Thu, Nov 8, 2012 at 2:50 PM, Hannu Krosing <hannu@2ndquadrant.com> wrote:
> On 11/08/2012 11:40 PM, Simon Riggs wrote:
>>
>> On 8 November 2012 20:36, Jeff Janes <jeff.janes@gmail.com> wrote:
>>
>>> It does not seem outrageous to me that there would be real-world
>>> conditions in which invalidations would be sent more than once a
>>> minute over prolonged periods, so this total starvation seems like a
>>> bug.
>>
>> Yes, its a bug, but do you really believe the above? In what cases?

We see lots of traffic on the mail list about people trying to dump
several hundred thousand tables, or they can only create one database
every two minutes, or truncating hundreds of tables at a time over and
over again gets slow, etc.  I know little about the internal of the
invalidation code, but I would think doing that kind of thing must
generate a lot of them.

>>
> Could heavy use of temporary tables cause this ?

It looks like they do.  I'm not sure what is being invalidated in
those cases, but something seems to be.

Cheers,

Jeff



Re: AutoVacuum starvation from sinval messages

From
Tom Lane
Date:
I wrote:
>> Hmm.  That seems like a problem, all right, but I don't believe the
>> claim that it applies only to the poll() code path.  On many platforms
>> the select() path would have the same issue.

> BTW, doesn't win32_latch.c have the identical problem?  I don't see it
> updating the timeout after a signal, either.

I've committed a patch for this, but the win32_latch version is entirely
innocent of any testing.  The buildfarm will probably show any mistake
there, but if someone would like to try Jeff's test case on Windows just
to make sure the autovac launcher wakes up on time, it wouldn't hurt.
        regards, tom lane



Re: AutoVacuum starvation from sinval messages

From
Simon Riggs
Date:
On 8 November 2012 23:58, Jeff Janes <jeff.janes@gmail.com> wrote:
> On Thu, Nov 8, 2012 at 2:50 PM, Hannu Krosing <hannu@2ndquadrant.com> wrote:
>> On 11/08/2012 11:40 PM, Simon Riggs wrote:
>>>
>>> On 8 November 2012 20:36, Jeff Janes <jeff.janes@gmail.com> wrote:
>>>
>>>> It does not seem outrageous to me that there would be real-world
>>>> conditions in which invalidations would be sent more than once a
>>>> minute over prolonged periods, so this total starvation seems like a
>>>> bug.
>>>
>>> Yes, its a bug, but do you really believe the above? In what cases?
>
> We see lots of traffic on the mail list about people trying to dump
> several hundred thousand tables, or they can only create one database
> every two minutes, or truncating hundreds of tables at a time over and
> over again gets slow, etc.  I know little about the internal of the
> invalidation code, but I would think doing that kind of thing must
> generate a lot of them.

OK, so the problem is *any* sinval. I thought you meant one sinval per
object per minute, which seemed much less likely.

I agree one sinval per minute for long periods is actually quite likely.

-- Simon Riggs                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services



Re: AutoVacuum starvation from sinval messages

From
Robert Haas
Date:
On Thu, Nov 8, 2012 at 6:23 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Simon Riggs <simon@2ndQuadrant.com> writes:
>> On 8 November 2012 20:36, Jeff Janes <jeff.janes@gmail.com> wrote:
>>> It does not seem outrageous to me that there would be real-world
>>> conditions in which invalidations would be sent more than once a
>>> minute over prolonged periods, so this total starvation seems like a
>>> bug.
>
>> Yes, its a bug, but do you really believe the above? In what cases?
>
> It doesn't take a whole lot of DDL to provoke an sinval overrun, if
> the recipient process is just sitting idle and not servicing the
> messages.  I think Jeff's concern is entirely valid.

So, do we need a sinval overrun or just a sinval message to provoke
starvation?  The former would be bad but the latter would be really,
really bad.  IIRC the queue has 4K entries, and IIRC a single DDL
operation might provoke a couple of sinvals, but I'm thinking that
somebody would probably have to be creating >1024 temp tables a minute
to overrun the queue, which is very possible but not necessarily
common.  OTOH, creating 1 temp table a minute would hit a much broader
swath of users.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: AutoVacuum starvation from sinval messages

From
Simon Riggs
Date:
On 9 November 2012 13:42, Robert Haas <robertmhaas@gmail.com> wrote:
> On Thu, Nov 8, 2012 at 6:23 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Simon Riggs <simon@2ndQuadrant.com> writes:
>>> On 8 November 2012 20:36, Jeff Janes <jeff.janes@gmail.com> wrote:
>>>> It does not seem outrageous to me that there would be real-world
>>>> conditions in which invalidations would be sent more than once a
>>>> minute over prolonged periods, so this total starvation seems like a
>>>> bug.
>>
>>> Yes, its a bug, but do you really believe the above? In what cases?
>>
>> It doesn't take a whole lot of DDL to provoke an sinval overrun, if
>> the recipient process is just sitting idle and not servicing the
>> messages.  I think Jeff's concern is entirely valid.
>
> So, do we need a sinval overrun or just a sinval message to provoke
> starvation?  The former would be bad but the latter would be really,
> really bad.  IIRC the queue has 4K entries, and IIRC a single DDL
> operation might provoke a couple of sinvals, but I'm thinking that
> somebody would probably have to be creating >1024 temp tables a minute
> to overrun the queue, which is very possible but not necessarily
> common.  OTOH, creating 1 temp table a minute would hit a much broader
> swath of users.

The point is moot because latches don't work that way anymore.

-- Simon Riggs                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services



Re: AutoVacuum starvation from sinval messages

From
Robert Haas
Date:
On Fri, Nov 9, 2012 at 9:02 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
>> So, do we need a sinval overrun or just a sinval message to provoke
>> starvation?  The former would be bad but the latter would be really,
>> really bad.  IIRC the queue has 4K entries, and IIRC a single DDL
>> operation might provoke a couple of sinvals, but I'm thinking that
>> somebody would probably have to be creating >1024 temp tables a minute
>> to overrun the queue, which is very possible but not necessarily
>> common.  OTOH, creating 1 temp table a minute would hit a much broader
>> swath of users.
>
> The point is moot because latches don't work that way anymore.

One of us is confused, because IIUC Tom just fixed this this morning,
and I'm trying to figure out how many users will be affected by it,
and how seriously.  Like, do we need an immediate minor release?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: AutoVacuum starvation from sinval messages

From
Simon Riggs
Date:
On 9 November 2012 14:16, Robert Haas <robertmhaas@gmail.com> wrote:
> On Fri, Nov 9, 2012 at 9:02 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
>>> So, do we need a sinval overrun or just a sinval message to provoke
>>> starvation?  The former would be bad but the latter would be really,
>>> really bad.  IIRC the queue has 4K entries, and IIRC a single DDL
>>> operation might provoke a couple of sinvals, but I'm thinking that
>>> somebody would probably have to be creating >1024 temp tables a minute
>>> to overrun the queue, which is very possible but not necessarily
>>> common.  OTOH, creating 1 temp table a minute would hit a much broader
>>> swath of users.
>>
>> The point is moot because latches don't work that way anymore.
>
> One of us is confused, because IIUC Tom just fixed this this morning,
> and I'm trying to figure out how many users will be affected by it,
> and how seriously.  Like, do we need an immediate minor release?

You asked what provokes starvation, and the answer is nothing anymore,
since Tom's commit. No confusion here...

-- Simon Riggs                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services



Re: AutoVacuum starvation from sinval messages

From
"Kevin Grittner"
Date:
Simon Riggs wrote:
> Robert Haas <robertmhaas@gmail.com> wrote:

>> One of us is confused, because IIUC Tom just fixed this this
>> morning, and I'm trying to figure out how many users will be
>> affected by it, and how seriously. Like, do we need an immediate
>> minor release?
> 
> You asked what provokes starvation, and the answer is nothing
> anymore, since Tom's commit. No confusion here...

So for all those end users running production with a build compiled
from HEAD after Tom's commit this morning there is no issue. I'm
wondering about those using something else in production. What does
it take for their autovacuum to be stalled?

-Kevin



Re: AutoVacuum starvation from sinval messages

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> So, do we need a sinval overrun or just a sinval message to provoke
> starvation?  The former would be bad but the latter would be really,
> really bad.

The former.  Actually, a signal will be sent as soon as a backend is
determined to be unreasonably far behind, which I think is when the
message queue is half full.  In the AV launcher case, the queue probably
never actually overflows; but the launcher has to wake up to process
messages every so often, and that wakeup cycle is breaking the timeout
management in WaitLatch.

> IIRC the queue has 4K entries, and IIRC a single DDL
> operation might provoke a couple of sinvals, but I'm thinking that
> somebody would probably have to be creating >1024 temp tables a minute
> to overrun the queue, which is very possible but not necessarily
> common.

Well, one DDL typically generates multiple messages --- one for each
catalog row added/modified/removed, roughly speaking.  When I run the
constant create/drop example Jeff posted, I see the AV launcher getting
a catchup signal every few seconds.  I didn't try to determine exactly
how many create/drop cycles that was, but I'm pretty sure it's a lot
less than 1000.
        regards, tom lane



Re: AutoVacuum starvation from sinval messages

From
Alvaro Herrera
Date:
Tom Lane escribió:
> Robert Haas <robertmhaas@gmail.com> writes:

> > IIRC the queue has 4K entries, and IIRC a single DDL
> > operation might provoke a couple of sinvals, but I'm thinking that
> > somebody would probably have to be creating >1024 temp tables a minute
> > to overrun the queue, which is very possible but not necessarily
> > common.
>
> Well, one DDL typically generates multiple messages --- one for each
> catalog row added/modified/removed, roughly speaking.  When I run the
> constant create/drop example Jeff posted, I see the AV launcher getting
> a catchup signal every few seconds.  I didn't try to determine exactly
> how many create/drop cycles that was, but I'm pretty sure it's a lot
> less than 1000.

Just creating the sequence for the serial column means 16 pg_attribute
tuples.  There's also two pg_class entries, one more pg_attribute, two
pg_type entries, a bunch of pg_depend entries ... I doubt it's less than
30 catalog tuples, all things considered.  Double that for the drop.  So
for a 4k entry table that needs to get 50% full, that's only ~35 temp
table creations like that.

I hadn't realized sequences used so many pg_attribute entries.

--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services



Re: AutoVacuum starvation from sinval messages

From
Robert Haas
Date:
On Fri, Nov 9, 2012 at 10:08 AM, Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:
> Tom Lane escribió:
>> Robert Haas <robertmhaas@gmail.com> writes:
>
>> > IIRC the queue has 4K entries, and IIRC a single DDL
>> > operation might provoke a couple of sinvals, but I'm thinking that
>> > somebody would probably have to be creating >1024 temp tables a minute
>> > to overrun the queue, which is very possible but not necessarily
>> > common.
>>
>> Well, one DDL typically generates multiple messages --- one for each
>> catalog row added/modified/removed, roughly speaking.  When I run the
>> constant create/drop example Jeff posted, I see the AV launcher getting
>> a catchup signal every few seconds.  I didn't try to determine exactly
>> how many create/drop cycles that was, but I'm pretty sure it's a lot
>> less than 1000.
>
> Just creating the sequence for the serial column means 16 pg_attribute
> tuples.  There's also two pg_class entries, one more pg_attribute, two
> pg_type entries, a bunch of pg_depend entries ... I doubt it's less than
> 30 catalog tuples, all things considered.  Double that for the drop.  So
> for a 4k entry table that needs to get 50% full, that's only ~35 temp
> table creations like that.
>
> I hadn't realized sequences used so many pg_attribute entries.

Hmm.  So, are we going to force a minor release for this, or do we
think it's not serious enough to warrant that?

I'm not expressing an opinion either way, just asking.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company