Thread: widen vacuum buffer counters

widen vacuum buffer counters

From
Alvaro Herrera
Date:
We recently noticed that vacuum buffer counters wraparound in extreme
cases, with ridiculous results.  Example:

2020-01-06 16:38:38.010 EST [45625-1] app= LOG:  automatic vacuum of table "somtab.sf.foobar": index scans: 17
        pages: 0 removed, 207650641 remain, 0 skipped due to pins, 13419403 skipped frozen
        tuples: 141265419 removed, 3186614627 remain, 87783760 are dead but not yet removable
        buffer usage: -2022059267 hits, -17141881 misses, 1252507767 dirtied
        avg read rate: -0.043 MB/s, avg write rate: 3.146 MB/s
        system usage: CPU 107819.92s/2932957.75u sec elapsed 3110498.10 sec

That's to be expected, as tables exist that are large enough for 4 billion
buffer accesses to be a possibility.  Let's widen the counters, as in the
attached patch.

I propose to backpatch this.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment

Re: widen vacuum buffer counters

From
Julien Rouhaud
Date:
On Fri, Jan 31, 2020 at 9:59 PM Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
>
> We recently noticed that vacuum buffer counters wraparound in extreme
> cases, with ridiculous results.  Example:
>
> 2020-01-06 16:38:38.010 EST [45625-1] app= LOG:  automatic vacuum of table "somtab.sf.foobar": index scans: 17
>         pages: 0 removed, 207650641 remain, 0 skipped due to pins, 13419403 skipped frozen
>         tuples: 141265419 removed, 3186614627 remain, 87783760 are dead but not yet removable
>         buffer usage: -2022059267 hits, -17141881 misses, 1252507767 dirtied
>         avg read rate: -0.043 MB/s, avg write rate: 3.146 MB/s
>         system usage: CPU 107819.92s/2932957.75u sec elapsed 3110498.10 sec
>
> That's to be expected, as tables exist that are large enough for 4 billion
> buffer accesses to be a possibility.  Let's widen the counters, as in the
> attached patch.
>
> I propose to backpatch this.

+1, and patch LGTM.



Re: widen vacuum buffer counters

From
Tom Lane
Date:
Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> We recently noticed that vacuum buffer counters wraparound in extreme
> cases, with ridiculous results.

Ugh.

> I propose to backpatch this.

+1 for widening these counters, but since they're global variables, -0.2
or so for back-patching.  I don't know of any reason that an extension
would be touching these, but I feel like the problem isn't severe enough
to justify taking an ABI-break risk.

Also, %zd is the wrong format code for int64.  Recommended practice
these days is to use "%lld" with an explicit cast of the printf argument
to long long (just to be sure).  That doesn't work safely before v12,
and if you did insist on back-patching further, you'd need to jump
through hoops to avoid having platform-specific format codes in a
translatable string.  (The side-effects for translation seem like
an independent argument against back-patching.)

            regards, tom lane



Re: widen vacuum buffer counters

From
Michael Paquier
Date:
On Fri, Jan 31, 2020 at 05:13:53PM -0500, Tom Lane wrote:
> +1 for widening these counters, but since they're global variables, -0.2
> or so for back-patching.  I don't know of any reason that an extension
> would be touching these, but I feel like the problem isn't severe enough
> to justify taking an ABI-break risk.

I would not recommend doing a back-patch because of that.  I don't
think that's worth taking any risk.  Extension authors can have a lot
of imagination.

> Also, %zd is the wrong format code for int64.  Recommended practice
> these days is to use "%lld" with an explicit cast of the printf argument
> to long long (just to be sure).  That doesn't work safely before v12,
> and if you did insist on back-patching further, you'd need to jump
> through hoops to avoid having platform-specific format codes in a
> translatable string.  (The side-effects for translation seem like
> an independent argument against back-patching.)

Surely you meant INT64_FORMAT here?  Anyway, looking at the patch,
couldn't we just use uint64?
--
Michael

Attachment

Re: widen vacuum buffer counters

From
Tom Lane
Date:
Michael Paquier <michael@paquier.xyz> writes:
> On Fri, Jan 31, 2020 at 05:13:53PM -0500, Tom Lane wrote:
>> Also, %zd is the wrong format code for int64.  Recommended practice
>> these days is to use "%lld" with an explicit cast of the printf argument
>> to long long (just to be sure).  That doesn't work safely before v12,
>> and if you did insist on back-patching further, you'd need to jump
>> through hoops to avoid having platform-specific format codes in a
>> translatable string.  (The side-effects for translation seem like
>> an independent argument against back-patching.)

> Surely you meant INT64_FORMAT here?

No, because that varies depending on platform, so using it in a
translatable string is a bad idea.  See e.g. 6a1cd8b92.

> Anyway, looking at the patch,
> couldn't we just use uint64?

Yeah, I was wondering if those counters shouldn't be unsigned, too.
Probably doesn't matter once we widen them to 64 bits though.

            regards, tom lane



Re: widen vacuum buffer counters

From
Alvaro Herrera
Date:
On 2020-Jan-31, Tom Lane wrote:

> Also, %zd is the wrong format code for int64.  Recommended practice
> these days is to use "%lld" with an explicit cast of the printf argument
> to long long (just to be sure).  That doesn't work safely before v12,
> and if you did insist on back-patching further, you'd need to jump
> through hoops to avoid having platform-specific format codes in a
> translatable string.  (The side-effects for translation seem like
> an independent argument against back-patching.)

Pushed with that change; did not backpatch, because I don't think it's
really worth the possible breakage :-)

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services