Thread: Option to not use ringbuffer in VACUUM, using it in failsafe mode

Option to not use ringbuffer in VACUUM, using it in failsafe mode

From
Andres Freund
Date:
Hi,

The use of the ringbuffer in VACUUM often causes very substantial slowdowns.

The primary reason for that is that often most/all the buffers in the
ringbuffer have been dirtied when they were processed, with an associated WAL
record. When we then reuse the buffer via the (quite small) ringbuffer, we
need to flush the WAL before reclaiming the buffer.  A synchronous flush of
the WAL every few buffers ends up as a very significant bottleneck, unless you
have local low-latency durable storage (i.e. local storage with some sort of
power protection).

The slowdown caused by the frequent WAL flushes is very significant.

A secondary issue, when we end up doing multiple passes, we'll have to re-read
data into shared_buffers, when we just wrote it out and evicted it.


An example:

On a local SSD with decently fast fdatasync([1]). Table size is 3322MB, with ~10%
updated, ~30% deleted tuples, and a single index. m_w_m is large enough to do
this in one pass.  I used pg_prewarm() of another relation to ensure the
vacuumed table isn't in s_b (otherwise ringbuffers aren't doing anything).

s_b     ringbuffer enabled      time     wal_syncs wal_sync_time
128MB   1                       77797ms
128MB   0                       13676ms       241         2538ms
8GB     1                       72834ms     23976        51989ms
8GB     0                        9544ms       150         1634ms


see [2] for logs / stats of the 8GB run. All the data here is in the OS page
cache, so we don't even pay the real-price for reading the data multiple
times.


On cloud hardware with higher fsync latency I've seen > 15x time differences
between using the ringbuffers and avoiding them by using pg_prewarm.


Of course there's a good reason we have the ringbuffer - we don't want
maintenance operations to completely disturb the buffer pool and harm the
production workload. But if, e.g., the database isn't available due to
anti-wraparound measures, there's no other workload to protect, and the
ringbuffer substantially reduces availability. Initial data loads could
similarly benefit.


Therefore I'd like to add an option to the VACUUM command to use to disable
the use of the ringbuffer. Not sure about the name yet.


I think we should auto-enable that mode once we're using the failsafe mode,
similar to [auto]vacuum cost delays getting disabled
(c.f. lazy_check_wraparound_failsafe()). If things are bad enough that we're
soon going to shut down, we want to be aggressive.


Greetings,

Andres Freund

[1] according to pg_test_fsync:
fdatasync                           769.189 ops/sec    1300 usecs/op


[2]

For the s_b 128MB case:

ringbuffers enabled:

2023-01-11 10:24:58.726 PST [355353][client backend][2/19:0][psql] INFO:  aggressively vacuuming
"postgres.public.copytest_0"
2023-01-11 10:26:19.488 PST [355353][client backend][2/19:0][psql] INFO:  finished vacuuming
"postgres.public.copytest_0":index scans: 1
 
    pages: 0 removed, 424975 remain, 424975 scanned (100.00% of total)
    tuples: 4333300 removed, 6666700 remain, 0 are dead but not yet removable
    removable cutoff: 2981, which was 0 XIDs old when operation ended
    new relfrozenxid: 2981, which is 102 XIDs ahead of previous value
    frozen: 424975 pages from table (100.00% of total) had 6666700 tuples frozen
    index scan needed: 424975 pages from table (100.00% of total) had 4325101 dead item identifiers removed
    index "copytest_0_id_idx": pages: 10032 in total, 0 newly deleted, 0 currently deleted, 0 reusable
    I/O timings: read: 2284.292 ms, write: 4325.009 ms
    avg read rate: 83.203 MB/s, avg write rate: 83.199 MB/s
    buffer usage: 425044 hits, 860113 misses, 860074 dirtied
    WAL usage: 1709902 records, 434990 full page images, 2273501683 bytes
    system usage: CPU: user: 11.62 s, system: 11.86 s, elapsed: 80.76 s


┌─────────────┬─────────┬────────────┬──────────────────┬───────────┬──────────┬────────────────┬───────────────┬───────────────────────────────┐
│ wal_records │ wal_fpi │ wal_bytes  │ wal_buffers_full │ wal_write │ wal_sync │ wal_write_time │ wal_sync_time │
  stats_reset          │
 

├─────────────┼─────────┼────────────┼──────────────────┼───────────┼──────────┼────────────────┼───────────────┼───────────────────────────────┤
│     8092795 │  443356 │ 2999358740 │             1569 │     28651 │    27081 │       1874.391 │     59895.674 │
2023-01-1110:24:58.664859-08 │
 

└─────────────┴─────────┴────────────┴──────────────────┴───────────┴──────────┴────────────────┴───────────────┴───────────────────────────────┘


ringbuffers disabled:

2023-01-11 10:23:05.081 PST [355054][client backend][2/19:0][psql] INFO:  aggressively vacuuming
"postgres.public.copytest_0"
2023-01-11 10:23:18.755 PST [355054][client backend][2/19:0][psql] INFO:  finished vacuuming
"postgres.public.copytest_0":index scans: 1
 
    pages: 0 removed, 424979 remain, 424979 scanned (100.00% of total)
    tuples: 4333300 removed, 6666700 remain, 0 are dead but not yet removable
    removable cutoff: 2879, which was 0 XIDs old when operation ended
    new relfrozenxid: 2879, which is 102 XIDs ahead of previous value
    frozen: 424979 pages from table (100.00% of total) had 6666700 tuples frozen
    index scan needed: 424979 pages from table (100.00% of total) had 4325176 dead item identifiers removed
    index "copytest_0_id_idx": pages: 10032 in total, 0 newly deleted, 0 currently deleted, 0 reusable
    I/O timings: read: 1247.366 ms, write: 2888.756 ms
    avg read rate: 491.485 MB/s, avg write rate: 491.395 MB/s
    buffer usage: 424927 hits, 860242 misses, 860083 dirtied
    WAL usage: 1709918 records, 434994 full page images, 2273503049 bytes
    system usage: CPU: user: 5.42 s, system: 6.26 s, elapsed: 13.67 s


┌─────────────┬─────────┬────────────┬──────────────────┬───────────┬──────────┬────────────────┬───────────────┬───────────────────────────────┐
│ wal_records │ wal_fpi │ wal_bytes  │ wal_buffers_full │ wal_write │ wal_sync │ wal_write_time │ wal_sync_time │
  stats_reset          │
 

├─────────────┼─────────┼────────────┼──────────────────┼───────────┼──────────┼────────────────┼───────────────┼───────────────────────────────┤
│     8092963 │  443362 │ 2999373996 │           212190 │    212333 │      241 │       1209.516 │      2538.706 │
2023-01-1110:23:05.004783-08 │
 

└─────────────┴─────────┴────────────┴──────────────────┴───────────┴──────────┴────────────────┴───────────────┴───────────────────────────────┘



For the s_b 8GB case:

ringbuffers enabled:

2023-01-11 10:04:12.479 PST [352665][client backend][2/19:0][psql] INFO:  aggressively vacuuming
"postgres.public.copytest_0"
2023-01-11 10:05:25.312 PST [352665][client backend][2/19:0][psql] INFO:  finished vacuuming
"postgres.public.copytest_0":index scans: 1
 
    pages: 0 removed, 424977 remain, 424977 scanned (100.00% of total)
    tuples: 4333300 removed, 6666700 remain, 0 are dead but not yet removable
    removable cutoff: 2675, which was 0 XIDs old when operation ended
    new relfrozenxid: 2675, which is 102 XIDs ahead of previous value
    frozen: 424977 pages from table (100.00% of total) had 6666700 tuples frozen
    index scan needed: 424977 pages from table (100.00% of total) had 4325066 dead item identifiers removed
    index "copytest_0_id_idx": pages: 10032 in total, 0 newly deleted, 0 currently deleted, 0 reusable
    I/O timings: read: 2610.875 ms, write: 4177.842 ms
    avg read rate: 81.688 MB/s, avg write rate: 87.515 MB/s
    buffer usage: 523611 hits, 761552 misses, 815868 dirtied
    WAL usage: 1709910 records, 434992 full page images, 2273502729 bytes
    system usage: CPU: user: 11.00 s, system: 11.86 s, elapsed: 72.83 s

┌─────────────┬─────────┬────────────┬──────────────────┬───────────┬──────────┬────────────────┬───────────────┬───────────────────────────────┐
│ wal_records │ wal_fpi │ wal_bytes  │ wal_buffers_full │ wal_write │ wal_sync │ wal_write_time │ wal_sync_time │
  stats_reset          │
 

├─────────────┼─────────┼────────────┼──────────────────┼───────────┼──────────┼────────────────┼───────────────┼───────────────────────────────┤
│     8092707 │  443358 │ 2999354090 │            42259 │     66227 │    23976 │       2050.963 │     51989.099 │
2023-01-1110:04:12.404054-08 │
 

└─────────────┴─────────┴────────────┴──────────────────┴───────────┴──────────┴────────────────┴───────────────┴───────────────────────────────┘


ringbuffers disabled:

2023-01-11 10:08:48.414 PST [353287][client backend][3/19:0][psql] INFO:  aggressively vacuuming
"postgres.public.copytest_0"
2023-01-11 10:08:57.956 PST [353287][client backend][3/19:0][psql] INFO:  finished vacuuming
"postgres.public.copytest_0":index scans: 1
 
    pages: 0 removed, 424977 remain, 424977 scanned (100.00% of total)
    tuples: 4333300 removed, 6666700 remain, 0 are dead but not yet removable
    removable cutoff: 2777, which was 0 XIDs old when operation ended
    new relfrozenxid: 2777, which is 102 XIDs ahead of previous value
    frozen: 424977 pages from table (100.00% of total) had 6666700 tuples frozen
    index scan needed: 424976 pages from table (100.00% of total) had 4325153 dead item identifiers removed
    index "copytest_0_id_idx": pages: 10032 in total, 0 newly deleted, 0 currently deleted, 0 reusable
    I/O timings: read: 1040.230 ms, write: 0.000 ms
    avg read rate: 312.016 MB/s, avg write rate: 356.242 MB/s
    buffer usage: 904078 hits, 381084 misses, 435101 dirtied
    WAL usage: 1709908 records, 434992 full page images, 2273499663 bytes
    system usage: CPU: user: 5.57 s, system: 2.26 s, elapsed: 9.54 s


┌─────────────┬─────────┬────────────┬──────────────────┬───────────┬──────────┬────────────────┬───────────────┬───────────────────────────────┐
│ wal_records │ wal_fpi │ wal_bytes  │ wal_buffers_full │ wal_write │ wal_sync │ wal_write_time │ wal_sync_time │
  stats_reset          │
 

├─────────────┼─────────┼────────────┼──────────────────┼───────────┼──────────┼────────────────┼───────────────┼───────────────────────────────┤
│     8092933 │  443358 │ 2999364596 │           236354 │    236398 │      150 │       1166.314 │      1634.408 │
2023-01-1110:08:48.350328-08 │
 

└─────────────┴─────────┴────────────┴──────────────────┴───────────┴──────────┴────────────────┴───────────────┴───────────────────────────────┘



Re: Option to not use ringbuffer in VACUUM, using it in failsafe mode

From
Peter Geoghegan
Date:
On Wed, Jan 11, 2023 at 10:27 AM Andres Freund <andres@anarazel.de> wrote:
> Therefore I'd like to add an option to the VACUUM command to use to disable
> the use of the ringbuffer. Not sure about the name yet.

Sounds like a good idea.

> I think we should auto-enable that mode once we're using the failsafe mode,
> similar to [auto]vacuum cost delays getting disabled
> (c.f. lazy_check_wraparound_failsafe()). If things are bad enough that we're
> soon going to shut down, we want to be aggressive.

+1

-- 
Peter Geoghegan



Re: Option to not use ringbuffer in VACUUM, using it in failsafe mode

From
Andres Freund
Date:
Hi,

On 2023-01-11 10:27:20 -0800, Andres Freund wrote:
> On cloud hardware with higher fsync latency I've seen > 15x time differences
> between using the ringbuffers and avoiding them by using pg_prewarm.

A slightly edited version of what I've in the past to defeat the ringbuffers
using pg_prewarm, as I think it might be useful for others:

WITH what_rel AS (
  SELECT 'copytest_0'::regclass AS vacuum_me
),
what_to_prefetch AS (
    SELECT vacuum_me, greatest(heap_blks_total - 1, 0) AS last_block,
        CASE WHEN phase = 'scanning heap' THEN heap_blks_scanned ELSE heap_blks_vacuumed END AS current_pos
    FROM what_rel, pg_stat_progress_vacuum
    WHERE relid = vacuum_me AND phase IN ('scanning heap', 'vacuuming heap')
)
SELECT
    vacuum_me, current_pos,
    pg_prewarm(vacuum_me, 'buffer', 'main', current_pos, least(current_pos + 10000, last_block))
FROM what_to_prefetch
\watch 0.1

Having this running in the background brings the s_b=128MB, ringbuffer enabled
case down from 77797ms to 14838ms. Close to the version with the ringbuffer
disabled.


Unfortunately, afaik, that trick isn't currently possible for the index vacuum
phase, as we don't yet expose the current scan position. And not every index
might be as readily prefetchable as just prefetching the next 10k blocks from
the current position.

That's not too bad if your indexes are small, but unfortunately that's not
always the case...

Greetings,

Andres Freund



Re: Option to not use ringbuffer in VACUUM, using it in failsafe mode

From
Andres Freund
Date:
Hi,

On 2023-01-11 10:35:19 -0800, Peter Geoghegan wrote:
> On Wed, Jan 11, 2023 at 10:27 AM Andres Freund <andres@anarazel.de> wrote:
> > Therefore I'd like to add an option to the VACUUM command to use to disable
> > the use of the ringbuffer. Not sure about the name yet.
> 
> Sounds like a good idea.

Any idea about the name? The obvious thing is to reference ring buffers in the
option name, but that's more of an implementation detail...

Some ideas:

USE_RING_BUFFERS on|off
SCAN_PROTECTION on|off
REUSE_BUFFERS on|off
LIMIT_BUFFER_USAGE on|off

Regards,

Andres



Re: Option to not use ringbuffer in VACUUM, using it in failsafe mode

From
Peter Geoghegan
Date:
On Wed, Jan 11, 2023 at 10:58 AM Andres Freund <andres@anarazel.de> wrote:
> Any idea about the name? The obvious thing is to reference ring buffers in the
> option name, but that's more of an implementation detail...

What are the chances that anybody using this feature via a manual
VACUUM command will also use INDEX_CLEANUP off? It's not really
supposed to be used routinely, at all. Right? It's just for
emergencies.

Perhaps it can be tied to INDEX_CLEANUP=off? That makes it hard to get
just the behavior you want when testing VACUUM, but maybe that doesn't
matter.

Realistically, most of the value here comes from changing the failsafe
behavior, which doesn't require the user to know anything about
VACUUM. I know that AWS has reduced the vacuum_failsafe_age default on
RDS to 1.2 billion (a decision made before I joined Amazon), so it is
already something AWS lean on quite a bit where available.

-- 
Peter Geoghegan



Re: Option to not use ringbuffer in VACUUM, using it in failsafe mode

From
Andres Freund
Date:
Hi,

On 2023-01-11 11:06:26 -0800, Peter Geoghegan wrote:
> On Wed, Jan 11, 2023 at 10:58 AM Andres Freund <andres@anarazel.de> wrote:
> > Any idea about the name? The obvious thing is to reference ring buffers in the
> > option name, but that's more of an implementation detail...
> 
> What are the chances that anybody using this feature via a manual
> VACUUM command will also use INDEX_CLEANUP off? It's not really
> supposed to be used routinely, at all. Right? It's just for
> emergencies.

I think it's also quite useful for e.g. vacuuming after initial data loads or
if you need to do a first vacuum after a lot of bloat accumulated due to a
stuck transaction.


> Perhaps it can be tied to INDEX_CLEANUP=off? That makes it hard to get
> just the behavior you want when testing VACUUM, but maybe that doesn't
> matter.

I don't like that - it's also quite useful to disable use of ringbuffers when
you actually need to clean up indexes. Especially when we have a lot of dead
tuples we'll rescan indexes over and over...

Greetings,

Andres Freund



Re: Option to not use ringbuffer in VACUUM, using it in failsafe mode

From
Peter Geoghegan
Date:
On Wed, Jan 11, 2023 at 11:18 AM Andres Freund <andres@anarazel.de> wrote:
> I don't like that - it's also quite useful to disable use of ringbuffers when
> you actually need to clean up indexes. Especially when we have a lot of dead
> tuples we'll rescan indexes over and over...

That's a fair point.

My vote goes to "REUSE_BUFFERS", then.

-- 
Peter Geoghegan



Re: Option to not use ringbuffer in VACUUM, using it in failsafe mode

From
Justin Pryzby
Date:
On Wed, Jan 11, 2023 at 10:58:54AM -0800, Andres Freund wrote:
> Hi,
> 
> On 2023-01-11 10:35:19 -0800, Peter Geoghegan wrote:
> > On Wed, Jan 11, 2023 at 10:27 AM Andres Freund <andres@anarazel.de> wrote:
> > > Therefore I'd like to add an option to the VACUUM command to use to disable
> > > the use of the ringbuffer. Not sure about the name yet.
> > 
> > Sounds like a good idea.
> 
> Any idea about the name? The obvious thing is to reference ring buffers in the
> option name, but that's more of an implementation detail...
> 
> Some ideas:
> 
> USE_RING_BUFFERS on|off
> REUSE_BUFFERS on|off

+1 for either of these.

I don't think it's an issue to expose implementation details here.
Anyone who wants to change this will know about the implementation
details that they're changing, and it's better to refer to it by the
same/similar name and not by some other name that's hard to find.

BTW I can't see that the ring buffer is currently exposed in any
user-facing docs for COPY/ALTER/VACUUM/CREATE ?

-- 
Justin



Re: Option to not use ringbuffer in VACUUM, using it in failsafe mode

From
Andres Freund
Date:
Hi,

On 2023-01-11 14:38:34 -0600, Justin Pryzby wrote:
> On Wed, Jan 11, 2023 at 10:58:54AM -0800, Andres Freund wrote:
> > Some ideas:
> > 
> > USE_RING_BUFFERS on|off
> > REUSE_BUFFERS on|off
> 
> +1 for either of these.

Then I'd go for REUSE_BUFFERS. What made you prefer it over
LIMIT_BUFFER_USAGE?

USE_BUFFER_ACCESS_STRATEGY would be a name tied to the implementation that's
not awful, I think..


> I don't think it's an issue to expose implementation details here.
> Anyone who wants to change this will know about the implementation
> details that they're changing, and it's better to refer to it by the
> same/similar name and not by some other name that's hard to find.

A ringbuffer could refer to a lot of things other than something limiting
buffer usage, that's why I don't like it.


> BTW I can't see that the ring buffer is currently exposed in any
> user-facing docs for COPY/ALTER/VACUUM/CREATE ?

Yea, there's surprisingly little in the docs about it, given how much it
influences behaviour. It's mentioned in tablesample-method.sgml, but without
explanation - and it's a page documenting C API...

Greetings,

Andres Freund



Re: Option to not use ringbuffer in VACUUM, using it in failsafe mode

From
Tom Lane
Date:
Peter Geoghegan <pg@bowt.ie> writes:
> On Wed, Jan 11, 2023 at 11:18 AM Andres Freund <andres@anarazel.de> wrote:
>> I don't like that - it's also quite useful to disable use of ringbuffers when
>> you actually need to clean up indexes. Especially when we have a lot of dead
>> tuples we'll rescan indexes over and over...

> That's a fair point.

> My vote goes to "REUSE_BUFFERS", then.

I wonder whether it could make sense to allow a larger ringbuffer size,
rather than just the limit cases of "on" and "off".

            regards, tom lane



Re: Option to not use ringbuffer in VACUUM, using it in failsafe mode

From
Andres Freund
Date:
Hi,

On 2023-01-11 16:18:34 -0500, Tom Lane wrote:
> Peter Geoghegan <pg@bowt.ie> writes:
> > On Wed, Jan 11, 2023 at 11:18 AM Andres Freund <andres@anarazel.de> wrote:
> >> I don't like that - it's also quite useful to disable use of ringbuffers when
> >> you actually need to clean up indexes. Especially when we have a lot of dead
> >> tuples we'll rescan indexes over and over...
> 
> > That's a fair point.
> 
> > My vote goes to "REUSE_BUFFERS", then.
> 
> I wonder whether it could make sense to allow a larger ringbuffer size,
> rather than just the limit cases of "on" and "off".

I can see that making sense, particularly if we were to later extend this to
other users of ringbuffers. E.g. COPYs us of the ringbuffer makes loading of
data > 16MB but also << s_b vastly slower, but it can still be very important
to use if there's lots of parallel processes loading data.

Maybe BUFFER_USAGE_LIMIT, with a value from -1 to N, with -1 indicating the
default value, 0 preventing use of a buffer access strategy, and 1..N
indicating the size in blocks?

Would we want to set an upper limit lower than implied by the memory limit for
the BufferAccessStrategy allocation?

Greetings,

Andres Freund



Re: Option to not use ringbuffer in VACUUM, using it in failsafe mode

From
"David G. Johnston"
Date:
On Wed, Jan 11, 2023 at 2:39 PM Andres Freund <andres@anarazel.de> wrote:
Hi,

On 2023-01-11 16:18:34 -0500, Tom Lane wrote:
> Peter Geoghegan <pg@bowt.ie> writes:
> > On Wed, Jan 11, 2023 at 11:18 AM Andres Freund <andres@anarazel.de> wrote:
> >> I don't like that - it's also quite useful to disable use of ringbuffers when
> >> you actually need to clean up indexes. Especially when we have a lot of dead
> >> tuples we'll rescan indexes over and over...
>
> > That's a fair point.
>
> > My vote goes to "REUSE_BUFFERS", then.
>
> I wonder whether it could make sense to allow a larger ringbuffer size,
> rather than just the limit cases of "on" and "off".

I can see that making sense, particularly if we were to later extend this to
other users of ringbuffers. E.g. COPYs us of the ringbuffer makes loading of
data > 16MB but also << s_b vastly slower, but it can still be very important
to use if there's lots of parallel processes loading data.


Should we just add "ring_buffers" to the existing "shared_buffers" and "temp_buffers" settings?

Then give VACUUM a (BUFFER_POOL=ring*|shared) option?

I think making DBAs aware of this dynamic and making the ring buffer usage user-facing is beneficial in its own right (at least, the concept that changes done by vacuum don't impact shared_buffers, regardless of how that non-impact manifests).  But I don't see much benefit trying to come up with a different name.

David J.

Re: Option to not use ringbuffer in VACUUM, using it in failsafe mode

From
Andres Freund
Date:
Hi,

On 2023-01-11 17:26:19 -0700, David G. Johnston wrote:
> Should we just add "ring_buffers" to the existing "shared_buffers" and
> "temp_buffers" settings?

The different types of ring buffers have different sizes, for good reasons. So
I don't see that working well. I also think it'd be more often useful to
control this on a statement basis - if you have a parallel import tool that
starts NCPU COPYs you'd want a smaller buffer than a single threaded COPY. Of
course each session can change the ring buffer settings, but still.


> Then give VACUUM a (BUFFER_POOL=ring*|shared) option?

That seems likely to mislead, because it'd still use shared buffers when the
blocks are already present. The ring buffers aren't a separate buffer pool,
they're a subset of the normal bufferpool. Lookup is done normally, only when
a page isn't found, the search for a victim buffer first tries to use a buffer
from the ring.


> I think making DBAs aware of this dynamic and making the ring buffer usage
> user-facing is beneficial in its own right (at least, the concept that
> changes done by vacuum don't impact shared_buffers, regardless of how that
> non-impact manifests).

VACUUM can end up dirtying all of shared buffers, even with the ring buffer in
use...

Greetings,

Andres Freund



Re: Option to not use ringbuffer in VACUUM, using it in failsafe mode

From
Melanie Plageman
Date:
Hi,

So, I attached a rough implementation of both the autovacuum failsafe
reverts to shared buffers and the vacuum option (no tests or docs or
anything).

The first three patches in the set are just for enabling use of shared
buffers in failsafe mode for autovacuum. I haven't actually ensured it
works (i.e. triggering failsafe mode and checking the stats for whether
or not shared buffers were used).

I was wondering about the status of the autovacuum wraparound failsafe
test suggested in [1]. I don't see it registered for the March's
commitfest. I'll probably review it since it will be useful for this
patchset.

The first patch in the set is to free the BufferAccessStrategy object
that is made in do_autovacuum() -- I don't see when the memory context
it is allocated in is destroyed, so it seems like it might be a leak?

The last patch in the set is a trial implementation of the VACUUM option
suggested -- BUFFER_USAGE_LIMIT. More on that below.

On Wed, Jan 11, 2023 at 4:39 PM Andres Freund <andres@anarazel.de> wrote:
Hi,

On 2023-01-11 16:18:34 -0500, Tom Lane wrote:
> Peter Geoghegan <pg@bowt.ie> writes:
> > On Wed, Jan 11, 2023 at 11:18 AM Andres Freund <andres@anarazel.de> wrote:
> >> I don't like that - it's also quite useful to disable use of ringbuffers when
> >> you actually need to clean up indexes. Especially when we have a lot of dead
> >> tuples we'll rescan indexes over and over...
>
> > That's a fair point.
>
> > My vote goes to "REUSE_BUFFERS", then.
>
> I wonder whether it could make sense to allow a larger ringbuffer size,
> rather than just the limit cases of "on" and "off".

I can see that making sense, particularly if we were to later extend this to
other users of ringbuffers. E.g. COPYs us of the ringbuffer makes loading of
data > 16MB but also << s_b vastly slower, but it can still be very important
to use if there's lots of parallel processes loading data.

Maybe BUFFER_USAGE_LIMIT, with a value from -1 to N, with -1 indicating the
default value, 0 preventing use of a buffer access strategy, and 1..N
indicating the size in blocks?


I have found the implementation you suggested very hard to use.
The attached fourth patch in the set implements it the way you suggest.
I had to figure out what number to set the BUFER_USAGE_LIMIT to -- and,
since I don't specify shared buffers in units of nbuffer, it's pretty
annoying to have to figure out a valid number.

I think that it would be better to have it be either a percentage of
shared buffers or a size in units of bytes/kb/mb like that of shared
buffers.

Using a fraction or percentage appeals to me because you don't need to
reference your shared buffers setting and calculate what size you want
to set it to. Also, parsing the size in different units sounds like more work.

Unfortunately, the fraction doesn't really work if we cap the ring size
of a buffer access strategy to NBuffers / 8. Also, there are other
issues like what would 0% and 100% mean.

I have a list of other questions, issues, and TODOs related to the code
I wrote to implement BUFFER_USAGE_LIMIT, but I'm not sure those are
worth discussing until we shape up the interface.
 
Would we want to set an upper limit lower than implied by the memory limit for
the BufferAccessStrategy allocation?


So, I was wondering what you thought about NBuffers / 8 (the current
limit). Does it make sense?

If we clamp the user-specified value to this, I think we definitely need
to inform them through some kind of logging or message. I am sure there
are lots of other gucs doing this -- do you know any off the top of your
head?

- Melanie

Attachment

Re: Option to not use ringbuffer in VACUUM, using it in failsafe mode

From
Andres Freund
Date:
Hi,

On 2023-02-22 16:32:53 -0500, Melanie Plageman wrote:
> I was wondering about the status of the autovacuum wraparound failsafe
> test suggested in [1]. I don't see it registered for the March's
> commitfest. I'll probably review it since it will be useful for this
> patchset.

It's pretty hard to make it work reliably. I was suggesting somewhere that we
ought to add a EMERGENCY parameter to manual VACUUMs to allow testing that
path a tad more easily.


> The first patch in the set is to free the BufferAccessStrategy object
> that is made in do_autovacuum() -- I don't see when the memory context
> it is allocated in is destroyed, so it seems like it might be a leak?

The backend shuts down just after, so that's not a real issue. Not that it'd
hurt to fix it.


> > I can see that making sense, particularly if we were to later extend this
> > to
> > other users of ringbuffers. E.g. COPYs us of the ringbuffer makes loading
> > of
> > data > 16MB but also << s_b vastly slower, but it can still be very
> > important
> > to use if there's lots of parallel processes loading data.
> >
> > Maybe BUFFER_USAGE_LIMIT, with a value from -1 to N, with -1 indicating the
> > default value, 0 preventing use of a buffer access strategy, and 1..N
> > indicating the size in blocks?

> I have found the implementation you suggested very hard to use.
> The attached fourth patch in the set implements it the way you suggest.
> I had to figure out what number to set the BUFER_USAGE_LIMIT to -- and,
> since I don't specify shared buffers in units of nbuffer, it's pretty
> annoying to have to figure out a valid number.

I think we should be able to parse it in a similar way to how we parse
shared_buffers. You could even implement this as a GUC that is then set by
VACUUM (similar to how VACUUM FREEZE is implemented).


> I think that it would be better to have it be either a percentage of
> shared buffers or a size in units of bytes/kb/mb like that of shared
> buffers.

I don't think a percentage of shared buffers works particularly well - you
very quickly run into the ringbuffer becoming impractically big.


> > Would we want to set an upper limit lower than implied by the memory limit
> > for
> > the BufferAccessStrategy allocation?
> >
> >
> So, I was wondering what you thought about NBuffers / 8 (the current
> limit). Does it make sense?

That seems *way* too big. Imagine how large allocations we'd end up with a
shared_buffers size of a few TB.

I'd probably make it a hard error at 1GB and a silent cap at NBuffers / 2 or
such.


> @@ -547,7 +547,7 @@ bt_check_every_level(Relation rel, Relation heaprel, bool heapkeyspace,
>      state->targetcontext = AllocSetContextCreate(CurrentMemoryContext,
>                                                   "amcheck context",
>                                                   ALLOCSET_DEFAULT_SIZES);
> -    state->checkstrategy = GetAccessStrategy(BAS_BULKREAD);
> +    state->checkstrategy = GetAccessStrategy(BAS_BULKREAD, -1);
>  
>      /* Get true root block from meta-page */
>      metapage = palloc_btree_page(state, BTREE_METAPAGE);

Changing this everywhere seems pretty annoying, particularly because I suspect
a bunch of extensions also use GetAccessStrategy(). How about a
GetAccessStrategyExt(), GetAccessStrategyCustomSize() or such?


>  BufferAccessStrategy
> -GetAccessStrategy(BufferAccessStrategyType btype)
> +GetAccessStrategy(BufferAccessStrategyType btype, int buffers)
>  {
>      BufferAccessStrategy strategy;
>      int            ring_size;
> +    const char *strategy_name = btype_get_name(btype);

Shouldn't be executed when we don't need it.


> +    if (btype != BAS_VACUUM)
> +    {
> +        if (buffers == 0)
> +            elog(ERROR, "Use of shared buffers unsupported for buffer access strategy: %s. nbuffers must be -1.",
> +                    strategy_name);
> +
> +        if (buffers > 0)
> +            elog(ERROR, "Specification of ring size in buffers unsupported for buffer access strategy: %s. nbuffers
mustbe -1.",
 
> +                    strategy_name);
> +    }
> +
> +    // TODO: DEBUG logging message for dev?
> +    if (buffers == 0)
> +        btype = BAS_NORMAL;

GetAccessStrategy() often can be executed hundreds of thousands of times a
second, so I'm very sceptical that adding log messages to it useful.

Greetings,

Andres Freund



Re: Option to not use ringbuffer in VACUUM, using it in failsafe mode

From
Bharath Rupireddy
Date:
On Thu, Jan 12, 2023 at 6:06 AM Andres Freund <andres@anarazel.de> wrote:
>
> On 2023-01-11 17:26:19 -0700, David G. Johnston wrote:
> > Should we just add "ring_buffers" to the existing "shared_buffers" and
> > "temp_buffers" settings?
>
> The different types of ring buffers have different sizes, for good reasons. So
> I don't see that working well. I also think it'd be more often useful to
> control this on a statement basis - if you have a parallel import tool that
> starts NCPU COPYs you'd want a smaller buffer than a single threaded COPY. Of
> course each session can change the ring buffer settings, but still.

How about having GUCs for each ring buffer (bulk_read_ring_buffers,
bulk_write_ring_buffers, vacuum_ring_buffers - ah, 3 more new GUCs)?
These options can help especially when statement level controls aren't
easy to add (COPY, CREATE TABLE AS/CTAS, REFRESH MAT VIEW/RMV)? If
needed users can also set them at the system level. For instance, one
can set bulk_write_ring_buffers to other than 16MB or -1 to disable
the ring buffer to use shared_buffers and run a bunch of bulk write
queries.

Although I'm not quite opposing the idea of statement level controls
(like the VACUUM one proposed here), it is better to make these ring
buffer sizes configurable across the system to help with the other
similar cases e.g., a CTAS or RMV can help subsequent reads from
shared buffers if ring buffer is skipped.

Thoughts?

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



Re: Option to not use ringbuffer in VACUUM, using it in failsafe mode

From
Bharath Rupireddy
Date:
On Thu, Feb 23, 2023 at 3:03 AM Melanie Plageman
<melanieplageman@gmail.com> wrote:
>
> Hi,
>
> So, I attached a rough implementation of both the autovacuum failsafe
> reverts to shared buffers and the vacuum option (no tests or docs or
> anything).

Thanks for the patches. I have some comments.

0001:
1. I don't quite understand the need for this 0001 patch. Firstly,
bstrategy allocated once per autovacuum worker in AutovacMemCxt which
goes away with the process. Secondly, the worker exits after
do_autovacuum() with which memory context is gone. I think this patch
is unnecessary unless I'm missing something.

0002:
1. Don't we need to remove vac_strategy for analyze.c as well? It's
pretty-meaningless there than vacuum.c as we're passing bstrategy to
all required functions.

0004:
1. I think no multiple sentences in a single error message. How about
"of %d, changing it to %d"?
+                elog(WARNING, "buffer_usage_limit %d is below the
minimum buffer_usage_limit of %d. setting it to %d",

2. Typically, postgres error messages start with lowercase letters,
hints and detail messages start with uppercase letters.
+        if (buffers == 0)
+            elog(ERROR, "Use of shared buffers unsupported for buffer
access strategy: %s. nbuffers must be -1.",
+                    strategy_name);
+
+        if (buffers > 0)
+            elog(ERROR, "Specification of ring size in buffers
unsupported for buffer access strategy: %s. nbuffers must be -1.",
+                    strategy_name);

3. A function for this seems unnecessary, especially when a static
array would do the needful, something like forkNames[].
+static const char *
+btype_get_name(BufferAccessStrategyType btype)
+{
+    switch (btype)
+    {

4. Why are these assumptions needed? Can't we simplify by doing
validations on the new buffers parameter only when the btype is
BAS_VACUUM?
+        if (buffers == 0)
+            elog(ERROR, "Use of shared buffers unsupported for buffer
access strategy: %s. nbuffers must be -1.",
+                    strategy_name);

+    // TODO: DEBUG logging message for dev?
+    if (buffers == 0)
+        btype = BAS_NORMAL;

5. Is this change needed for this patch?
         default:
             elog(ERROR, "unrecognized buffer access strategy: %d",
-                 (int) btype);
-            return NULL;        /* keep compiler quiet */
+                    (int) btype);
+
+        pg_unreachable();

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



Re: Option to not use ringbuffer in VACUUM, using it in failsafe mode

From
Melanie Plageman
Date:
Thank you to all reviewers!

This email is in answer to the three reviews
done since my last version. Attached is v2 and inline below is replies
to all the review comments.

The main difference between this version and the previous version is
that I've added a guc, buffer_usage_limit and the VACUUM option
BUFFER_USAGE_LIMIT is now to be specified in size (like kB, MB, etc).

I currently only use the guc value for VACUUM, but it is meant to be
used for all buffer access strategies and is configurable at the session
level.

I would prefer that we had the option of resizing the buffer access
strategy object per table being autovacuumed. Since autovacuum reloads
the config file between tables, this would be quite possible.

I started implementing this, but stopped because the code is not really
in a good state for that.

In fact, I'm not very happy with my implementation at all because I
think given the current structure of vacuum() and vacuum_rel(), it will
potentially make the code more confusing.

I don't like how autovacuum and vacuum use vacuum_rel() and vacuum()
differently (autovacuum always calls vacuum() with a list containing a
single relation). And vacuum() takes buffer access strategy as a
parameter, supposedly so that autovacuum can change the buffer access
strategy object per call, but it doesn't do that. And then vacuum() and
vacuum_rel() go and access VacuumParams at various places with no rhyme
or reason -- seemingly just based on the random availability of other
objects whose state they would like to check on. So, IMO, in adding a
"buffers" parameter to VacuumParams, I am asking for confusion in
autovacuum code with table-level VacuumParams containing an value for
buffers that isn't used.


On Mon, Feb 27, 2023 at 4:21 PM Andres Freund <andres@anarazel.de> wrote:
> On 2023-02-22 16:32:53 -0500, Melanie Plageman wrote:
> > The first patch in the set is to free the BufferAccessStrategy object
> > that is made in do_autovacuum() -- I don't see when the memory context
> > it is allocated in is destroyed, so it seems like it might be a leak?
>
> The backend shuts down just after, so that's not a real issue. Not that it'd
> hurt to fix it.

I've dropped that patch from the set.

> > > I can see that making sense, particularly if we were to later extend this
> > > to
> > > other users of ringbuffers. E.g. COPYs us of the ringbuffer makes loading
> > > of
> > > data > 16MB but also << s_b vastly slower, but it can still be very
> > > important
> > > to use if there's lots of parallel processes loading data.
> > >
> > > Maybe BUFFER_USAGE_LIMIT, with a value from -1 to N, with -1 indicating the
> > > default value, 0 preventing use of a buffer access strategy, and 1..N
> > > indicating the size in blocks?
>
> > I have found the implementation you suggested very hard to use.
> > The attached fourth patch in the set implements it the way you suggest.
> > I had to figure out what number to set the BUFER_USAGE_LIMIT to -- and,
> > since I don't specify shared buffers in units of nbuffer, it's pretty
> > annoying to have to figure out a valid number.
>
> I think we should be able to parse it in a similar way to how we parse
> shared_buffers. You could even implement this as a GUC that is then set by
> VACUUM (similar to how VACUUM FREEZE is implemented).

in the attached v2, I've used parse_int() to do this.


> > I think that it would be better to have it be either a percentage of
> > shared buffers or a size in units of bytes/kb/mb like that of shared
> > buffers.
>
> I don't think a percentage of shared buffers works particularly well - you
> very quickly run into the ringbuffer becoming impractically big.

It is now a size.

> > > Would we want to set an upper limit lower than implied by the memory limit
> > > for
> > > the BufferAccessStrategy allocation?
> > >
> > >
> > So, I was wondering what you thought about NBuffers / 8 (the current
> > limit). Does it make sense?
>
> That seems *way* too big. Imagine how large allocations we'd end up with a
> shared_buffers size of a few TB.
>
> I'd probably make it a hard error at 1GB and a silent cap at NBuffers / 2 or
> such.

Well, as I mentioned NBuffers / 8 is the current GetAccessStrategy()
cap.

In the attached patchset, I have introduced a hard cap of 16GB which is
enforced for the VACUUM option and for the buffer_usage_limit guc. I
kept the "silent cap" at NBuffers / 8 but am open to changing it to
NBuffers / 2 if we think it is okay for its silent cap to be different
than GetAccessStrategy()'s cap.


> > @@ -547,7 +547,7 @@ bt_check_every_level(Relation rel, Relation heaprel, bool heapkeyspace,
> >       state->targetcontext = AllocSetContextCreate(CurrentMemoryContext,
> >                                                                                                "amcheck context",
> >
ALLOCSET_DEFAULT_SIZES);
> > -     state->checkstrategy = GetAccessStrategy(BAS_BULKREAD);
> > +     state->checkstrategy = GetAccessStrategy(BAS_BULKREAD, -1);
> >
> >       /* Get true root block from meta-page */
> >       metapage = palloc_btree_page(state, BTREE_METAPAGE);
>
> Changing this everywhere seems pretty annoying, particularly because I suspect
> a bunch of extensions also use GetAccessStrategy(). How about a
> GetAccessStrategyExt(), GetAccessStrategyCustomSize() or such?

Yes, I don't know what I was thinking. Changed it to
GetAccessStrategyExt() -- though now I am thinking I don't like Ext and
want to change it.

> >  BufferAccessStrategy
> > -GetAccessStrategy(BufferAccessStrategyType btype)
> > +GetAccessStrategy(BufferAccessStrategyType btype, int buffers)
> >  {
> >       BufferAccessStrategy strategy;
> >       int                     ring_size;
> > +     const char *strategy_name = btype_get_name(btype);
>
> Shouldn't be executed when we don't need it.

I got rid of it for now.

> > +     if (btype != BAS_VACUUM)
> > +     {
> > +             if (buffers == 0)
> > +                     elog(ERROR, "Use of shared buffers unsupported for buffer access strategy: %s. nbuffers must
be-1.", 
> > +                                     strategy_name);
> > +
> > +             if (buffers > 0)
> > +                     elog(ERROR, "Specification of ring size in buffers unsupported for buffer access strategy:
%s.nbuffers must be -1.", 
> > +                                     strategy_name);
> > +     }
> > +
> > +     // TODO: DEBUG logging message for dev?
> > +     if (buffers == 0)
> > +             btype = BAS_NORMAL;
>
> GetAccessStrategy() often can be executed hundreds of thousands of times a
> second, so I'm very sceptical that adding log messages to it useful.

So, in the case of vacuum and autovacuum, I don't see how
GetAccessStrategyExt() could be called hundreds of thousands of times a
second. It is not even called for each table being vacuumed -- it is
only called before vacuuming a list of tables.

On Tue, Feb 28, 2023 at 3:16 AM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

> On Thu, Jan 12, 2023 at 6:06 AM Andres Freund <andres@anarazel.de> wrote:
> >
> > On 2023-01-11 17:26:19 -0700, David G. Johnston wrote:
> > > Should we just add "ring_buffers" to the existing "shared_buffers" and
> > > "temp_buffers" settings?
> >
> > The different types of ring buffers have different sizes, for good reasons. So
> > I don't see that working well. I also think it'd be more often useful to
> > control this on a statement basis - if you have a parallel import tool that
> > starts NCPU COPYs you'd want a smaller buffer than a single threaded COPY. Of
> > course each session can change the ring buffer settings, but still.
>
> How about having GUCs for each ring buffer (bulk_read_ring_buffers,
> bulk_write_ring_buffers, vacuum_ring_buffers - ah, 3 more new GUCs)?
> These options can help especially when statement level controls aren't
> easy to add (COPY, CREATE TABLE AS/CTAS, REFRESH MAT VIEW/RMV)? If
> needed users can also set them at the system level. For instance, one
> can set bulk_write_ring_buffers to other than 16MB or -1 to disable
> the ring buffer to use shared_buffers and run a bunch of bulk write
> queries.

So, I've rebelled a bit and implemented a single guc,
buffer_usage_limit, in the attached patchset. Users can set it at the
session or system level or they can specify BUFFER_USAGE_LIMIT to
vacuum. It is the same size for all operations. By default all of this
would be the same as it is now.

The attached patchset does not use the guc for any operations except
VACUUM, though. I will add on another patch if people still feel
strongly that we cannot have a single guc. If the other operations use
this guc, I think we could get much of the same flexibility as having
multiple gucs by just being able to set it at the session level (or
having command options).

> Although I'm not quite opposing the idea of statement level controls
> (like the VACUUM one proposed here), it is better to make these ring
> buffer sizes configurable across the system to help with the other
> similar cases e.g., a CTAS or RMV can help subsequent reads from
> shared buffers if ring buffer is skipped.

Yes, I've done both.


On Tue, Feb 28, 2023 at 3:52 AM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
>
> On Thu, Feb 23, 2023 at 3:03 AM Melanie Plageman
> 0001:
> 1. I don't quite understand the need for this 0001 patch. Firstly,
> bstrategy allocated once per autovacuum worker in AutovacMemCxt which
> goes away with the process. Secondly, the worker exits after
> do_autovacuum() with which memory context is gone. I think this patch
> is unnecessary unless I'm missing something.

I've dropped this one.

> 0002:
> 1. Don't we need to remove vac_strategy for analyze.c as well? It's
> pretty-meaningless there than vacuum.c as we're passing bstrategy to
> all required functions.

So, it is a bit harder to remove it from analyze because acquire_func
func doesn't take the buffer access strategy as a parameter and
acquire_sample_rows uses the vac_context global variable to pass to
table_scan_analyze_next_block().

We could change acquire_func, but it looks like FDW uses it, so I'm not
sure. It would be more for consistency than as a performance win, as I
imagine analyze is less of a problem than vacuum (i.e. it is probably
reading fewer blocks and probably not dirtying them [unless it does
on-access pruning?]).

I haven't done this in the attached set.

> 0004:
> 1. I think no multiple sentences in a single error message. How about
> "of %d, changing it to %d"?
> +                elog(WARNING, "buffer_usage_limit %d is below the
> minimum buffer_usage_limit of %d. setting it to %d",

I've removed this message, but if I put back a message about clamping, I
will remember this note.

> 2. Typically, postgres error messages start with lowercase letters,
> hints and detail messages start with uppercase letters.
> +        if (buffers == 0)
> +            elog(ERROR, "Use of shared buffers unsupported for buffer
> access strategy: %s. nbuffers must be -1.",
> +                    strategy_name);
> +
> +        if (buffers > 0)
> +            elog(ERROR, "Specification of ring size in buffers
> unsupported for buffer access strategy: %s. nbuffers must be -1.",
> +                    strategy_name);

Thanks! I've removed some of the error messages for now, but, for the
ones I kept, I tthink they are consistent now with this pattern.


> 3. A function for this seems unnecessary, especially when a static
> array would do the needful, something like forkNames[].
> +static const char *
> +btype_get_name(BufferAccessStrategyType btype)
> +{
> +    switch (btype)
> +    {

I've removed it for now.

>
> 4. Why are these assumptions needed? Can't we simplify by doing
> validations on the new buffers parameter only when the btype is
> BAS_VACUUM?
> +        if (buffers == 0)
> +            elog(ERROR, "Use of shared buffers unsupported for buffer
> access strategy: %s. nbuffers must be -1.",
> +                    strategy_name);
>
> +    // TODO: DEBUG logging message for dev?
> +    if (buffers == 0)
> +        btype = BAS_NORMAL;

So, I've moved validation to the vacuum option parsing for the vacuum
option and am using the guc infrastructure to check min and max for the
guc value.

> 5. Is this change needed for this patch?
>          default:
>              elog(ERROR, "unrecognized buffer access strategy: %d",
> -                 (int) btype);
> -            return NULL;        /* keep compiler quiet */
> +                    (int) btype);
> +
> +        pg_unreachable();

The pg_unreachable() is removed, as I've left GetAccessStrategy()
untouched.

- Melanie

Attachment

Re: Option to not use ringbuffer in VACUUM, using it in failsafe mode

From
Melanie Plageman
Date:
> On Tue, Feb 28, 2023 at 3:16 AM Bharath Rupireddy
> <bharath.rupireddyforpostgres@gmail.com> wrote:
>
> > On Thu, Jan 12, 2023 at 6:06 AM Andres Freund <andres@anarazel.de> wrote:
> > >
> > > On 2023-01-11 17:26:19 -0700, David G. Johnston wrote:
> > > > Should we just add "ring_buffers" to the existing "shared_buffers" and
> > > > "temp_buffers" settings?
> > >
> > > The different types of ring buffers have different sizes, for good reasons. So
> > > I don't see that working well. I also think it'd be more often useful to
> > > control this on a statement basis - if you have a parallel import tool that
> > > starts NCPU COPYs you'd want a smaller buffer than a single threaded COPY. Of
> > > course each session can change the ring buffer settings, but still.
> >
> > How about having GUCs for each ring buffer (bulk_read_ring_buffers,
> > bulk_write_ring_buffers, vacuum_ring_buffers - ah, 3 more new GUCs)?
> > These options can help especially when statement level controls aren't
> > easy to add (COPY, CREATE TABLE AS/CTAS, REFRESH MAT VIEW/RMV)? If
> > needed users can also set them at the system level. For instance, one
> > can set bulk_write_ring_buffers to other than 16MB or -1 to disable
> > the ring buffer to use shared_buffers and run a bunch of bulk write
> > queries.

In attached v3, I've changed the name of the guc from buffer_usage_limit
to vacuum_buffer_usage_limit, since it is only used for vacuum and
autovacuum.

I haven't added the other suggested strategy gucs, as those could easily
be done in a future patchset.

I've also changed GetAccessStrategyExt() to GetAccessStrategyWithSize()
-- similar to initArrayResultWithSize().

And I've added tab completion for BUFFER_USAGE_LIMIT so that it is
easier to try out my patch.

Most of the TODOs in the code are related to the question of how
autovacuum uses the guc vacuum_buffer_usage_limit. autovacuum creates
the buffer access strategy ring in do_autovacuum() before looping
through and vacuuming tables. It passes this strategy object on to
vacuum(). Since we reuse the same strategy object for all tables in a
given invocation of do_autovacuum(), only failsafe autovacuum would
change buffer access strategies. This is probably okay, but it does mean
that the table-level VacuumParams variable, ring_size, means something
different for autovacuum than vacuum. Autovacuum workers will always
have set it to -1. We won't ever reach code in vacuum() which relies on
VacuumParams->ring_size as long as autovacuum workers pass a non-NULL
BufferAccessStrategy object to vacuum(), though.

- Melanie

Attachment

Re: Option to not use ringbuffer in VACUUM, using it in failsafe mode

From
Justin Pryzby
Date:
On Sat, Mar 11, 2023 at 09:55:33AM -0500, Melanie Plageman wrote:
> Subject: [PATCH v3 2/3] use shared buffers when failsafe active
> 
> +        /*
> +         * Assume the caller who allocated the memory for the
> +         * BufferAccessStrategy object will free it.
> +         */
> +        vacrel->bstrategy = NULL;

This comment could use elaboration:

** VACUUM normally restricts itself to a small ring buffer; but in
failsafe mode, in order to process tables as quickly as possible, allow
the leaving behind large number of dirty buffers.

> Subject: [PATCH v3 3/3] add vacuum option to specify ring_size and guc

>  #define INT_ACCESS_ONCE(var)    ((int)(*((volatile int *)&(var))))
> +#define bufsize_limit_to_nbuffers(bufsize) (bufsize * 1024 / BLCKSZ)

Macros are normally be capitalized

It's a good idea to write "(bufsize)", in case someone passes "a+b".

> @@ -586,6 +587,45 @@ GetAccessStrategy(BufferAccessStrategyType btype)
> +BufferAccessStrategy
> +GetAccessStrategyWithSize(BufferAccessStrategyType btype, int ring_size)

Maybe it would make sense for GetAccessStrategy() to call
GetAccessStrategyWithSize().  Or maybe not.

> +        {"vacuum_buffer_usage_limit", PGC_USERSET, RESOURCES_MEM,
> +            gettext_noop("Sets the buffer pool size for operations employing a buffer access strategy."),

The description should mention vacuum, if that's the scope of the GUC's
behavior.

> +#vacuum_buffer_usage_limit = -1 # size of vacuum buffer access strategy ring.
> +                # -1 to use default,
> +                # 0 to disable vacuum buffer access strategy and use shared buffers
> +                # > 0 to specify size

If I'm not wrong, there's still no documentation about "ring buffers" or
postgres' "strategy".  Which seems important to do for this patch, along
with other documentation.

This patch should add support in vacuumdb.c.  And maybe a comment about
adding support there, since it's annoying when it the release notes one
year say "support VACUUM (FOO)" and then one year later say "support
vacuumdb --foo".

-- 
Justin



Re: Option to not use ringbuffer in VACUUM, using it in failsafe mode

From
Ants Aasma
Date:
On Sat, 11 Mar 2023 at 16:55, Melanie Plageman
<melanieplageman@gmail.com> wrote:
>
> > On Tue, Feb 28, 2023 at 3:16 AM Bharath Rupireddy
> > <bharath.rupireddyforpostgres@gmail.com> wrote:
> >
> > > On Thu, Jan 12, 2023 at 6:06 AM Andres Freund <andres@anarazel.de> wrote:
> > > >
> > > > On 2023-01-11 17:26:19 -0700, David G. Johnston wrote:
> > > > > Should we just add "ring_buffers" to the existing "shared_buffers" and
> > > > > "temp_buffers" settings?
> > > >
> > > > The different types of ring buffers have different sizes, for good reasons. So
> > > > I don't see that working well. I also think it'd be more often useful to
> > > > control this on a statement basis - if you have a parallel import tool that
> > > > starts NCPU COPYs you'd want a smaller buffer than a single threaded COPY. Of
> > > > course each session can change the ring buffer settings, but still.
> > >
> > > How about having GUCs for each ring buffer (bulk_read_ring_buffers,
> > > bulk_write_ring_buffers, vacuum_ring_buffers - ah, 3 more new GUCs)?
> > > These options can help especially when statement level controls aren't
> > > easy to add (COPY, CREATE TABLE AS/CTAS, REFRESH MAT VIEW/RMV)? If
> > > needed users can also set them at the system level. For instance, one
> > > can set bulk_write_ring_buffers to other than 16MB or -1 to disable
> > > the ring buffer to use shared_buffers and run a bunch of bulk write
> > > queries.
>
> In attached v3, I've changed the name of the guc from buffer_usage_limit
> to vacuum_buffer_usage_limit, since it is only used for vacuum and
> autovacuum.

Sorry for arriving late to this thread, but what about sizing the ring
dynamically? From what I gather the primary motivation for larger ring
size is avoiding WAL flushes due to dirty buffer writes. We already
catch that event with StrategyRejectBuffer(). So maybe a dynamic
sizing algorithm could be applied to the ringbuffer. Make the buffers
array in strategy capable of holding up to the limit of buffers, but
set ring size conservatively. If we have to flush WAL, double the ring
size (up to the limit). If we loop around the ring without flushing,
decrease the ring size by a small amount to let clock sweep reclaim
them for use by other backends.

--
Ants Aasma
Senior Database Engineer
www.cybertec-postgresql.com



Re: Option to not use ringbuffer in VACUUM, using it in failsafe mode

From
Melanie Plageman
Date:
Thanks for your interest in this patch!

On Mon, Mar 13, 2023 at 8:38 AM Ants Aasma <ants@cybertec.at> wrote:
>
> On Sat, 11 Mar 2023 at 16:55, Melanie Plageman
> <melanieplageman@gmail.com> wrote:
> >
> > > On Tue, Feb 28, 2023 at 3:16 AM Bharath Rupireddy
> > > <bharath.rupireddyforpostgres@gmail.com> wrote:
> > >
> > > > On Thu, Jan 12, 2023 at 6:06 AM Andres Freund <andres@anarazel.de> wrote:
> > > > >
> > > > > On 2023-01-11 17:26:19 -0700, David G. Johnston wrote:
> > > > > > Should we just add "ring_buffers" to the existing "shared_buffers" and
> > > > > > "temp_buffers" settings?
> > > > >
> > > > > The different types of ring buffers have different sizes, for good reasons. So
> > > > > I don't see that working well. I also think it'd be more often useful to
> > > > > control this on a statement basis - if you have a parallel import tool that
> > > > > starts NCPU COPYs you'd want a smaller buffer than a single threaded COPY. Of
> > > > > course each session can change the ring buffer settings, but still.
> > > >
> > > > How about having GUCs for each ring buffer (bulk_read_ring_buffers,
> > > > bulk_write_ring_buffers, vacuum_ring_buffers - ah, 3 more new GUCs)?
> > > > These options can help especially when statement level controls aren't
> > > > easy to add (COPY, CREATE TABLE AS/CTAS, REFRESH MAT VIEW/RMV)? If
> > > > needed users can also set them at the system level. For instance, one
> > > > can set bulk_write_ring_buffers to other than 16MB or -1 to disable
> > > > the ring buffer to use shared_buffers and run a bunch of bulk write
> > > > queries.
> >
> > In attached v3, I've changed the name of the guc from buffer_usage_limit
> > to vacuum_buffer_usage_limit, since it is only used for vacuum and
> > autovacuum.
>
> Sorry for arriving late to this thread, but what about sizing the ring
> dynamically? From what I gather the primary motivation for larger ring
> size is avoiding WAL flushes due to dirty buffer writes. We already
> catch that event with StrategyRejectBuffer(). So maybe a dynamic
> sizing algorithm could be applied to the ringbuffer. Make the buffers
> array in strategy capable of holding up to the limit of buffers, but
> set ring size conservatively. If we have to flush WAL, double the ring
> size (up to the limit). If we loop around the ring without flushing,
> decrease the ring size by a small amount to let clock sweep reclaim
> them for use by other backends.

So, the original motivation of this patch was to allow autovacuum in
failsafe mode to abandon use of a buffer access strategy, since, at that
point, there is no reason to hold back. The idea was expanded to be an
option to explicit vacuum, since users often must initiate an explicit
vacuum after a forced shutdown due to transaction ID wraparound.

As for routine vacuuming and the other buffer access strategies, I think
there is an argument for configurability based on operator knowledge --
perhaps your workload will use the data you are COPYing as soon as the
COPY finishes, so you might as well disable a buffer access strategy or
use a larger fraction of shared buffers. Also, the ring sizes were
selected sixteen years ago and average server memory and data set sizes
have changed.

StrategyRejectBuffer() will allow bulkreads to, as you say, use more
buffers than the original ring size, since it allows them to kick
dirty buffers out of the ring and claim new shared buffers.

Bulkwrites and vacuums, however, will inevitably dirty buffers and
require flushing the buffer (and thus flushing the associated WAL) when
reusing them. Bulkwrites and vacuum do not kick dirtied buffers out of
the ring, since dirtying buffers is their common case. A dynamic
resizing like the one you suggest would likely devolve to vacuum and
bulkwrite strategies always using the max size.

As for decreasing the ring size, buffers are only "added" to the ring
lazily and, technically, as it is now, buffers which have been added
added to the ring can always be reclaimed by the clocksweep (as long as
they are not pinned). The buffer access strategy is more of a
self-imposed restriction than it is a reservation. Since the ring is
small and the buffers are being frequently reused, odds are the usage
count will be 1 and we will be the one who set it to 1, but there is no
guarantee. If, when attempting to reuse the buffer, its usage count is
> 1 (or it is pinned), we also will kick it out of the ring and go look
for a replacement buffer.

I do think that it is a bit unreasonable to expect users to know how
large they would like to make their buffer access strategy ring. What we
want is some way of balancing different kinds of workloads and
maintenance tasks reasonably. If your database has no activity because
it is the middle of the night or it was shutdown because of transaction
id wraparound, there is no reason why vacuum should limit the number of
buffers it uses. I'm sure there are many other such examples.

- Melanie



Re: Option to not use ringbuffer in VACUUM, using it in failsafe mode

From
Melanie Plageman
Date:
Thanks for the review!

On Sat, Mar 11, 2023 at 2:16 PM Justin Pryzby <pryzby@telsasoft.com> wrote:
>
> On Sat, Mar 11, 2023 at 09:55:33AM -0500, Melanie Plageman wrote:
> > Subject: [PATCH v3 2/3] use shared buffers when failsafe active
> >
> > +             /*
> > +              * Assume the caller who allocated the memory for the
> > +              * BufferAccessStrategy object will free it.
> > +              */
> > +             vacrel->bstrategy = NULL;
>
> This comment could use elaboration:
>
> ** VACUUM normally restricts itself to a small ring buffer; but in
> failsafe mode, in order to process tables as quickly as possible, allow
> the leaving behind large number of dirty buffers.

Agreed. It definitely needs a comment like this. I will update in the
next version along with addressing your other feedback (after sorting
out some of the other points in this mail on which I still have
questions).

> > Subject: [PATCH v3 3/3] add vacuum option to specify ring_size and guc
>
> >  #define INT_ACCESS_ONCE(var) ((int)(*((volatile int *)&(var))))
> > +#define bufsize_limit_to_nbuffers(bufsize) (bufsize * 1024 / BLCKSZ)
>
> Macros are normally be capitalized

Yes, there doesn't seem to be a great amount of consistency around
this... See pgstat.c read_chunk_s and bufmgr.c BufHdrGetBlock and
friends. Though there are probably more capitalized than not. Since it
does a bit of math and returns a value, I wanted to convey that it was
more like a function. Also, since the name was long, I thought all-caps
would be hard to read. However, if you or others feel strongly, I am
attached neither to the capitalization nor to the name at all (what do
you think of the name?).

> It's a good idea to write "(bufsize)", in case someone passes "a+b".

Ah yes, this is a good idea -- I always miss at least one set of
parentheses when writing a macro. In this case, I didn't think of it
since the current caller couldn't pass an expression.

> > @@ -586,6 +587,45 @@ GetAccessStrategy(BufferAccessStrategyType btype)
> > +BufferAccessStrategy
> > +GetAccessStrategyWithSize(BufferAccessStrategyType btype, int ring_size)
>
> Maybe it would make sense for GetAccessStrategy() to call
> GetAccessStrategyWithSize().  Or maybe not.

You mean instead of having anyone call GetAccessStrategyWithSize()?
We would need to change the signature of GetAccessStrategy() then -- and
there are quite a few callers.

>
> > +             {"vacuum_buffer_usage_limit", PGC_USERSET, RESOURCES_MEM,
> > +                     gettext_noop("Sets the buffer pool size for operations employing a buffer access strategy."),
>
> The description should mention vacuum, if that's the scope of the GUC's
> behavior.

Good catch. Will update in next version.

> > +#vacuum_buffer_usage_limit = -1 # size of vacuum buffer access strategy ring.
> > +                             # -1 to use default,
> > +                             # 0 to disable vacuum buffer access strategy and use shared buffers
> > +                             # > 0 to specify size
>
> If I'm not wrong, there's still no documentation about "ring buffers" or
> postgres' "strategy".  Which seems important to do for this patch, along
> with other documentation.

Yes, it is. I have been thinking about where in the docs to add it (the
docs about buffer access strategies). Any ideas?

> This patch should add support in vacuumdb.c.

Oh, I had totally forgotten about vacuumdb.

> And maybe a comment about adding support there, since it's annoying
> when it the release notes one year say "support VACUUM (FOO)" and then
> one year later say "support vacuumdb --foo".

I'm not sure I totally follow. Do you mean to add a comment to
ExecVacuum() saying to add support to vacuumdb when adding a new option
to vacuum?

In the past, did people forget to add support to vacuumdb for new vacuum
options or did they forget to document that they did that or did they
forgot to include that they did that in the release notes?

- Melanie



Re: Option to not use ringbuffer in VACUUM, using it in failsafe mode

From
Masahiko Sawada
Date:
On Sat, Mar 11, 2023 at 11:55 PM Melanie Plageman
<melanieplageman@gmail.com> wrote:
>
> > On Tue, Feb 28, 2023 at 3:16 AM Bharath Rupireddy
> > <bharath.rupireddyforpostgres@gmail.com> wrote:
> >
> > > On Thu, Jan 12, 2023 at 6:06 AM Andres Freund <andres@anarazel.de> wrote:
> > > >
> > > > On 2023-01-11 17:26:19 -0700, David G. Johnston wrote:
> > > > > Should we just add "ring_buffers" to the existing "shared_buffers" and
> > > > > "temp_buffers" settings?
> > > >
> > > > The different types of ring buffers have different sizes, for good reasons. So
> > > > I don't see that working well. I also think it'd be more often useful to
> > > > control this on a statement basis - if you have a parallel import tool that
> > > > starts NCPU COPYs you'd want a smaller buffer than a single threaded COPY. Of
> > > > course each session can change the ring buffer settings, but still.
> > >
> > > How about having GUCs for each ring buffer (bulk_read_ring_buffers,
> > > bulk_write_ring_buffers, vacuum_ring_buffers - ah, 3 more new GUCs)?
> > > These options can help especially when statement level controls aren't
> > > easy to add (COPY, CREATE TABLE AS/CTAS, REFRESH MAT VIEW/RMV)? If
> > > needed users can also set them at the system level. For instance, one
> > > can set bulk_write_ring_buffers to other than 16MB or -1 to disable
> > > the ring buffer to use shared_buffers and run a bunch of bulk write
> > > queries.
>
> In attached v3, I've changed the name of the guc from buffer_usage_limit
> to vacuum_buffer_usage_limit, since it is only used for vacuum and
> autovacuum.
>
> I haven't added the other suggested strategy gucs, as those could easily
> be done in a future patchset.
>
> I've also changed GetAccessStrategyExt() to GetAccessStrategyWithSize()
> -- similar to initArrayResultWithSize().
>
> And I've added tab completion for BUFFER_USAGE_LIMIT so that it is
> easier to try out my patch.
>
> Most of the TODOs in the code are related to the question of how
> autovacuum uses the guc vacuum_buffer_usage_limit. autovacuum creates
> the buffer access strategy ring in do_autovacuum() before looping
> through and vacuuming tables. It passes this strategy object on to
> vacuum(). Since we reuse the same strategy object for all tables in a
> given invocation of do_autovacuum(), only failsafe autovacuum would
> change buffer access strategies. This is probably okay, but it does mean
> that the table-level VacuumParams variable, ring_size, means something
> different for autovacuum than vacuum. Autovacuum workers will always
> have set it to -1. We won't ever reach code in vacuum() which relies on
> VacuumParams->ring_size as long as autovacuum workers pass a non-NULL
> BufferAccessStrategy object to vacuum(), though.

I've not reviewed the patchset in depth yet but I got assertion
failure and SEGV when using the buffer_usage_limit parameter.

postgres(1:471180)=# vacuum (buffer_usage_limit 10000000000) ;
2023-03-15 17:10:02.947 JST [471180] ERROR:  buffer_usage_limit for a
vacuum must be between -1 and 16777216. 10000000000 is invalid. at
character 9

The message show the max value is 16777216, but when I set it, I got
an assertion failure:

postgres(1:470992)=# vacuum (buffer_usage_limit 16777216) ;
TRAP: failed Assert("ring_size < MAX_BAS_RING_SIZE_KB"), File:
"freelist.c", Line: 606, PID: 470992

Then when I used 1 byte lower value, 16777215, I got a SEGV:

postgres(1:471180)=# vacuum (buffer_usage_limit 16777215) ;
server closed the connection unexpectedly
        This probably means the server terminated abnormally
        before or while processing the request.
The connection to the server was lost. Attempting reset: 2023-03-15
17:10:59.404 JST [471159] LOG:  server process (PID 471180) was
terminated by signal 11: Segmentation fault

Finally, when I used a more lower value, 16777100, I got a memory
allocation error:

postgres(1:471361)=# vacuum (buffer_usage_limit 16777100) ;
2023-03-15 17:12:17.853 JST [471361] ERROR:  invalid memory alloc
request size 18446744073709551572

Probably vacuum_buffer_usage_limit also has the same issue.

Also, should we support a table option for vacuum_buffer_usage_limit as well?

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



Re: Option to not use ringbuffer in VACUUM, using it in failsafe mode

From
Ants Aasma
Date:
On Wed, 15 Mar 2023 at 02:29, Melanie Plageman
<melanieplageman@gmail.com> wrote:
> As for routine vacuuming and the other buffer access strategies, I think
> there is an argument for configurability based on operator knowledge --
> perhaps your workload will use the data you are COPYing as soon as the
> COPY finishes, so you might as well disable a buffer access strategy or
> use a larger fraction of shared buffers. Also, the ring sizes were
> selected sixteen years ago and average server memory and data set sizes
> have changed.

To be clear I'm not at all arguing against configurability. I was
thinking that dynamic use could make the configuration simpler by self
tuning to use no more buffers than is useful.

> StrategyRejectBuffer() will allow bulkreads to, as you say, use more
> buffers than the original ring size, since it allows them to kick
> dirty buffers out of the ring and claim new shared buffers.
>
> Bulkwrites and vacuums, however, will inevitably dirty buffers and
> require flushing the buffer (and thus flushing the associated WAL) when
> reusing them. Bulkwrites and vacuum do not kick dirtied buffers out of
> the ring, since dirtying buffers is their common case. A dynamic
> resizing like the one you suggest would likely devolve to vacuum and
> bulkwrite strategies always using the max size.

I think it should self stabilize around the point where the WAL is
either flushed by other commit activity, WAL writer or WAL buffers
filling up. Writing out their own dirtied buffers will still happen,
just the associated WAL flushes will be in larger chunks and possibly
done by other processes.

> As for decreasing the ring size, buffers are only "added" to the ring
> lazily and, technically, as it is now, buffers which have been added
> added to the ring can always be reclaimed by the clocksweep (as long as
> they are not pinned). The buffer access strategy is more of a
> self-imposed restriction than it is a reservation. Since the ring is
> small and the buffers are being frequently reused, odds are the usage
> count will be 1 and we will be the one who set it to 1, but there is no
> guarantee. If, when attempting to reuse the buffer, its usage count is
> > 1 (or it is pinned), we also will kick it out of the ring and go look
> for a replacement buffer.

Right, but while the buffer is actively used by the ring it is
unlikely that clocksweep will find it at usage 0 as the ring buffer
should cycle more often than the clocksweep. Whereas if the ring stops
using a buffer, clocksweep will eventually come and reclaim it. And if
the ring shrinking decision turns out to be wrong before the
clocksweep gets around to reusing it, we can bring the same buffer
back into the ring.

> I do think that it is a bit unreasonable to expect users to know how
> large they would like to make their buffer access strategy ring. What we
> want is some way of balancing different kinds of workloads and
> maintenance tasks reasonably. If your database has no activity because
> it is the middle of the night or it was shutdown because of transaction
> id wraparound, there is no reason why vacuum should limit the number of
> buffers it uses. I'm sure there are many other such examples.

Ideally yes, though I am not hopeful of finding a solution that does
this any time soon. Just to take your example, if a nightly
maintenance job wipes out the shared buffer contents slightly
optimizing its non time-critical work and then causes morning user
visible load to have big latency spikes due to cache misses, that's
not a good tradeoff either.

--
Ants Aasma
Senior Database Engineer
www.cybertec-postgresql.com



Re: Option to not use ringbuffer in VACUUM, using it in failsafe mode

From
Ants Aasma
Date:
On Wed, 15 Mar 2023 at 02:57, Melanie Plageman
<melanieplageman@gmail.com> wrote:
> > > Subject: [PATCH v3 3/3] add vacuum option to specify ring_size and guc
> >
> > >  #define INT_ACCESS_ONCE(var) ((int)(*((volatile int *)&(var))))
> > > +#define bufsize_limit_to_nbuffers(bufsize) (bufsize * 1024 / BLCKSZ)
> >
> > Macros are normally be capitalized
>
> Yes, there doesn't seem to be a great amount of consistency around
> this... See pgstat.c read_chunk_s and bufmgr.c BufHdrGetBlock and
> friends. Though there are probably more capitalized than not. Since it
> does a bit of math and returns a value, I wanted to convey that it was
> more like a function. Also, since the name was long, I thought all-caps
> would be hard to read. However, if you or others feel strongly, I am
> attached neither to the capitalization nor to the name at all (what do
> you think of the name?).

A static inline function seems like a less surprising and more type
safe solution for this.

-- 
Ants Aasma
Senior Database Engineer
www.cybertec-postgresql.com



Re: Option to not use ringbuffer in VACUUM, using it in failsafe mode

From
Justin Pryzby
Date:
On Tue, Mar 14, 2023 at 08:56:58PM -0400, Melanie Plageman wrote:
> On Sat, Mar 11, 2023 at 2:16 PM Justin Pryzby <pryzby@telsasoft.com> wrote:

> > > @@ -586,6 +587,45 @@ GetAccessStrategy(BufferAccessStrategyType btype)
> > > +BufferAccessStrategy
> > > +GetAccessStrategyWithSize(BufferAccessStrategyType btype, int ring_size)
> >
> > Maybe it would make sense for GetAccessStrategy() to call
> > GetAccessStrategyWithSize().  Or maybe not.
> 
> You mean instead of having anyone call GetAccessStrategyWithSize()?
> We would need to change the signature of GetAccessStrategy() then -- and
> there are quite a few callers.

I mean to avoid code duplication, GetAccessStrategy() could "Select ring
size to use" and then call into GetAccessStrategyWithSize().  Maybe it's
counter to your intent or otherwise not worth it to save 8 LOC.

> > > +#vacuum_buffer_usage_limit = -1 # size of vacuum buffer access strategy ring.
> > > +                             # -1 to use default,
> > > +                             # 0 to disable vacuum buffer access strategy and use shared buffers
> > > +                             # > 0 to specify size
> >
> > If I'm not wrong, there's still no documentation about "ring buffers" or
> > postgres' "strategy".  Which seems important to do for this patch, along
> > with other documentation.
> 
> Yes, it is. I have been thinking about where in the docs to add it (the
> docs about buffer access strategies). Any ideas?

This patch could add something to the vacuum manpage and to the appendix.
And maybe references from the shared_buffers and pg_buffercache
manpages.

> > This patch should add support in vacuumdb.c.
> 
> Oh, I had totally forgotten about vacuumdb.

:)

> > And maybe a comment about adding support there, since it's annoying
> > when it the release notes one year say "support VACUUM (FOO)" and then
> > one year later say "support vacuumdb --foo".
> 
> I'm not sure I totally follow. Do you mean to add a comment to
> ExecVacuum() saying to add support to vacuumdb when adding a new option
> to vacuum?

Yeah, like:
/* Options added here should also be added to vacuumdb.c */

> In the past, did people forget to add support to vacuumdb for new vacuum
> options or did they forget to document that they did that or did they
> forgot to include that they did that in the release notes?

The first.  Maybe not often, it's not important whether it's in the
original patch, but it's odd if the vacuumdb option isn't added until
the following release, which then shows up as a separate "feature".

-- 
Justin



Re: Option to not use ringbuffer in VACUUM, using it in failsafe mode

From
Melanie Plageman
Date:
Thanks for the reviews and for trying the patch!

On Wed, Mar 15, 2023 at 4:31 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Sat, Mar 11, 2023 at 11:55 PM Melanie Plageman
> <melanieplageman@gmail.com> wrote:
> >
> > > On Tue, Feb 28, 2023 at 3:16 AM Bharath Rupireddy
> > > <bharath.rupireddyforpostgres@gmail.com> wrote:
> > >
> > > > On Thu, Jan 12, 2023 at 6:06 AM Andres Freund <andres@anarazel.de> wrote:
> > > > >
> > > > > On 2023-01-11 17:26:19 -0700, David G. Johnston wrote:
> > > > > > Should we just add "ring_buffers" to the existing "shared_buffers" and
> > > > > > "temp_buffers" settings?
> > > > >
> > > > > The different types of ring buffers have different sizes, for good reasons. So
> > > > > I don't see that working well. I also think it'd be more often useful to
> > > > > control this on a statement basis - if you have a parallel import tool that
> > > > > starts NCPU COPYs you'd want a smaller buffer than a single threaded COPY. Of
> > > > > course each session can change the ring buffer settings, but still.
> > > >
> > > > How about having GUCs for each ring buffer (bulk_read_ring_buffers,
> > > > bulk_write_ring_buffers, vacuum_ring_buffers - ah, 3 more new GUCs)?
> > > > These options can help especially when statement level controls aren't
> > > > easy to add (COPY, CREATE TABLE AS/CTAS, REFRESH MAT VIEW/RMV)? If
> > > > needed users can also set them at the system level. For instance, one
> > > > can set bulk_write_ring_buffers to other than 16MB or -1 to disable
> > > > the ring buffer to use shared_buffers and run a bunch of bulk write
> > > > queries.
> >
> > In attached v3, I've changed the name of the guc from buffer_usage_limit
> > to vacuum_buffer_usage_limit, since it is only used for vacuum and
> > autovacuum.
> >
> > I haven't added the other suggested strategy gucs, as those could easily
> > be done in a future patchset.
> >
> > I've also changed GetAccessStrategyExt() to GetAccessStrategyWithSize()
> > -- similar to initArrayResultWithSize().
> >
> > And I've added tab completion for BUFFER_USAGE_LIMIT so that it is
> > easier to try out my patch.
> >
> > Most of the TODOs in the code are related to the question of how
> > autovacuum uses the guc vacuum_buffer_usage_limit. autovacuum creates
> > the buffer access strategy ring in do_autovacuum() before looping
> > through and vacuuming tables. It passes this strategy object on to
> > vacuum(). Since we reuse the same strategy object for all tables in a
> > given invocation of do_autovacuum(), only failsafe autovacuum would
> > change buffer access strategies. This is probably okay, but it does mean
> > that the table-level VacuumParams variable, ring_size, means something
> > different for autovacuum than vacuum. Autovacuum workers will always
> > have set it to -1. We won't ever reach code in vacuum() which relies on
> > VacuumParams->ring_size as long as autovacuum workers pass a non-NULL
> > BufferAccessStrategy object to vacuum(), though.
>
> I've not reviewed the patchset in depth yet but I got assertion
> failure and SEGV when using the buffer_usage_limit parameter.
>
> postgres(1:471180)=# vacuum (buffer_usage_limit 10000000000) ;
> 2023-03-15 17:10:02.947 JST [471180] ERROR:  buffer_usage_limit for a
> vacuum must be between -1 and 16777216. 10000000000 is invalid. at
> character 9
>
> The message show the max value is 16777216, but when I set it, I got
> an assertion failure:
>
> postgres(1:470992)=# vacuum (buffer_usage_limit 16777216) ;
> TRAP: failed Assert("ring_size < MAX_BAS_RING_SIZE_KB"), File:
> "freelist.c", Line: 606, PID: 470992
>
> Then when I used 1 byte lower value, 16777215, I got a SEGV:
>
> postgres(1:471180)=# vacuum (buffer_usage_limit 16777215) ;
> server closed the connection unexpectedly
>         This probably means the server terminated abnormally
>         before or while processing the request.
> The connection to the server was lost. Attempting reset: 2023-03-15
> 17:10:59.404 JST [471159] LOG:  server process (PID 471180) was
> terminated by signal 11: Segmentation fault
>
> Finally, when I used a more lower value, 16777100, I got a memory
> allocation error:
>
> postgres(1:471361)=# vacuum (buffer_usage_limit 16777100) ;
> 2023-03-15 17:12:17.853 JST [471361] ERROR:  invalid memory alloc
> request size 18446744073709551572
>
> Probably vacuum_buffer_usage_limit also has the same issue.

Oh dear--it seems I had an integer overflow when calculating the number
of buffers using the specified buffer size in the macro:

 #define bufsize_limit_to_nbuffers(bufsize) (bufsize * 1024 / BLCKSZ)

In the attached v4, I've changed that to:

  static inline int
  bufsize_limit_to_nbuffers(int bufsize_limit_kb)
  {
    int blcksz_kb = BLCKSZ / 1024;

    Assert(blcksz_kb > 0);

    return bufsize_limit_kb / blcksz_kb;
  }

This should address Justin's suggestions and Ants' concern about the
macro as well.

Also, I was missing the = in the Assert(ring_size <= MAX_BAS_RING_SIZE)
I've fixed that as well, so it should work for you to specify up to 16777216.

> Also, should we support a table option for vacuum_buffer_usage_limit as well?

Hmm. Since this is meant more for balancing resource usage globally, it
doesn't make as much sense as a table option to me. But, I could be
convinced.

On Sat, Mar 11, 2023 at 2:16 PM Justin Pryzby <pryzby@telsasoft.com> wrote:
> On Sat, Mar 11, 2023 at 09:55:33AM -0500, Melanie Plageman wrote:
> > Subject: [PATCH v3 2/3] use shared buffers when failsafe active
> >
> > +             /*
> > +              * Assume the caller who allocated the memory for the
> > +              * BufferAccessStrategy object will free it.
> > +              */
> > +             vacrel->bstrategy = NULL;
>
> This comment could use elaboration:
>
> ** VACUUM normally restricts itself to a small ring buffer; but in
> failsafe mode, in order to process tables as quickly as possible, allow
> the leaving behind large number of dirty buffers.

I've added a comment in attached v4 which is a bit different than Justin's
suggestion but still more verbose than the previous comment.

> > Subject: [PATCH v3 3/3] add vacuum option to specify ring_size and guc
> > +             {"vacuum_buffer_usage_limit", PGC_USERSET, RESOURCES_MEM,
> > +                     gettext_noop("Sets the buffer pool size for operations employing a buffer access strategy."),
>
> The description should mention vacuum, if that's the scope of the GUC's
> behavior.

I've updated this in v4.

> > +#vacuum_buffer_usage_limit = -1 # size of vacuum buffer access strategy ring.
> > +                             # -1 to use default,
> > +                             # 0 to disable vacuum buffer access strategy and use shared buffers
> > +                             # > 0 to specify size
>
> If I'm not wrong, there's still no documentation about "ring buffers" or
> postgres' "strategy".  Which seems important to do for this patch, along
> with other documentation.

So, on the topic of "other documentation", I have, at least, added docs
for the vacuum_buffer_usage_limit guc and the BUFFER_USAGE option to
VACUUM and the buffer-usage-limit parameter to vacuumdb.

> This patch should add support in vacuumdb.c.  And maybe a comment about
> adding support there, since it's annoying when it the release notes one
> year say "support VACUUM (FOO)" and then one year later say "support
> vacuumdb --foo".

So, v4 adds support for buffer-usage-limit to vacuumdb. There are a few
issues. The main one is that no other vacuumdb option takes a size as a
parameter. I couldn't actually find any other client with a parameter
specified as a size.

My VACUUM option code is using the GUC size parsing code from
parse_int() -- including the unit flag GUC_UNIT_KB. Now that vacuumdb
also needs to parse sizes, I think we'll need to lift the parse_int()
code and the unit_conversion struct and
unit_conversion_memory_unit_conversion_table out of guc.c and put it
somewhere that it can be accessed for more than guc parsing (e.g. option
parsing).

For vacuumdb in this version, I just specified buffer-usage-limit is
only in kB and thus can only be specified as an int.

If we had something like pg_parse_size() in common, would this make
sense? It would be a little bit of work to figure out what to do about
the flags, etc.

Another issue is the server-side guc
#define MAX_BAS_RING_SIZE_KB (16 * 1024 * 1024)
I just redefined it in vacuumdb code. I'm not sure what the preferred
method for dealing with this is.

I know this validation would get done server-side if I just passed the
user-specified option through, but all of the other vacuumdb options
appear to be doing min/max boundary validation on the client side.

On Wed, Mar 15, 2023 at 8:14 PM Justin Pryzby <pryzby@telsasoft.com> wrote:
> On Tue, Mar 14, 2023 at 08:56:58PM -0400, Melanie Plageman wrote:
> > On Sat, Mar 11, 2023 at 2:16 PM Justin Pryzby <pryzby@telsasoft.com> wrote:
>
> > > > @@ -586,6 +587,45 @@ GetAccessStrategy(BufferAccessStrategyType btype)
> > > > +BufferAccessStrategy
> > > > +GetAccessStrategyWithSize(BufferAccessStrategyType btype, int ring_size)
> > >
> > > Maybe it would make sense for GetAccessStrategy() to call
> > > GetAccessStrategyWithSize().  Or maybe not.
> >
> > You mean instead of having anyone call GetAccessStrategyWithSize()?
> > We would need to change the signature of GetAccessStrategy() then -- and
> > there are quite a few callers.
>
> I mean to avoid code duplication, GetAccessStrategy() could "Select ring
> size to use" and then call into GetAccessStrategyWithSize().  Maybe it's
> counter to your intent or otherwise not worth it to save 8 LOC.

Oh, that's a cool idea. I will think on it.

> > > > +#vacuum_buffer_usage_limit = -1 # size of vacuum buffer access strategy ring.
> > > > +                             # -1 to use default,
> > > > +                             # 0 to disable vacuum buffer access strategy and use shared buffers
> > > > +                             # > 0 to specify size
> > >
> > > If I'm not wrong, there's still no documentation about "ring buffers" or
> > > postgres' "strategy".  Which seems important to do for this patch, along
> > > with other documentation.
> >
> > Yes, it is. I have been thinking about where in the docs to add it (the
> > docs about buffer access strategies). Any ideas?
>
> This patch could add something to the vacuum manpage and to the appendix.
> And maybe references from the shared_buffers and pg_buffercache
> manpages.

So, I was thinking it would be good to have some documentation in
general about Buffer Access Strategies (i.e. not just for vacuum). It
would have been nice to have something to reference from the pg_stat_io
docs that describe what buffer access strategies are.

> > > And maybe a comment about adding support there, since it's annoying
> > > when it the release notes one year say "support VACUUM (FOO)" and then
> > > one year later say "support vacuumdb --foo".
> >
> > I'm not sure I totally follow. Do you mean to add a comment to
> > ExecVacuum() saying to add support to vacuumdb when adding a new option
> > to vacuum?
>
> Yeah, like:
> /* Options added here should also be added to vacuumdb.c */

I've added a little something to the comment above the VacuumParams
struct.

> > In the past, did people forget to add support to vacuumdb for new vacuum
> > options or did they forget to document that they did that or did they
> > forgot to include that they did that in the release notes?
>
> The first.  Maybe not often, it's not important whether it's in the
> original patch, but it's odd if the vacuumdb option isn't added until
> the following release, which then shows up as a separate "feature".

I've squished in the code for adding the parameter to vacuumdb in a
single commit with the guc and vacuum option, but I will separate it out
after some of the basics get sorted.

- Melanie

Attachment

Re: Option to not use ringbuffer in VACUUM, using it in failsafe mode

From
Melanie Plageman
Date:
On Wed, Mar 15, 2023 at 9:03 PM Melanie Plageman
<melanieplageman@gmail.com> wrote:
> On Sat, Mar 11, 2023 at 2:16 PM Justin Pryzby <pryzby@telsasoft.com> wrote:
> > On Sat, Mar 11, 2023 at 09:55:33AM -0500, Melanie Plageman wrote:
> > This patch should add support in vacuumdb.c.  And maybe a comment about
> > adding support there, since it's annoying when it the release notes one
> > year say "support VACUUM (FOO)" and then one year later say "support
> > vacuumdb --foo".
>
> So, v4 adds support for buffer-usage-limit to vacuumdb. There are a few
> issues. The main one is that no other vacuumdb option takes a size as a
> parameter. I couldn't actually find any other client with a parameter
> specified as a size.
>
> My VACUUM option code is using the GUC size parsing code from
> parse_int() -- including the unit flag GUC_UNIT_KB. Now that vacuumdb
> also needs to parse sizes, I think we'll need to lift the parse_int()
> code and the unit_conversion struct and
> unit_conversion_memory_unit_conversion_table out of guc.c and put it
> somewhere that it can be accessed for more than guc parsing (e.g. option
> parsing).
>
> For vacuumdb in this version, I just specified buffer-usage-limit is
> only in kB and thus can only be specified as an int.
>
> If we had something like pg_parse_size() in common, would this make
> sense? It would be a little bit of work to figure out what to do about
> the flags, etc.
>
> Another issue is the server-side guc
> #define MAX_BAS_RING_SIZE_KB (16 * 1024 * 1024)
> I just redefined it in vacuumdb code. I'm not sure what the preferred
> method for dealing with this is.
>
> I know this validation would get done server-side if I just passed the
> user-specified option through, but all of the other vacuumdb options
> appear to be doing min/max boundary validation on the client side.

So, after discussing vacuumdb client-side validation off-list with Jelte,
I realized that I was trying to do too much there.

Attached v5 passes the contents of the buffer-usage-limit option to
vacuumdb unvalidated into the VACUUM command string which vacuumdb
builds. This solves most of the problems.

I also improved the error messages coming from VACUUM
(buffer_usage_limit) handling.

- Melanie

Attachment

Re: Option to not use ringbuffer in VACUUM, using it in failsafe mode

From
Justin Pryzby
Date:
> Subject: [PATCH v4 3/3] add vacuum[db] option to specify ring_size and guc

> +        Specifies the ring buffer size to be used for a given invocation of
> +        <command>VACUUM</command> or instance of autovacuum. This size is
> +        converted to a number of shared buffers which will be reused as part of

I'd say "specifies the size of shared_buffers to be reused as .."

> +        a <literal>Buffer Access Strategy</literal>. <literal>0</literal> will
> +        disable use of a <literal>Buffer Access Strategy</literal>.
> +        <literal>-1</literal> will set the size to a default of <literal>256
> +        kB</literal>. The maximum ring buffer size is <literal>16 GB</literal>.
> +        Though you may set <varname>vacuum_buffer_usage_limit</varname> below
> +        <literal>128 kB</literal>, it will be clamped to <literal>128
> +        kB</literal> at runtime. The default value is <literal>-1</literal>.
> +        This parameter can be set at any time.

GUC docs usually also say something like 
"If this value is specified without units, it is taken as .."

> +      is used to calculate a number of shared buffers which will be reused as

*the* number?

> +      <command>VACUUM</command>. The analyze stage and parallel vacuum workers
> +      do not use this size.

I think what you mean is that vacuum's heap scan stage uses the
strategy, but the index scan/cleanup phases doesn't?

> +        The size in kB of the ring buffer used for vacuuming. This size is used
> +        to calculate a number of shared buffers which will be reused as part of

*the* number

> +++ b/doc/src/sgml/ref/vacuumdb.sgml

The docs here duplicate the sql-vacuum docs.  It seems better to refer
to the vacuum page for details, like --parallel does.


Unrelated: it would be nice if the client-side options were documented
separately from the server-side options.  Especially due to --jobs and
--parallel.

> +            if (!parse_int(vac_buffer_size, &result, GUC_UNIT_KB, NULL))
> +            {
> +                ereport(ERROR,
> +                        (errcode(ERRCODE_SYNTAX_ERROR),
> +                            errmsg("buffer_usage_limit for a vacuum must be between -1 and %d. %s is invalid.",
> +                                    MAX_BAS_RING_SIZE_KB, vac_buffer_size),
> +                            parser_errposition(pstate, opt->location)));
> +            }
> +
> +            /* check for out-of-bounds */
> +            if (result < -1 || result > MAX_BAS_RING_SIZE_KB)
> +            {
> +                ereport(ERROR,
> +                        (errcode(ERRCODE_SYNTAX_ERROR),
> +                            errmsg("buffer_usage_limit for a vacuum must be between -1 and %d",
> +                                    MAX_BAS_RING_SIZE_KB),
> +                            parser_errposition(pstate, opt->location)));
> +            }

I think these checks could be collapsed into a single ereport().

if !parse_int() || (result < -1 || result > MAX_BAS_RINGSIZE_KB):
    ereport(ERROR,
        errcode(ERRCODE_SYNTAX_ERROR),
        errmsg("buffer_usage_limit for a vacuum must be an integer between -1 and %d",
            MAX_BAS_RING_SIZE_KB),

There was a recent, similar, and unrelated suggestion here:
https://www.postgresql.org/message-id/20230314.135859.260879647537075548.horikyota.ntt%40gmail.com

> +#vacuum_buffer_usage_limit = -1 # size of vacuum buffer access strategy ring.
> +                # -1 to use default,
> +                # 0 to disable vacuum buffer access strategy and use shared buffers

I think it's confusing to say "and use shared buffers", since
"strategies" also use shared_buffers.  It seems better to remove those 4
words.

> @@ -550,6 +563,13 @@ vacuum_one_database(ConnParams *cparams,
>          pg_fatal("cannot use the \"%s\" option on server versions older than PostgreSQL %s",
>                   "--parallel", "13");
>  
> +    // TODO: this is a problem: if the user specifies this option with -1 in a
> +    // version before 16, it will not produce an error message. it also won't
> +    // do anything, but that still doesn't seem right.

Actually, that seems fine to me.  If someone installs v16 vacuumdb, they
can run it against old servers and specify the option as -1 without it
failing with an error.  I don't know if anyone will find that useful,
but it doesn't seem unreasonable.

I still think adding something to the glossary would be good.

Buffer Access Strategy:
A circular/ring buffer used for reading or writing data pages from/to
the operating system.  Ring buffers are used for sequential scans of
large tables, VACUUM, COPY FROM, CREATE TABLE AS SELECT, ALTER TABLE,
and CLUSTER.  By using only a limited portion of >shared_buffers<, the
ring buffer avoids avoids evicting large amounts of data whenever a
backend performs bulk I/O operations.  Use of a ring buffer also forces
the backend to write out its own dirty pages, rather than leaving them
behind to be cleaned up by other backends.

If there's a larger section added than a glossary entry, the text could
be promoted from src/backend/storage/buffer/README to doc/.

-- 
Justin



Re: Option to not use ringbuffer in VACUUM, using it in failsafe mode

From
Melanie Plageman
Date:
Thanks for the review!

Attached is an updated v6.

v6 has some updates and corrections. It has two remaining TODOs in the
code: one is around what value to initialize the ring_size to in
VacuumParams, the other is around whether or not parallel vacuum index
workers should in fact stick with the default buffer access strategy
sizes.

I also separated vacuumdb into its own commit.

I also have addressed Justin's review feedback.

On Sat, Mar 18, 2023 at 2:30 PM Justin Pryzby <pryzby@telsasoft.com> wrote:
>
> > Subject: [PATCH v4 3/3] add vacuum[db] option to specify ring_size and guc
>
> > +        Specifies the ring buffer size to be used for a given invocation of
> > +        <command>VACUUM</command> or instance of autovacuum. This size is
> > +        converted to a number of shared buffers which will be reused as part of
>
> I'd say "specifies the size of shared_buffers to be reused as .."

I've included "shared_buffers" in the description.

> > +        a <literal>Buffer Access Strategy</literal>. <literal>0</literal> will
> > +        disable use of a <literal>Buffer Access Strategy</literal>.
> > +        <literal>-1</literal> will set the size to a default of <literal>256
> > +        kB</literal>. The maximum ring buffer size is <literal>16 GB</literal>.
> > +        Though you may set <varname>vacuum_buffer_usage_limit</varname> below
> > +        <literal>128 kB</literal>, it will be clamped to <literal>128
> > +        kB</literal> at runtime. The default value is <literal>-1</literal>.
> > +        This parameter can be set at any time.
>
> GUC docs usually also say something like
> "If this value is specified without units, it is taken as .."

I had updated this in v5 with slightly different wording, but I now am
using the wording you suggested (which does appear standard in the rest
of the docs).

>
> > +      is used to calculate a number of shared buffers which will be reused as
>
> *the* number?

updated.

>
> > +      <command>VACUUM</command>. The analyze stage and parallel vacuum workers
> > +      do not use this size.
>
> I think what you mean is that vacuum's heap scan stage uses the
> strategy, but the index scan/cleanup phases doesn't?

Yes, non-parallel index vacuum and cleanup will use whatever value you
specify but parallel workers make their own buffer access strategy
object. I've updated the docs to indicate that they will use the default
size for this.


>
> > +        The size in kB of the ring buffer used for vacuuming. This size is used
> > +        to calculate a number of shared buffers which will be reused as part of
>
> *the* number

fixed.

> > +++ b/doc/src/sgml/ref/vacuumdb.sgml
>
> The docs here duplicate the sql-vacuum docs.  It seems better to refer
> to the vacuum page for details, like --parallel does.

Good idea.

>
> Unrelated: it would be nice if the client-side options were documented
> separately from the server-side options.  Especially due to --jobs and
> --parallel.

Yes, that would be helpful.

> > +                     if (!parse_int(vac_buffer_size, &result, GUC_UNIT_KB, NULL))
> > +                     {
> > +                             ereport(ERROR,
> > +                                             (errcode(ERRCODE_SYNTAX_ERROR),
> > +                                                     errmsg("buffer_usage_limit for a vacuum must be between -1
and%d. %s is invalid.", 
> > +                                                                     MAX_BAS_RING_SIZE_KB, vac_buffer_size),
> > +                                                     parser_errposition(pstate, opt->location)));
> > +                     }
> > +
> > +                     /* check for out-of-bounds */
> > +                     if (result < -1 || result > MAX_BAS_RING_SIZE_KB)
> > +                     {
> > +                             ereport(ERROR,
> > +                                             (errcode(ERRCODE_SYNTAX_ERROR),
> > +                                                     errmsg("buffer_usage_limit for a vacuum must be between -1
and%d", 
> > +                                                                     MAX_BAS_RING_SIZE_KB),
> > +                                                     parser_errposition(pstate, opt->location)));
> > +                     }
>
> I think these checks could be collapsed into a single ereport().
>
> if !parse_int() || (result < -1 || result > MAX_BAS_RINGSIZE_KB):
>         ereport(ERROR,
>                 errcode(ERRCODE_SYNTAX_ERROR),
>                 errmsg("buffer_usage_limit for a vacuum must be an integer between -1 and %d",
>                         MAX_BAS_RING_SIZE_KB),
>
> There was a recent, similar, and unrelated suggestion here:
> https://www.postgresql.org/message-id/20230314.135859.260879647537075548.horikyota.ntt%40gmail.com

So, these have been updated/improved in v5. I still didn't combine them.
I see what you are saying about combining them (and I checked the link
you shared), but in this case, having them separate allows me to provide
info using the hintmsg passed to parse_int() about why it failed during
parse_int -- which could be something not related to range. So, I think
it makes sense to keep them separate.

> > +#vacuum_buffer_usage_limit = -1 # size of vacuum buffer access strategy ring.
> > +                             # -1 to use default,
> > +                             # 0 to disable vacuum buffer access strategy and use shared buffers
>
> I think it's confusing to say "and use shared buffers", since
> "strategies" also use shared_buffers.  It seems better to remove those 4
> words.

Got it. I've gone ahead and done that.

> > @@ -550,6 +563,13 @@ vacuum_one_database(ConnParams *cparams,
> >               pg_fatal("cannot use the \"%s\" option on server versions older than PostgreSQL %s",
> >                                "--parallel", "13");
> >
> > +     // TODO: this is a problem: if the user specifies this option with -1 in a
> > +     // version before 16, it will not produce an error message. it also won't
> > +     // do anything, but that still doesn't seem right.
>
> Actually, that seems fine to me.  If someone installs v16 vacuumdb, they
> can run it against old servers and specify the option as -1 without it
> failing with an error.  I don't know if anyone will find that useful,
> but it doesn't seem unreasonable.

I sort of skirted around this by removing any validation from vacuumdb
(present in v5 and still the case in v6). Now, the parameter is a string
and I check if it is non-NULL when the version is < 16. However, this
will no longer have the property that someone can use v16 vacuumdb and
pass buffer-usage-limit and have it not fail. I think that is okay,
though, since they might be confused thinking it was doing something.

> I still think adding something to the glossary would be good.
>
> Buffer Access Strategy:
> A circular/ring buffer used for reading or writing data pages from/to
> the operating system.  Ring buffers are used for sequential scans of
> large tables, VACUUM, COPY FROM, CREATE TABLE AS SELECT, ALTER TABLE,
> and CLUSTER.  By using only a limited portion of >shared_buffers<, the
> ring buffer avoids avoids evicting large amounts of data whenever a
> backend performs bulk I/O operations.  Use of a ring buffer also forces
> the backend to write out its own dirty pages, rather than leaving them
> behind to be cleaned up by other backends.

Yes, I have taken some ideas from here and added a separate commit
before all the others adding Buffer Access Strategy to the
documentation.

> If there's a larger section added than a glossary entry, the text could
> be promoted from src/backend/storage/buffer/README to doc/.

This is a good idea. I think we provided enough information in the
glossary (as far as users would care) if it weren't for the new
buffer_usage_limit guc, which probably merits more explanation about how
it interacts with buffer access strategies. Since it is only used for
vacuum now, do you think such a thing would belong in VACUUM-related
documentation? Like somewhere in [1]?

On Wed, Mar 15, 2023 at 9:03 PM Melanie Plageman
<melanieplageman@gmail.com> wrote:
> On Wed, Mar 15, 2023 at 8:14 PM Justin Pryzby <pryzby@telsasoft.com> wrote:
> > On Tue, Mar 14, 2023 at 08:56:58PM -0400, Melanie Plageman wrote:
> > > On Sat, Mar 11, 2023 at 2:16 PM Justin Pryzby <pryzby@telsasoft.com> wrote:
> >
> > > > > @@ -586,6 +587,45 @@ GetAccessStrategy(BufferAccessStrategyType btype)
> > > > > +BufferAccessStrategy
> > > > > +GetAccessStrategyWithSize(BufferAccessStrategyType btype, int ring_size)
> > > >
> > > > Maybe it would make sense for GetAccessStrategy() to call
> > > > GetAccessStrategyWithSize().  Or maybe not.
> > >
> > > You mean instead of having anyone call GetAccessStrategyWithSize()?
> > > We would need to change the signature of GetAccessStrategy() then -- and
> > > there are quite a few callers.
> >
> > I mean to avoid code duplication, GetAccessStrategy() could "Select ring
> > size to use" and then call into GetAccessStrategyWithSize().  Maybe it's
> > counter to your intent or otherwise not worth it to save 8 LOC.
>
> Oh, that's a cool idea. I will think on it.

So, I thought about doing a version of this by adding a helper which did
the allocation of the BufferAccessStrategy object given a number of
buffers that could be called by both GetAccessStrategy() and
GetAccessStrategyWithSize(). I decided not to because I wanted to emit a
debug message if the size of the ring was clamped lower or higher than
the user would expect -- but only do this in GetAccessStrategyWithSize()
since there is no user expectation in the use of GetAccessStrategy().

- Melanie

[1] https://www.postgresql.org/docs/devel/routine-vacuuming.html

Attachment

Re: Option to not use ringbuffer in VACUUM, using it in failsafe mode

From
Melanie Plageman
Date:
On Wed, Mar 15, 2023 at 6:46 AM Ants Aasma <ants@cybertec.at> wrote:
>
> On Wed, 15 Mar 2023 at 02:29, Melanie Plageman
> <melanieplageman@gmail.com> wrote:
> > As for routine vacuuming and the other buffer access strategies, I think
> > there is an argument for configurability based on operator knowledge --
> > perhaps your workload will use the data you are COPYing as soon as the
> > COPY finishes, so you might as well disable a buffer access strategy or
> > use a larger fraction of shared buffers. Also, the ring sizes were
> > selected sixteen years ago and average server memory and data set sizes
> > have changed.
>
> To be clear I'm not at all arguing against configurability. I was
> thinking that dynamic use could make the configuration simpler by self
> tuning to use no more buffers than is useful.

Yes, but I am struggling with how we would define "useful".

> > StrategyRejectBuffer() will allow bulkreads to, as you say, use more
> > buffers than the original ring size, since it allows them to kick
> > dirty buffers out of the ring and claim new shared buffers.
> >
> > Bulkwrites and vacuums, however, will inevitably dirty buffers and
> > require flushing the buffer (and thus flushing the associated WAL) when
> > reusing them. Bulkwrites and vacuum do not kick dirtied buffers out of
> > the ring, since dirtying buffers is their common case. A dynamic
> > resizing like the one you suggest would likely devolve to vacuum and
> > bulkwrite strategies always using the max size.
>
> I think it should self stabilize around the point where the WAL is
> either flushed by other commit activity, WAL writer or WAL buffers
> filling up. Writing out their own dirtied buffers will still happen,
> just the associated WAL flushes will be in larger chunks and possibly
> done by other processes.

They will have to write out any WAL associated with modifications to the
dirty buffer before flushing it, so I'm not sure I understand how this
would work.

> > As for decreasing the ring size, buffers are only "added" to the ring
> > lazily and, technically, as it is now, buffers which have been added
> > added to the ring can always be reclaimed by the clocksweep (as long as
> > they are not pinned). The buffer access strategy is more of a
> > self-imposed restriction than it is a reservation. Since the ring is
> > small and the buffers are being frequently reused, odds are the usage
> > count will be 1 and we will be the one who set it to 1, but there is no
> > guarantee. If, when attempting to reuse the buffer, its usage count is
> > > 1 (or it is pinned), we also will kick it out of the ring and go look
> > for a replacement buffer.
>
> Right, but while the buffer is actively used by the ring it is
> unlikely that clocksweep will find it at usage 0 as the ring buffer
> should cycle more often than the clocksweep. Whereas if the ring stops
> using a buffer, clocksweep will eventually come and reclaim it. And if
> the ring shrinking decision turns out to be wrong before the
> clocksweep gets around to reusing it, we can bring the same buffer
> back into the ring.

I can see what you mean about excluding a buffer from the ring being a
more effective way of allowing it to be reclaimed. However, I'm not sure
I understand the use case. If the operation, say vacuum, is actively
using the buffer and keeping its usage count at one, then what would be
the criteria for it to decide to stop using it?

Also, if vacuum used the buffer once and then didn't reuse it but, for
some reason, the vacuum isn't over, it isn't any different at that point
than some other buffer with a usage count of one. It isn't any harder
for it to be reclaimed by the clocksweep.

The argument I could see for decreasing the size even when the buffers
are being used by the operation employing the strategy is if there is
pressure from other workloads to use those buffers. But, designing a
system that would reclaim buffers when needed by other workloads is more
complicated than what is being proposed here.

> > I do think that it is a bit unreasonable to expect users to know how
> > large they would like to make their buffer access strategy ring. What we
> > want is some way of balancing different kinds of workloads and
> > maintenance tasks reasonably. If your database has no activity because
> > it is the middle of the night or it was shutdown because of transaction
> > id wraparound, there is no reason why vacuum should limit the number of
> > buffers it uses. I'm sure there are many other such examples.
>
> Ideally yes, though I am not hopeful of finding a solution that does
> this any time soon. Just to take your example, if a nightly
> maintenance job wipes out the shared buffer contents slightly
> optimizing its non time-critical work and then causes morning user
> visible load to have big latency spikes due to cache misses, that's
> not a good tradeoff either.

Yes, that is a valid concern.

- Melanie



Re: Option to not use ringbuffer in VACUUM, using it in failsafe mode

From
Ants Aasma
Date:
On Mon, 20 Mar 2023 at 00:59, Melanie Plageman
<melanieplageman@gmail.com> wrote:
>
> On Wed, Mar 15, 2023 at 6:46 AM Ants Aasma <ants@cybertec.at> wrote:
> >
> > On Wed, 15 Mar 2023 at 02:29, Melanie Plageman
> > <melanieplageman@gmail.com> wrote:
> > > As for routine vacuuming and the other buffer access strategies, I think
> > > there is an argument for configurability based on operator knowledge --
> > > perhaps your workload will use the data you are COPYing as soon as the
> > > COPY finishes, so you might as well disable a buffer access strategy or
> > > use a larger fraction of shared buffers. Also, the ring sizes were
> > > selected sixteen years ago and average server memory and data set sizes
> > > have changed.
> >
> > To be clear I'm not at all arguing against configurability. I was
> > thinking that dynamic use could make the configuration simpler by self
> > tuning to use no more buffers than is useful.
>
> Yes, but I am struggling with how we would define "useful".

For copy and vacuum, the only reason I can see for keeping visited
buffers around is to avoid flushing WAL or at least doing it in larger
batches. Once the ring is big enough that WAL doesn't need to be
flushed on eviction, making it bigger only wastes space that could be
used by something that is not going to be evicted soon.

> > > StrategyRejectBuffer() will allow bulkreads to, as you say, use more
> > > buffers than the original ring size, since it allows them to kick
> > > dirty buffers out of the ring and claim new shared buffers.
> > >
> > > Bulkwrites and vacuums, however, will inevitably dirty buffers and
> > > require flushing the buffer (and thus flushing the associated WAL) when
> > > reusing them. Bulkwrites and vacuum do not kick dirtied buffers out of
> > > the ring, since dirtying buffers is their common case. A dynamic
> > > resizing like the one you suggest would likely devolve to vacuum and
> > > bulkwrite strategies always using the max size.
> >
> > I think it should self stabilize around the point where the WAL is
> > either flushed by other commit activity, WAL writer or WAL buffers
> > filling up. Writing out their own dirtied buffers will still happen,
> > just the associated WAL flushes will be in larger chunks and possibly
> > done by other processes.
>
> They will have to write out any WAL associated with modifications to the
> dirty buffer before flushing it, so I'm not sure I understand how this
> would work.

By the time the dirty buffer needs eviction the WAL associated with it
can already be written out by concurrent commits, WAL writer or by WAL
buffers filling up. The bigger the ring is, the higher the chance that
one of these will happen before we loop around.

> > > As for decreasing the ring size, buffers are only "added" to the ring
> > > lazily and, technically, as it is now, buffers which have been added
> > > added to the ring can always be reclaimed by the clocksweep (as long as
> > > they are not pinned). The buffer access strategy is more of a
> > > self-imposed restriction than it is a reservation. Since the ring is
> > > small and the buffers are being frequently reused, odds are the usage
> > > count will be 1 and we will be the one who set it to 1, but there is no
> > > guarantee. If, when attempting to reuse the buffer, its usage count is
> > > > 1 (or it is pinned), we also will kick it out of the ring and go look
> > > for a replacement buffer.
> >
> > Right, but while the buffer is actively used by the ring it is
> > unlikely that clocksweep will find it at usage 0 as the ring buffer
> > should cycle more often than the clocksweep. Whereas if the ring stops
> > using a buffer, clocksweep will eventually come and reclaim it. And if
> > the ring shrinking decision turns out to be wrong before the
> > clocksweep gets around to reusing it, we can bring the same buffer
> > back into the ring.
>
> I can see what you mean about excluding a buffer from the ring being a
> more effective way of allowing it to be reclaimed. However, I'm not sure
> I understand the use case. If the operation, say vacuum, is actively
> using the buffer and keeping its usage count at one, then what would be
> the criteria for it to decide to stop using it?

The criteria for reducing ring size could be that we have cycled the
ring buffer n times without having to do any WAL flushes.

> Also, if vacuum used the buffer once and then didn't reuse it but, for
> some reason, the vacuum isn't over, it isn't any different at that point
> than some other buffer with a usage count of one. It isn't any harder
> for it to be reclaimed by the clocksweep.
>
> The argument I could see for decreasing the size even when the buffers
> are being used by the operation employing the strategy is if there is
> pressure from other workloads to use those buffers. But, designing a
> system that would reclaim buffers when needed by other workloads is more
> complicated than what is being proposed here.

I don't think any specific reclaim is needed, if the ring stops using
a buffer *and* there is pressure from other workloads the buffer will
get used for other stuff by the normal clocksweep. If the ring keeps
using it then the normal clocksweep is highly unlikely to find it with
usage count 0. If there is no concurrent allocation pressure, the ring
can start using it again if that turns out to be necessary (probably
should still check that it hasn't been reused by someone else).
--

Ants Aasma
Senior Database Engineer
www.cybertec-postgresql.com



Re: Option to not use ringbuffer in VACUUM, using it in failsafe mode

From
Melanie Plageman
Date:
On Tue, Mar 21, 2023 at 6:03 AM Ants Aasma <ants@cybertec.at> wrote:
>
> On Mon, 20 Mar 2023 at 00:59, Melanie Plageman
> <melanieplageman@gmail.com> wrote:
> >
> > On Wed, Mar 15, 2023 at 6:46 AM Ants Aasma <ants@cybertec.at> wrote:
> > >
> > > On Wed, 15 Mar 2023 at 02:29, Melanie Plageman
> > > <melanieplageman@gmail.com> wrote:
> > > > As for routine vacuuming and the other buffer access strategies, I think
> > > > there is an argument for configurability based on operator knowledge --
> > > > perhaps your workload will use the data you are COPYing as soon as the
> > > > COPY finishes, so you might as well disable a buffer access strategy or
> > > > use a larger fraction of shared buffers. Also, the ring sizes were
> > > > selected sixteen years ago and average server memory and data set sizes
> > > > have changed.
> > >
> > > To be clear I'm not at all arguing against configurability. I was
> > > thinking that dynamic use could make the configuration simpler by self
> > > tuning to use no more buffers than is useful.
> >
> > Yes, but I am struggling with how we would define "useful".
>
> For copy and vacuum, the only reason I can see for keeping visited
> buffers around is to avoid flushing WAL or at least doing it in larger
> batches. Once the ring is big enough that WAL doesn't need to be
> flushed on eviction, making it bigger only wastes space that could be
> used by something that is not going to be evicted soon.

Well, I think if you know you will use the data you are COPYing right
away in your normal workload, it could be useful to have the ring be
large or to disable use of the ring. And, for vacuum, if you need to get
it done as quickly as possible, again, it could be useful to have the
ring be large or disable use of the ring.

> > > > StrategyRejectBuffer() will allow bulkreads to, as you say, use more
> > > > buffers than the original ring size, since it allows them to kick
> > > > dirty buffers out of the ring and claim new shared buffers.
> > > >
> > > > Bulkwrites and vacuums, however, will inevitably dirty buffers and
> > > > require flushing the buffer (and thus flushing the associated WAL) when
> > > > reusing them. Bulkwrites and vacuum do not kick dirtied buffers out of
> > > > the ring, since dirtying buffers is their common case. A dynamic
> > > > resizing like the one you suggest would likely devolve to vacuum and
> > > > bulkwrite strategies always using the max size.
> > >
> > > I think it should self stabilize around the point where the WAL is
> > > either flushed by other commit activity, WAL writer or WAL buffers
> > > filling up. Writing out their own dirtied buffers will still happen,
> > > just the associated WAL flushes will be in larger chunks and possibly
> > > done by other processes.
> >
> > They will have to write out any WAL associated with modifications to the
> > dirty buffer before flushing it, so I'm not sure I understand how this
> > would work.
>
> By the time the dirty buffer needs eviction the WAL associated with it
> can already be written out by concurrent commits, WAL writer or by WAL
> buffers filling up. The bigger the ring is, the higher the chance that
> one of these will happen before we loop around.

Ah, I think I understand the idea now. So, I think it is an interesting
idea to try and find the goldilocks size for the ring buffer. It is
especially interesting to me in the case in which we are enlarging the
ring.

However, given that concurrent workload variability, machine I/O latency
fluctuations, etc, we will definitely have to set a max value of some
kind anyway for the ring size. So, this seems more like a complimentary
feature to vacuum_buffer_usage_limit. If we added some kind of adaptive
sizing in a later version, we could emphasize in the guidance for
setting vacuum_buffer_usage_limit that it is the *maximum* size you
would like to allow vacuum to use. And, of course, there are the other
operations which use buffer access strategies.

> > > > As for decreasing the ring size, buffers are only "added" to the ring
> > > > lazily and, technically, as it is now, buffers which have been added
> > > > added to the ring can always be reclaimed by the clocksweep (as long as
> > > > they are not pinned). The buffer access strategy is more of a
> > > > self-imposed restriction than it is a reservation. Since the ring is
> > > > small and the buffers are being frequently reused, odds are the usage
> > > > count will be 1 and we will be the one who set it to 1, but there is no
> > > > guarantee. If, when attempting to reuse the buffer, its usage count is
> > > > > 1 (or it is pinned), we also will kick it out of the ring and go look
> > > > for a replacement buffer.
> > >
> > > Right, but while the buffer is actively used by the ring it is
> > > unlikely that clocksweep will find it at usage 0 as the ring buffer
> > > should cycle more often than the clocksweep. Whereas if the ring stops
> > > using a buffer, clocksweep will eventually come and reclaim it. And if
> > > the ring shrinking decision turns out to be wrong before the
> > > clocksweep gets around to reusing it, we can bring the same buffer
> > > back into the ring.
> >
> > I can see what you mean about excluding a buffer from the ring being a
> > more effective way of allowing it to be reclaimed. However, I'm not sure
> > I understand the use case. If the operation, say vacuum, is actively
> > using the buffer and keeping its usage count at one, then what would be
> > the criteria for it to decide to stop using it?
>
> The criteria for reducing ring size could be that we have cycled the
> ring buffer n times without having to do any WAL flushes.
>
> > Also, if vacuum used the buffer once and then didn't reuse it but, for
> > some reason, the vacuum isn't over, it isn't any different at that point
> > than some other buffer with a usage count of one. It isn't any harder
> > for it to be reclaimed by the clocksweep.
> >
> > The argument I could see for decreasing the size even when the buffers
> > are being used by the operation employing the strategy is if there is
> > pressure from other workloads to use those buffers. But, designing a
> > system that would reclaim buffers when needed by other workloads is more
> > complicated than what is being proposed here.
>
> I don't think any specific reclaim is needed, if the ring stops using
> a buffer *and* there is pressure from other workloads the buffer will
> get used for other stuff by the normal clocksweep. If the ring keeps
> using it then the normal clocksweep is highly unlikely to find it with
> usage count 0. If there is no concurrent allocation pressure, the ring
> can start using it again if that turns out to be necessary (probably
> should still check that it hasn't been reused by someone else).

Yes, you don't need a specific reclaim mechanism. But you would want to
be quite conservative about decreasing the ring size (given workload
variation and machine variations such as bursting in the cloud) and
probably not do so simply because the operation using the strategy
doesn't absolutely need the buffer but also because other concurrent
workloads really need the buffer. And, it seems complicated to determine
if other workloads do need the buffer.

- Melanie



Re: Option to not use ringbuffer in VACUUM, using it in failsafe mode

From
David Rowley
Date:
On Mon, 20 Mar 2023 at 11:50, Melanie Plageman
<melanieplageman@gmail.com> wrote:
> Attached is an updated v6.

I had a look over the v6-0001 patch.  There are a few things I think
could be done better:

"Some operations will access a large number of pages at a time", does
this really need "at a time"? I think it's more relevant that the
operation uses a large number of pages.

Missing <firstterm> around Buffer Access Strategy.

Various things could be linked to other sections of the glossary, e.g.
pages could link to glossary-data-page, shared buffers could link to
glossary-shared-memory and WAL could link to glossary-wal.

The final paragraph should have <command> tags around the various
commands that you list.

I have adjusted those and slightly reworded a few other things. See
the attached .diff which can be applied atop of v6-0001.

David

Attachment

Re: Option to not use ringbuffer in VACUUM, using it in failsafe mode

From
Melanie Plageman
Date:
On Thu, Mar 30, 2023 at 11:54 PM David Rowley <dgrowleyml@gmail.com> wrote:
>
> On Mon, 20 Mar 2023 at 11:50, Melanie Plageman
> <melanieplageman@gmail.com> wrote:
> > Attached is an updated v6.
>
> I had a look over the v6-0001 patch.  There are a few things I think
> could be done better:
>
> "Some operations will access a large number of pages at a time", does
> this really need "at a time"? I think it's more relevant that the
> operation uses a large number of pages.
>
> Missing <firstterm> around Buffer Access Strategy.
>
> Various things could be linked to other sections of the glossary, e.g.
> pages could link to glossary-data-page, shared buffers could link to
> glossary-shared-memory and WAL could link to glossary-wal.
>
> The final paragraph should have <command> tags around the various
> commands that you list.
>
> I have adjusted those and slightly reworded a few other things. See
> the attached .diff which can be applied atop of v6-0001.

There was one small typo keeping this from compiling. Also a repeated
word. I've fixed these. I also edited a bit of indentation and tweaked
some wording. Diff attached (to be applied on top of your diff).

- Melanie

Attachment

Re: Option to not use ringbuffer in VACUUM, using it in failsafe mode

From
David Rowley
Date:
On Sat, 1 Apr 2023 at 02:52, Melanie Plageman <melanieplageman@gmail.com> wrote:
> There was one small typo keeping this from compiling. Also a repeated
> word. I've fixed these. I also edited a bit of indentation and tweaked
> some wording. Diff attached (to be applied on top of your diff).

Thanks for fixing that mistake.

For reference, I had changed things to end lines early so that the
glossterm tags could be on a line of their own without breaking to a
new line. The rest of the file seems to be done that way, so I thought
we'd better stick to it.

I swapped out "associated WAL" for "unflushed WAL". I didn't agree
that the WAL that would be flushed would have any particular
association with the to-be-written page.

I dropped CTAS since I didn't see any other mention in the docs about
that. I could maybe see the sense in making reference to the
abbreviated form if we were going to mention it again and didn't want
to spell the whole thing out each time, but that's not the case here.

I pushed the result.

David



Re: Option to not use ringbuffer in VACUUM, using it in failsafe mode

From
Melanie Plageman
Date:
On Fri, Mar 31, 2023 at 5:47 PM David Rowley <dgrowleyml@gmail.com> wrote:
>
> On Sat, 1 Apr 2023 at 02:52, Melanie Plageman <melanieplageman@gmail.com> wrote:
> > There was one small typo keeping this from compiling. Also a repeated
> > word. I've fixed these. I also edited a bit of indentation and tweaked
> > some wording. Diff attached (to be applied on top of your diff).
>
> Thanks for fixing that mistake.
>
> For reference, I had changed things to end lines early so that the
> glossterm tags could be on a line of their own without breaking to a
> new line. The rest of the file seems to be done that way, so I thought
> we'd better stick to it.
>
> I swapped out "associated WAL" for "unflushed WAL". I didn't agree
> that the WAL that would be flushed would have any particular
> association with the to-be-written page.
>
> I dropped CTAS since I didn't see any other mention in the docs about
> that. I could maybe see the sense in making reference to the
> abbreviated form if we were going to mention it again and didn't want
> to spell the whole thing out each time, but that's not the case here.
>
> I pushed the result.

Cool!

I've attached v7 with that commit dropped and with support for parallel
vacuum workers to use the same number of buffers in their own Buffer
Access Strategy ring as the main vacuum phase did. I also updated the
docs to indicate that vacuum_buffer_usage_limit is per backend (not per
instance of VACUUM).

- Melanie

Attachment

Re: Option to not use ringbuffer in VACUUM, using it in failsafe mode

From
David Rowley
Date:
On Sat, 1 Apr 2023 at 12:57, Melanie Plageman <melanieplageman@gmail.com> wrote:
> I've attached v7 with that commit dropped and with support for parallel
> vacuum workers to use the same number of buffers in their own Buffer
> Access Strategy ring as the main vacuum phase did. I also updated the
> docs to indicate that vacuum_buffer_usage_limit is per backend (not per
> instance of VACUUM).

(was just replying about v6-0002 when this came in. Replying here instead)

For v7-0001, can we just get rid of both of those static globals?  I'm
gobsmacked by the existing "A few variables that don't seem worth
passing around as parameters" comment.  Not wanting to pass parameters
around is a horrible excuse for adding global variables, even static
ones.

Attached is what I propose in .diff form so that the CFbot can run on
your v7 patches without picking this up.

I considered if we could switch memory contexts before calling
expand_vacuum_rel() and get_all_vacuum_rels(), but I see, at least in
the case of expand_vacuum_rel() that we'd probably want to list_free()
the output of find_all_inheritors() to save that from leaking into the
vac_context.  It seems safe just to switch into the vac_context only
when we really want to keep that memory around. (I do think switching
in each iteration of the foreach(part_lc, part_oids) loop is
excessive, however. Just not enough for me to want to change it)

David

Attachment

Re: Option to not use ringbuffer in VACUUM, using it in failsafe mode

From
Justin Pryzby
Date:
On Sat, Apr 01, 2023 at 01:05:19PM +1300, David Rowley wrote:
> Attached is what I propose in .diff form so that the CFbot can run on
> your v7 patches without picking this up.

But it processes .diff, too

https://wiki.postgresql.org/wiki/Cfbot#Which_attachments_are_considered_to_be_patches.3F



Re: Option to not use ringbuffer in VACUUM, using it in failsafe mode

From
Melanie Plageman
Date:
On Fri, Mar 31, 2023 at 8:05 PM David Rowley <dgrowleyml@gmail.com> wrote:
>
> On Sat, 1 Apr 2023 at 12:57, Melanie Plageman <melanieplageman@gmail.com> wrote:
> > I've attached v7 with that commit dropped and with support for parallel
> > vacuum workers to use the same number of buffers in their own Buffer
> > Access Strategy ring as the main vacuum phase did. I also updated the
> > docs to indicate that vacuum_buffer_usage_limit is per backend (not per
> > instance of VACUUM).
>
> (was just replying about v6-0002 when this came in. Replying here instead)
>
> For v7-0001, can we just get rid of both of those static globals?  I'm
> gobsmacked by the existing "A few variables that don't seem worth
> passing around as parameters" comment.  Not wanting to pass parameters
> around is a horrible excuse for adding global variables, even static
> ones.

Makes sense to me.

> Attached is what I propose in .diff form so that the CFbot can run on
> your v7 patches without picking this up.

Your diff LGTM.

Earlier upthread in [1], Bharath had mentioned in a review comment about
removing the global variables that he would have expected the analogous
global in analyze.c to also be removed (vac_strategy [and analyze.c also
has anl_context]).

I looked into doing this, and this is what I found out (see full
rationale in [2]):

> it is a bit harder to remove it from analyze because acquire_func
> doesn't take the buffer access strategy as a parameter and
> acquire_sample_rows uses the vac_context global variable to pass to
> table_scan_analyze_next_block().

I don't know if this is worth mentioning in the commit removing the
other globals? Maybe it will just make it more confusing...

> I considered if we could switch memory contexts before calling
> expand_vacuum_rel() and get_all_vacuum_rels(), but I see, at least in
> the case of expand_vacuum_rel() that we'd probably want to list_free()
> the output of find_all_inheritors() to save that from leaking into the
> vac_context.  It seems safe just to switch into the vac_context only
> when we really want to keep that memory around. (I do think switching
> in each iteration of the foreach(part_lc, part_oids) loop is
> excessive, however. Just not enough for me to want to change it)

Yes, I see what you mean. Your decision makes sense to me.

- Melanie

[1] https://www.postgresql.org/message-id/CALj2ACXKgAQpKsCPi6ox%2BK5JLDB9TAxeObyVOfrmgTjqmc0aAA%40mail.gmail.com
[2] https://www.postgresql.org/message-id/CAAKRu_brtqmd4e7kwEeKjySP22y4ywF32M7pvpi%2Bx5txgF0%2Big%40mail.gmail.com



Re: Option to not use ringbuffer in VACUUM, using it in failsafe mode

From
Melanie Plageman
Date:
Hi,

I was just doing some cleanup on the main patch in this set and realized
that it was missing a few things. One of which is forbidding the
BUFFER_USAGE_LIMIT with VACUUM FULL since VACUUM FULL does not use a
BAS_VACUUM strategy.

VACUUM FULL technically uses a bulkread buffer access strategy for
reading the original relation if its number of blocks is > number of
shared buffers / 4 (see initscan()). The new rel writing is done using
smgrextend/write directly and doesn't go through shared buffers. I
think it is a stretch to try and use the size passed in to VACUUM by
BUFFER_USAGE_LIMIT for the bulkread strategy ring.

As for forbidding the combination, I noticed that when VACUUM FULL is
specified with INDEX_CLEANUP OFF, there is no syntax error but the
INDEX_CLEANUP option is simply ignored. This is documented behavior.

I somehow feel like VACUUM (FULL, BUFFER_USAGE_LIMIT 'x') should error
out instead of silently not using the buffer usage limit, though.

I am looking for others' opinions.

- Melanie



Re: Option to not use ringbuffer in VACUUM, using it in failsafe mode

From
Justin Pryzby
Date:
On Sat, Apr 01, 2023 at 01:29:13PM -0400, Melanie Plageman wrote:
> Hi,
> 
> I was just doing some cleanup on the main patch in this set and realized
> that it was missing a few things. One of which is forbidding the
> BUFFER_USAGE_LIMIT with VACUUM FULL since VACUUM FULL does not use a
> BAS_VACUUM strategy.
> 
> VACUUM FULL technically uses a bulkread buffer access strategy for
> reading the original relation if its number of blocks is > number of
> shared buffers / 4 (see initscan()). The new rel writing is done using
> smgrextend/write directly and doesn't go through shared buffers. I
> think it is a stretch to try and use the size passed in to VACUUM by
> BUFFER_USAGE_LIMIT for the bulkread strategy ring.

When you say that it's a stretch, do you mean that it'd be a pain to add
arguments to handful of functions to pass down the setting ?  Or that
it's unclear if doing so would be the desirable/needed/intended/expected
behavior ?

I think if VACUUM FULL were going to allow a configurable strategy size,
then so should CLUSTER.  But it seems fine if they don't.

I wonder if maybe strategy should be configurable in some more generic
way, like a GUC.  At one point I had a patch to allow INSERT to use
strategy buffers (not just INSERT SELECT).  And that's still pretty
desirable.  Also COPY.  I've seen load spikes caused by pg_dumping
tables which are just below 25% of shared_buffers.  Which is exacerbated
because pg_dump deliberately orders tables by size, so those tables are
dumped one after another, each causing eviction of ~20% of shared
buffers.  And exacerbated some more because TOAST don't seem to use a
ring buffer in that case.

> I somehow feel like VACUUM (FULL, BUFFER_USAGE_LIMIT 'x') should error
> out instead of silently not using the buffer usage limit, though.
> 
> I am looking for others' opinions.

Sorry, no opinion here :)

One thing is that it's fine to take something that previously throw an
error and change it to not throw an error anymore.  But it's undesirable
to do the opposite.  For that reason, there's may be a tendency to add
errors for cases like this.

-- 
Justin



Re: Option to not use ringbuffer in VACUUM, using it in failsafe mode

From
Melanie Plageman
Date:
On Sat, Apr 1, 2023 at 1:57 PM Justin Pryzby <pryzby@telsasoft.com> wrote:
>
> On Sat, Apr 01, 2023 at 01:29:13PM -0400, Melanie Plageman wrote:
> > Hi,
> >
> > I was just doing some cleanup on the main patch in this set and realized
> > that it was missing a few things. One of which is forbidding the
> > BUFFER_USAGE_LIMIT with VACUUM FULL since VACUUM FULL does not use a
> > BAS_VACUUM strategy.
> >
> > VACUUM FULL technically uses a bulkread buffer access strategy for
> > reading the original relation if its number of blocks is > number of
> > shared buffers / 4 (see initscan()). The new rel writing is done using
> > smgrextend/write directly and doesn't go through shared buffers. I
> > think it is a stretch to try and use the size passed in to VACUUM by
> > BUFFER_USAGE_LIMIT for the bulkread strategy ring.
>
> When you say that it's a stretch, do you mean that it'd be a pain to add
> arguments to handful of functions to pass down the setting ?  Or that
> it's unclear if doing so would be the desirable/needed/intended/expected
> behavior ?

More that I don't think it makes sense. VACUUM FULL only uses a buffer
access strategy (BAS_BULKREAD) for reading the original relation in and
not for writing the new one. It has different concerns because its
behavior is totally different from regular vacuum. It is not modifying
the original buffers (AFAIK) and the amount of WAL it is generating is
different. Also, no matter what, the new relation won't be in shared
buffers because of VACUUM FULL using the smgr functions directly. So, I
think that allowing the two options together is confusing for the user
because it seems to imply we can give them some benefit that we cannot.

> I wonder if maybe strategy should be configurable in some more generic
> way, like a GUC.  At one point I had a patch to allow INSERT to use
> strategy buffers (not just INSERT SELECT).  And that's still pretty
> desirable.  Also COPY.  I've seen load spikes caused by pg_dumping
> tables which are just below 25% of shared_buffers.  Which is exacerbated
> because pg_dump deliberately orders tables by size, so those tables are
> dumped one after another, each causing eviction of ~20% of shared
> buffers.  And exacerbated some more because TOAST don't seem to use a
> ring buffer in that case.

Yes, it is probably worth exploring how configurable or dynamic Buffer
Access Strategies should be for other users (e.g. not just VACUUM).
However, since the ring sizes wouldn't be the same for all the different
operations, it is probably easier to start with a single kind of
operation and go from there.

> > I somehow feel like VACUUM (FULL, BUFFER_USAGE_LIMIT 'x') should error
> > out instead of silently not using the buffer usage limit, though.
> >
> > I am looking for others' opinions.
>
> Sorry, no opinion here :)
>
> One thing is that it's fine to take something that previously throw an
> error and change it to not throw an error anymore.  But it's undesirable
> to do the opposite.  For that reason, there's may be a tendency to add
> errors for cases like this.

So, I have made it error out when you specify BUFFER_USAGE_LIMIT with
VACUUM FULL or VACUUM ONLY_DATABASE_STATS. However, if you specify
buffer_usage_limit -1 with either of these options, it will not error
out. I don't love this, but I noticed that VACUUM (FULL, PARALLEL 0)
does not error out, while VACUUM (FULL, PARALLEL X) where X > 0 does.

If I want to error out when BUFFER_USAGE_LIMIT specified at all but
still do so at the bottom of ExecVacuum() with the rest of the vacuum
option sanity checking, I will probably need to add a flag bit for
VacuumParams->options.

I was wondering why some "sanity checking" of vacuum options is done in
ExecVacuum() and some in vacuum() (it isn't just split by what is
applicable to autovacuum and what isn't).

I noticed that even in cases where we don't use the strategy object we
still made it, which I thought seemed like a bit of a waste and easy to
fix. I've added a commit which does not make the BufferAccessStrategy
object when VACUUM FULL or VACUUM ONLY_DATABASE_STATS are specified. I
noticed that we also don't use the strategy for VACUUM (PROCESS_MAIN
false, PROCESS_TOAST false), but it didn't seem worth handling this very
specific case, so I didn't.

v8 attached has the prohibitions specified above (including for
vacuumdb, as relevant) as well as some cleanup, added test cases, and
updated documentation.

0001 is essentially unmodified (i.e. I didn't do anything with the other
global variable David mentioned).

I still have a few open questions:
- what the initial value of ring_size for autovacuum should be (see the
  one remaining TODO in the code)
- should ANALYZE allow specifying BUFFER_USAGE_LIMIT since it uses the guc
  value when that is set?
- should INDEX_CLEANUP off cause VACUUM to use shared buffers and
  disable use of a strategy (like failsafe vacuum)
- should we add anything to VACUUM VERBOSE output about the number of
  reuses of strategy buffers?
- Should we make BufferAccessStrategyData non-opaque so that we don't
  have to add a getter for nbuffers. I could have implemented this in
  another way, but I don't really see why BufferAccessStrategyData
  should be opaque

- Melanie

Attachment

Re: Option to not use ringbuffer in VACUUM, using it in failsafe mode

From
David Rowley
Date:
On Sat, 1 Apr 2023 at 13:24, Melanie Plageman <melanieplageman@gmail.com> wrote:
> Your diff LGTM.
>
> Earlier upthread in [1], Bharath had mentioned in a review comment about
> removing the global variables that he would have expected the analogous
> global in analyze.c to also be removed (vac_strategy [and analyze.c also
> has anl_context]).
>
> I looked into doing this, and this is what I found out (see full
> rationale in [2]):
>
> > it is a bit harder to remove it from analyze because acquire_func
> > doesn't take the buffer access strategy as a parameter and
> > acquire_sample_rows uses the vac_context global variable to pass to
> > table_scan_analyze_next_block().
>
> I don't know if this is worth mentioning in the commit removing the
> other globals? Maybe it will just make it more confusing...

I did look at that, but it seems a little tricky to make work unless
the AcquireSampleRowsFunc signature was changed. To me, it just does
not seem worth doing that to get rid of the two globals in analyze.c.

I pushed the patch with just the vacuum.c changes.

David



Re: Option to not use ringbuffer in VACUUM, using it in failsafe mode

From
David Rowley
Date:
I've now pushed up v8-0004.  Can rebase the remaining 2 patches on top
of master again and resend?

On Mon, 3 Apr 2023 at 08:11, Melanie Plageman <melanieplageman@gmail.com> wrote:
> I still have a few open questions:
> - what the initial value of ring_size for autovacuum should be (see the
>   one remaining TODO in the code)

I assume you're talking about the 256KB BAS_VACUUM one set in
GetAccessStrategy()? I don't think this patch should be doing anything
to change those defaults.  Anything that does that should likely have
a new thread and come with analysis or reasoning about why the newly
proposed defaults are better than the old ones.

> - should ANALYZE allow specifying BUFFER_USAGE_LIMIT since it uses the guc
>   value when that is set?

That's a good question...

> - should INDEX_CLEANUP off cause VACUUM to use shared buffers and
>   disable use of a strategy (like failsafe vacuum)

I don't see why it should.  It seems strange to have one option
magically make changes to some other option.

> - should we add anything to VACUUM VERBOSE output about the number of
>   reuses of strategy buffers?

Sounds like this would require an extra array of counter variables in
BufferAccessStrategyData?  I think it might be a bit late to start
experimenting with this.

> - Should we make BufferAccessStrategyData non-opaque so that we don't
>   have to add a getter for nbuffers. I could have implemented this in
>   another way, but I don't really see why BufferAccessStrategyData
>   should be opaque

If nothing outside of the .c file requires access then there's little
need to make the members known outside of the file. Same as you'd want
to make classes private rather than public when possible in OOP.

If you do come up with a reason to be able to determine the size of
the BufferAccessStrategy from outside freelist.c, I'd say an accessor
method is the best way.

David

David



Re: Option to not use ringbuffer in VACUUM, using it in failsafe mode

From
Melanie Plageman
Date:
On Mon, Apr 3, 2023 at 1:09 AM David Rowley <dgrowleyml@gmail.com> wrote:
>
> On Sat, 1 Apr 2023 at 13:24, Melanie Plageman <melanieplageman@gmail.com> wrote:
> > Your diff LGTM.
> >
> > Earlier upthread in [1], Bharath had mentioned in a review comment about
> > removing the global variables that he would have expected the analogous
> > global in analyze.c to also be removed (vac_strategy [and analyze.c also
> > has anl_context]).
> >
> > I looked into doing this, and this is what I found out (see full
> > rationale in [2]):
> >
> > > it is a bit harder to remove it from analyze because acquire_func
> > > doesn't take the buffer access strategy as a parameter and
> > > acquire_sample_rows uses the vac_context global variable to pass to
> > > table_scan_analyze_next_block().
> >
> > I don't know if this is worth mentioning in the commit removing the
> > other globals? Maybe it will just make it more confusing...
>
> I did look at that, but it seems a little tricky to make work unless
> the AcquireSampleRowsFunc signature was changed. To me, it just does
> not seem worth doing that to get rid of the two globals in analyze.c.

Yes, I came to basically the same conclusion.

On Mon, Apr 3, 2023 at 7:57 AM David Rowley <dgrowleyml@gmail.com> wrote:
>
> I've now pushed up v8-0004.  Can rebase the remaining 2 patches on top
> of master again and resend?

v9 attached.

> On Mon, 3 Apr 2023 at 08:11, Melanie Plageman <melanieplageman@gmail.com> wrote:
> > I still have a few open questions:
> > - what the initial value of ring_size for autovacuum should be (see the
> >   one remaining TODO in the code)
>
> I assume you're talking about the 256KB BAS_VACUUM one set in
> GetAccessStrategy()? I don't think this patch should be doing anything
> to change those defaults.  Anything that does that should likely have
> a new thread and come with analysis or reasoning about why the newly
> proposed defaults are better than the old ones.

I actually was talking about something much more trivial but a little
more confusing.

In table_recheck_autovac(), I initialize the
autovac_table->at_params.ring_size to the value of the
vacuum_buffer_usage_limit guc. However, autovacuum makes its own
BufferAccessStrategy object (instead of relying on vacuum() to do it)
and passes that in to vacuum(). So, if we wanted autovacuum to disable
use of a strategy (and use as many shared buffers as it likes), it would
pass in NULL to vacuum(). If vauum_buffer_usage_limit is not 0, then we
would end up making and using a BufferAccessStrategy in vacuum().

If we instead initialized autovac_table->at_params.ring_size to 0, even
if the passed in BufferAccessStrategy is NULL, we wouldn't make a ring
for autovacuum. Right now, we don't disable the strategy for autovacuum
except in failsafe mode. And it is unclear when or why we would want to.

I also thought it might be weird to have the value of the ring_size be
initialized to something other than the value of
vacuum_buffer_usage_limit for autovacuum, since it is supposed to use
that guc value.

In fact, right now, we don't use the autovac_table->at_params.ring_size
set in table_recheck_autovac() when making the ring in do_autovacuum()
but instead use the guc directly.

I actually don't really like how vacuum() relies on the
BufferAccessStrategy parameter being NULL for autovacuum and feel like
there is a more intuitive way to handle all this. But, I didn't want to
make major changes at this point.

Anyway, the above is quite a bit more analysis than the issue is really
worth. We should pick something and then document it in a comment.

> > - should ANALYZE allow specifying BUFFER_USAGE_LIMIT since it uses the guc
> >   value when that is set?
>
> That's a good question...

I kinda think we should just skip it. It adds to the surface area of the
feature.

> > - should INDEX_CLEANUP off cause VACUUM to use shared buffers and
> >   disable use of a strategy (like failsafe vacuum)
>
> I don't see why it should.  It seems strange to have one option
> magically make changes to some other option.

Sure, sounds good.

> > - should we add anything to VACUUM VERBOSE output about the number of
> >   reuses of strategy buffers?
>
> Sounds like this would require an extra array of counter variables in
> BufferAccessStrategyData?  I think it might be a bit late to start
> experimenting with this.

Makes sense. I hadn't thought through the implementation. We count reuses in
pg_stat_io data structures but that is global and not per
BufferAccessStrategyData instance, so I agree to scrapping this idea.

> > - Should we make BufferAccessStrategyData non-opaque so that we don't
> >   have to add a getter for nbuffers. I could have implemented this in
> >   another way, but I don't really see why BufferAccessStrategyData
> >   should be opaque
>
> If nothing outside of the .c file requires access then there's little
> need to make the members known outside of the file. Same as you'd want
> to make classes private rather than public when possible in OOP.
>
> If you do come up with a reason to be able to determine the size of
> the BufferAccessStrategy from outside freelist.c, I'd say an accessor
> method is the best way.

In the main patch, I wanted access to the number of buffers so that
parallel vacuum workers could make their own rings the same size. I
added an accessor, but it looked a bit silly so I thought I would ask if
we needed to keep the data structure opaque. It isn't called frequently
enough to worry about the function call overhead. Though the accessor
could use a better name than the one I chose.

- Melanie

Attachment

Re: Option to not use ringbuffer in VACUUM, using it in failsafe mode

From
David Rowley
Date:
On Tue, 4 Apr 2023 at 02:49, Melanie Plageman <melanieplageman@gmail.com> wrote:
> v9 attached.

I've made a pass on the v9-0001 patch only.  Here's what I noted down:

v9-0001:

1. In the documentation and comments, generally we always double-space
after a period.  I see quite often you're not following this.

2. Doc: We could generally seem to break tags within paragraphs into
multiple lines. You're doing that quite a bit, e.g:

        linkend="glossary-buffer-access-strategy">Buffer Access
        Strategy</glossterm>. <literal>0</literal> will disable use of a

2. This is not a command

        <command>BUFFER_USAGE_LIMIT</command> parameter.

<option> is probably what you want.

3. I'm not sure I agree that it's a good idea to refer to the strategy
with multiple different names. Here you've called it a "ring buffer",
but in the next sentence, you're calling it a Buffer Access Strategy.

      Specifies the ring buffer size for <command>VACUUM</command>. This size
      is used to calculate the number of shared buffers which will be reused as
      part of a <glossterm linkend="glossary-buffer-access-strategy">Buffer
      Access Strategy</glossterm>. <literal>0</literal> disables use of a

4. Can you explain your choice in not just making < 128 a hard error
rather than clamping?

I guess it means checks like this are made more simple, but that does
not seem like a good enough reason:

/* check for out-of-bounds */
if (result < -1 || result > MAX_BAS_RING_SIZE_KB)


postgres=# vacuum (parallel -1) pg_class;
ERROR:  parallel workers for vacuum must be between 0 and 1024

Maybe the above is a good guide to follow.

To allow you to get rid of the clamping code, you'd likely need an
assign hook function for vacuum_buffer_usage_limit.

5. I see vacuum.sgml is full of inconsistencies around the use of
<literal> vs <option>. I was going to complain about your:

      <literal>ONLY_DATABASE_STATS</literal> option. If
      <literal>ANALYZE</literal> is also specified, the
      <literal>BUFFER_USAGE_LIMIT</literal> value is used for both the vacuum

but I see you've likely just copied what's nearby.

There are also plenty of usages of <option> in that file.  I'd rather
see you use <option>. Maybe there can be some other patch that sweeps
the entire docs to look for <literal>OPTION_NAME</literal> and
replaces them to use <option>.

6. I was surprised to see you've added both
GetAccessStrategyWithSize() and GetAccessStrategyWithNBuffers(). I
think the former is suitable for both. GetAccessStrategyWithNBuffers()
seems to be just used once outside of freelist.c

7. I don't think bas_nbuffers() is a good name for an external
function.  StrategyGetBufferCount() seems better.

8. I don't quite follow this comment:

/*
* TODO: should this be 0 so that we are sure that vacuum() never
* allocates a new bstrategy for us, even if we pass in NULL for that
* parameter? maybe could change how failsafe NULLs out bstrategy if
* so?
*/

Can you explain under what circumstances would vacuum() allocate a
bstrategy when do_autovacuum() would not? Is this a case of a config
reload where someone changes vacuum_buffer_usage_limit from 0 to
something non-zero? If so, perhaps do_autovacuum() needs to detect
this and allocate a strategy rather than having vacuum() do it once
per table (wastefully).

9. buffer/README.  I think it might be overkill to document details
about how the new vacuum option works in a section talking about
Buffer Ring Replacement Strategy.  Perhaps it just worth something
like:

"In v16, the 256KB ring was made configurable by way of the
vacuum_buffer_usage_limit GUC and the BUFFER_USAGE_LIMIT VACUUM
option."

10. I think if you do #4 then you can get rid of all the range checks
and DEBUG1 elogs in GetAccessStrategyWithSize().

11. This seems a bit badly done:

int vacuum_buffer_usage_limit = -1;

int VacuumCostPageHit = 1; /* GUC parameters for vacuum */
int VacuumCostPageMiss = 2;
int VacuumCostPageDirty = 20;

I'd class vacuum_buffer_usage_limit as a "GUC parameters for vacuum"
too.  Probably the CamelCase naming should be followed too.


12. ANALYZE too?

{"vacuum_buffer_usage_limit", PGC_USERSET, RESOURCES_MEM,
gettext_noop("Sets the buffer pool size for VACUUM and autovacuum."),

13. VacuumParams.ring_size has no comments explaining what it is.

14. vacuum_buffer_usage_limit seems to be lumped in with unrelated GUCs

extern PGDLLIMPORT int maintenance_work_mem;
extern PGDLLIMPORT int max_parallel_maintenance_workers;
+extern PGDLLIMPORT int vacuum_buffer_usage_limit;

extern PGDLLIMPORT int VacuumCostPageHit;
extern PGDLLIMPORT int VacuumCostPageMiss;


15. No comment explaining what these are:

#define MAX_BAS_RING_SIZE_KB (16 * 1024 * 1024)
#define MIN_BAS_RING_SIZE_KB 128

16. Parameter names in function declaration and definition don't match in:

extern BufferAccessStrategy
GetAccessStrategyWithNBuffers(BufferAccessStrategyType btype, int
nbuffers);
extern BufferAccessStrategy
GetAccessStrategyWithSize(BufferAccessStrategyType btype, int
nbuffers);

Also, line wraps at 79 chars. (80 including line feed)

17. If you want to test the 16GB upper limit, maybe going 1KB (or
8KB?) rather than 1GB over 16GB is better? 2097153kB, I think.

VACUUM (BUFFER_USAGE_LIMIT '17 GB') vac_option_tab;

David



Re: Option to not use ringbuffer in VACUUM, using it in failsafe mode

From
Melanie Plageman
Date:
On Mon, Apr 3, 2023 at 8:37 PM David Rowley <dgrowleyml@gmail.com> wrote:
>
> On Tue, 4 Apr 2023 at 02:49, Melanie Plageman <melanieplageman@gmail.com> wrote:
> > v9 attached.
>
> I've made a pass on the v9-0001 patch only.  Here's what I noted down:

Thanks for the review!

Attached v10 addresses the review feedback below.

Remaining TODOs:
- tests
- do something about config reload changing GUC

> v9-0001:
>
> 1. In the documentation and comments, generally we always double-space
> after a period.  I see quite often you're not following this.

I've gone through and done this. I noticed after building the docs that
it doesn't seem to affect how many spaces are after a period in the
rendered docs, but I suppose it affects readability when editing the
sgml files.

> 2. Doc: We could generally seem to break tags within paragraphs into
> multiple lines. You're doing that quite a bit, e.g:
>
>         linkend="glossary-buffer-access-strategy">Buffer Access
>         Strategy</glossterm>. <literal>0</literal> will disable use of a

I've updated all of the ones I could find that I did this with.

> 2. This is not a command
>
>         <command>BUFFER_USAGE_LIMIT</command> parameter.
>
> <option> is probably what you want.

I have gone through and attempted to correct all
option/command/application tag usages.

> 3. I'm not sure I agree that it's a good idea to refer to the strategy
> with multiple different names. Here you've called it a "ring buffer",
> but in the next sentence, you're calling it a Buffer Access Strategy.
>
>       Specifies the ring buffer size for <command>VACUUM</command>. This size
>       is used to calculate the number of shared buffers which will be reused as
>       part of a <glossterm linkend="glossary-buffer-access-strategy">Buffer
>       Access Strategy</glossterm>. <literal>0</literal> disables use of a

I've updated this to always prefix any use of ring with "Buffer Access
Strategy". I don't know how you'll feel about it. It felt awkward in
some places to use Buffer Access Strategy as a complete stand-in for
ring buffer.

> 4. Can you explain your choice in not just making < 128 a hard error
> rather than clamping?
>
> I guess it means checks like this are made more simple, but that does
> not seem like a good enough reason:
>
> /* check for out-of-bounds */
> if (result < -1 || result > MAX_BAS_RING_SIZE_KB)
>
> postgres=# vacuum (parallel -1) pg_class;
> ERROR:  parallel workers for vacuum must be between 0 and 1024
>
> Maybe the above is a good guide to follow.
>
> To allow you to get rid of the clamping code, you'd likely need an
> assign hook function for vacuum_buffer_usage_limit.

I've added a check hook and replicated the same restrictions in
ExecVacuum() where it parses the limit. I have included enforcement of
the conditional limit that the ring cannot occupy more than 1/8 of
shared buffers. The immediate consequence of this was that my tests were
no longer stable (except for the integer overflow one).
I have removed them for now until I can come up with a better testing
strategy.

On the topic of testing, I also thought we should remove the
VACUUM(BUFFER_USAGE_LIMIT X, PARALLEL X) test. Though the parallel
workers do make their own strategy rings and such a test would be
covering some code, I am hesitant to write a test that would never
really fail. The observable behavior of not using a strategy will be
either 1) basically nothing or 2) the same for parallel and
non-parallel. What do you think?

> 5. I see vacuum.sgml is full of inconsistencies around the use of
> <literal> vs <option>. I was going to complain about your:
>
>       <literal>ONLY_DATABASE_STATS</literal> option. If
>       <literal>ANALYZE</literal> is also specified, the
>       <literal>BUFFER_USAGE_LIMIT</literal> value is used for both the vacuum
>
> but I see you've likely just copied what's nearby.
>
> There are also plenty of usages of <option> in that file.  I'd rather
> see you use <option>. Maybe there can be some other patch that sweeps
> the entire docs to look for <literal>OPTION_NAME</literal> and
> replaces them to use <option>.

I haven't done the separate sweep patch, but I have updated my own
usages in this set.

> 6. I was surprised to see you've added both
> GetAccessStrategyWithSize() and GetAccessStrategyWithNBuffers(). I
> think the former is suitable for both. GetAccessStrategyWithNBuffers()
> seems to be just used once outside of freelist.c

This has been updated and reorganized.

> 7. I don't think bas_nbuffers() is a good name for an external
> function.  StrategyGetBufferCount() seems better.

I've used this name.

> 8. I don't quite follow this comment:
>
> /*
> * TODO: should this be 0 so that we are sure that vacuum() never
> * allocates a new bstrategy for us, even if we pass in NULL for that
> * parameter? maybe could change how failsafe NULLs out bstrategy if
> * so?
> */
>
> Can you explain under what circumstances would vacuum() allocate a
> bstrategy when do_autovacuum() would not? Is this a case of a config
> reload where someone changes vacuum_buffer_usage_limit from 0 to
> something non-zero? If so, perhaps do_autovacuum() needs to detect
> this and allocate a strategy rather than having vacuum() do it once
> per table (wastefully).

Hmm. Yes, I started hacking on this, but I think it might be a bit
tricky to get right. I think it would make sense to check if
vacuum_buffer_usage_limit goes from 0 to not 0 or from not 0 to 0 and
allow disabling and enabling the buffer access strategy, however, I'm
not sure we want to allow changing the size during an autovacuum
worker's run. I started writing code to just allow enabling and
disabling, but I'm a little concerned that the distinction will be
difficult to understand for the user with no obvious indication of what
is happening. That is, you change the size and it silently does nothing,
but you set it to/from 0 and it silently does something?

One alternative for now is to save the ring size before looping through
the relations in do_autovacuum() and always restore that value in
tab->at_params.ring_size in table_recheck_autovac().

I'm not sure what to do.

> 9. buffer/README.  I think it might be overkill to document details
> about how the new vacuum option works in a section talking about
> Buffer Ring Replacement Strategy.  Perhaps it just worth something
> like:
>
> "In v16, the 256KB ring was made configurable by way of the
> vacuum_buffer_usage_limit GUC and the BUFFER_USAGE_LIMIT VACUUM
> option."

I've made the change you suggested.

> 10. I think if you do #4 then you can get rid of all the range checks
> and DEBUG1 elogs in GetAccessStrategyWithSize().

Done.

> 11. This seems a bit badly done:
>
> int vacuum_buffer_usage_limit = -1;
>
> int VacuumCostPageHit = 1; /* GUC parameters for vacuum */
> int VacuumCostPageMiss = 2;
> int VacuumCostPageDirty = 20;
>
> I'd class vacuum_buffer_usage_limit as a "GUC parameters for vacuum"
> too.  Probably the CamelCase naming should be followed too.

I've made this change.

> 12. ANALYZE too?
>
> {"vacuum_buffer_usage_limit", PGC_USERSET, RESOURCES_MEM,
> gettext_noop("Sets the buffer pool size for VACUUM and autovacuum."),

I've mentioned this here and also added the option for ANALYZE.

> 13. VacuumParams.ring_size has no comments explaining what it is.

I've added one.

> 14. vacuum_buffer_usage_limit seems to be lumped in with unrelated GUCs
>
> extern PGDLLIMPORT int maintenance_work_mem;
> extern PGDLLIMPORT int max_parallel_maintenance_workers;
> +extern PGDLLIMPORT int vacuum_buffer_usage_limit;
>
> extern PGDLLIMPORT int VacuumCostPageHit;
> extern PGDLLIMPORT int VacuumCostPageMiss;

I've moved it down a line.

> 15. No comment explaining what these are:
>
> #define MAX_BAS_RING_SIZE_KB (16 * 1024 * 1024)
> #define MIN_BAS_RING_SIZE_KB 128

I've added one.

> 16. Parameter names in function declaration and definition don't match in:
>
> extern BufferAccessStrategy
> GetAccessStrategyWithNBuffers(BufferAccessStrategyType btype, int
> nbuffers);
> extern BufferAccessStrategy
> GetAccessStrategyWithSize(BufferAccessStrategyType btype, int
> nbuffers);

I've fixed this.

> Also, line wraps at 79 chars. (80 including line feed)

I've fixed that function prototype instance of it.

In general line wrap limit + pgindent can be quite challenging. I often
break something onto multiple lines to appease the line limit and then
pgindent will add an absurd number of tabs to align the second line in a
way that looks truly awful. I try to make local variables when this is a
problem, but it is often quite annoying to do that. I wish there was
some way to make pgindent do something different in these cases.

> 17. If you want to test the 16GB upper limit, maybe going 1KB (or
> 8KB?) rather than 1GB over 16GB is better? 2097153kB, I think.
>
> VACUUM (BUFFER_USAGE_LIMIT '17 GB') vac_option_tab;

I've removed this test for now until I figure out a way to actually hit
this reliably with different-sized shared buffers.

- Melanie

Attachment

Re: Option to not use ringbuffer in VACUUM, using it in failsafe mode

From
David Rowley
Date:
On Wed, 5 Apr 2023 at 05:53, Melanie Plageman <melanieplageman@gmail.com> wrote:
> Attached v10 addresses the review feedback below.

Thanks.  Here's another round on v10-0001:

1. The following documentation fragment does not seem to be aligned
with the code:

        <literal>16 GB</literal>.  The minimum size is the lesser
        of 1/8 the size of shared buffers and <literal>128 KB</literal>.  The
        default value is <literal>-1</literal>. If this value is specified

The relevant code is:

static inline int
StrategyGetClampedBufsize(int bufsize_kb)
{
    int sb_limit_kb;
    int blcksz_kb = BLCKSZ / 1024;

    Assert(blcksz_kb > 0);

    sb_limit_kb = NBuffers / 8 * blcksz_kb;

    return Min(sb_limit_kb, bufsize_kb);
}

The code seems to mean that the *maximum* is the lesser of 16GB and
shared_buffers / 8. You're saying it's the minimum.


2. I think you could get rid of the double "Buffer Access Stategy" in:

        <glossterm linkend="glossary-buffer-access-strategy">Buffer
Access Strategy</glossterm>.
        <literal>0</literal> will disable use of a <literal>Buffer
Access Strategy</literal>.
        <literal>-1</literal> will set the size to a default of
        <literal>256 KB</literal>.  The maximum size is

how about:

        <glossterm linkend="glossary-buffer-access-strategy">Buffer
Access Strategy</glossterm>.
        A setting of <literal>0</literal> will allow the operation to use any
        number of <varname>shared_buffers</varname>, whereas
        <literal>-1</literal> will set the size to a default of
        <literal>256 KB</literal>.  The maximum size is


3. In the following snippet you can use <xref linkend="sql-vacuum"/>
or just <command>VACUUM</command>. There are examples of both in that
file. I don't have a preference as it which, but I think what you've
got isn't great.

        <link linkend="sql-vacuum"><command>VACUUM</command></link> and
        <link linkend="sql-analyze"><command>ANALYZE</command></link>

4. I wonder if there's a reason this needs to be written in the
overview of ANALYZE.

   <command>ANALYZE</command> uses a
   <glossterm linkend="glossary-buffer-access-strategy">Buffer Access
Strategy</glossterm>
   when reading in the sample data. The number of buffers consumed for this can
   be controlled by <xref linkend="guc-vacuum-buffer-usage-limit"/> or by using
   the <option>BUFFER_USAGE_LIMIT</option> option.

I think it's fine just to mention it under BUFFER_USAGE_LIMIT. It just
does not seem fundamental enough to be worth being upfront about it.
The other things mentioned in that section don't seem related to
parameters, so there might be no better place for those to go. That's
not the case for what you're adding here.

5. I think I'd rather see the details spelt out here rather than
telling the readers to look at what VACUUM does:

      Specifies the
      <glossterm linkend="glossary-buffer-access-strategy">Buffer
Access Strategy</glossterm>
      ring buffer size for <command>ANALYZE</command>.  See the
      <link linkend="sql-vacuum"><command>VACUUM</command></link> option with
      the same name.


6. When I asked about restricting the valid values of
vacuum_buffer_usage_limit to -1 / 0 or 128 KB to 16GB, I didn't expect
you to code in the NBuffers / 8 check.  We shouldn't chain
dependencies between GUCs like that.  Imagine someone editing their
postgresql.conf after realising shared_buffers is too large for their
RAM, they reduce it and restart.  The database fails to start because
vacuum_buffer_usage_limit is too large!  Angry DBA?

Take what's already written about vacuum_failsafe_age as your guidance on this:

"The default is 1.6 billion transactions. Although users can set this
value anywhere from zero to 2.1 billion, VACUUM will silently adjust
the effective value to no less than 105% of
autovacuum_freeze_max_age."

Here we just document the silent capping.   You can still claim the
valid range is 128KB to 16GB in the docs. You can mention the 1/8th of
shared buffers cap same as what's mentioned about "105%" above.

When I mentioned #4 and #10 in my review of the v9-0001 patch, I just
wanted to not surprise users who do vacuum_buffer_usage_limit = 64 and
magically get 128.

7. In ExecVacuum(), similar to #6 from above, it's also not great that
you're raising an ERROR based on if StrategyGetClampedBufsize() clamps
or not. If someone has a script that does:

VACUUM (BUFFER_USAGE_LIMIT '1 GB'); it might be annoying that it stops
working when someone adjusts shared buffers from 10GB to 6GB.

I really think the NBuffers / 8 clamping just should be done inside
GetAccessStrategyWithSize().

8. I think this ERROR in vacuum.c should mention that 0 is a valid value.

ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("buffer_usage_limit for a vacuum must be between %d KB and %d KB",
MIN_BAS_RING_SIZE_KB, MAX_BAS_RING_SIZE_KB)));

I doubt there's a need to mention -1 as that's the same as not
specifying BUFFER_USAGE_LIMIT.

9. The following might be worthy of a comment explaining the order of
precedence of how we choose the size:

if (params->ring_size == -1)
{
if (VacuumBufferUsageLimit == -1)
bstrategy = GetAccessStrategy(BAS_VACUUM);
else
bstrategy = GetAccessStrategyWithSize(BAS_VACUUM, VacuumBufferUsageLimit);
}
else
bstrategy = GetAccessStrategyWithSize(BAS_VACUUM, params->ring_size);

10. I wonder if you need to keep bufsize_limit_to_nbuffers(). It's
just used once and seems trivial enough just to write that code inside
GetAccessStrategyWithSize().

11. It's probably worth putting the valid range in the sample config:

#vacuum_buffer_usage_limit = -1 # size of vacuum and analyze buffer
access strategy ring.
# -1 to use default,
# 0 to disable vacuum buffer access strategy
# > 0 to specify size <-- here

12. Is bufmgr.h the right place for these?

/*
 * Upper and lower hard limits for the Buffer Access Strategy ring size
 * specified by the vacuum_buffer_usage_limit GUC and BUFFER_USAGE_LIMIT option
 * to VACUUM and ANALYZE.
 */
#define MAX_BAS_RING_SIZE_KB (16 * 1024 * 1024)
#define MIN_BAS_RING_SIZE_KB 128

Your comment implies they're VACUUM / ANALYZE limits.  If we want to
impose these limits to all access strategies then these seem like good
names and location, otherwise, I imagine miscadmin.h is the correct
place. If so, they'll likely want to be renamed to something more
VACUUM specific.  I don't particularly have a preference. 128 -
1677216 seem like reasonable limits for any buffer access strategy.

13. I think check_vacuum_buffer_usage_limit() does not belong in
freelist.c. Maybe vacuum.c?

14. Not related to this patch, but why do we have half the vacuum
related GUCs in vacuum.c and the other half in globals.c? I see
vacuum_freeze_table_age is defined in vacuum.c but is also needed in
autovacuum.c, so that rules out the globals.c ones being for vacuum.c
and autovacuum.c.  It seems a bit messy.   I'm not really sure where
VacuumBufferUsageLimit should go now.

> Remaining TODOs:
> - tests
> - do something about config reload changing GUC

Shouldn't table_recheck_autovac() pfree/palloc a new strategy if the
size changes?

I'm not sure what the implications are with that and the other patch
you're working on to allow vacuum config changes mid-vacuum. We'll
need to be careful and not immediately break that if that gets
committed then this does or vice-versa.

David



Re: Option to not use ringbuffer in VACUUM, using it in failsafe mode

From
Peter Geoghegan
Date:
On Tue, Apr 4, 2023 at 8:14 PM David Rowley <dgrowleyml@gmail.com> wrote:
> 14. Not related to this patch, but why do we have half the vacuum
> related GUCs in vacuum.c and the other half in globals.c? I see
> vacuum_freeze_table_age is defined in vacuum.c but is also needed in
> autovacuum.c, so that rules out the globals.c ones being for vacuum.c
> and autovacuum.c.  It seems a bit messy.   I'm not really sure where
> VacuumBufferUsageLimit should go now.

vacuum_freeze_table_age is an abomination, even compared to the rest
of these GUCs. It was added around the time the visibility map first
went in, and so is quite a bit more recent than
autovacuum_freeze_max_age.

Before the introduction of the visibility map, we only had
autovacuum_freeze_max_age, and it was used to schedule antiwraparound
autovacuums -- there was no such thing as aggressive VACUUMs (just
antiwraparound autovacuums), and no need for autovacuum_freeze_max_age
at all. So autovacuum_freeze_max_age was just for autovacuum.c code.
There was only one type of VACUUM, and they always advanced
relfrozenxid to the same degree.

With the introduction of the visibility map, we needed to have
autovacuum_freeze_max_age in vacuum.c for the first time, to deal with
interpreting the then-new vacuum_freeze_table_age GUC correctly. We
silently truncate the vacuum_freeze_table_age setting so that it never
exceeds 95% of autovacuum_freeze_max_age (the
105%-of-autovacuum_freeze_max_age vacuum_failsafe_age thing that
you're discussing is symmetric). So after 2009
autovacuum_freeze_max_age actually plays an important role in VACUUM,
the command, and not just autovacuum.

This is related to the problem of the autovacuum_freeze_max_age
reloption being completely broken [1]. If autovacuum_freeze_max_age
was still purely just an autovacuum.c scheduling thing, then there
would be no problem with having a separate reloption of the same name.
There are big problems precisely because vacuum.c doesn't do anything
with the autovacuum_freeze_max_age reloption. Though it does okay with
the autovacuum_freeze_table_age reloption, which gets passed in. (Yes,
it's called autovacuum_freeze_table_age in reloption form -- not
vacuum_freeze_table_age, like the GUC).

Note that the decision to ignore the reloption version of
autovacuum_freeze_max_age in the failsafe's
105%-of-autovacuum_freeze_max_age thing was a deliberate one. The
autovacuum_freeze_max_age GUC is authoritative in the sense that it
cannot be overridden locally, except in the direction of making
aggressive VACUUMs happen more frequently for a given table (so they
can't be less frequent via reloption configuration). I suppose you'd
have to untangle that mess if you wanted to fix the
autovacuum_freeze_max_age reloption issue I go into in [1].

[1] https://postgr.es/m/CAH2-Wz=DJAokY_GhKJchgpa8k9t_H_OVOvfPEn97jGNr9W=deg@mail.gmail.com
--
Peter Geoghegan



Re: Option to not use ringbuffer in VACUUM, using it in failsafe mode

From
Andres Freund
Date:
Hi,

On 2023-04-04 13:53:15 -0400, Melanie Plageman wrote:
> > 8. I don't quite follow this comment:
> >
> > /*
> > * TODO: should this be 0 so that we are sure that vacuum() never
> > * allocates a new bstrategy for us, even if we pass in NULL for that
> > * parameter? maybe could change how failsafe NULLs out bstrategy if
> > * so?
> > */
> >
> > Can you explain under what circumstances would vacuum() allocate a
> > bstrategy when do_autovacuum() would not? Is this a case of a config
> > reload where someone changes vacuum_buffer_usage_limit from 0 to
> > something non-zero? If so, perhaps do_autovacuum() needs to detect
> > this and allocate a strategy rather than having vacuum() do it once
> > per table (wastefully).

Hm. I don't much like that we use a single strategy for multiple tables
today. That way even tiny tables never end up in shared_buffers. But that's
really a discussion for a different thread. However, if were to use a
per-table bstrategy, it'd be a lot easier to react to changes of the config.


I doubt it's worth adding complications to the code for changing the size of
the ringbuffer during an ongoing vacuum scan, at least for 16. Reacting to
enabling/disbling the ringbuffer alltogether seems a bit more important, but
still not crucial compared to making it configurable at all.

I think it'd be OK to add a comment saying something like "XXX: In the future
we might want to react to configuration changes of the ring buffer size during
a vacuum" or such.

WRT to the TODO specifically: Yes, passing in 0 seems to make sense. I don't
see a reason not to do so? But perhaps there's a better solution:

Perhaps the best solution for autovac vs interactive vacuum issue would be to
move the allocation of the bstrategy to ExecVacuum()?


Random note while looking at the code:
ISTM that adding handling of -1 in GetAccessStrategyWithSize() would make the
code more readable. Instead of

        if (params->ring_size == -1)
        {
            if (VacuumBufferUsageLimit == -1)
                bstrategy = GetAccessStrategy(BAS_VACUUM);
            else
                bstrategy = GetAccessStrategyWithSize(BAS_VACUUM, VacuumBufferUsageLimit);
        }
        else
            bstrategy = GetAccessStrategyWithSize(BAS_VACUUM, params->ring_size);

you could just have something like:
    bstrategy = GetAccessStrategyWithSize(BAS_VACUUM,
        params->ring_size != -1 ? params->ring_size : VacuumBufferUsageLimit);

by falling back to the default values from GetAccessStrategy().

Or even more "extremely", you could entirely remove references to
VacuumBufferUsageLimit and handle that in freelist.c

Greetings,

Andres Freund



Re: Option to not use ringbuffer in VACUUM, using it in failsafe mode

From
Melanie Plageman
Date:
v11 attached with updates detailed below.

On Tue, Apr 4, 2023 at 11:14 PM David Rowley <dgrowleyml@gmail.com> wrote:
>
> On Wed, 5 Apr 2023 at 05:53, Melanie Plageman <melanieplageman@gmail.com> wrote:
> > Attached v10 addresses the review feedback below.
>
> Thanks.  Here's another round on v10-0001:
>
> 1. The following documentation fragment does not seem to be aligned
> with the code:
>
>         <literal>16 GB</literal>.  The minimum size is the lesser
>         of 1/8 the size of shared buffers and <literal>128 KB</literal>.  The
>         default value is <literal>-1</literal>. If this value is specified
>
> The relevant code is:
>
> static inline int
> StrategyGetClampedBufsize(int bufsize_kb)
> {
>     int sb_limit_kb;
>     int blcksz_kb = BLCKSZ / 1024;
>
>     Assert(blcksz_kb > 0);
>
>     sb_limit_kb = NBuffers / 8 * blcksz_kb;
>
>     return Min(sb_limit_kb, bufsize_kb);
> }
>
> The code seems to mean that the *maximum* is the lesser of 16GB and
> shared_buffers / 8. You're saying it's the minimum.

Good catch. Fixed.

> 2. I think you could get rid of the double "Buffer Access Stategy" in:
>
>         <glossterm linkend="glossary-buffer-access-strategy">Buffer
> Access Strategy</glossterm>.
>         <literal>0</literal> will disable use of a <literal>Buffer
> Access Strategy</literal>.
>         <literal>-1</literal> will set the size to a default of
>         <literal>256 KB</literal>.  The maximum size is
>
> how about:
>
>         <glossterm linkend="glossary-buffer-access-strategy">Buffer
> Access Strategy</glossterm>.
>         A setting of <literal>0</literal> will allow the operation to use any
>         number of <varname>shared_buffers</varname>, whereas
>         <literal>-1</literal> will set the size to a default of
>         <literal>256 KB</literal>.  The maximum size is

I've made these changes.

> 3. In the following snippet you can use <xref linkend="sql-vacuum"/>
> or just <command>VACUUM</command>. There are examples of both in that
> file. I don't have a preference as it which, but I think what you've
> got isn't great.
>
>         <link linkend="sql-vacuum"><command>VACUUM</command></link> and
>         <link linkend="sql-analyze"><command>ANALYZE</command></link>

I've updated it to use the link. I thought it would be nice to have the
link in case the reader wants to look at the BUFFER_USAGE_LIMIT option
docs there.

> 4. I wonder if there's a reason this needs to be written in the
> overview of ANALYZE.
>
>    <command>ANALYZE</command> uses a
>    <glossterm linkend="glossary-buffer-access-strategy">Buffer Access
> Strategy</glossterm>
>    when reading in the sample data. The number of buffers consumed for this can
>    be controlled by <xref linkend="guc-vacuum-buffer-usage-limit"/> or by using
>    the <option>BUFFER_USAGE_LIMIT</option> option.
>
> I think it's fine just to mention it under BUFFER_USAGE_LIMIT. It just
> does not seem fundamental enough to be worth being upfront about it.
> The other things mentioned in that section don't seem related to
> parameters, so there might be no better place for those to go. That's
> not the case for what you're adding here.

I updated this.

> 5. I think I'd rather see the details spelt out here rather than
> telling the readers to look at what VACUUM does:
>
>       Specifies the
>       <glossterm linkend="glossary-buffer-access-strategy">Buffer
> Access Strategy</glossterm>
>       ring buffer size for <command>ANALYZE</command>.  See the
>       <link linkend="sql-vacuum"><command>VACUUM</command></link> option with
>       the same name.

I've updated it to contain the same text, as relevant, as the VACUUM
option contains. Note that both rely on the vacuum_buffer_usage_limit
GUC documentation for a description of upper and lower bounds.

> 6. When I asked about restricting the valid values of
> vacuum_buffer_usage_limit to -1 / 0 or 128 KB to 16GB, I didn't expect
> you to code in the NBuffers / 8 check.  We shouldn't chain
> dependencies between GUCs like that.  Imagine someone editing their
> postgresql.conf after realising shared_buffers is too large for their
> RAM, they reduce it and restart.  The database fails to start because
> vacuum_buffer_usage_limit is too large!  Angry DBA?
>
> Take what's already written about vacuum_failsafe_age as your guidance on this:
>
> "The default is 1.6 billion transactions. Although users can set this
> value anywhere from zero to 2.1 billion, VACUUM will silently adjust
> the effective value to no less than 105% of
> autovacuum_freeze_max_age."
>
> Here we just document the silent capping.   You can still claim the
> valid range is 128KB to 16GB in the docs. You can mention the 1/8th of
> shared buffers cap same as what's mentioned about "105%" above.
>
> When I mentioned #4 and #10 in my review of the v9-0001 patch, I just
> wanted to not surprise users who do vacuum_buffer_usage_limit = 64 and
> magically get 128.


> 7. In ExecVacuum(), similar to #6 from above, it's also not great that
> you're raising an ERROR based on if StrategyGetClampedBufsize() clamps
> or not. If someone has a script that does:
>
> VACUUM (BUFFER_USAGE_LIMIT '1 GB'); it might be annoying that it stops
> working when someone adjusts shared buffers from 10GB to 6GB.
>
> I really think the NBuffers / 8 clamping just should be done inside
> GetAccessStrategyWithSize().

Got it. I've done what you suggested.
I had some logic issues as well that I fixed and reorderd the code.

> 8. I think this ERROR in vacuum.c should mention that 0 is a valid value.
>
> ereport(ERROR,
> (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
> errmsg("buffer_usage_limit for a vacuum must be between %d KB and %d KB",
> MIN_BAS_RING_SIZE_KB, MAX_BAS_RING_SIZE_KB)));
>
> I doubt there's a need to mention -1 as that's the same as not
> specifying BUFFER_USAGE_LIMIT.

I've done this. I didn't say that 0 meant disabling the strategy. Do you
think that would be useful?

> 9. The following might be worthy of a comment explaining the order of
> precedence of how we choose the size:
>
> if (params->ring_size == -1)
> {
> if (VacuumBufferUsageLimit == -1)
> bstrategy = GetAccessStrategy(BAS_VACUUM);
> else
> bstrategy = GetAccessStrategyWithSize(BAS_VACUUM, VacuumBufferUsageLimit);
> }
> else
> bstrategy = GetAccessStrategyWithSize(BAS_VACUUM, params->ring_size);

I've updated this. Also, after doing so, I realized the if/else logic
here and in ExecVacuum() could be better and updated the ordering to
more closely mirror the human readable logic.

> 10. I wonder if you need to keep bufsize_limit_to_nbuffers(). It's
> just used once and seems trivial enough just to write that code inside
> GetAccessStrategyWithSize().

I've gotten rid of it.

> 11. It's probably worth putting the valid range in the sample config:
>
> #vacuum_buffer_usage_limit = -1 # size of vacuum and analyze buffer
> access strategy ring.
> # -1 to use default,
> # 0 to disable vacuum buffer access strategy
> # > 0 to specify size <-- here

Done.

> 12. Is bufmgr.h the right place for these?
>
> /*
>  * Upper and lower hard limits for the Buffer Access Strategy ring size
>  * specified by the vacuum_buffer_usage_limit GUC and BUFFER_USAGE_LIMIT option
>  * to VACUUM and ANALYZE.
>  */
> #define MAX_BAS_RING_SIZE_KB (16 * 1024 * 1024)
> #define MIN_BAS_RING_SIZE_KB 128
>
> Your comment implies they're VACUUM / ANALYZE limits.  If we want to
> impose these limits to all access strategies then these seem like good
> names and location, otherwise, I imagine miscadmin.h is the correct
> place. If so, they'll likely want to be renamed to something more
> VACUUM specific.  I don't particularly have a preference. 128 -
> 1677216 seem like reasonable limits for any buffer access strategy.

I don't assert on these limits in GetAccessStrategyWithSize(), and,
since the rest of the code is mainly only concerned with vacuum, I think
it is better to make these limits vacuum-specific. If we decide to make
other access strategies configurable, we can generalize these macros
then. As such, I have moved them into miscadmin.h.

> 13. I think check_vacuum_buffer_usage_limit() does not belong in
> freelist.c. Maybe vacuum.c?

I've moved it to vacuum.c. I put it above ExecVacuum() since that would
be correct alphabetically, but I'm not sure if it would be better to
move it down since ExecVacuum() is the "main entry point".

> 14. Not related to this patch, but why do we have half the vacuum
> related GUCs in vacuum.c and the other half in globals.c? I see
> vacuum_freeze_table_age is defined in vacuum.c but is also needed in
> autovacuum.c, so that rules out the globals.c ones being for vacuum.c
> and autovacuum.c.  It seems a bit messy.   I'm not really sure where
> VacuumBufferUsageLimit should go now.

I've left it where it is and added a (helpful?) comment.

> > Remaining TODOs:
> > - tests
> > - do something about config reload changing GUC
>
> Shouldn't table_recheck_autovac() pfree/palloc a new strategy if the
> size changes?

See thoughts about that below in response to Andres' mail.

> I'm not sure what the implications are with that and the other patch
> you're working on to allow vacuum config changes mid-vacuum. We'll
> need to be careful and not immediately break that if that gets
> committed then this does or vice-versa.

We can think hard about this. If we go with adding a TODO for the size,
and keeping the same ring, it won't be a problem.

On Wed, Apr 5, 2023 at 1:05 PM Andres Freund <andres@anarazel.de> wrote:
> On 2023-04-04 13:53:15 -0400, Melanie Plageman wrote:
> > > 8. I don't quite follow this comment:
> > >
> > > /*
> > > * TODO: should this be 0 so that we are sure that vacuum() never
> > > * allocates a new bstrategy for us, even if we pass in NULL for that
> > > * parameter? maybe could change how failsafe NULLs out bstrategy if
> > > * so?
> > > */
> > >
> > > Can you explain under what circumstances would vacuum() allocate a
> > > bstrategy when do_autovacuum() would not? Is this a case of a config
> > > reload where someone changes vacuum_buffer_usage_limit from 0 to
> > > something non-zero? If so, perhaps do_autovacuum() needs to detect
> > > this and allocate a strategy rather than having vacuum() do it once
> > > per table (wastefully).
>
> Hm. I don't much like that we use a single strategy for multiple tables
> today. That way even tiny tables never end up in shared_buffers. But that's
> really a discussion for a different thread. However, if were to use a
> per-table bstrategy, it'd be a lot easier to react to changes of the config.

Agreed. I was wondering if it is okay to do the palloc()/pfree() for
every table given that some may be small.

> I doubt it's worth adding complications to the code for changing the size of
> the ringbuffer during an ongoing vacuum scan, at least for 16. Reacting to
> enabling/disbling the ringbuffer alltogether seems a bit more important, but
> still not crucial compared to making it configurable at all.
>
> I think it'd be OK to add a comment saying something like "XXX: In the future
> we might want to react to configuration changes of the ring buffer size during
> a vacuum" or such.

I've added the XXX to the autovacuum code. I think you mean it also
could be considered for VACUUM, but I've refrained from mentioning that
for now.

> WRT to the TODO specifically: Yes, passing in 0 seems to make sense. I don't
> see a reason not to do so? But perhaps there's a better solution:

I've done that (passed in a 0), I was concerned that future code may
reference this vacuum param and expect it to be aligned with the Buffer
Access Strategy in use. Really only vacuum() should be referencing the
params, so, perhaps it is not an issue...

Okay, now I've convinced myself that it is better to allocate the
strategy in ExecVacuum(). Then we can get rid of the
VacuumParams->ring_size altogether.

I haven't done that in this version because of the below concern (re: it
being appropriate to allocate the strategy in ExecVacuum() given its
current concern/focus).

> Perhaps the best solution for autovac vs interactive vacuum issue would be to
> move the allocation of the bstrategy to ExecVacuum()?

Thought about this -- I did think it might be a bit weird since
ExecVacuum() mainly does option handling and sanity checking. Doing
Buffer Access Strategy allocation seemed a bit out of place. I've left
it as is, but would be happy to change it if the consensus is that this
is better.

> Random note while looking at the code:
> ISTM that adding handling of -1 in GetAccessStrategyWithSize() would make the
> code more readable. Instead of
>
>                 if (params->ring_size == -1)
>                 {
>                         if (VacuumBufferUsageLimit == -1)
>                                 bstrategy = GetAccessStrategy(BAS_VACUUM);
>                         else
>                                 bstrategy = GetAccessStrategyWithSize(BAS_VACUUM, VacuumBufferUsageLimit);
>                 }
>                 else
>                         bstrategy = GetAccessStrategyWithSize(BAS_VACUUM, params->ring_size);
>
> you could just have something like:
>     bstrategy = GetAccessStrategyWithSize(BAS_VACUUM,
>         params->ring_size != -1 ? params->ring_size : VacuumBufferUsageLimit);
>
> by falling back to the default values from GetAccessStrategy().
>
> Or even more "extremely", you could entirely remove references to
> VacuumBufferUsageLimit and handle that in freelist.c

Hmm. I see what you mean.

I've updated it to this, which is a bit better.

        if (params->ring_size > -1)
            bstrategy = GetAccessStrategyWithSize(BAS_VACUUM,
params->ring_size);
        else if (VacuumBufferUsageLimit > -1)
            bstrategy = GetAccessStrategyWithSize(BAS_VACUUM,
VacuumBufferUsageLimit);
        else
            bstrategy = GetAccessStrategy(BAS_VACUUM);

Not referencing VacuumBufferUsageLimit except in freelist.c is more
challenging because I think it would be weird to have
GetAccessStrategyWithSize() call GetAccessStrategy() which then calls
GetAccessStrategyWithSize().

- Melanie

Attachment

Re: Option to not use ringbuffer in VACUUM, using it in failsafe mode

From
Melanie Plageman
Date:
On Wed, Apr 5, 2023 at 1:05 PM Andres Freund <andres@anarazel.de> wrote:
>
> Hi,
>
> On 2023-04-04 13:53:15 -0400, Melanie Plageman wrote:
> > > 8. I don't quite follow this comment:
> > >
> > > /*
> > > * TODO: should this be 0 so that we are sure that vacuum() never
> > > * allocates a new bstrategy for us, even if we pass in NULL for that
> > > * parameter? maybe could change how failsafe NULLs out bstrategy if
> > > * so?
> > > */
> > >
> > > Can you explain under what circumstances would vacuum() allocate a
> > > bstrategy when do_autovacuum() would not? Is this a case of a config
> > > reload where someone changes vacuum_buffer_usage_limit from 0 to
> > > something non-zero? If so, perhaps do_autovacuum() needs to detect
> > > this and allocate a strategy rather than having vacuum() do it once
> > > per table (wastefully).
>
> Hm. I don't much like that we use a single strategy for multiple tables
> today. That way even tiny tables never end up in shared_buffers. But that's
> really a discussion for a different thread. However, if were to use a
> per-table bstrategy, it'd be a lot easier to react to changes of the config.
>
>
> I doubt it's worth adding complications to the code for changing the size of
> the ringbuffer during an ongoing vacuum scan, at least for 16. Reacting to
> enabling/disbling the ringbuffer alltogether seems a bit more important, but
> still not crucial compared to making it configurable at all.
>
> I think it'd be OK to add a comment saying something like "XXX: In the future
> we might want to react to configuration changes of the ring buffer size during
> a vacuum" or such.
>
> WRT to the TODO specifically: Yes, passing in 0 seems to make sense. I don't
> see a reason not to do so? But perhaps there's a better solution:
>
> Perhaps the best solution for autovac vs interactive vacuum issue would be to
> move the allocation of the bstrategy to ExecVacuum()?

So, I started looking into allocating the bstrategy in ExecVacuum().

While doing so, I was trying to understand if the "sanity checking" in
vacuum() could possibly apply to autovacuum, and I don't really see how.

AFAICT, autovacuum does not ever set VACOPT_DISABLE_PAGE_SKIPPING or
VACOPT_FULL or VACOPT_ONLY_DATABASE_STATS.

We could move those sanity checks up into ExecVacuum().

I also noticed that we make the vac_context in vacuum() which says it is
for "cross-transaction storage". We use it for the buffer access
strategy and for the newrels relation list created in vacuum(). Then we
delete it at the end of vacuum().

Autovacuum workers already make a similar kind of memory context called
AutovacMemCxt in do_autovacuum() which the comment says is for the list
of relations to vacuum/analyze across transactions.

What if we made ExecVacuum() make its own memory context and both it and
do_autovacuum() pass that memory context (along with the buffer access
strategy they make) to vacuum(), which then uses the memory context in
the same way it does now?

It simplifies the buffer usage limit patchset and also seems a bit more
clear than what is there now?

- Melanie



Re: Option to not use ringbuffer in VACUUM, using it in failsafe mode

From
Andres Freund
Date:
Hi,

On 2023-04-05 16:17:20 -0400, Melanie Plageman wrote:
> On Wed, Apr 5, 2023 at 1:05 PM Andres Freund <andres@anarazel.de> wrote:
> > Perhaps the best solution for autovac vs interactive vacuum issue would be to
> > move the allocation of the bstrategy to ExecVacuum()?
> 
> So, I started looking into allocating the bstrategy in ExecVacuum().
> 
> While doing so, I was trying to understand if the "sanity checking" in
> vacuum() could possibly apply to autovacuum, and I don't really see how.
> 
> AFAICT, autovacuum does not ever set VACOPT_DISABLE_PAGE_SKIPPING or
> VACOPT_FULL or VACOPT_ONLY_DATABASE_STATS.
> 
> We could move those sanity checks up into ExecVacuum().

Would make sense.

ISTM that eventually most of what currently happens in vacuum() should be in
ExecVacuum(). There's a lot of stuff that can't happen for autovacuum. So it
just seems to make more sense to move those parts to ExecVacuum().


> I also noticed that we make the vac_context in vacuum() which says it is
> for "cross-transaction storage". We use it for the buffer access
> strategy and for the newrels relation list created in vacuum(). Then we
> delete it at the end of vacuum().

> Autovacuum workers already make a similar kind of memory context called
> AutovacMemCxt in do_autovacuum() which the comment says is for the list
> of relations to vacuum/analyze across transactions.

AutovacMemCxt seems to be a bit longer lived / cover more than the context
created in vacuum(). It's where all the hash tables etc live that
do_autovacuum() uses to determine what to vacuum.

Note that do_autovacuum() also creates:

    /*
     * create a memory context to act as fake PortalContext, so that the
     * contexts created in the vacuum code are cleaned up for each table.
     */
    PortalContext = AllocSetContextCreate(AutovacMemCxt,
                                          "Autovacuum Portal",
                                          ALLOCSET_DEFAULT_SIZES);

which is then what vacuum() creates the "Vacuum" context in.


> What if we made ExecVacuum() make its own memory context and both it and
> do_autovacuum() pass that memory context (along with the buffer access
> strategy they make) to vacuum(), which then uses the memory context in
> the same way it does now?

Maybe? It's not clear to me why it'd be a win.


> It simplifies the buffer usage limit patchset and also seems a bit more
> clear than what is there now?

I don't really see what it'd make simpler? The context in vacuum() is used for
just that vacuum - we couldn't just use AutovacMemCxt, as that'd live much
longer (for all the tables a autovac worker processes).

Greetings,

Andres Freund



Re: Option to not use ringbuffer in VACUUM, using it in failsafe mode

From
David Rowley
Date:
On Wed, 5 Apr 2023 at 16:37, Peter Geoghegan <pg@bowt.ie> wrote:
>
> On Tue, Apr 4, 2023 at 8:14 PM David Rowley <dgrowleyml@gmail.com> wrote:
> > 14. Not related to this patch, but why do we have half the vacuum
> > related GUCs in vacuum.c and the other half in globals.c? I see
> > vacuum_freeze_table_age is defined in vacuum.c but is also needed in
> > autovacuum.c, so that rules out the globals.c ones being for vacuum.c
> > and autovacuum.c.  It seems a bit messy.   I'm not really sure where
> > VacuumBufferUsageLimit should go now.
>
> vacuum_freeze_table_age is an abomination, even compared to the rest
> of these GUCs. It was added around the time the visibility map first
> went in, and so is quite a bit more recent than
> autovacuum_freeze_max_age.
>
> Before the introduction of the visibility map, we only had
> autovacuum_freeze_max_age, and it was used to schedule antiwraparound
> autovacuums -- there was no such thing as aggressive VACUUMs (just
> antiwraparound autovacuums), and no need for autovacuum_freeze_max_age
> at all. So autovacuum_freeze_max_age was just for autovacuum.c code.
> There was only one type of VACUUM, and they always advanced
> relfrozenxid to the same degree.
>
> With the introduction of the visibility map, we needed to have
> autovacuum_freeze_max_age in vacuum.c for the first time, to deal with
> interpreting the then-new vacuum_freeze_table_age GUC correctly. We
> silently truncate the vacuum_freeze_table_age setting so that it never
> exceeds 95% of autovacuum_freeze_max_age (the
> 105%-of-autovacuum_freeze_max_age vacuum_failsafe_age thing that
> you're discussing is symmetric). So after 2009
> autovacuum_freeze_max_age actually plays an important role in VACUUM,
> the command, and not just autovacuum.
>
> This is related to the problem of the autovacuum_freeze_max_age
> reloption being completely broken [1]. If autovacuum_freeze_max_age
> was still purely just an autovacuum.c scheduling thing, then there
> would be no problem with having a separate reloption of the same name.
> There are big problems precisely because vacuum.c doesn't do anything
> with the autovacuum_freeze_max_age reloption. Though it does okay with
> the autovacuum_freeze_table_age reloption, which gets passed in. (Yes,
> it's called autovacuum_freeze_table_age in reloption form -- not
> vacuum_freeze_table_age, like the GUC).
>
> Note that the decision to ignore the reloption version of
> autovacuum_freeze_max_age in the failsafe's
> 105%-of-autovacuum_freeze_max_age thing was a deliberate one. The
> autovacuum_freeze_max_age GUC is authoritative in the sense that it
> cannot be overridden locally, except in the direction of making
> aggressive VACUUMs happen more frequently for a given table (so they
> can't be less frequent via reloption configuration). I suppose you'd
> have to untangle that mess if you wanted to fix the
> autovacuum_freeze_max_age reloption issue I go into in [1].
>
> [1] https://postgr.es/m/CAH2-Wz=DJAokY_GhKJchgpa8k9t_H_OVOvfPEn97jGNr9W=deg@mail.gmail.com

I read this twice yesterday and again this morning. It looks like
you're taking an opportunity to complain/vent about
vacuum_freeze_table_age and didn't really answer my query about why
all the vacuum GUCs aren't defined in the one file.  I'd just picked
vacuum_freeze_table_age as a random one from vacuum.c to raise the
point about the inconsistency about the GUC locations.

 I don't think this is the place to raise concerns with existing GUCs,
but if you did have a point about the GUCs locations, then you might
need to phase it differently as I didn't catch it.

David



Re: Option to not use ringbuffer in VACUUM, using it in failsafe mode

From
Peter Geoghegan
Date:
On Wed, Apr 5, 2023 at 2:33 PM David Rowley <dgrowleyml@gmail.com> wrote:
> I read this twice yesterday and again this morning. It looks like
> you're taking an opportunity to complain/vent about
> vacuum_freeze_table_age and didn't really answer my query about why
> all the vacuum GUCs aren't defined in the one file.  I'd just picked
> vacuum_freeze_table_age as a random one from vacuum.c to raise the
> point about the inconsistency about the GUC locations.

I thought that the point was obvious. Which is: the current situation
with the locations of these GUCs came about because the division
between autovacuum and VACUUM used to be a lot clearer, but that
changed. Without the locations of the GUCs also changing. More
generally, the current structure has lots of problems. And so it seems
to me that you're probably not wrong to suspect that it just doesn't
make much sense to keep them in different files now.

--
Peter Geoghegan



Re: Option to not use ringbuffer in VACUUM, using it in failsafe mode

From
Melanie Plageman
Date:
On Wed, Apr 5, 2023 at 5:15 PM Andres Freund <andres@anarazel.de> wrote:
>
> Hi,
>
> On 2023-04-05 16:17:20 -0400, Melanie Plageman wrote:
> > On Wed, Apr 5, 2023 at 1:05 PM Andres Freund <andres@anarazel.de> wrote:
> > > Perhaps the best solution for autovac vs interactive vacuum issue would be to
> > > move the allocation of the bstrategy to ExecVacuum()?
> >
> > So, I started looking into allocating the bstrategy in ExecVacuum().
> >
> > While doing so, I was trying to understand if the "sanity checking" in
> > vacuum() could possibly apply to autovacuum, and I don't really see how.
> >
> > AFAICT, autovacuum does not ever set VACOPT_DISABLE_PAGE_SKIPPING or
> > VACOPT_FULL or VACOPT_ONLY_DATABASE_STATS.
> >
> > We could move those sanity checks up into ExecVacuum().
>
> Would make sense.
>
> ISTM that eventually most of what currently happens in vacuum() should be in
> ExecVacuum(). There's a lot of stuff that can't happen for autovacuum. So it
> just seems to make more sense to move those parts to ExecVacuum().

I've done that in the attached wip patch. It is perhaps too much of a
change, I dunno.

> > I also noticed that we make the vac_context in vacuum() which says it is
> > for "cross-transaction storage". We use it for the buffer access
> > strategy and for the newrels relation list created in vacuum(). Then we
> > delete it at the end of vacuum().
>
> > Autovacuum workers already make a similar kind of memory context called
> > AutovacMemCxt in do_autovacuum() which the comment says is for the list
> > of relations to vacuum/analyze across transactions.
>
> AutovacMemCxt seems to be a bit longer lived / cover more than the context
> created in vacuum(). It's where all the hash tables etc live that
> do_autovacuum() uses to determine what to vacuum.
>
> Note that do_autovacuum() also creates:
>
>         /*
>          * create a memory context to act as fake PortalContext, so that the
>          * contexts created in the vacuum code are cleaned up for each table.
>          */
>         PortalContext = AllocSetContextCreate(AutovacMemCxt,
>                                                                                   "Autovacuum Portal",
>                                                                                   ALLOCSET_DEFAULT_SIZES);
>
> which is then what vacuum() creates the "Vacuum" context in.

Yea, I realized that when writing the patch.

> > What if we made ExecVacuum() make its own memory context and both it and
> > do_autovacuum() pass that memory context (along with the buffer access
> > strategy they make) to vacuum(), which then uses the memory context in
> > the same way it does now?
>
> Maybe? It's not clear to me why it'd be a win.

Less that it is a win and more that we need access to that memory
context when allocating the buffer access strategy, so we would have had
to make it in ExecVacuum(). And if we have already made it, we would
need to pass it in to vacuum() for it to use.

> > It simplifies the buffer usage limit patchset and also seems a bit more
> > clear than what is there now?
>
> I don't really see what it'd make simpler? The context in vacuum() is used for
> just that vacuum - we couldn't just use AutovacMemCxt, as that'd live much
> longer (for all the tables a autovac worker processes).

Autovacuum already made the BufferAccessStrategy in the AutovacMemCxt,
so this is the same behavior. I simply made autovacuum_do_vac_analyze()
make the per table vacuum memory context and pass that to vacuum(). So
we have the same amount of memory context granularity as before.

Attached patchset has some kind of isolation test failure due to a hard
deadlock that I haven't figured out yet. I thought it was something with
the "in_vacuum" static variable and having VACUUM or ANALYZE called when
already in VACUUM or ANALYZE, but that variable is the same as in
master.

I've mostly shared it because I want to know if this approach is worth
pursuing or not.

Also, while working on it, I noticed that I made a mistake in the code
that was committed in 4830f102 and didn't remember that we should still
make a Buffer Access Strategy in the case of VACUUM (FULL, ANALYZE).

Changing this:

if (params->options & (VACOPT_ONLY_DATABASE_STATS | VACOPT_FULL)) == 0)

to this:

if ((params.options & VACOPT_ONLY_DATABASE_STATS) == 0 ||
        (params.options & VACOPT_FULL && (params.options & VACOPT_ANALYZE) == 0)

should fix it.

- Melanie

Attachment

Re: Option to not use ringbuffer in VACUUM, using it in failsafe mode

From
Melanie Plageman
Date:
On Wed, Apr 5, 2023 at 6:55 PM Melanie Plageman
<melanieplageman@gmail.com> wrote:
>
> On Wed, Apr 5, 2023 at 5:15 PM Andres Freund <andres@anarazel.de> wrote:
> >
> > Hi,
> >
> > On 2023-04-05 16:17:20 -0400, Melanie Plageman wrote:
> > > On Wed, Apr 5, 2023 at 1:05 PM Andres Freund <andres@anarazel.de> wrote:
> > > > Perhaps the best solution for autovac vs interactive vacuum issue would be to
> > > > move the allocation of the bstrategy to ExecVacuum()?
> > >
> > > So, I started looking into allocating the bstrategy in ExecVacuum().
> > >
> > > While doing so, I was trying to understand if the "sanity checking" in
> > > vacuum() could possibly apply to autovacuum, and I don't really see how.
> > >
> > > AFAICT, autovacuum does not ever set VACOPT_DISABLE_PAGE_SKIPPING or
> > > VACOPT_FULL or VACOPT_ONLY_DATABASE_STATS.
> > >
> > > We could move those sanity checks up into ExecVacuum().
> >
> > Would make sense.
> >
> > ISTM that eventually most of what currently happens in vacuum() should be in
> > ExecVacuum(). There's a lot of stuff that can't happen for autovacuum. So it
> > just seems to make more sense to move those parts to ExecVacuum().
>
> I've done that in the attached wip patch. It is perhaps too much of a
> change, I dunno.
>
> > > I also noticed that we make the vac_context in vacuum() which says it is
> > > for "cross-transaction storage". We use it for the buffer access
> > > strategy and for the newrels relation list created in vacuum(). Then we
> > > delete it at the end of vacuum().
> >
> > > Autovacuum workers already make a similar kind of memory context called
> > > AutovacMemCxt in do_autovacuum() which the comment says is for the list
> > > of relations to vacuum/analyze across transactions.
> >
> > AutovacMemCxt seems to be a bit longer lived / cover more than the context
> > created in vacuum(). It's where all the hash tables etc live that
> > do_autovacuum() uses to determine what to vacuum.
> >
> > Note that do_autovacuum() also creates:
> >
> >         /*
> >          * create a memory context to act as fake PortalContext, so that the
> >          * contexts created in the vacuum code are cleaned up for each table.
> >          */
> >         PortalContext = AllocSetContextCreate(AutovacMemCxt,
> >                                                                                   "Autovacuum Portal",
> >                                                                                   ALLOCSET_DEFAULT_SIZES);
> >
> > which is then what vacuum() creates the "Vacuum" context in.
>
> Yea, I realized that when writing the patch.
>
> > > What if we made ExecVacuum() make its own memory context and both it and
> > > do_autovacuum() pass that memory context (along with the buffer access
> > > strategy they make) to vacuum(), which then uses the memory context in
> > > the same way it does now?
> >
> > Maybe? It's not clear to me why it'd be a win.
>
> Less that it is a win and more that we need access to that memory
> context when allocating the buffer access strategy, so we would have had
> to make it in ExecVacuum(). And if we have already made it, we would
> need to pass it in to vacuum() for it to use.
>
> > > It simplifies the buffer usage limit patchset and also seems a bit more
> > > clear than what is there now?
> >
> > I don't really see what it'd make simpler? The context in vacuum() is used for
> > just that vacuum - we couldn't just use AutovacMemCxt, as that'd live much
> > longer (for all the tables a autovac worker processes).
>
> Autovacuum already made the BufferAccessStrategy in the AutovacMemCxt,
> so this is the same behavior. I simply made autovacuum_do_vac_analyze()
> make the per table vacuum memory context and pass that to vacuum(). So
> we have the same amount of memory context granularity as before.
>
> Attached patchset has some kind of isolation test failure due to a hard
> deadlock that I haven't figured out yet. I thought it was something with
> the "in_vacuum" static variable and having VACUUM or ANALYZE called when
> already in VACUUM or ANALYZE, but that variable is the same as in
> master.
>
> I've mostly shared it because I want to know if this approach is worth
> pursuing or not.

Figured out how to fix the issue, though I'm not sure I understand how
the issue can occur.
use_own_xacts seems like it will always be true for autovacuum when it
calls vacuum() and ExecVacuum() only calls vacuum() once, so I thought
that I could make use_own_xacts a parameter to vacuum() and push up its
calculation for VACUUM and ANALYZE into ExecVacuum().
This caused a deadlock, so there must be a way that in_vacuum is false
but vacuum() is called in a nested context.
Anyway, recalculating it every time in vacuum() fixes it.

Attached is a v12 of the whole vacuum_buffer_usage_limit patch set which
includes a commit to fix the bug in master and a commit to move relevant
code from vacuum() up into ExecVacuum().

The logic I suggested earlier for fixing the bug was...not right.
Attached fix should be right?

- Melanie

Attachment

Re: Option to not use ringbuffer in VACUUM, using it in failsafe mode

From
David Rowley
Date:
On Thu, 6 Apr 2023 at 12:42, Melanie Plageman <melanieplageman@gmail.com> wrote:
> Attached is a v12 of the whole vacuum_buffer_usage_limit patch set which
> includes a commit to fix the bug in master and a commit to move relevant
> code from vacuum() up into ExecVacuum().

I'm still playing catch up to the moving of the pre-checks from
vacuum() to ExecVacuum().  I'm now wondering...

Is it intended that VACUUM t1,t2; now share the same strategy?
Currently, in master, we'll allocate a new strategy for t2 after
vacuuming t1. Does this not mean we'll now leave fewer t1 pages in
shared_buffers because the reuse of the strategy will force them out
with t2 pages?  I understand there's nothing particularly invalid
about that, but it is a change in behaviour that the patch seems to be
making with very little consideration as to if it's better or worse.

David



Re: Option to not use ringbuffer in VACUUM, using it in failsafe mode

From
Melanie Plageman
Date:
On Wed, Apr 5, 2023 at 9:15 PM David Rowley <dgrowleyml@gmail.com> wrote:
>
> On Thu, 6 Apr 2023 at 12:42, Melanie Plageman <melanieplageman@gmail.com> wrote:
> > Attached is a v12 of the whole vacuum_buffer_usage_limit patch set which
> > includes a commit to fix the bug in master and a commit to move relevant
> > code from vacuum() up into ExecVacuum().
>
> I'm still playing catch up to the moving of the pre-checks from
> vacuum() to ExecVacuum().  I'm now wondering...
>
> Is it intended that VACUUM t1,t2; now share the same strategy?
> Currently, in master, we'll allocate a new strategy for t2 after
> vacuuming t1. Does this not mean we'll now leave fewer t1 pages in
> shared_buffers because the reuse of the strategy will force them out
> with t2 pages?  I understand there's nothing particularly invalid
> about that, but it is a change in behaviour that the patch seems to be
> making with very little consideration as to if it's better or worse.

I'm pretty sure that in master we also reuse the strategy since we make
it above this loop in vacuum() (and pass it in)

        /*
        * Loop to process each selected relation.
        */
        foreach(cur, relations)
        {
            VacuumRelation *vrel = lfirst_node(VacuumRelation, cur);
            if (params->options & VACOPT_VACUUM)
            {
                if (!vacuum_rel(vrel->oid, vrel->relation, params,
false, bstrategy))
                    continue;
            }

On Wed, Apr 5, 2023 at 8:41 PM Melanie Plageman
<melanieplageman@gmail.com> wrote:
>
> On Wed, Apr 5, 2023 at 6:55 PM Melanie Plageman
> <melanieplageman@gmail.com> wrote:
> >
> > On Wed, Apr 5, 2023 at 5:15 PM Andres Freund <andres@anarazel.de> wrote:
> > >
> > > Hi,
> > >
> > > On 2023-04-05 16:17:20 -0400, Melanie Plageman wrote:
> > > > On Wed, Apr 5, 2023 at 1:05 PM Andres Freund <andres@anarazel.de> wrote:
> > > > > Perhaps the best solution for autovac vs interactive vacuum issue would be to
> > > > > move the allocation of the bstrategy to ExecVacuum()?
> > > >
> > > > So, I started looking into allocating the bstrategy in ExecVacuum().
> > > >
> > > > While doing so, I was trying to understand if the "sanity checking" in
> > > > vacuum() could possibly apply to autovacuum, and I don't really see how.
> > > >
> > > > AFAICT, autovacuum does not ever set VACOPT_DISABLE_PAGE_SKIPPING or
> > > > VACOPT_FULL or VACOPT_ONLY_DATABASE_STATS.
> > > >
> > > > We could move those sanity checks up into ExecVacuum().
> > >
> > > Would make sense.
> > >
> > > ISTM that eventually most of what currently happens in vacuum() should be in
> > > ExecVacuum(). There's a lot of stuff that can't happen for autovacuum. So it
> > > just seems to make more sense to move those parts to ExecVacuum().
> >
> > I've done that in the attached wip patch. It is perhaps too much of a
> > change, I dunno.
> >
> > > > I also noticed that we make the vac_context in vacuum() which says it is
> > > > for "cross-transaction storage". We use it for the buffer access
> > > > strategy and for the newrels relation list created in vacuum(). Then we
> > > > delete it at the end of vacuum().
> > >
> > > > Autovacuum workers already make a similar kind of memory context called
> > > > AutovacMemCxt in do_autovacuum() which the comment says is for the list
> > > > of relations to vacuum/analyze across transactions.
> > >
> > > AutovacMemCxt seems to be a bit longer lived / cover more than the context
> > > created in vacuum(). It's where all the hash tables etc live that
> > > do_autovacuum() uses to determine what to vacuum.
> > >
> > > Note that do_autovacuum() also creates:
> > >
> > >         /*
> > >          * create a memory context to act as fake PortalContext, so that the
> > >          * contexts created in the vacuum code are cleaned up for each table.
> > >          */
> > >         PortalContext = AllocSetContextCreate(AutovacMemCxt,
> > >                                                                                   "Autovacuum Portal",
> > >                                                                                   ALLOCSET_DEFAULT_SIZES);
> > >
> > > which is then what vacuum() creates the "Vacuum" context in.
> >
> > Yea, I realized that when writing the patch.
> >
> > > > What if we made ExecVacuum() make its own memory context and both it and
> > > > do_autovacuum() pass that memory context (along with the buffer access
> > > > strategy they make) to vacuum(), which then uses the memory context in
> > > > the same way it does now?
> > >
> > > Maybe? It's not clear to me why it'd be a win.
> >
> > Less that it is a win and more that we need access to that memory
> > context when allocating the buffer access strategy, so we would have had
> > to make it in ExecVacuum(). And if we have already made it, we would
> > need to pass it in to vacuum() for it to use.
> >
> > > > It simplifies the buffer usage limit patchset and also seems a bit more
> > > > clear than what is there now?
> > >
> > > I don't really see what it'd make simpler? The context in vacuum() is used for
> > > just that vacuum - we couldn't just use AutovacMemCxt, as that'd live much
> > > longer (for all the tables a autovac worker processes).
> >
> > Autovacuum already made the BufferAccessStrategy in the AutovacMemCxt,
> > so this is the same behavior. I simply made autovacuum_do_vac_analyze()
> > make the per table vacuum memory context and pass that to vacuum(). So
> > we have the same amount of memory context granularity as before.
> >
> > Attached patchset has some kind of isolation test failure due to a hard
> > deadlock that I haven't figured out yet. I thought it was something with
> > the "in_vacuum" static variable and having VACUUM or ANALYZE called when
> > already in VACUUM or ANALYZE, but that variable is the same as in
> > master.
> >
> > I've mostly shared it because I want to know if this approach is worth
> > pursuing or not.
>
> Figured out how to fix the issue, though I'm not sure I understand how
> the issue can occur.
> use_own_xacts seems like it will always be true for autovacuum when it
> calls vacuum() and ExecVacuum() only calls vacuum() once, so I thought
> that I could make use_own_xacts a parameter to vacuum() and push up its
> calculation for VACUUM and ANALYZE into ExecVacuum().
> This caused a deadlock, so there must be a way that in_vacuum is false
> but vacuum() is called in a nested context.
> Anyway, recalculating it every time in vacuum() fixes it.
>
> Attached is a v12 of the whole vacuum_buffer_usage_limit patch set which
> includes a commit to fix the bug in master and a commit to move relevant
> code from vacuum() up into ExecVacuum().
>
> The logic I suggested earlier for fixing the bug was...not right.
> Attached fix should be right?

David had already pushed a fix, so the patchset had merge conflicts.
Attached v13 should work.

- Melanie

Attachment

Re: Option to not use ringbuffer in VACUUM, using it in failsafe mode

From
David Rowley
Date:
On Thu, 6 Apr 2023 at 13:14, David Rowley <dgrowleyml@gmail.com> wrote:
> Is it intended that VACUUM t1,t2; now share the same strategy?
> Currently, in master, we'll allocate a new strategy for t2 after
> vacuuming t1. Does this not mean we'll now leave fewer t1 pages in
> shared_buffers because the reuse of the strategy will force them out
> with t2 pages?  I understand there's nothing particularly invalid
> about that, but it is a change in behaviour that the patch seems to be
> making with very little consideration as to if it's better or worse.

Actually, never mind that. I'm wrong. The same strategy is used for
both tables before and after this change.

I stumbled on thinking vacuum() was being called in a loop from
ExecVacuum() rather than it just passing all of the relations to
vacuum() and the loop being done inside vacuum(), which is does.

David



Re: Option to not use ringbuffer in VACUUM, using it in failsafe mode

From
Melanie Plageman
Date:
Attached is v14 which adds back in tests for the BUFFER_USAGE_LIMIT
option. I haven't included a test for VACUUM (BUFFER_USAGE_LIMIT x,
PARALLEL x) for the reason I mentioned upthread -- even if we force it
to actually do the parallel vacuuming, we are adding exercising the code
where parallel vacuum workers make their own buffer access strategy
rings but not really adding a test that will fail usefully. If something
is wrong with the configurability of the buffer access strategy object,
I don't see how it will break differently in parallel vacuum workers vs
regular vacuum.

- Melanie

Attachment

Re: Option to not use ringbuffer in VACUUM, using it in failsafe mode

From
David Rowley
Date:
 second On Thu, 6 Apr 2023 at 14:14, Melanie Plageman
<melanieplageman@gmail.com> wrote:
>
> Attached is v14 which adds back in tests for the BUFFER_USAGE_LIMIT
> option.

I've spent quite a bit of time looking at this since you sent it. I've
also made quite a few changes, mostly cosmetic ones, but there are a
few things below which are more fundamental.

1. I don't really think we need to support VACUUM (BUFFER_USAGE_LIMIT
-1);  It's just the same as VACUUM;  Removing that makes the
documentation more simple.

2. I don't think we really need to allow vacuum_buffer_usage_limit =
-1.  I think we can just set this to 256 and leave it.  If we allow -1
then we need to document what -1 means. The more I think about it, the
more strange it seems to allow -1. I can't quite imagine work_mem = -1
means 4MB. Why 4MB?  Changing this means we don't really need to do
anything special in:

+ if (VacuumBufferUsageLimit > -1)
+     bstrategy = GetAccessStrategyWithSize(BAS_VACUUM, VacuumBufferUsageLimit);
+ else
+     bstrategy = GetAccessStrategy(BAS_VACUUM);

That simply becomes:

bstrategy = GetAccessStrategyWithSize(BAS_VACUUM, VacuumBufferUsageLimit);

The code inside GetAccessStrategyWithSize() handles returning NULL
when the GUC is zero.

The equivalent in ExecVacuum() becomes:

if (ring_size > -1)
    bstrategy = GetAccessStrategyWithSize(BAS_VACUUM, ring_size);
else
    bstrategy = GetAccessStrategyWithSize(BAS_VACUUM, VacuumBufferUsageLimit);

instead of:

if (ring_size > -1)
    bstrategy = GetAccessStrategyWithSize(BAS_VACUUM, ring_size);
else if (VacuumBufferUsageLimit > -1)
    bstrategy = GetAccessStrategyWithSize(BAS_VACUUM, VacuumBufferUsageLimit);
else
    bstrategy = GetAccessStrategy(BAS_VACUUM);

3. There was a bug in GetAccessStrategyBufferCount() (I renamed it to
that from StrategyGetBufferCount()) that didn't handle NULL input. The
problem was that if you set vacuum_buffer_usage_limit = 0 then did a
parallel vacuum that GetAccessStrategyWithSize() would return NULL due
to the 0 buffer input, but GetAccessStrategyBufferCount() couldn't
handle NULL.  I've adjusted GetAccessStrategyBufferCount() just to
return 0 on NULL input.

Most of the rest is cosmetic. GetAccessStrategyWithSize() ended up
looking quite different.  I didn't see the sense in converting the
shared_buffer size into kilobytes to compare when we could just
convert ring_size_kb to buffers slightly sooner and then just do:

/* Cap to 1/8th of shared_buffers */
ring_buffers = Min(NBuffers / 8, ring_buffers);

I renamed nbuffers to ring_buffers as it was a little too confusing to
have nbuffers (for ring size) and NBuffers (for shared_buffers).

A few other changes like getting rid of the regression test and code
check for VACUUM (ONLY_DATABASE_STATS, BUFFER_USAGE_LIMIT 0); There is
already an if check and ERROR that looks for ONLY_DATABASE_STATS with
any other option slightly later in the function.  I also got rid of
the documentation that mentioned that wasn't supported as there's
already a mention in the ONLY_DATABASE_STATS which says it's not
supported with anything else. No other option seemed to care enough to
mention it, so I don't think BUFFER_USAGE_LIMIT is an exception.

I've attached v15.  I've only glanced at the vacuumdb patch so far.
I'm not expecting it to be too controversial.

I'm fairly happy with v15 now but would welcome anyone who wants to
have a look in the next 8 hours or so, else I plan to push it.

David

Attachment

Re: Option to not use ringbuffer in VACUUM, using it in failsafe mode

From
Melanie Plageman
Date:
On Thu, Apr 06, 2023 at 11:34:44PM +1200, David Rowley wrote:
>  second On Thu, 6 Apr 2023 at 14:14, Melanie Plageman
> <melanieplageman@gmail.com> wrote:
> >
> > Attached is v14 which adds back in tests for the BUFFER_USAGE_LIMIT
> > option.
> 
> I've spent quite a bit of time looking at this since you sent it. I've
> also made quite a few changes, mostly cosmetic ones, but there are a
> few things below which are more fundamental.
> 
> 1. I don't really think we need to support VACUUM (BUFFER_USAGE_LIMIT
> -1);  It's just the same as VACUUM;  Removing that makes the
> documentation more simple.

Agreed.
 
> 2. I don't think we really need to allow vacuum_buffer_usage_limit =
> -1.  I think we can just set this to 256 and leave it.  If we allow -1
> then we need to document what -1 means. The more I think about it, the
> more strange it seems to allow -1. I can't quite imagine work_mem = -1
> means 4MB. Why 4MB?

Agreed.
 
> 3. There was a bug in GetAccessStrategyBufferCount() (I renamed it to
> that from StrategyGetBufferCount()) that didn't handle NULL input. The
> problem was that if you set vacuum_buffer_usage_limit = 0 then did a
> parallel vacuum that GetAccessStrategyWithSize() would return NULL due
> to the 0 buffer input, but GetAccessStrategyBufferCount() couldn't
> handle NULL.  I've adjusted GetAccessStrategyBufferCount() just to
> return 0 on NULL input.

Good catch.
 
> diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
> index bcc49aec45..c421da348d 100644
> --- a/doc/src/sgml/config.sgml
> +++ b/doc/src/sgml/config.sgml
> @@ -2001,6 +2001,35 @@ include_dir 'conf.d'
>        </listitem>
>       </varlistentry>
>  
> +     <varlistentry id="guc-vacuum-buffer-usage-limit" xreflabel="vacuum_buffer_usage_limit">
> +      <term>
> +       <varname>vacuum_buffer_usage_limit</varname> (<type>integer</type>)
> +       <indexterm>
> +        <primary><varname>vacuum_buffer_usage_limit</varname> configuration parameter</primary>
> +       </indexterm>
> +      </term>
> +      <listitem>
> +       <para>
> +        Specifies the size of <varname>shared_buffers</varname> to be reused
> +        for each backend participating in a given invocation of
> +        <command>VACUUM</command> or <command>ANALYZE</command> or in
> +        autovacuum.  This size is converted to the number of shared buffers
> +        which will be reused as part of a
> +        <glossterm linkend="glossary-buffer-access-strategy">Buffer Access Strategy</glossterm>.
> +        A setting of <literal>0</literal> will allow the operation to use any
> +        number of <varname>shared_buffers</varname>.  The maximum size is
> +        <literal>16 GB</literal> and the minimum size is
> +        <literal>128 KB</literal>.  If the specified size would exceed 1/8 the
> +        size of <varname>shared_buffers</varname>, it is silently capped to
> +        that value.  The default value is <literal>-1</literal>.  If this

This still says that the default value is -1.

> diff --git a/doc/src/sgml/ref/vacuum.sgml b/doc/src/sgml/ref/vacuum.sgml
> index b6d30b5764..0b02d9faef 100644
> --- a/doc/src/sgml/ref/vacuum.sgml
> +++ b/doc/src/sgml/ref/vacuum.sgml
> @@ -345,6 +346,24 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
>      </listitem>
>     </varlistentry>
>  
> +   <varlistentry>
> +    <term><literal>BUFFER_USAGE_LIMIT</literal></term>
> +    <listitem>
> +     <para>
> +      Specifies the
> +      <glossterm linkend="glossary-buffer-access-strategy">Buffer Access Strategy</glossterm>
> +      ring buffer size for <command>VACUUM</command>.  This size is used to
> +      calculate the number of shared buffers which will be reused as part of
> +      this strategy.  <literal>0</literal> disables use of a
> +      <literal>Buffer Access Strategy</literal>.  If <option>ANALYZE</option>
> +      is also specified, the <option>BUFFER_USAGE_LIMIT</option> value is used
> +      for both the vacuum and analyze stages.  This option can't be used with
> +      the <option>FULL</option> option except if <option>ANALYZE</option> is
> +      also specified.
> +     </para>
> +    </listitem>
> +   </varlistentry>

I noticed you seemed to have removed the reference to the GUC
vacuum_buffer_usage_limit here. Was that intentional?
We may not need to mention "falling back" as I did before, however, the
GUC docs mention max/min values and such, which might be useful.

> diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
> index ea1d8960f4..e92738c7f0 100644
> --- a/src/backend/commands/vacuum.c
> +++ b/src/backend/commands/vacuum.c
> @@ -56,6 +56,7 @@
>  #include "utils/acl.h"
>  #include "utils/fmgroids.h"
>  #include "utils/guc.h"
> +#include "utils/guc_hooks.h"
>  #include "utils/memutils.h"
>  #include "utils/pg_rusage.h"
>  #include "utils/snapmgr.h"
> @@ -95,6 +96,30 @@ static VacOptValue get_vacoptval_from_boolean(DefElem *def);
>  static bool vac_tid_reaped(ItemPointer itemptr, void *state);
>  static int    vac_cmp_itemptr(const void *left, const void *right);
>  
> +/*
> + * GUC check function to ensure GUC value specified is within the allowable
> + * range.
> + */
> +bool
> +check_vacuum_buffer_usage_limit(int *newval, void **extra,
> +                                GucSource source)
> +{
> +    /* Allow specifying the default or disabling Buffer Access Strategy */
> +    if (*newval == -1 || *newval == 0)
> +        return true;

This should not check for -1 since that isn't the default anymore.
It should only need to check for 0 I think?

> +    /* Value upper and lower hard limits are inclusive */
> +    if (*newval >= MIN_BAS_VAC_RING_SIZE_KB &&
> +        *newval <= MAX_BAS_VAC_RING_SIZE_KB)
> +        return true;
> +
> +    /* Value does not fall within any allowable range */
> +    GUC_check_errdetail("\"vacuum_buffer_usage_limit\" must be -1, 0 or between %d KB and %d KB",
> +                        MIN_BAS_VAC_RING_SIZE_KB, MAX_BAS_VAC_RING_SIZE_KB);

Also remove -1 here.

>   * Primary entry point for manual VACUUM and ANALYZE commands
>   *
> @@ -114,6 +139,8 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
>      bool        disable_page_skipping = false;
>      bool        process_main = true;
>      bool        process_toast = true;
> +    /* by default use buffer access strategy with default size */
> +    int            ring_size = -1;

We need to update this comment to something like, "use an invalid value
for ring_size" so it is clear whether or not the BUFFER_USAGE_LIMIT was
specified when making the access strategy later". Also, I think just
removing the comment would be okay, because this is the normal behavior
for initializing values, I think.

> @@ -240,6 +309,17 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
>                  (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
>                   errmsg("VACUUM FULL cannot be performed in parallel")));
>  
> +    /*
> +     * BUFFER_USAGE_LIMIT does nothing for VACUUM (FULL) so just raise an
> +     * ERROR for that case.  VACUUM (FULL, ANALYZE) does make use of it, so
> +     * we'll permit that.
> +     */
> +    if ((params.options & VACOPT_FULL) && !(params.options & VACOPT_ANALYZE) &&
> +        ring_size > -1)
> +        ereport(ERROR,
> +                (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
> +                 errmsg("BUFFER_USAGE_LIMIT cannot be specified for VACUUM FULL")));
> +
>      /*
>       * Make sure VACOPT_ANALYZE is specified if any column lists are present.
>       */
> @@ -341,7 +421,20 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
>  
>          MemoryContext old_context = MemoryContextSwitchTo(vac_context);
>  
> -        bstrategy = GetAccessStrategy(BAS_VACUUM);

Is it worth moving this assert up above when we do the "sanity checking"
for VACUUM FULL with BUFFER_USAGE_LIMIT?

> +        Assert(ring_size >= -1);

> +        /*
> +         * If BUFFER_USAGE_LIMIT was specified by the VACUUM command, that
> +         * overrides the value of VacuumBufferUsageLimit.  Otherwise, use
> +         * VacuumBufferUsageLimit to define the size, which might be 0.  We
> +         * expliot that calling GetAccessStrategyWithSize with a zero size

s/expliot/exploit

I might rephrase the last sentence(s). Since it overrides it, I think it
is clear that if it is not specified, then the thing it overrides is
used. Then you could phrase the whole thing like:

 "If BUFFER_USAGE_LIMIT was specified by the VACUUM or ANALYZE command,
 it overrides the value of VacuumBufferUsageLimit.  Either value may be
 0, in which case GetAccessStrategyWithSize() will return NULL, which is
 the expected behavior."

> +         * returns NULL.
> +         */
> +        if (ring_size > -1)
> +            bstrategy = GetAccessStrategyWithSize(BAS_VACUUM, ring_size);
> +        else
> +            bstrategy = GetAccessStrategyWithSize(BAS_VACUUM, VacuumBufferUsageLimit);
> +
>          MemoryContextSwitchTo(old_context);
>      }

> diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
> index c1e911b1b3..1b5f779384 100644
> --- a/src/backend/postmaster/autovacuum.c
> +++ b/src/backend/postmaster/autovacuum.c
> @@ -2287,11 +2287,21 @@ do_autovacuum(void)
>      /*
> -     * Create a buffer access strategy object for VACUUM to use.  We want to
> -     * use the same one across all the vacuum operations we perform, since the
> -     * point is for VACUUM not to blow out the shared cache.
> +     * Optionally, create a buffer access strategy object for VACUUM to use.
> +     * When creating one, we want the same one across all tables being
> +     * vacuumed this helps prevent autovacuum from blowing out shared buffers.

"When creating one" is a bit awkward. I would say something like "Use
the same BufferAccessStrategy object for all tables VACUUMed by this
worker to prevent autovacuum from blowing out shared buffers."

> diff --git a/src/backend/storage/buffer/freelist.c b/src/backend/storage/buffer/freelist.c
> index f122709fbe..710b05cbc5 100644
> --- a/src/backend/storage/buffer/freelist.c
> +++ b/src/backend/storage/buffer/freelist.c
> +/*
> + * GetAccessStrategyWithSize -- create a BufferAccessStrategy object with a
> + *        number of buffers equivalent to the passed in size.
> + *
> + * If the given ring size is 0, no BufferAccessStrategy will be created and
> + * the function will return NULL.  The ring size may not be negative.
> + */
> +BufferAccessStrategy
> +GetAccessStrategyWithSize(BufferAccessStrategyType btype, int ring_size_kb)
> +{
> +    int            ring_buffers;
> +    BufferAccessStrategy strategy;
> +
> +    Assert(ring_size_kb >= 0);
> +
> +    /* Figure out how many buffers ring_size_kb is */
> +    ring_buffers = ring_size_kb / (BLCKSZ / 1024);

Is there any BLCKSZ that could end up rounding down to 0 and resulting
in a divide by 0?

> diff --git a/src/backend/utils/init/globals.c b/src/backend/utils/init/globals.c
> index 1b1d814254..011ec18015 100644
> --- a/src/backend/utils/init/globals.c
> +++ b/src/backend/utils/init/globals.c
> @@ -139,7 +139,10 @@ int            max_worker_processes = 8;
>  int            max_parallel_workers = 8;
>  int            MaxBackends = 0;
>  
> -int            VacuumCostPageHit = 1;    /* GUC parameters for vacuum */
> +/* GUC parameters for vacuum */
> +int            VacuumBufferUsageLimit = 256;

So, I know we agreed to make it camel cased, but I couldn't help
mentioning the discussion over in [1] in which Sawada-san says:

> In vacuum.c, we use snake case for GUC parameters and camel case for
> other global variables

Our variable doesn't have a corresponding global that is not a GUC, and
the current pattern is hardly consistent. But, I know we are discussing
following this convention, so I thought I would mention it.

> diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
> index 06a86f9ac1..d4f9ff8077 100644
> --- a/src/include/miscadmin.h
> +++ b/src/include/miscadmin.h
> @@ -263,6 +263,18 @@ extern PGDLLIMPORT double hash_mem_multiplier;
>  extern PGDLLIMPORT int maintenance_work_mem;
>  extern PGDLLIMPORT int max_parallel_maintenance_workers;
>  

GUC name mentioned in comment is inconsistent with current GUC name.

> +/*
> + * Upper and lower hard limits for the buffer access strategy ring size
> + * specified by the vacuum_buffer_usage_limit GUC and BUFFER_USAGE_LIMIT
> + * option to VACUUM and ANALYZE.
> + */
> +#define MIN_BAS_VAC_RING_SIZE_KB 128
> +#define MAX_BAS_VAC_RING_SIZE_KB (16 * 1024 * 1024)

> diff --git a/src/test/regress/sql/vacuum.sql b/src/test/regress/sql/vacuum.sql
> index a1fad43657..d23e1a8ced 100644
> --- a/src/test/regress/sql/vacuum.sql
> +++ b/src/test/regress/sql/vacuum.sql
> @@ -272,6 +272,18 @@ SELECT t.relfilenode = :toast_filenode AS is_same_toast_filenode
>    FROM pg_class c, pg_class t
>    WHERE c.reltoastrelid = t.oid AND c.relname = 'vac_option_tab';
>  
> +-- BUFFER_USAGE_LIMIT option
> +VACUUM (BUFFER_USAGE_LIMIT '512 kB') vac_option_tab;

Is it worth adding a VACUUM (BUFFER_USAGE_LIMIT 0) vac_option_tab test?

- Melanie

[1] https://www.postgresql.org/message-id/CAD21AoC5aDwARiqsL%2BKwHqnN7phub9AaMkbGkJ9aUCeETx8esw%40mail.gmail.com



Re: Option to not use ringbuffer in VACUUM, using it in failsafe mode

From
David Rowley
Date:
On Fri, 7 Apr 2023 at 05:20, Melanie Plageman <melanieplageman@gmail.com> wrote:
> > diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml

> This still says that the default value is -1.

Oops, I had this staged but didn't commit and formed the patch with
"git diff master.."  instead of "git diff master", so missed a few
staged changes.

> > diff --git a/doc/src/sgml/ref/vacuum.sgml b/doc/src/sgml/ref/vacuum.sgml
> I noticed you seemed to have removed the reference to the GUC
> vacuum_buffer_usage_limit here. Was that intentional?
> We may not need to mention "falling back" as I did before, however, the
> GUC docs mention max/min values and such, which might be useful.

Unintentional. I removed it when removing the -1 stuff. It's useful to
keep something about the fallback, so I put that part back.

> > +     /* Allow specifying the default or disabling Buffer Access Strategy */
> > +     if (*newval == -1 || *newval == 0)
> > +             return true;
>
> This should not check for -1 since that isn't the default anymore.
> It should only need to check for 0 I think?

Thanks. That one was one of the staged fixes.

> > +     /* Value upper and lower hard limits are inclusive */
> > +     if (*newval >= MIN_BAS_VAC_RING_SIZE_KB &&
> > +             *newval <= MAX_BAS_VAC_RING_SIZE_KB)
> > +             return true;
> > +
> > +     /* Value does not fall within any allowable range */
> > +     GUC_check_errdetail("\"vacuum_buffer_usage_limit\" must be -1, 0 or between %d KB and %d KB",
> > +                                             MIN_BAS_VAC_RING_SIZE_KB, MAX_BAS_VAC_RING_SIZE_KB);
>
> Also remove -1 here.

And this one.

> >   * Primary entry point for manual VACUUM and ANALYZE commands
> >   *
> > @@ -114,6 +139,8 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
> >       bool            disable_page_skipping = false;
> >       bool            process_main = true;
> >       bool            process_toast = true;
> > +     /* by default use buffer access strategy with default size */
> > +     int                     ring_size = -1;
>
> We need to update this comment to something like, "use an invalid value
> for ring_size" so it is clear whether or not the BUFFER_USAGE_LIMIT was
> specified when making the access strategy later". Also, I think just
> removing the comment would be okay, because this is the normal behavior
> for initializing values, I think.

Yeah, I've moved the assignment away from the declaration and wrote
something along those lines.

> > @@ -240,6 +309,17 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
> >                               (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
> >                                errmsg("VACUUM FULL cannot be performed in parallel")));
> >
> > +     /*
> > +      * BUFFER_USAGE_LIMIT does nothing for VACUUM (FULL) so just raise an
> > +      * ERROR for that case.  VACUUM (FULL, ANALYZE) does make use of it, so
> > +      * we'll permit that.
> > +      */
> > +     if ((params.options & VACOPT_FULL) && !(params.options & VACOPT_ANALYZE) &&
> > +             ring_size > -1)
> > +             ereport(ERROR,
> > +                             (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
> > +                              errmsg("BUFFER_USAGE_LIMIT cannot be specified for VACUUM FULL")));
> > +
> >       /*
> >        * Make sure VACOPT_ANALYZE is specified if any column lists are present.
> >        */
> > @@ -341,7 +421,20 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
> >
> >               MemoryContext old_context = MemoryContextSwitchTo(vac_context);
> >
> > -             bstrategy = GetAccessStrategy(BAS_VACUUM);
>
> Is it worth moving this assert up above when we do the "sanity checking"
> for VACUUM FULL with BUFFER_USAGE_LIMIT?

I didn't do this, but I did adjust that check to check ring_size != -1
and put that as the first condition. It's likely more rare to have
ring_size not set to -1, so probably should check that first.

> s/expliot/exploit

oops

> I might rephrase the last sentence(s). Since it overrides it, I think it
> is clear that if it is not specified, then the thing it overrides is
> used. Then you could phrase the whole thing like:
>
>  "If BUFFER_USAGE_LIMIT was specified by the VACUUM or ANALYZE command,
>  it overrides the value of VacuumBufferUsageLimit.  Either value may be
>  0, in which case GetAccessStrategyWithSize() will return NULL, which is
>  the expected behavior."

Fixed.

> > diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
> > index c1e911b1b3..1b5f779384 100644
> > --- a/src/backend/postmaster/autovacuum.c
> > +++ b/src/backend/postmaster/autovacuum.c
> > @@ -2287,11 +2287,21 @@ do_autovacuum(void)
> >       /*
> > -      * Create a buffer access strategy object for VACUUM to use.  We want to
> > -      * use the same one across all the vacuum operations we perform, since the
> > -      * point is for VACUUM not to blow out the shared cache.
> > +      * Optionally, create a buffer access strategy object for VACUUM to use.
> > +      * When creating one, we want the same one across all tables being
> > +      * vacuumed this helps prevent autovacuum from blowing out shared buffers.
>
> "When creating one" is a bit awkward. I would say something like "Use
> the same BufferAccessStrategy object for all tables VACUUMed by this
> worker to prevent autovacuum from blowing out shared buffers."

Adjusted

> > +     /* Figure out how many buffers ring_size_kb is */
> > +     ring_buffers = ring_size_kb / (BLCKSZ / 1024);
>
> Is there any BLCKSZ that could end up rounding down to 0 and resulting
> in a divide by 0?

I removed that Assert() as I found quite a number of other places in
our code that assume BLCKSZ / 1024 is never 0.

> So, I know we agreed to make it camel cased, but I couldn't help
> mentioning the discussion over in [1] in which Sawada-san says:

I didn't change anything here. I'm happy to follow any rules about
this once they're defined. What we have today is horribly
inconsistent.

> GUC name mentioned in comment is inconsistent with current GUC name.
>
> > +/*
> > + * Upper and lower hard limits for the buffer access strategy ring size
> > + * specified by the vacuum_buffer_usage_limit GUC and BUFFER_USAGE_LIMIT
> > + * option to VACUUM and ANALYZE.

I did adjust this. I wasn't sure it was incorrect as I mentioned "GUC"
as in, the user facing setting.

> Is it worth adding a VACUUM (BUFFER_USAGE_LIMIT 0) vac_option_tab test?

I think so.

I've attached v16.

David

Attachment

Re: Option to not use ringbuffer in VACUUM, using it in failsafe mode

From
Justin Pryzby
Date:
+VACUUM uses a ring like sequential scans, however, the size this ring
+controlled by the vacuum_buffer_usage_limit GUC.  Dirty pages are not removed

should say: ".. the size OF this ring IS .." ?



Re: Option to not use ringbuffer in VACUUM, using it in failsafe mode

From
Melanie Plageman
Date:
On Fri, Apr 07, 2023 at 09:12:32AM +1200, David Rowley wrote:
> On Fri, 7 Apr 2023 at 05:20, Melanie Plageman <melanieplageman@gmail.com> wrote:
> > GUC name mentioned in comment is inconsistent with current GUC name.
> >
> > > +/*
> > > + * Upper and lower hard limits for the buffer access strategy ring size
> > > + * specified by the vacuum_buffer_usage_limit GUC and BUFFER_USAGE_LIMIT
> > > + * option to VACUUM and ANALYZE.
> 
> I did adjust this. I wasn't sure it was incorrect as I mentioned "GUC"
> as in, the user facing setting.

Oh, actually maybe you are right then.

> diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
> index bcc49aec45..220f9ee84c 100644
> --- a/doc/src/sgml/config.sgml
> +++ b/doc/src/sgml/config.sgml
> @@ -2001,6 +2001,35 @@ include_dir 'conf.d'
>        </listitem>
>       </varlistentry>
>  
> +     <varlistentry id="guc-vacuum-buffer-usage-limit" xreflabel="vacuum_buffer_usage_limit">
> +      <term>
> +       <varname>vacuum_buffer_usage_limit</varname> (<type>integer</type>)
> +       <indexterm>
> +        <primary><varname>vacuum_buffer_usage_limit</varname> configuration parameter</primary>
> +       </indexterm>
> +      </term>
> +      <listitem>
> +       <para>
> +        Specifies the size of <varname>shared_buffers</varname> to be reused
> +        for each backend participating in a given invocation of
> +        <command>VACUUM</command> or <command>ANALYZE</command> or in
> +        autovacuum.  

Rereading this, I think it is not a good sentence (my fault).
Perhaps we should use the same language as the BUFFER_USAGE_LIMIT
options. Something like:

Specifies the
<glossterm linkend="glossary-buffer-access-strategy">Buffer Access Strategy</glossterm>
ring buffer size used by each backend participating in a given
invocation of <command>VACUUM</command> or <command>ANALYZE</command> or
in autovacuum.

Last part is admittedly a bit awkward...

> diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
> index ea1d8960f4..995b4bd54a 100644
> --- a/src/backend/commands/vacuum.c
> +++ b/src/backend/commands/vacuum.c
> @@ -56,6 +56,7 @@
>  #include "utils/acl.h"
>  #include "utils/fmgroids.h"
>  #include "utils/guc.h"
> +#include "utils/guc_hooks.h"
>  #include "utils/memutils.h"
>  #include "utils/pg_rusage.h"
>  #include "utils/snapmgr.h"
> @@ -95,6 +96,26 @@ static VacOptValue get_vacoptval_from_boolean(DefElem *def);
>  static bool vac_tid_reaped(ItemPointer itemptr, void *state);
>  static int    vac_cmp_itemptr(const void *left, const void *right);
>  
> +/*
> + * GUC check function to ensure GUC value specified is within the allowable
> + * range.
> + */
> +bool
> +check_vacuum_buffer_usage_limit(int *newval, void **extra,
> +                                GucSource source)
> +{

I'm not so hot on this comment. It seems very...generic. Like it could
be the comment on any GUC check function. I'm also okay with leaving it
as is.

> @@ -341,7 +422,19 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel)
>  
>          MemoryContext old_context = MemoryContextSwitchTo(vac_context);
>  
> -        bstrategy = GetAccessStrategy(BAS_VACUUM);
> +        Assert(ring_size >= -1);
> +
> +        /*
> +         * If BUFFER_USAGE_LIMIT was specified by the VACUUM or ANALYZE
> +         * command, it overrides the value of VacuumBufferUsageLimit.  Either
> +         * value may be 0, in which case GetAccessStrategyWithSize() will
> +         * return NULL, effectively allowing full use of shared buffers.

Maybe "unlimited" is better than "full"?

> +         */
> +        if (ring_size != -1)
> +            bstrategy = GetAccessStrategyWithSize(BAS_VACUUM, ring_size);
> +        else
> +            bstrategy = GetAccessStrategyWithSize(BAS_VACUUM, VacuumBufferUsageLimit);
> +
>          MemoryContextSwitchTo(old_context);
>      }
>  
> diff --git a/src/backend/commands/vacuumparallel.c b/src/backend/commands/vacuumparallel.c
> @@ -365,6 +371,9 @@ parallel_vacuum_init(Relation rel, Relation *indrels, int nindexes,
>          maintenance_work_mem / Min(parallel_workers, nindexes_mwm) :
>          maintenance_work_mem;
>  
> +    /* Use the same buffer size for all workers */

I would say ring buffer size -- this sounds like it is the size of a
single buffer.

> +    shared->ring_nbuffers = GetAccessStrategyBufferCount(bstrategy);
> +
>      pg_atomic_init_u32(&(shared->cost_balance), 0);
>      pg_atomic_init_u32(&(shared->active_nworkers), 0);
>      pg_atomic_init_u32(&(shared->idx), 0);

> + * Upper and lower hard limits for the buffer access strategy ring size
> + * specified by the VacuumBufferUsageLimit GUC and BUFFER_USAGE_LIMIT option

I agree with your original usage of the actual GUC name, now that I
realize why you were doing it and am rereading it.

> + * to VACUUM and ANALYZE.
> + */
> +#define MIN_BAS_VAC_RING_SIZE_KB 128
> +#define MAX_BAS_VAC_RING_SIZE_KB (16 * 1024 * 1024)


Otherwise, LGTM.



Re: Option to not use ringbuffer in VACUUM, using it in failsafe mode

From
David Rowley
Date:
On Fri, 7 Apr 2023 at 09:44, Melanie Plageman <melanieplageman@gmail.com> wrote:
> Otherwise, LGTM.

Thanks for looking.  I've also taken Justin's comments about the
README into account and fixed that part.

I've pushed the patch after a little more adjusting.  I added some
text to the docs that mention larger VACUUM_BUFFER_LIMITs can speed up
vacuum and also a reason why they might not want to go nuts with it.

I've also just now pushed the vacuumdb patch too. I ended up adjusting
some of the ERROR messages in the main patch after the following not
so nice user experience:

$ vacuumdb --buffer-usage-limit=1TB --analyze postgres
vacuumdb: vacuuming database "postgres"
SQL: VACUUM (SKIP_DATABASE_STATS, ANALYZE, BUFFER_USAGE_LIMIT '1TB')
vacuumdb: error: processing of database "postgres" failed: ERROR:
buffer_usage_limit option must be 0 or between 128 KB and 16777216 KB

$ vacuumdb --buffer-usage-limit=128KB --analyze postgres
vacuumdb: vacuuming database "postgres"
SQL: VACUUM (SKIP_DATABASE_STATS, ANALYZE, BUFFER_USAGE_LIMIT '128KB')
vacuumdb: error: processing of database "postgres" failed: ERROR:
value: "128KB": is invalid for buffer_usage_limit
HINT:  Valid units for this parameter are "B", "kB", "MB", "GB", and "TB".

David



Re: Option to not use ringbuffer in VACUUM, using it in failsafe mode

From
Peter Eisentraut
Date:
On 07.04.23 02:52, David Rowley wrote:
> On Fri, 7 Apr 2023 at 09:44, Melanie Plageman <melanieplageman@gmail.com> wrote:
>> Otherwise, LGTM.
> 
> Thanks for looking.  I've also taken Justin's comments about the
> README into account and fixed that part.
> 
> I've pushed the patch after a little more adjusting.  I added some
> text to the docs that mention larger VACUUM_BUFFER_LIMITs can speed up
> vacuum and also a reason why they might not want to go nuts with it.

I came across these new options and had a little bit of trouble figuring 
them out from the documentation.  Maybe this could be polished a bit.

vacuumdb --help says

     --buffer-usage-limit=BUFSIZE

I can guess what a "SIZE" might be, but is "BUFSIZE" different from a 
"SIZE"?  Maybe simplify here.

On the vacuumdb man page, the placeholder is

     <replaceable class="parameter">buffer_usage_limit</replaceable>

which is yet another way of phrasing it.  Maybe also use "size" here?

The VACUUM man page says

     BUFFER_USAGE_LIMIT [ <replaceable ...>string</replaceable> ]

which had me really confused.  The detailed description later doesn't 
give any further explanation of possible values, except that 
<literal>0</literal> is apparently a possible value, which in my mind is 
not a string.  Then there is a link to guc-vacuum-buffer-usage-limit, 
which lifts the mystery that this is really just an integer setting with 
possible memory-size units, but it was really hard to figure that out 
from the start!

Moreover, on the VACUUM man page, right below BUFFER_USAGE_LIMIT, it 
explains the different kinds of accepted values, and "string" wasn't 
added there.  Maybe also change this to "size" here and add an 
explanation there what kinds of sizes are possible.

Finally, the locations of the new options in the various documentation 
places seems a bit random.  The vacuumdb --help output and the man page 
appear to be mostly alphabetical, so --buffer-usage-limit should be 
after -a/--all.  (Also note that right now the option isn't even in the 
same place in the --help output versus the man page.)

The order of the options on the VACUUM man page doesn't make any sense 
anymore.  This isn't really the fault of this patch, but maybe it's time 
to do a fresh reordering there.



Re: Option to not use ringbuffer in VACUUM, using it in failsafe mode

From
David Rowley
Date:
On Fri, 14 Apr 2023 at 19:20, Peter Eisentraut
<peter.eisentraut@enterprisedb.com> wrote:
>
> I came across these new options and had a little bit of trouble figuring
> them out from the documentation.  Maybe this could be polished a bit.
>
> vacuumdb --help says
>
>      --buffer-usage-limit=BUFSIZE
>
> I can guess what a "SIZE" might be, but is "BUFSIZE" different from a
> "SIZE"?  Maybe simplify here.
>
> On the vacuumdb man page, the placeholder is
>
>      <replaceable class="parameter">buffer_usage_limit</replaceable>
>
> which is yet another way of phrasing it.  Maybe also use "size" here?
>
> The VACUUM man page says
>
>      BUFFER_USAGE_LIMIT [ <replaceable ...>string</replaceable> ]
>
> which had me really confused.  The detailed description later doesn't
> give any further explanation of possible values, except that
> <literal>0</literal> is apparently a possible value, which in my mind is
> not a string.  Then there is a link to guc-vacuum-buffer-usage-limit,
> which lifts the mystery that this is really just an integer setting with
> possible memory-size units, but it was really hard to figure that out
> from the start!
>
> Moreover, on the VACUUM man page, right below BUFFER_USAGE_LIMIT, it
> explains the different kinds of accepted values, and "string" wasn't
> added there.  Maybe also change this to "size" here and add an
> explanation there what kinds of sizes are possible.
>
> Finally, the locations of the new options in the various documentation
> places seems a bit random.  The vacuumdb --help output and the man page
> appear to be mostly alphabetical, so --buffer-usage-limit should be
> after -a/--all.  (Also note that right now the option isn't even in the
> same place in the --help output versus the man page.)

These are all valid points. I've attached a patch aiming to address
each of them.

> The order of the options on the VACUUM man page doesn't make any sense
> anymore.  This isn't really the fault of this patch, but maybe it's time
> to do a fresh reordering there.

Agreed, that likely wasn't a big problem say about 5 years ago when we
had far fewer options, but the number has grown quite a bit since
then.

Right after I fix the points you've mentioned seems a good time to address that.

David

Attachment

Re: Option to not use ringbuffer in VACUUM, using it in failsafe mode

From
David Rowley
Date:
On Sat, 15 Apr 2023 at 12:59, David Rowley <dgrowleyml@gmail.com> wrote:
> These are all valid points. I've attached a patch aiming to address
> each of them.

I tweaked this a little further and pushed it.

David



Re: Option to not use ringbuffer in VACUUM, using it in failsafe mode

From
Melanie Plageman
Date:
On Sat, Apr 15, 2023 at 12:59:52PM +1200, David Rowley wrote:
> On Fri, 14 Apr 2023 at 19:20, Peter Eisentraut
> <peter.eisentraut@enterprisedb.com> wrote:
> >
> > I came across these new options and had a little bit of trouble figuring
> > them out from the documentation.  Maybe this could be polished a bit.
> >
> > vacuumdb --help says
> >
> >      --buffer-usage-limit=BUFSIZE
> >
> > I can guess what a "SIZE" might be, but is "BUFSIZE" different from a
> > "SIZE"?  Maybe simplify here.
> >
> > On the vacuumdb man page, the placeholder is
> >
> >      <replaceable class="parameter">buffer_usage_limit</replaceable>
> >
> > which is yet another way of phrasing it.  Maybe also use "size" here?
> >
> > The VACUUM man page says
> >
> >      BUFFER_USAGE_LIMIT [ <replaceable ...>string</replaceable> ]
> >
> > which had me really confused.  The detailed description later doesn't
> > give any further explanation of possible values, except that
> > <literal>0</literal> is apparently a possible value, which in my mind is
> > not a string.  Then there is a link to guc-vacuum-buffer-usage-limit,
> > which lifts the mystery that this is really just an integer setting with
> > possible memory-size units, but it was really hard to figure that out
> > from the start!
> >
> > Moreover, on the VACUUM man page, right below BUFFER_USAGE_LIMIT, it
> > explains the different kinds of accepted values, and "string" wasn't
> > added there.  Maybe also change this to "size" here and add an
> > explanation there what kinds of sizes are possible.
> >
> > Finally, the locations of the new options in the various documentation
> > places seems a bit random.  The vacuumdb --help output and the man page
> > appear to be mostly alphabetical, so --buffer-usage-limit should be
> > after -a/--all.  (Also note that right now the option isn't even in the
> > same place in the --help output versus the man page.)
> 
> These are all valid points. I've attached a patch aiming to address
> each of them.

I like that we are now using "size" consistently instead of bufsize etc.

> 
> > The order of the options on the VACUUM man page doesn't make any sense
> > anymore.  This isn't really the fault of this patch, but maybe it's time
> > to do a fresh reordering there.
> 
> Agreed, that likely wasn't a big problem say about 5 years ago when we
> had far fewer options, but the number has grown quite a bit since
> then.
> 
> Right after I fix the points you've mentioned seems a good time to address that.

Are we still thinking that reordering the VACUUM (and ANALYZE) options
makes sense. And, if so, should it be alphabetical within parameter
category? That is, all actual parameters (e.g. FULL and FREEZE) are
alphabetically organized first followed by all parameter types (e.g.
boolean and size) alphabetically listed?

- Melanie



Re: Option to not use ringbuffer in VACUUM, using it in failsafe mode

From
David Rowley
Date:
On Tue, 18 Apr 2023 at 09:21, Melanie Plageman
<melanieplageman@gmail.com> wrote:
> Are we still thinking that reordering the VACUUM (and ANALYZE) options
> makes sense. And, if so, should it be alphabetical within parameter
> category? That is, all actual parameters (e.g. FULL and FREEZE) are
> alphabetically organized first followed by all parameter types (e.g.
> boolean and size) alphabetically listed?

I've opened a thread for that [1].

David

[1] https://postgr.es/m/CAApHDvo1eWbt5PVpk0G=yCbBNgLU7KaRP6dCBHpNbFaBjyGyQA@mail.gmail.com



Re: Option to not use ringbuffer in VACUUM, using it in failsafe mode

From
Masahiko Sawada
Date:
Hi,

On Sun, Apr 16, 2023 at 9:09 AM David Rowley <dgrowleyml@gmail.com> wrote:
>
> On Sat, 15 Apr 2023 at 12:59, David Rowley <dgrowleyml@gmail.com> wrote:
> > These are all valid points. I've attached a patch aiming to address
> > each of them.
>
> I tweaked this a little further and pushed it.
>

I realized that the value of vacuum_buffer_usage_limit parameter in
postgresql.conf.sample doesn't have the unit:

#vacuum_buffer_usage_limit = 256 # size of vacuum and analyze buffer
access strategy ring.
                                 # 0 to disable vacuum buffer access strategy
                                 # range 128kB to 16GB

It works but I think we might want to add the unit kB for
understandability and consistency with other GUC_UNIT_KB parameters.
I've attached a small patch that adds the unit and aligns the indent
of the comments to the perimeter parameters.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Attachment

Re: Option to not use ringbuffer in VACUUM, using it in failsafe mode

From
David Rowley
Date:
On Wed, 26 Apr 2023, 8:48 pm Masahiko Sawada, <sawada.mshk@gmail.com> wrote:
I realized that the value of vacuum_buffer_usage_limit parameter in
postgresql.conf.sample doesn't have the unit:

#vacuum_buffer_usage_limit = 256 # size of vacuum and analyze buffer
access strategy ring.
                                 # 0 to disable vacuum buffer access strategy
                                 # range 128kB to 16GB

It works but I think we might want to add the unit kB for
understandability and consistency with other GUC_UNIT_KB parameters.
I've attached a small patch that adds the unit and aligns the indent
of the comments to the perimeter parameters.

I'm not currently able to check, but if work_mem has a unit in the sample conf then I agree that vacuum_buffer_usage_limit should too.

I'm fine for you to go ahead and adjust this, otherwise it'll be Monday before I can.

David

Re: Option to not use ringbuffer in VACUUM, using it in failsafe mode

From
Daniel Gustafsson
Date:
> On 26 Apr 2023, at 13:26, David Rowley <dgrowleyml@gmail.com> wrote:
> On Wed, 26 Apr 2023, 8:48 pm Masahiko Sawada, <sawada.mshk@gmail.com <mailto:sawada.mshk@gmail.com>> wrote:

> It works but I think we might want to add the unit kB for
> understandability and consistency with other GUC_UNIT_KB parameters.
> I've attached a small patch that adds the unit and aligns the indent
> of the comments to the perimeter parameters.
>
> I'm not currently able to check, but if work_mem has a unit in the sample conf then I agree that
vacuum_buffer_usage_limitshould too. 

+1 work_mem and all other related options in this section has a unit in the
sample conf so adding this makes sense.

--
Daniel Gustafsson




Re: Option to not use ringbuffer in VACUUM, using it in failsafe mode

From
Melanie Plageman
Date:

On Wed, Apr 26, 2023 at 8:31 AM Daniel Gustafsson <daniel@yesql.se> wrote:
> On 26 Apr 2023, at 13:26, David Rowley <dgrowleyml@gmail.com> wrote:
> On Wed, 26 Apr 2023, 8:48 pm Masahiko Sawada, <sawada.mshk@gmail.com <mailto:sawada.mshk@gmail.com>> wrote:

> It works but I think we might want to add the unit kB for
> understandability and consistency with other GUC_UNIT_KB parameters.
> I've attached a small patch that adds the unit and aligns the indent
> of the comments to the perimeter parameters.
>
> I'm not currently able to check, but if work_mem has a unit in the sample conf then I agree that vacuum_buffer_usage_limit should too.

+1 work_mem and all other related options in this section has a unit in the
sample conf so adding this makes sense.

Agreed.
for the patch, the other GUCs have a tab instead of a space between the unit and the "#" of the first comment.
(not the fault of this patch but probably makes sense to fix now).
Otherwise, LGTM

- Melanie

Re: Option to not use ringbuffer in VACUUM, using it in failsafe mode

From
Masahiko Sawada
Date:
On Wed, Apr 26, 2023 at 9:59 PM Melanie Plageman
<melanieplageman@gmail.com> wrote:
>
>
> On Wed, Apr 26, 2023 at 8:31 AM Daniel Gustafsson <daniel@yesql.se> wrote:
>>
>> > On 26 Apr 2023, at 13:26, David Rowley <dgrowleyml@gmail.com> wrote:
>> > On Wed, 26 Apr 2023, 8:48 pm Masahiko Sawada, <sawada.mshk@gmail.com <mailto:sawada.mshk@gmail.com>> wrote:
>>
>> > It works but I think we might want to add the unit kB for
>> > understandability and consistency with other GUC_UNIT_KB parameters.
>> > I've attached a small patch that adds the unit and aligns the indent
>> > of the comments to the perimeter parameters.
>> >
>> > I'm not currently able to check, but if work_mem has a unit in the sample conf then I agree that
vacuum_buffer_usage_limitshould too. 
>>
>> +1 work_mem and all other related options in this section has a unit in the
>> sample conf so adding this makes sense.
>
>
> Agreed.
> for the patch, the other GUCs have a tab instead of a space between the unit and the "#" of the first comment.
> (not the fault of this patch but probably makes sense to fix now).
> Otherwise, LGTM

Thanks for the review! Pushed after incorporating a comment from Melanie.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com