Re: Option to not use ringbuffer in VACUUM, using it in failsafe mode - Mailing list pgsql-hackers

From Melanie Plageman
Subject Re: Option to not use ringbuffer in VACUUM, using it in failsafe mode
Date
Msg-id CAAKRu_Z2JH5yqBRHUshVoM4SNqP9Fa5XezGh-MLJmY6p74C1Tg@mail.gmail.com
Whole thread Raw
In response to Re: Option to not use ringbuffer in VACUUM, using it in failsafe mode  (David Rowley <dgrowleyml@gmail.com>)
Responses Re: Option to not use ringbuffer in VACUUM, using it in failsafe mode
List pgsql-hackers
On Mon, Apr 3, 2023 at 1:09 AM David Rowley <dgrowleyml@gmail.com> wrote:
>
> On Sat, 1 Apr 2023 at 13:24, Melanie Plageman <melanieplageman@gmail.com> wrote:
> > Your diff LGTM.
> >
> > Earlier upthread in [1], Bharath had mentioned in a review comment about
> > removing the global variables that he would have expected the analogous
> > global in analyze.c to also be removed (vac_strategy [and analyze.c also
> > has anl_context]).
> >
> > I looked into doing this, and this is what I found out (see full
> > rationale in [2]):
> >
> > > it is a bit harder to remove it from analyze because acquire_func
> > > doesn't take the buffer access strategy as a parameter and
> > > acquire_sample_rows uses the vac_context global variable to pass to
> > > table_scan_analyze_next_block().
> >
> > I don't know if this is worth mentioning in the commit removing the
> > other globals? Maybe it will just make it more confusing...
>
> I did look at that, but it seems a little tricky to make work unless
> the AcquireSampleRowsFunc signature was changed. To me, it just does
> not seem worth doing that to get rid of the two globals in analyze.c.

Yes, I came to basically the same conclusion.

On Mon, Apr 3, 2023 at 7:57 AM David Rowley <dgrowleyml@gmail.com> wrote:
>
> I've now pushed up v8-0004.  Can rebase the remaining 2 patches on top
> of master again and resend?

v9 attached.

> On Mon, 3 Apr 2023 at 08:11, Melanie Plageman <melanieplageman@gmail.com> wrote:
> > I still have a few open questions:
> > - what the initial value of ring_size for autovacuum should be (see the
> >   one remaining TODO in the code)
>
> I assume you're talking about the 256KB BAS_VACUUM one set in
> GetAccessStrategy()? I don't think this patch should be doing anything
> to change those defaults.  Anything that does that should likely have
> a new thread and come with analysis or reasoning about why the newly
> proposed defaults are better than the old ones.

I actually was talking about something much more trivial but a little
more confusing.

In table_recheck_autovac(), I initialize the
autovac_table->at_params.ring_size to the value of the
vacuum_buffer_usage_limit guc. However, autovacuum makes its own
BufferAccessStrategy object (instead of relying on vacuum() to do it)
and passes that in to vacuum(). So, if we wanted autovacuum to disable
use of a strategy (and use as many shared buffers as it likes), it would
pass in NULL to vacuum(). If vauum_buffer_usage_limit is not 0, then we
would end up making and using a BufferAccessStrategy in vacuum().

If we instead initialized autovac_table->at_params.ring_size to 0, even
if the passed in BufferAccessStrategy is NULL, we wouldn't make a ring
for autovacuum. Right now, we don't disable the strategy for autovacuum
except in failsafe mode. And it is unclear when or why we would want to.

I also thought it might be weird to have the value of the ring_size be
initialized to something other than the value of
vacuum_buffer_usage_limit for autovacuum, since it is supposed to use
that guc value.

In fact, right now, we don't use the autovac_table->at_params.ring_size
set in table_recheck_autovac() when making the ring in do_autovacuum()
but instead use the guc directly.

I actually don't really like how vacuum() relies on the
BufferAccessStrategy parameter being NULL for autovacuum and feel like
there is a more intuitive way to handle all this. But, I didn't want to
make major changes at this point.

Anyway, the above is quite a bit more analysis than the issue is really
worth. We should pick something and then document it in a comment.

> > - should ANALYZE allow specifying BUFFER_USAGE_LIMIT since it uses the guc
> >   value when that is set?
>
> That's a good question...

I kinda think we should just skip it. It adds to the surface area of the
feature.

> > - should INDEX_CLEANUP off cause VACUUM to use shared buffers and
> >   disable use of a strategy (like failsafe vacuum)
>
> I don't see why it should.  It seems strange to have one option
> magically make changes to some other option.

Sure, sounds good.

> > - should we add anything to VACUUM VERBOSE output about the number of
> >   reuses of strategy buffers?
>
> Sounds like this would require an extra array of counter variables in
> BufferAccessStrategyData?  I think it might be a bit late to start
> experimenting with this.

Makes sense. I hadn't thought through the implementation. We count reuses in
pg_stat_io data structures but that is global and not per
BufferAccessStrategyData instance, so I agree to scrapping this idea.

> > - Should we make BufferAccessStrategyData non-opaque so that we don't
> >   have to add a getter for nbuffers. I could have implemented this in
> >   another way, but I don't really see why BufferAccessStrategyData
> >   should be opaque
>
> If nothing outside of the .c file requires access then there's little
> need to make the members known outside of the file. Same as you'd want
> to make classes private rather than public when possible in OOP.
>
> If you do come up with a reason to be able to determine the size of
> the BufferAccessStrategy from outside freelist.c, I'd say an accessor
> method is the best way.

In the main patch, I wanted access to the number of buffers so that
parallel vacuum workers could make their own rings the same size. I
added an accessor, but it looked a bit silly so I thought I would ask if
we needed to keep the data structure opaque. It isn't called frequently
enough to worry about the function call overhead. Though the accessor
could use a better name than the one I chose.

- Melanie

Attachment

pgsql-hackers by date:

Previous
From: Fujii Masao
Date:
Subject: Re: is_superuser is not documented
Next
From: Masahiko Sawada
Date:
Subject: Re: Initial Schema Sync for Logical Replication