Re: Option to not use ringbuffer in VACUUM, using it in failsafe mode - Mailing list pgsql-hackers
| From | Justin Pryzby |
|---|---|
| Subject | Re: Option to not use ringbuffer in VACUUM, using it in failsafe mode |
| Date | |
| Msg-id | ZBYDTrD1kyGg+HkS@telsasoft.com Whole thread Raw |
| In response to | Re: Option to not use ringbuffer in VACUUM, using it in failsafe mode (Melanie Plageman <melanieplageman@gmail.com>) |
| Responses |
Re: Option to not use ringbuffer in VACUUM, using it in failsafe mode
|
| List | pgsql-hackers |
> Subject: [PATCH v4 3/3] add vacuum[db] option to specify ring_size and guc
> + Specifies the ring buffer size to be used for a given invocation of
> + <command>VACUUM</command> or instance of autovacuum. This size is
> + converted to a number of shared buffers which will be reused as part of
I'd say "specifies the size of shared_buffers to be reused as .."
> + a <literal>Buffer Access Strategy</literal>. <literal>0</literal> will
> + disable use of a <literal>Buffer Access Strategy</literal>.
> + <literal>-1</literal> will set the size to a default of <literal>256
> + kB</literal>. The maximum ring buffer size is <literal>16 GB</literal>.
> + Though you may set <varname>vacuum_buffer_usage_limit</varname> below
> + <literal>128 kB</literal>, it will be clamped to <literal>128
> + kB</literal> at runtime. The default value is <literal>-1</literal>.
> + This parameter can be set at any time.
GUC docs usually also say something like
"If this value is specified without units, it is taken as .."
> + is used to calculate a number of shared buffers which will be reused as
*the* number?
> + <command>VACUUM</command>. The analyze stage and parallel vacuum workers
> + do not use this size.
I think what you mean is that vacuum's heap scan stage uses the
strategy, but the index scan/cleanup phases doesn't?
> + The size in kB of the ring buffer used for vacuuming. This size is used
> + to calculate a number of shared buffers which will be reused as part of
*the* number
> +++ b/doc/src/sgml/ref/vacuumdb.sgml
The docs here duplicate the sql-vacuum docs. It seems better to refer
to the vacuum page for details, like --parallel does.
Unrelated: it would be nice if the client-side options were documented
separately from the server-side options. Especially due to --jobs and
--parallel.
> + if (!parse_int(vac_buffer_size, &result, GUC_UNIT_KB, NULL))
> + {
> + ereport(ERROR,
> + (errcode(ERRCODE_SYNTAX_ERROR),
> + errmsg("buffer_usage_limit for a vacuum must be between -1 and %d. %s is invalid.",
> + MAX_BAS_RING_SIZE_KB, vac_buffer_size),
> + parser_errposition(pstate, opt->location)));
> + }
> +
> + /* check for out-of-bounds */
> + if (result < -1 || result > MAX_BAS_RING_SIZE_KB)
> + {
> + ereport(ERROR,
> + (errcode(ERRCODE_SYNTAX_ERROR),
> + errmsg("buffer_usage_limit for a vacuum must be between -1 and %d",
> + MAX_BAS_RING_SIZE_KB),
> + parser_errposition(pstate, opt->location)));
> + }
I think these checks could be collapsed into a single ereport().
if !parse_int() || (result < -1 || result > MAX_BAS_RINGSIZE_KB):
ereport(ERROR,
errcode(ERRCODE_SYNTAX_ERROR),
errmsg("buffer_usage_limit for a vacuum must be an integer between -1 and %d",
MAX_BAS_RING_SIZE_KB),
There was a recent, similar, and unrelated suggestion here:
https://www.postgresql.org/message-id/20230314.135859.260879647537075548.horikyota.ntt%40gmail.com
> +#vacuum_buffer_usage_limit = -1 # size of vacuum buffer access strategy ring.
> + # -1 to use default,
> + # 0 to disable vacuum buffer access strategy and use shared buffers
I think it's confusing to say "and use shared buffers", since
"strategies" also use shared_buffers. It seems better to remove those 4
words.
> @@ -550,6 +563,13 @@ vacuum_one_database(ConnParams *cparams,
> pg_fatal("cannot use the \"%s\" option on server versions older than PostgreSQL %s",
> "--parallel", "13");
>
> + // TODO: this is a problem: if the user specifies this option with -1 in a
> + // version before 16, it will not produce an error message. it also won't
> + // do anything, but that still doesn't seem right.
Actually, that seems fine to me. If someone installs v16 vacuumdb, they
can run it against old servers and specify the option as -1 without it
failing with an error. I don't know if anyone will find that useful,
but it doesn't seem unreasonable.
I still think adding something to the glossary would be good.
Buffer Access Strategy:
A circular/ring buffer used for reading or writing data pages from/to
the operating system. Ring buffers are used for sequential scans of
large tables, VACUUM, COPY FROM, CREATE TABLE AS SELECT, ALTER TABLE,
and CLUSTER. By using only a limited portion of >shared_buffers<, the
ring buffer avoids avoids evicting large amounts of data whenever a
backend performs bulk I/O operations. Use of a ring buffer also forces
the backend to write out its own dirty pages, rather than leaving them
behind to be cleaned up by other backends.
If there's a larger section added than a glossary entry, the text could
be promoted from src/backend/storage/buffer/README to doc/.
--
Justin
pgsql-hackers by date: