Re: Option to not use ringbuffer in VACUUM, using it in failsafe mode - Mailing list pgsql-hackers
From | Justin Pryzby |
---|---|
Subject | Re: Option to not use ringbuffer in VACUUM, using it in failsafe mode |
Date | |
Msg-id | ZBYDTrD1kyGg+HkS@telsasoft.com Whole thread Raw |
In response to | Re: Option to not use ringbuffer in VACUUM, using it in failsafe mode (Melanie Plageman <melanieplageman@gmail.com>) |
Responses |
Re: Option to not use ringbuffer in VACUUM, using it in failsafe mode
|
List | pgsql-hackers |
> Subject: [PATCH v4 3/3] add vacuum[db] option to specify ring_size and guc > + Specifies the ring buffer size to be used for a given invocation of > + <command>VACUUM</command> or instance of autovacuum. This size is > + converted to a number of shared buffers which will be reused as part of I'd say "specifies the size of shared_buffers to be reused as .." > + a <literal>Buffer Access Strategy</literal>. <literal>0</literal> will > + disable use of a <literal>Buffer Access Strategy</literal>. > + <literal>-1</literal> will set the size to a default of <literal>256 > + kB</literal>. The maximum ring buffer size is <literal>16 GB</literal>. > + Though you may set <varname>vacuum_buffer_usage_limit</varname> below > + <literal>128 kB</literal>, it will be clamped to <literal>128 > + kB</literal> at runtime. The default value is <literal>-1</literal>. > + This parameter can be set at any time. GUC docs usually also say something like "If this value is specified without units, it is taken as .." > + is used to calculate a number of shared buffers which will be reused as *the* number? > + <command>VACUUM</command>. The analyze stage and parallel vacuum workers > + do not use this size. I think what you mean is that vacuum's heap scan stage uses the strategy, but the index scan/cleanup phases doesn't? > + The size in kB of the ring buffer used for vacuuming. This size is used > + to calculate a number of shared buffers which will be reused as part of *the* number > +++ b/doc/src/sgml/ref/vacuumdb.sgml The docs here duplicate the sql-vacuum docs. It seems better to refer to the vacuum page for details, like --parallel does. Unrelated: it would be nice if the client-side options were documented separately from the server-side options. Especially due to --jobs and --parallel. > + if (!parse_int(vac_buffer_size, &result, GUC_UNIT_KB, NULL)) > + { > + ereport(ERROR, > + (errcode(ERRCODE_SYNTAX_ERROR), > + errmsg("buffer_usage_limit for a vacuum must be between -1 and %d. %s is invalid.", > + MAX_BAS_RING_SIZE_KB, vac_buffer_size), > + parser_errposition(pstate, opt->location))); > + } > + > + /* check for out-of-bounds */ > + if (result < -1 || result > MAX_BAS_RING_SIZE_KB) > + { > + ereport(ERROR, > + (errcode(ERRCODE_SYNTAX_ERROR), > + errmsg("buffer_usage_limit for a vacuum must be between -1 and %d", > + MAX_BAS_RING_SIZE_KB), > + parser_errposition(pstate, opt->location))); > + } I think these checks could be collapsed into a single ereport(). if !parse_int() || (result < -1 || result > MAX_BAS_RINGSIZE_KB): ereport(ERROR, errcode(ERRCODE_SYNTAX_ERROR), errmsg("buffer_usage_limit for a vacuum must be an integer between -1 and %d", MAX_BAS_RING_SIZE_KB), There was a recent, similar, and unrelated suggestion here: https://www.postgresql.org/message-id/20230314.135859.260879647537075548.horikyota.ntt%40gmail.com > +#vacuum_buffer_usage_limit = -1 # size of vacuum buffer access strategy ring. > + # -1 to use default, > + # 0 to disable vacuum buffer access strategy and use shared buffers I think it's confusing to say "and use shared buffers", since "strategies" also use shared_buffers. It seems better to remove those 4 words. > @@ -550,6 +563,13 @@ vacuum_one_database(ConnParams *cparams, > pg_fatal("cannot use the \"%s\" option on server versions older than PostgreSQL %s", > "--parallel", "13"); > > + // TODO: this is a problem: if the user specifies this option with -1 in a > + // version before 16, it will not produce an error message. it also won't > + // do anything, but that still doesn't seem right. Actually, that seems fine to me. If someone installs v16 vacuumdb, they can run it against old servers and specify the option as -1 without it failing with an error. I don't know if anyone will find that useful, but it doesn't seem unreasonable. I still think adding something to the glossary would be good. Buffer Access Strategy: A circular/ring buffer used for reading or writing data pages from/to the operating system. Ring buffers are used for sequential scans of large tables, VACUUM, COPY FROM, CREATE TABLE AS SELECT, ALTER TABLE, and CLUSTER. By using only a limited portion of >shared_buffers<, the ring buffer avoids avoids evicting large amounts of data whenever a backend performs bulk I/O operations. Use of a ring buffer also forces the backend to write out its own dirty pages, rather than leaving them behind to be cleaned up by other backends. If there's a larger section added than a glossary entry, the text could be promoted from src/backend/storage/buffer/README to doc/. -- Justin
pgsql-hackers by date: