Thread: add PROCESS_MAIN to VACUUM

add PROCESS_MAIN to VACUUM

From
Nathan Bossart
Date:
Hi hackers,

I originally suggested $ЅUBJECT as part of the thread that ultimately led
to the addition of PROCESS_TOAST [0], but we decided not to proceed with
it.  Recently, this idea came up again [1], so I thought I'd give it
another try.

The motivation for adding this option is to make it easier to VACUUM only a
relation's TOAST table.  At the moment, you need to find the TOAST table by
examining a relation's reltoastrelid, and you need USAGE on the pg_toast
schema.  This option could also help make it possible to call only
vac_update_datfrozenxid() without processing any relations, as discussed
elsewhere [2].

The demand for all these niche VACUUM options is likely limited, but it
does seem like there are some useful applications.  If a new option is out
of the question, perhaps this functionality could be added to the existing
PROCESS_TOAST option.

[0] https://postgr.es/m/BA8951E9-1524-48C5-94AF-73B1F0D7857F%40amazon.com
[1] https://postgr.es/m/20221215191246.GA252861%40nathanxps13
[2] https://postgr.es/m/20221229213719.GA301584%40nathanxps13

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Attachment

Re: add PROCESS_MAIN to VACUUM

From
Nathan Bossart
Date:
rebased for cfbot

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Attachment

Re: add PROCESS_MAIN to VACUUM

From
Jeff Davis
Date:
On Thu, 2022-12-29 at 16:00 -0800, Nathan Bossart wrote:
> The motivation for adding this option is to make it easier to VACUUM
> only a
> relation's TOAST table.  At the moment, you need to find the TOAST
> table by
> examining a relation's reltoastrelid, and you need USAGE on the
> pg_toast
> schema.  This option could also help make it possible to call only
> vac_update_datfrozenxid() without processing any relations, as
> discussed
> elsewhere [2].

For completeness, did you consider CLUSTER and REINDEX options as well?


--
Jeff Davis
PostgreSQL Contributor Team - AWS





Re: add PROCESS_MAIN to VACUUM

From
Nathan Bossart
Date:
On Fri, Jan 13, 2023 at 03:24:09PM -0800, Jeff Davis wrote:
> For completeness, did you consider CLUSTER and REINDEX options as well?

I have not, but I can put together patches for those as well.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com



Re: add PROCESS_MAIN to VACUUM

From
vignesh C
Date:
On Sat, 7 Jan 2023 at 10:37, Nathan Bossart <nathandbossart@gmail.com> wrote:
>
> rebased for cfbot

The patch does not apply on top of HEAD as in [1], please post a rebased patch:
=== Applying patches on top of PostgreSQL commit ID
d540a02a724b9643205abce8c5644a0f0908f6e3 ===
=== applying patch ./v2-0001-add-PROCESS_MAIN-to-VACUUM.patch
patching file src/backend/commands/vacuum.c
....
Hunk #8 FAILED at 2097.
1 out of 8 hunks FAILED -- saving rejects to file
src/backend/commands/vacuum.c.rej

[1] - http://cfbot.cputube.org/patch_41_4088.log

Regards,
Vignesh



Re: add PROCESS_MAIN to VACUUM

From
Nathan Bossart
Date:
On Thu, Jan 19, 2023 at 05:28:25PM +0530, vignesh C wrote:
> The patch does not apply on top of HEAD as in [1], please post a rebased patch:

rebased

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Attachment

Re: add PROCESS_MAIN to VACUUM

From
Justin Pryzby
Date:
On Fri, Jan 13, 2023 at 03:30:15PM -0800, Nathan Bossart wrote:
> On Fri, Jan 13, 2023 at 03:24:09PM -0800, Jeff Davis wrote:
> > For completeness, did you consider CLUSTER and REINDEX options as well?
> 
> I have not, but I can put together patches for those as well.

Are you planning to do that here, on this thread ?

It seems like a good idea - it would allow simplifying an existing or
future scripts which needs to reindex a toast table, saving the trouble
of looking up the name of the toast table, and a race condition and
server error log if the table itself were processed.  I can imagine that
might happen if a separate process used TRUNCATE, for example.

-- 
Justin



Re: add PROCESS_MAIN to VACUUM

From
Nathan Bossart
Date:
On Mon, Feb 20, 2023 at 10:31:11AM -0600, Justin Pryzby wrote:
> On Fri, Jan 13, 2023 at 03:30:15PM -0800, Nathan Bossart wrote:
>> On Fri, Jan 13, 2023 at 03:24:09PM -0800, Jeff Davis wrote:
>> > For completeness, did you consider CLUSTER and REINDEX options as well?
>> 
>> I have not, but I can put together patches for those as well.
> 
> Are you planning to do that here, on this thread ?

Yes, I just haven't made time for it yet.  IIRC I briefly looked into
CLUSTER and decided that it was probably not worth the effort, but I still
think it's worth adding this option to REINDEX.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com



Re: add PROCESS_MAIN to VACUUM

From
Michael Paquier
Date:
On Thu, Jan 19, 2023 at 11:08:07AM -0800, Nathan Bossart wrote:
> rebased

PROCESS_TOAST has that:
    /* sanity check for PROCESS_TOAST */
    if ((params->options & VACOPT_FULL) != 0 &&
        (params->options & VACOPT_PROCESS_TOAST) == 0)
        ereport(ERROR,
                (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
                 errmsg("PROCESS_TOAST required with VACUUM FULL")));
[...]
-   if (params->options & VACOPT_FULL)
+   if (params->options & VACOPT_FULL &&
+       params->options & VACOPT_PROCESS_MAIN)
    {

Shouldn't we apply the same rule for PROCESS_MAIN?  One of the
regression tests added means that FULL takes priority over
PROCESS_MAIN=FALSE, which is a bit confusing IMO.

@@ -190,6 +190,7 @@ typedef struct VacAttrStats
 #define VACOPT_DISABLE_PAGE_SKIPPING 0x80  /* don't skip any pages */
 #define VACOPT_SKIP_DATABASE_STATS 0x100   /* skip vac_update_datfrozenxid() */
 #define VACOPT_ONLY_DATABASE_STATS 0x200   /* only vac_update_datfrozenxid() */
+#define VACOPT_PROCESS_MAIN 0x400  /* process main relation */

Perhaps the options had better be reorganized so as PROCESS_MAIN is
just before PROCESS_TOAST?

+-- PROCESS_MAIN option
+VACUUM (PROCESS_MAIN FALSE) vactst;
+VACUUM (PROCESS_MAIN FALSE, PROCESS_TOAST FALSE) vactst;
+VACUUM (PROCESS_MAIN FALSE, FULL) vactst;

Thinking a bit here.  This set of tests does not make sure that the
main relation and/or the toast relation have been actually processed.
pg_stat_user_tables does not track what's happening on the toast
relations.  So...  What about adding some tests in 100_vacuumdb.pl
that rely on vacuumdb --verbose and check the logs produced?  We
should make sure that the toast or the main relation are processed,
by tracking, for example, logs like vacuuming "schema.table".  When
FULL is involved, we may want to track the changes on relfilenodes
depending on what's wanted.
--
Michael

Attachment

Re: add PROCESS_MAIN to VACUUM

From
Nathan Bossart
Date:
Thanks for taking a look.

On Wed, Mar 01, 2023 at 03:31:48PM +0900, Michael Paquier wrote:
> PROCESS_TOAST has that:
>     /* sanity check for PROCESS_TOAST */
>     if ((params->options & VACOPT_FULL) != 0 &&
>         (params->options & VACOPT_PROCESS_TOAST) == 0)
>         ereport(ERROR,
>                 (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
>                  errmsg("PROCESS_TOAST required with VACUUM FULL")));
> [...]
> -   if (params->options & VACOPT_FULL)
> +   if (params->options & VACOPT_FULL &&
> +       params->options & VACOPT_PROCESS_MAIN)
>     {
> 
> Shouldn't we apply the same rule for PROCESS_MAIN?  One of the
> regression tests added means that FULL takes priority over
> PROCESS_MAIN=FALSE, which is a bit confusing IMO.  

I don't think so.  We disallow FULL without PROCESS_TOAST because there
presently isn't a way to VACUUM FULL the main relation without rebuilding
its TOAST table.  However, FULL without PROCESS_MAIN can be used to run
VACUUM FULL on only the TOAST table.

> @@ -190,6 +190,7 @@ typedef struct VacAttrStats
>  #define VACOPT_DISABLE_PAGE_SKIPPING 0x80  /* don't skip any pages */
>  #define VACOPT_SKIP_DATABASE_STATS 0x100   /* skip vac_update_datfrozenxid() */
>  #define VACOPT_ONLY_DATABASE_STATS 0x200   /* only vac_update_datfrozenxid() */
> +#define VACOPT_PROCESS_MAIN 0x400  /* process main relation */
> 
> Perhaps the options had better be reorganized so as PROCESS_MAIN is
> just before PROCESS_TOAST?

Sure.

> +-- PROCESS_MAIN option
> +VACUUM (PROCESS_MAIN FALSE) vactst;
> +VACUUM (PROCESS_MAIN FALSE, PROCESS_TOAST FALSE) vactst;
> +VACUUM (PROCESS_MAIN FALSE, FULL) vactst;
> 
> Thinking a bit here.  This set of tests does not make sure that the
> main relation and/or the toast relation have been actually processed. 
> pg_stat_user_tables does not track what's happening on the toast
> relations.  So...  What about adding some tests in 100_vacuumdb.pl
> that rely on vacuumdb --verbose and check the logs produced?  We
> should make sure that the toast or the main relation are processed,
> by tracking, for example, logs like vacuuming "schema.table".  When
> FULL is involved, we may want to track the changes on relfilenodes
> depending on what's wanted.

That seems like a good idea.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com



Re: add PROCESS_MAIN to VACUUM

From
Alvaro Herrera
Date:
On 2023-Mar-01, Michael Paquier wrote:

> +-- PROCESS_MAIN option
> +VACUUM (PROCESS_MAIN FALSE) vactst;
> +VACUUM (PROCESS_MAIN FALSE, PROCESS_TOAST FALSE) vactst;
> +VACUUM (PROCESS_MAIN FALSE, FULL) vactst;
> 
> Thinking a bit here.  This set of tests does not make sure that the
> main relation and/or the toast relation have been actually processed. 
> pg_stat_user_tables does not track what's happening on the toast
> relations.  So...  What about adding some tests in 100_vacuumdb.pl
> that rely on vacuumdb --verbose and check the logs produced?  We
> should make sure that the toast or the main relation are processed,
> by tracking, for example, logs like vacuuming "schema.table".  When
> FULL is involved, we may want to track the changes on relfilenodes
> depending on what's wanted.

Maybe instead of reading the log, read values from pg_stat_all_tables.

-- 
Álvaro Herrera        Breisgau, Deutschland  —  https://www.EnterpriseDB.com/
"Uno puede defenderse de los ataques; contra los elogios se esta indefenso"



Re: add PROCESS_MAIN to VACUUM

From
Nathan Bossart
Date:
On Wed, Mar 01, 2023 at 07:09:53PM +0100, Alvaro Herrera wrote:
> On 2023-Mar-01, Michael Paquier wrote:
> 
>> +-- PROCESS_MAIN option
>> +VACUUM (PROCESS_MAIN FALSE) vactst;
>> +VACUUM (PROCESS_MAIN FALSE, PROCESS_TOAST FALSE) vactst;
>> +VACUUM (PROCESS_MAIN FALSE, FULL) vactst;
>> 
>> Thinking a bit here.  This set of tests does not make sure that the
>> main relation and/or the toast relation have been actually processed. 
>> pg_stat_user_tables does not track what's happening on the toast
>> relations.  So...  What about adding some tests in 100_vacuumdb.pl
>> that rely on vacuumdb --verbose and check the logs produced?  We
>> should make sure that the toast or the main relation are processed,
>> by tracking, for example, logs like vacuuming "schema.table".  When
>> FULL is involved, we may want to track the changes on relfilenodes
>> depending on what's wanted.
> 
> Maybe instead of reading the log, read values from pg_stat_all_tables.

Here is an attempt at that.  Thanks for the idea.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Attachment

Re: add PROCESS_MAIN to VACUUM

From
Michael Paquier
Date:
On Wed, Mar 01, 2023 at 07:09:53PM +0100, Alvaro Herrera wrote:
> Maybe instead of reading the log, read values from pg_stat_all_tables.

Ah, right.  I was looking at pg_stat_user_tables yesterday, and forgot
that pg_stat_all_tables tracks toast tables, so it should be fine to
do some validation with that.
--
Michael

Attachment

Re: add PROCESS_MAIN to VACUUM

From
Masahiko Sawada
Date:
On Thu, Mar 2, 2023 at 4:13 AM Nathan Bossart <nathandbossart@gmail.com> wrote:
>
> On Wed, Mar 01, 2023 at 07:09:53PM +0100, Alvaro Herrera wrote:
> > On 2023-Mar-01, Michael Paquier wrote:
> >
> >> +-- PROCESS_MAIN option
> >> +VACUUM (PROCESS_MAIN FALSE) vactst;
> >> +VACUUM (PROCESS_MAIN FALSE, PROCESS_TOAST FALSE) vactst;
> >> +VACUUM (PROCESS_MAIN FALSE, FULL) vactst;
> >>
> >> Thinking a bit here.  This set of tests does not make sure that the
> >> main relation and/or the toast relation have been actually processed.
> >> pg_stat_user_tables does not track what's happening on the toast
> >> relations.  So...  What about adding some tests in 100_vacuumdb.pl
> >> that rely on vacuumdb --verbose and check the logs produced?  We
> >> should make sure that the toast or the main relation are processed,
> >> by tracking, for example, logs like vacuuming "schema.table".  When
> >> FULL is involved, we may want to track the changes on relfilenodes
> >> depending on what's wanted.
> >
> > Maybe instead of reading the log, read values from pg_stat_all_tables.
>
> Here is an attempt at that.  Thanks for the idea.

I've reviewed the v4 patch. Here is a minor comment:

+SELECT * FROM vactst_vacuum_counts;
+   left   | vacuum_count
+----------+--------------
+ pg_toast |            1
+ vactst   |            0
+(2 rows)

+CREATE VIEW vactst_vacuum_counts AS
+   SELECT left(s.relname, 8), s.vacuum_count
+   FROM pg_stat_all_tables s
+   LEFT JOIN pg_class c ON s.relid = c.reltoastrelid
+   WHERE c.relname = 'vactst' OR s.relname = 'vactst'
+   ORDER BY s.relname;

Cutting the toast relation name to 'pg_toast' is a bit confusing to me
as we have the pg_toast schema. How about using the following query
instead to improve the readability?

       SELECT
       CASE WHEN c.relname IS NULL THEN
       s.relname
       ELSE
       'toast for ' || c.relname
       END as relname,
       s.vacuum_count
       FROM pg_stat_all_tables s
       LEFT JOIN pg_class c ON s.relid = c.reltoastrelid
       WHERE c.relname = 'vactst' OR s.relname = 'vactst'

We will get like:

 SELECT * FROM vactst_vacuum_counts;
     relname      | vacuum_count
------------------+--------------
 toast for vactst |            0
 vactst           |            1
 (2 rows)

The rest looks good to me.

Regards,

-- 
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



Re: add PROCESS_MAIN to VACUUM

From
Michael Paquier
Date:
On Thu, Mar 02, 2023 at 12:58:32PM +0900, Masahiko Sawada wrote:
> Cutting the toast relation name to 'pg_toast' is a bit confusing to me
> as we have the pg_toast schema. How about using the following query
> instead to improve the readability?
>
>        SELECT
>        CASE WHEN c.relname IS NULL THEN
>        s.relname
>        ELSE
>        'toast for ' || c.relname
>        END as relname,
>        s.vacuum_count
>        FROM pg_stat_all_tables s
>        LEFT JOIN pg_class c ON s.relid = c.reltoastrelid
>        WHERE c.relname = 'vactst' OR s.relname = 'vactst'

Another tweak that I have learnt to like is to apply a filter with
regexp_replace(), see 090_reindexdb.pl:
regexp_replace(b.indname::text, '(pg_toast.pg_toast_)\\d+(_index)', '\\1<oid>\\2')

If you make that part of the view definition, the result is the same,
so that's up to which solution one prefers.
--
Michael

Attachment

Re: add PROCESS_MAIN to VACUUM

From
Nathan Bossart
Date:
On Thu, Mar 02, 2023 at 02:21:08PM +0900, Michael Paquier wrote:
> On Thu, Mar 02, 2023 at 12:58:32PM +0900, Masahiko Sawada wrote:
>> Cutting the toast relation name to 'pg_toast' is a bit confusing to me
>> as we have the pg_toast schema. How about using the following query
>> instead to improve the readability?
>> 
>>        SELECT
>>        CASE WHEN c.relname IS NULL THEN
>>        s.relname
>>        ELSE
>>        'toast for ' || c.relname
>>        END as relname,
>>        s.vacuum_count
>>        FROM pg_stat_all_tables s
>>        LEFT JOIN pg_class c ON s.relid = c.reltoastrelid
>>        WHERE c.relname = 'vactst' OR s.relname = 'vactst'
> 
> Another tweak that I have learnt to like is to apply a filter with
> regexp_replace(), see 090_reindexdb.pl:
> regexp_replace(b.indname::text, '(pg_toast.pg_toast_)\\d+(_index)', '\\1<oid>\\2')
> 
> If you make that part of the view definition, the result is the same,
> so that's up to which solution one prefers.

Here's a new version of the patch that uses Sawada-san's suggestion.
Thanks for taking a look.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Attachment

Re: add PROCESS_MAIN to VACUUM

From
Masahiko Sawada
Date:
On Thu, Mar 2, 2023 at 2:26 PM Nathan Bossart <nathandbossart@gmail.com> wrote:
>
> On Thu, Mar 02, 2023 at 02:21:08PM +0900, Michael Paquier wrote:
> > On Thu, Mar 02, 2023 at 12:58:32PM +0900, Masahiko Sawada wrote:
> >> Cutting the toast relation name to 'pg_toast' is a bit confusing to me
> >> as we have the pg_toast schema. How about using the following query
> >> instead to improve the readability?
> >>
> >>        SELECT
> >>        CASE WHEN c.relname IS NULL THEN
> >>        s.relname
> >>        ELSE
> >>        'toast for ' || c.relname
> >>        END as relname,
> >>        s.vacuum_count
> >>        FROM pg_stat_all_tables s
> >>        LEFT JOIN pg_class c ON s.relid = c.reltoastrelid
> >>        WHERE c.relname = 'vactst' OR s.relname = 'vactst'
> >
> > Another tweak that I have learnt to like is to apply a filter with
> > regexp_replace(), see 090_reindexdb.pl:
> > regexp_replace(b.indname::text, '(pg_toast.pg_toast_)\\d+(_index)', '\\1<oid>\\2')
> >
> > If you make that part of the view definition, the result is the same,
> > so that's up to which solution one prefers.
>
> Here's a new version of the patch that uses Sawada-san's suggestion.
> Thanks for taking a look.

Thank you for updating the patch. The patch looks good to me.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



Re: add PROCESS_MAIN to VACUUM

From
Michael Paquier
Date:
On Wed, Mar 01, 2023 at 09:26:37AM -0800, Nathan Bossart wrote:
> Thanks for taking a look.
>
> On Wed, Mar 01, 2023 at 03:31:48PM +0900, Michael Paquier wrote:
> > PROCESS_TOAST has that:
> >     /* sanity check for PROCESS_TOAST */
> >     if ((params->options & VACOPT_FULL) != 0 &&
> >         (params->options & VACOPT_PROCESS_TOAST) == 0)
> >         ereport(ERROR,
> >                 (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
> >                  errmsg("PROCESS_TOAST required with VACUUM FULL")));
> > [...]
> > -   if (params->options & VACOPT_FULL)
> > +   if (params->options & VACOPT_FULL &&
> > +       params->options & VACOPT_PROCESS_MAIN)
> >     {
> >
> > Shouldn't we apply the same rule for PROCESS_MAIN?  One of the
> > regression tests added means that FULL takes priority over
> > PROCESS_MAIN=FALSE, which is a bit confusing IMO.
>
> I don't think so.  We disallow FULL without PROCESS_TOAST because there
> presently isn't a way to VACUUM FULL the main relation without rebuilding
> its TOAST table.  However, FULL without PROCESS_MAIN can be used to run
> VACUUM FULL on only the TOAST table.

-   if (params->options & VACOPT_FULL)
+   if (params->options & VACOPT_FULL &&
+       params->options & VACOPT_PROCESS_MAIN)
Perhaps this is a bit under-parenthesized, while reading through it
once again..

+   {
+       /* we force VACOPT_PROCESS_MAIN so vacuum_rel() processes it */
+       bool force_opt = ((params->options & VACOPT_PROCESS_MAIN) == 0);
+
+       params->options |= VACOPT_PROCESS_MAIN;
        vacuum_rel(toast_relid, NULL, params, true);
+       if (force_opt)
+           params->options &= ~VACOPT_PROCESS_MAIN;
Zigzagging with the centralized VacuumParams is a bit inelegant.
Perhaps it would be neater to copy VacuumParams and append
VACOPT_PROCESS_MAIN on the copy?

An extra question: should we check the behavior of the commands when
applying a list of relations to VACUUM?
--
Michael

Attachment

Re: add PROCESS_MAIN to VACUUM

From
Nathan Bossart
Date:
On Thu, Mar 02, 2023 at 02:53:02PM +0900, Michael Paquier wrote:
> -   if (params->options & VACOPT_FULL)
> +   if (params->options & VACOPT_FULL &&
> +       params->options & VACOPT_PROCESS_MAIN)
> Perhaps this is a bit under-parenthesized, while reading through it
> once again..

fixed

> 
> +   {
> +       /* we force VACOPT_PROCESS_MAIN so vacuum_rel() processes it */
> +       bool force_opt = ((params->options & VACOPT_PROCESS_MAIN) == 0);
> +
> +       params->options |= VACOPT_PROCESS_MAIN;
>         vacuum_rel(toast_relid, NULL, params, true);
> +       if (force_opt)
> +           params->options &= ~VACOPT_PROCESS_MAIN;
> Zigzagging with the centralized VacuumParams is a bit inelegant.
> Perhaps it would be neater to copy VacuumParams and append
> VACOPT_PROCESS_MAIN on the copy?

done

> An extra question: should we check the behavior of the commands when
> applying a list of relations to VACUUM?

I don't feel a strong need for that, especially now that we aren't
modifying params anymore.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Attachment

Re: add PROCESS_MAIN to VACUUM

From
Michael Paquier
Date:
On Wed, Mar 01, 2023 at 10:53:59PM -0800, Nathan Bossart wrote:
> I don't feel a strong need for that, especially now that we aren't
> modifying params anymore.

That was mostly OK for me, so applied after tweaking a couple of
places in the tests (extra explanations, for one), the comments and
the code.

The part that improved the tests of PROCESS_TOAST was useful on its
own, so I have done that separately as 46d490a.  Another thing that I
have found incorrect is the need for two pg_stat_reset() calls that
would reflect on the whole database, in the middle of a test running
in parallel of other things.  As far as I understand, you have added
that to provide fresh data after a single command while relying on
vactst, but it is possible to get the same amount of coverage by
relying on cumulated counts, and that gets even more solid with all
these commands run on an independent table.
--
Michael

Attachment

Re: add PROCESS_MAIN to VACUUM

From
Nathan Bossart
Date:
On Mon, Mar 06, 2023 at 04:51:46PM +0900, Michael Paquier wrote:
> That was mostly OK for me, so applied after tweaking a couple of
> places in the tests (extra explanations, for one), the comments and
> the code.

Thanks!

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com



Re: add PROCESS_MAIN to VACUUM

From
Melanie Plageman
Date:
On Mon, Mar 06, 2023 at 09:37:23AM -0800, Nathan Bossart wrote:
> On Mon, Mar 06, 2023 at 04:51:46PM +0900, Michael Paquier wrote:
> > That was mostly OK for me, so applied after tweaking a couple of
> > places in the tests (extra explanations, for one), the comments and
> > the code.

I noticed in vacuum_rel() in vacuum.c where table_relation_vacuum() is
called, 4211fbd84 changes the else into an else if [1]. I understand
after reading the commit and re-reading the code why that is now, but I
was initially confused. I was thinking it might be nice to have a
comment mentioning why there is no else case here (i.e. that the main
table relation will be vacuumed on the else if branch).

- Melanie

[1] https://github.com/postgres/postgres/blob/master/src/backend/commands/vacuum.c#L2078



Re: add PROCESS_MAIN to VACUUM

From
Nathan Bossart
Date:
On Mon, Mar 06, 2023 at 02:40:09PM -0500, Melanie Plageman wrote:
> I noticed in vacuum_rel() in vacuum.c where table_relation_vacuum() is
> called, 4211fbd84 changes the else into an else if [1]. I understand
> after reading the commit and re-reading the code why that is now, but I
> was initially confused. I was thinking it might be nice to have a
> comment mentioning why there is no else case here (i.e. that the main
> table relation will be vacuumed on the else if branch).

This was a hack to avoid another level of indentation for that whole block
of code, but based on your comment, it might be better to just surround
this entire section with an "if (params->options & VACOPT_PROCESS_MAIN)"
check.  WDYT?

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com



Re: add PROCESS_MAIN to VACUUM

From
Melanie Plageman
Date:
On Mon, Mar 6, 2023 at 3:27 PM Nathan Bossart <nathandbossart@gmail.com> wrote:
>
> On Mon, Mar 06, 2023 at 02:40:09PM -0500, Melanie Plageman wrote:
> > I noticed in vacuum_rel() in vacuum.c where table_relation_vacuum() is
> > called, 4211fbd84 changes the else into an else if [1]. I understand
> > after reading the commit and re-reading the code why that is now, but I
> > was initially confused. I was thinking it might be nice to have a
> > comment mentioning why there is no else case here (i.e. that the main
> > table relation will be vacuumed on the else if branch).
>
> This was a hack to avoid another level of indentation for that whole block
> of code, but based on your comment, it might be better to just surround
> this entire section with an "if (params->options & VACOPT_PROCESS_MAIN)"
> check.  WDYT?

I think that would be clearer.

- Melanie



Re: add PROCESS_MAIN to VACUUM

From
Nathan Bossart
Date:
On Mon, Mar 06, 2023 at 03:48:28PM -0500, Melanie Plageman wrote:
> On Mon, Mar 6, 2023 at 3:27 PM Nathan Bossart <nathandbossart@gmail.com> wrote:
>> On Mon, Mar 06, 2023 at 02:40:09PM -0500, Melanie Plageman wrote:
>> > I noticed in vacuum_rel() in vacuum.c where table_relation_vacuum() is
>> > called, 4211fbd84 changes the else into an else if [1]. I understand
>> > after reading the commit and re-reading the code why that is now, but I
>> > was initially confused. I was thinking it might be nice to have a
>> > comment mentioning why there is no else case here (i.e. that the main
>> > table relation will be vacuumed on the else if branch).
>>
>> This was a hack to avoid another level of indentation for that whole block
>> of code, but based on your comment, it might be better to just surround
>> this entire section with an "if (params->options & VACOPT_PROCESS_MAIN)"
>> check.  WDYT?
> 
> I think that would be clearer.

Here's a patch.  Thanks for reviewing.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Attachment

Re: add PROCESS_MAIN to VACUUM

From
Melanie Plageman
Date:
On Mon, Mar 06, 2023 at 01:13:37PM -0800, Nathan Bossart wrote:
> On Mon, Mar 06, 2023 at 03:48:28PM -0500, Melanie Plageman wrote:
> > On Mon, Mar 6, 2023 at 3:27 PM Nathan Bossart <nathandbossart@gmail.com> wrote:
> >> On Mon, Mar 06, 2023 at 02:40:09PM -0500, Melanie Plageman wrote:
> >> > I noticed in vacuum_rel() in vacuum.c where table_relation_vacuum() is
> >> > called, 4211fbd84 changes the else into an else if [1]. I understand
> >> > after reading the commit and re-reading the code why that is now, but I
> >> > was initially confused. I was thinking it might be nice to have a
> >> > comment mentioning why there is no else case here (i.e. that the main
> >> > table relation will be vacuumed on the else if branch).
> >>
> >> This was a hack to avoid another level of indentation for that whole block
> >> of code, but based on your comment, it might be better to just surround
> >> this entire section with an "if (params->options & VACOPT_PROCESS_MAIN)"
> >> check.  WDYT?
> > 
> > I think that would be clearer.
> 
> Here's a patch.  Thanks for reviewing.

> diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
> index 580f966499..fb1ef28fa9 100644
> --- a/src/backend/commands/vacuum.c
> +++ b/src/backend/commands/vacuum.c
> @@ -2060,23 +2060,25 @@ vacuum_rel(Oid relid, RangeVar *relation, VacuumParams *params, bool skip_privs)

I would move this comment inside of the outer if statement since it is
distinguishing between the two branches of the inner if statement.

Also, I would still consider putting a comment above that reminds us that
VACOPT_PROCESS_MAIN is the default and will vacuum the main relation.

>      /*
>       * Do the actual work --- either FULL or "lazy" vacuum
>       */
> -    if ((params->options & VACOPT_FULL) &&
> -        (params->options & VACOPT_PROCESS_MAIN))
> +    if (params->options & VACOPT_PROCESS_MAIN)
>      {
> -        ClusterParams cluster_params = {0};
> +        if (params->options & VACOPT_FULL)
> +        {
> +            ClusterParams cluster_params = {0};
>  
> -        /* close relation before vacuuming, but hold lock until commit */
> -        relation_close(rel, NoLock);
> -        rel = NULL;
> +            /* close relation before vacuuming, but hold lock until commit */
> +            relation_close(rel, NoLock);
> +            rel = NULL;
>  
> -        if ((params->options & VACOPT_VERBOSE) != 0)
> -            cluster_params.options |= CLUOPT_VERBOSE;
> +            if ((params->options & VACOPT_VERBOSE) != 0)
> +                cluster_params.options |= CLUOPT_VERBOSE;
>  
> -        /* VACUUM FULL is now a variant of CLUSTER; see cluster.c */
> -        cluster_rel(relid, InvalidOid, &cluster_params);
> +            /* VACUUM FULL is now a variant of CLUSTER; see cluster.c */
> +            cluster_rel(relid, InvalidOid, &cluster_params);
> +        }
> +        else
> +            table_relation_vacuum(rel, params, vac_strategy);
>      }
> -    else if (params->options & VACOPT_PROCESS_MAIN)
> -        table_relation_vacuum(rel, params, vac_strategy);
>  
>      /* Roll back any GUC changes executed by index functions */
>      AtEOXact_GUC(false, save_nestlevel);


- Melanie



Re: add PROCESS_MAIN to VACUUM

From
Nathan Bossart
Date:
On Mon, Mar 06, 2023 at 05:09:58PM -0500, Melanie Plageman wrote:
> I would move this comment inside of the outer if statement since it is
> distinguishing between the two branches of the inner if statement.

Oops, done.

> Also, I would still consider putting a comment above that reminds us that
> VACOPT_PROCESS_MAIN is the default and will vacuum the main relation.

I tried adding something along these lines.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Attachment

Re: add PROCESS_MAIN to VACUUM

From
Melanie Plageman
Date:
On Mon, Mar 6, 2023 at 5:43 PM Nathan Bossart <nathandbossart@gmail.com> wrote:
>
> On Mon, Mar 06, 2023 at 05:09:58PM -0500, Melanie Plageman wrote:
> > I would move this comment inside of the outer if statement since it is
> > distinguishing between the two branches of the inner if statement.
>
> Oops, done.
>
> > Also, I would still consider putting a comment above that reminds us that
> > VACOPT_PROCESS_MAIN is the default and will vacuum the main relation.
>
> I tried adding something along these lines.

LGTM.



Re: add PROCESS_MAIN to VACUUM

From
Michael Paquier
Date:
On Mon, Mar 06, 2023 at 06:12:36PM -0500, Melanie Plageman wrote:
> LGTM.

-    * Do the actual work --- either FULL or "lazy" vacuum
+    * If PROCESS_MAIN is set (the default), it's time to vacuum the main
+    * relation.  Otherwise, we can skip this part.  If required, we'll process
+    * the TOAST table later.

Should we mention that this part could be used for a toast table once
we've already looped once through vacuum_rel() when toast_relid was
set?  VACOPT_PROCESS_MAIN is enforced a few lines down the road,
still..
--
Michael

Attachment

Re: add PROCESS_MAIN to VACUUM

From
Nathan Bossart
Date:
On Tue, Mar 07, 2023 at 09:20:12AM +0900, Michael Paquier wrote:
> -    * Do the actual work --- either FULL or "lazy" vacuum
> +    * If PROCESS_MAIN is set (the default), it's time to vacuum the main
> +    * relation.  Otherwise, we can skip this part.  If required, we'll process
> +    * the TOAST table later.
> 
> Should we mention that this part could be used for a toast table once
> we've already looped once through vacuum_rel() when toast_relid was
> set?  VACOPT_PROCESS_MAIN is enforced a few lines down the road,
> still..

That did cross my mind, but I was worried that trying to explain all that
here could cause confusion.

    If PROCESS_MAIN is set (the default), it's time to vacuum the main
    relation.  Otherwise, we can skip this part.  If processing the TOAST
    table is required (e.g., PROCESS_TOAST is set), we'll force
    PROCESS_MAIN to be set when we recurse to the TOAST table so that it
    gets processed here.

How does that sound?

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com



Re: add PROCESS_MAIN to VACUUM

From
Michael Paquier
Date:
On Mon, Mar 06, 2023 at 04:59:49PM -0800, Nathan Bossart wrote:
> That did cross my mind, but I was worried that trying to explain all that
> here could cause confusion.
>
>     If PROCESS_MAIN is set (the default), it's time to vacuum the main
>     relation.  Otherwise, we can skip this part.  If processing the TOAST
>     table is required (e.g., PROCESS_TOAST is set), we'll force
>     PROCESS_MAIN to be set when we recurse to the TOAST table so that it
>     gets processed here.
>
> How does that sound?

Sounds clear to me, thanks!  Melanie, what do you think?
--
Michael

Attachment

Re: add PROCESS_MAIN to VACUUM

From
Melanie Plageman
Date:
On Mon, Mar 6, 2023 at 10:45 PM Michael Paquier <michael@paquier.xyz> wrote:
>
> On Mon, Mar 06, 2023 at 04:59:49PM -0800, Nathan Bossart wrote:
> > That did cross my mind, but I was worried that trying to explain all that
> > here could cause confusion.
> >
> >       If PROCESS_MAIN is set (the default), it's time to vacuum the main
> >       relation.  Otherwise, we can skip this part.  If processing the TOAST
> >       table is required (e.g., PROCESS_TOAST is set), we'll force
> >       PROCESS_MAIN to be set when we recurse to the TOAST table so that it
> >       gets processed here.
> >
> > How does that sound?
>
> Sounds clear to me, thanks!  Melanie, what do you think?

Yes, sounds clear to me also!

- Melanie



Re: add PROCESS_MAIN to VACUUM

From
Nathan Bossart
Date:
On Tue, Mar 07, 2023 at 12:39:29PM -0500, Melanie Plageman wrote:
> Yes, sounds clear to me also!

Here is an updated patch.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Attachment

Re: add PROCESS_MAIN to VACUUM

From
Michael Paquier
Date:
On Tue, Mar 07, 2023 at 12:55:08PM -0800, Nathan Bossart wrote:
> On Tue, Mar 07, 2023 at 12:39:29PM -0500, Melanie Plageman wrote:
>> Yes, sounds clear to me also!
>
> Here is an updated patch.

Fine by me, so done.  (I have cut a few words from the comment,
without changing its meaning.)
--
Michael

Attachment